Press ESC to exit fullscreen
πŸ“– Lesson ⏱️ 90 minutes

Hyperparameter Tuning

Grid search, random search, and Bayesian optimization

Hyperparameter Tuning in Machine Learning

Hyperparameter tuning is the process of finding the optimal configuration for your machine learning models. Unlike model parameters that are learned during training, hyperparameters are set before training and control how the learning process works.

What Are Hyperparameters?

Hyperparameters are configuration settings that determine how a model learns. They’re different from model parameters (like weights in neural networks) because they’re set before training begins.

Common Hyperparameters

For Decision Trees:

  • max_depth: Maximum depth of the tree
  • min_samples_split: Minimum samples required to split a node
  • min_samples_leaf: Minimum samples required in a leaf node

For Random Forest:

  • n_estimators: Number of trees in the forest
  • max_features: Number of features to consider for splits
  • bootstrap: Whether to use bootstrapping

For SVM:

  • C: Regularization parameter
  • kernel: Kernel type (linear, rbf, poly)
  • gamma: Kernel coefficient

For Neural Networks:

  • learning_rate: How fast the model learns
  • batch_size: Number of samples per gradient update
  • epochs: Number of training iterations

Why Hyperparameter Tuning Matters

βœ… Improved Performance: Proper tuning can significantly boost model accuracy βœ… Prevent Overfitting: Right parameters help models generalize better βœ… Faster Training: Optimal settings can reduce training time βœ… Better Generalization: Tuned models perform better on unseen data

Hyperparameter Tuning Techniques

Grid search exhaustively tries all combinations of specified hyperparameter values.

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier

# Define parameter grid
param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [None, 10, 20, 30],
    'min_samples_split': [2, 5, 10]
}

# Create model
rf = RandomForestClassifier()

# Grid search with cross-validation
grid_search = GridSearchCV(
    rf, 
    param_grid, 
    cv=5, 
    scoring='accuracy',
    n_jobs=-1
)

# Fit and find best parameters
grid_search.fit(X_train, y_train)
print(f"Best parameters: {grid_search.best_params_}")
print(f"Best score: {grid_search.best_score_}")

Random search samples hyperparameters randomly from specified distributions.

from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint

# Define parameter distributions
param_dist = {
    'n_estimators': randint(50, 500),
    'max_depth': [None] + list(randint(10, 50).rvs(10)),
    'min_samples_split': randint(2, 20)
}

# Random search
random_search = RandomizedSearchCV(
    RandomForestClassifier(),
    param_distributions=param_dist,
    n_iter=100,  # Number of parameter settings sampled
    cv=5,
    scoring='accuracy',
    n_jobs=-1
)

random_search.fit(X_train, y_train)
print(f"Best parameters: {random_search.best_params_}")

3. Bayesian Optimization

Bayesian optimization uses probabilistic models to find optimal hyperparameters more efficiently.

from skopt import BayesSearchCV
from skopt.space import Real, Integer

# Define search space
search_space = {
    'n_estimators': Integer(50, 500),
    'max_depth': Integer(10, 50),
    'min_samples_split': Integer(2, 20),
    'min_samples_leaf': Integer(1, 10)
}

# Bayesian search
bayes_search = BayesSearchCV(
    RandomForestClassifier(),
    search_space,
    n_iter=50,
    cv=5,
    scoring='accuracy',
    n_jobs=-1
)

bayes_search.fit(X_train, y_train)
print(f"Best parameters: {bayes_search.best_params_}")

Cross-Validation for Robust Tuning

Always use cross-validation to ensure your hyperparameter selection is robust:

from sklearn.model_selection import cross_val_score

# Get the best model
best_model = grid_search.best_estimator_

# Evaluate with cross-validation
cv_scores = cross_val_score(best_model, X_train, y_train, cv=5)
print(f"Cross-validation scores: {cv_scores}")
print(f"Mean CV score: {cv_scores.mean():.4f} (+/- {cv_scores.std() * 2:.4f})")

Practical Tips

1. Start Simple

Begin with a coarse grid search to identify promising regions, then refine.

2. Use Appropriate Metrics

Choose evaluation metrics that align with your business objectives.

3. Consider Computational Cost

Balance search thoroughness with available computational resources.

4. Nested Cross-Validation

For unbiased performance estimates, use nested cross-validation.

from sklearn.model_selection import cross_validate

# Nested cross-validation
nested_scores = cross_validate(
    GridSearchCV(rf, param_grid, cv=3),
    X, y, cv=5, scoring='accuracy'
)

print(f"Nested CV score: {nested_scores['test_score'].mean():.4f}")

Advanced Techniques

1. Multi-Objective Optimization

When optimizing for multiple objectives (accuracy vs. speed):

from sklearn.model_selection import GridSearchCV

# Custom scoring function
def custom_scorer(estimator, X, y):
    accuracy = estimator.score(X, y)
    # Penalize for model complexity
    complexity_penalty = len(estimator.feature_importances_) * 0.001
    return accuracy - complexity_penalty

grid_search = GridSearchCV(
    rf, param_grid, cv=5, 
    scoring=custom_scorer
)

2. Early Stopping

For iterative algorithms, implement early stopping to prevent overfitting:

from sklearn.ensemble import GradientBoostingClassifier

gb = GradientBoostingClassifier(
    n_estimators=1000,
    validation_fraction=0.1,
    n_iter_no_change=10,  # Stop if no improvement for 10 iterations
    random_state=42
)

Common Pitfalls to Avoid

❌ Data Leakage: Don’t use test data for hyperparameter tuning ❌ Overfitting to Validation Set: Use separate validation set or cross-validation ❌ Ignoring Computational Cost: Balance search thoroughness with resources ❌ Not Considering Domain Knowledge: Use domain expertise to guide search

Best Practices

βœ… Start with Default Parameters: Establish baseline performance βœ… Use Logarithmic Scales: For parameters like learning rate βœ… Parallelize Search: Use n_jobs=-1 for faster computation βœ… Monitor Progress: Track performance throughout tuning process βœ… Document Results: Keep track of tried configurations

Conclusion

Hyperparameter tuning is crucial for building high-performing machine learning models. Start with grid search for simple cases, use random search for larger parameter spaces, and consider Bayesian optimization for complex scenarios.

Remember that the best hyperparameters are dataset-specific, so always validate your results on unseen data and consider the computational trade-offs involved in your tuning strategy.

Next Steps

  • Practice hyperparameter tuning on different algorithms
  • Explore automated machine learning (AutoML) tools
  • Learn about neural architecture search for deep learning
  • Study multi-objective optimization techniques