TIL: Hyperparameter Tuning with Ray Tune
Hyperparameter Tuning with Ray Tune
Today I learned how to efficiently optimize hyperparameters for machine learning models using Ray Tune, a powerful library for distributed hyperparameter tuning.
Why Ray Tune?
After struggling with manual hyperparameter tuning and grid search approaches that took days to complete, I discovered Ray Tune, which offers several advantages:
- Distributed execution: Parallelizes trials across CPU cores and machines
- Early stopping: Automatically terminates underperforming trials
- Advanced search algorithms: Bayesian optimization, HyperBand, and more
- Resource management: Efficiently allocates CPU/GPU resources
- Integration: Works with PyTorch, TensorFlow, scikit-learn, and more
Key Components
Search Space Definition
Ray Tune supports various sampling methods for defining hyperparameter search spaces:
search_space = {
'lr': tune.loguniform(1e-4, 1e-1), # Log-uniform distribution
'hidden_units': tune.choice([64, 128, 256, 512]), # Discrete choices
'dropout': tune.uniform(0.1, 0.5), # Uniform distribution
'activation': tune.grid_search(['relu', 'tanh']) # Grid search these values
}
Search Algorithms
Ray Tune integrates with various optimization libraries:
- Bayesian Optimization (via HyperOpt): Builds a probabilistic model of the objective function
- Population-Based Training: Evolves a population of models via genetic algorithm principles
- HyperBand/ASHA: Efficiently allocates resources to promising configurations
Schedulers for Early Stopping
Schedulers determine which trials should be terminated early:
from ray.tune.schedulers import ASHAScheduler
scheduler = ASHAScheduler(
metric='val_loss',
mode='min',
max_t=100, # Maximum number of training iterations
grace_period=10, # Minimum iterations before stopping
reduction_factor=2
)
Complete Example
Here’s how I implemented a complete hyperparameter tuning workflow:
import ray
from ray import tune
from ray.tune.schedulers import ASHAScheduler
# Define the objective function
def objective(config):
# Create and train model with the hyperparameters in config
model = create_model(
learning_rate=config['lr'],
hidden_units=config['hidden_units'],
dropout=config['dropout']
)
# Train and evaluate
for epoch in range(10):
train_loss = train_epoch(model)
val_loss = validate(model)
# Report metrics to Ray Tune
tune.report(loss=val_loss, training_loss=train_loss)
# Run hyperparameter search
result = tune.run(
objective,
config=search_space,
scheduler=ASHAScheduler(metric='loss', mode='min'),
num_samples=50
)
# Get best configuration
best_config = result.get_best_config(metric='loss', mode='min')
Using Ray Tune reduced my hyperparameter optimization time from days to hours while finding better configurations than my manual tuning efforts. The early stopping feature alone saved approximately 70% of computation time by terminating unpromising trials.