AWS Certified AI Practitioner(30) - Hyperparameter Tuning
Hyperparameter Tuning
1. What is a Hyperparameter?
- Definition: Settings that define how the model is structured and how the learning algorithm works.
- Set before training begins (they are not learned from the data).
- Examples:
- Learning rate
- Batch size
- Number of epochs
- Regularization
๐ Exam Tip: Hyperparameters are not learned during training. They are chosen before training and tuned for best performance.
2. Why Hyperparameter Tuning Matters
- Goal: Find the best combination of hyperparameters to optimize model performance.
- Benefits:
- Improves accuracy
- Reduces overfitting
- Enhances generalization to new data
- Methods:
- Grid Search: Tries all possible parameter combinations.
- Random Search: Tests random parameter values.
- Automated Services:
- Amazon SageMaker Automatic Model Tuning (AMT) runs multiple training jobs and finds the best settings.
3. Key Hyperparameters
(1) Learning Rate
- Controls how big the steps are when updating model weights.
- High learning rate: Faster training, but may overshoot the optimal solution.
- Low learning rate: More stable and precise, but much slower.
(2) Batch Size
- Number of training examples processed in one iteration.
- Small batches: More stable, but slower.
- Large batches: Faster, but may cause less stable updates.
(3) Number of Epochs
- How many times the model goes through the entire training dataset.
- Too few: Underfitting (model doesnโt learn enough).
- Too many: Overfitting (model memorizes the data, performs poorly on new data).
(4) Regularization
- Controls the balance between a simple and complex model.
- More regularization โ less overfitting.
๐ Exam Tip: If asked how to reduce overfitting, increasing regularization is often the correct answer.
4. Overfitting
What is it?
- The model performs very well on training data but poorly on new, unseen data.
Causes
- Too little training data โ not representative.
- Training for too many epochs.
- Model too complex โ learns noise instead of patterns.
Solutions
- Increase training data size (best option).
- Use early stopping (stop training before overfitting).
- Apply data augmentation (add diversity to training data).
- Adjust hyperparameters (e.g., increase regularization, change batch size).
๐ Exam Tip: If the question is โbest way to prevent overfittingโ, the answer is usually increase training data.
5. When NOT to Use Machine Learning
- Example:
You have a deck of 10 cards (5 red, 3 blue, 2 yellow).
Q: What is the probability of drawing a blue card?
A: 3/10 = 0.3
This is a deterministic problem:
- The exact answer can be computed mathematically.
- Writing simple code is the best solution.
If we used ML (supervised, unsupervised, or reinforcement learning), weโd only get an approximation, not an exact result.
๐ Exam Tip:
Machine Learning is not appropriate for problems that have a clear, deterministic answer. It is designed for problems where patterns must be learned from data.
6. AWS-Specific Notes for Exams
- Amazon SageMaker Automatic Model Tuning (AMT): Automates hyperparameter tuning by running multiple jobs in parallel.
- Common Exam Questions:
- How to fix overfitting โ Increase data / regularization.
- What hyperparameter affects convergence speed โ Learning rate.
- Which AWS service automates tuning โ SageMaker AMT.
- When NOT to use ML โ Deterministic problem with exact answers.
โ Summary - Hyperparameters (learning rate, batch size, epochs regularization) must be tuned for best performance.
- Tuning improves accuracy, reduces overfitting, and enhances generalization.
- Overfitting occurs when the model memorizes training data โ fix by more data, regularization, early stopping.
- ML is not appropriate for deterministic problems.
- On AWS, SageMaker AMT is the go-to tool for automated hyperparameter tuning.