· Deep Learning · 2 min read
📋 Prerequisites
- Basic understanding of neural networks and activations
🎯 What You'll Learn
- Understand what normalization means in deep learning
- Learn why normalization is important for stable and efficient training
- Explore batch normalization and layer normalization
- Implement normalization in your deep learning models
Introduction
Normalization techniques in deep learning help stabilize and accelerate training by addressing internal covariate shift and ensuring activations remain within a reasonable range during training.
1️⃣ What is Normalization?
Normalization in deep learning refers to adjusting and scaling the activations of layers so that:
✅ The distribution of inputs to each layer remains stable during training.
✅ Training converges faster.
✅ Models become less sensitive to initialization.
2️⃣ Why is Normalization Important?
Without normalization:
✅ The distribution of layer inputs can change during training (internal covariate shift).
✅ Training may be unstable and slow.
✅ The model may get stuck in poor local minima.
Normalization helps: ✅ Use higher learning rates safely.
✅ Improve gradient flow.
✅ Act as a regularizer, reducing the need for dropout.
3️⃣ Types of Normalization
Batch Normalization
Normalizes the output of a previous activation layer by subtracting the batch mean and dividing by the batch standard deviation.
Formula: [ \hat{x} = \frac{x - \mu}{\sqrt{\sigma^2 + \epsilon}} ] where (\mu) = batch mean, (\sigma^2) = batch variance.
✅ Often used in CNNs and MLPs.
✅ Applied before or after the activation function depending on implementation.
Layer Normalization
Normalizes across the features for each sample instead of across the batch.
✅ Useful in RNNs and transformer models where batch normalization is less effective due to varying batch sizes.
Other Normalization Techniques:
- Instance Normalization: Common in style transfer tasks.
- Group Normalization: Divides channels into groups for normalization, helpful in small-batch training.
4️⃣ Example: Using Batch Normalization in TensorFlow
import tensorflow as tf
from tensorflow.keras import layers, models
model = models.Sequential([
layers.Dense(128),
layers.BatchNormalization(),
layers.Activation('relu'),
layers.Dense(10, activation='softmax')
])
5️⃣ Best Practices
✅ For CNNs, place batch normalization after convolution and before activation.
✅ For MLPs, use batch normalization after dense layers.
✅ Experiment with layer normalization for sequence models.
✅ Adjust learning rates as normalization often allows for higher rates.
Conclusion
Normalization is a powerful tool in deep learning that:
✅ Stabilizes training.
✅ Speeds up convergence.
✅ Improves generalization.
Learning to implement and tune normalization in your networks will make your models more robust and efficient.
What’s Next?
✅ Try adding batch normalization to your existing models and observe its effect on training.
✅ Explore advanced normalization techniques for specialized architectures like transformers.
✅ Continue your structured deep learning journey on superml.org
.
Join the SuperML Community to share experiments and learn collaboratively.
Happy Learning! ✨