Press ESC to exit fullscreen
πŸ“– Lesson ⏱️ 60 minutes

Loss Functions and Optimization

Understanding different loss functions for various tasks

Introduction

A loss function (also called a cost function or objective function) measures how well a model’s predictions align with the actual labels.

During training, the model:

βœ… Makes predictions.
βœ… Compares predictions with true labels using the loss function.
βœ… Adjusts weights to minimize the loss using optimization.


Why are Loss Functions Important?

βœ… They guide the learning process during training.
βœ… Help the model understand how far off its predictions are.
βœ… Enable the optimizer to update weights to improve accuracy and performance.


1️⃣ Loss Functions for Regression

Mean Squared Error (MSE)

Measures the average squared difference between predicted and actual values.

[ MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 ]

βœ… Penalizes larger errors more heavily.
βœ… Used in tasks like predicting house prices or temperatures.


Mean Absolute Error (MAE)

Measures the average absolute difference between predicted and actual values.

[ MAE = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y}_i| ]

βœ… Less sensitive to outliers compared to MSE.


2️⃣ Loss Functions for Binary Classification

Binary Cross-Entropy (Log Loss)

Used when predicting a binary label (0 or 1), it measures the difference between the true label and predicted probability.

[ L = -[y \log(p) + (1 - y) \log(1 - p)] ]

βœ… Used in spam detection, medical diagnosis (yes/no tasks).
βœ… Requires sigmoid activation in the output layer.


3️⃣ Loss Functions for Multiclass Classification

Categorical Cross-Entropy

Used when predicting one of multiple classes, comparing predicted probabilities with true labels (one-hot encoded).

[ L = -\sum_{i=1}^{n} y_i \log(p_i) ]

βœ… Used in digit classification, object recognition tasks.
βœ… Requires softmax activation in the output layer.


Sparse Categorical Cross-Entropy

Similar to categorical cross-entropy but uses integer labels instead of one-hot encoding, useful when handling large datasets with many classes.


Summary Table

Task TypeCommon Loss FunctionOutput Activation
RegressionMSE, MAENone / Linear
Binary ClassificationBinary Cross-EntropySigmoid
Multiclass ClassificationCategorical Cross-EntropySoftmax
Multiclass (Sparse)Sparse Categorical Cross-EntropySoftmax

Example: Using Loss Functions in TensorFlow

import tensorflow as tf

# For regression
model.compile(optimizer='adam', loss='mean_squared_error')

# For binary classification
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# For multiclass classification
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

Conclusion

βœ… Loss functions are essential in guiding the learning process of deep learning models.
βœ… Choosing the correct loss function based on your task type ensures effective learning and performance.
βœ… Experiment with these loss functions in your projects to understand their impact.


What’s Next?

βœ… Try building simple regression and classification models using different loss functions.
βœ… Observe how changing the loss function affects training.
βœ… Continue structured learning with the next superml.org tutorials.


Join the SuperML Community to share your experiments and get feedback on your learning journey.


Happy Learning! πŸ§