Loss Functions and Optimization

Introduction

A loss function (also called a cost function or objective function) measures how well a model’s predictions align with the actual labels.

During training, the model:

✅ Makes predictions.
✅ Compares predictions with true labels using the loss function.
✅ Adjusts weights to minimize the loss using optimization.

Why are Loss Functions Important?

✅ They guide the learning process during training.
✅ Help the model understand how far off its predictions are.
✅ Enable the optimizer to update weights to improve accuracy and performance.

1️⃣ Loss Functions for Regression

Mean Squared Error (MSE)

Measures the average squared difference between predicted and actual values.

[ MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 ]

✅ Penalizes larger errors more heavily.
✅ Used in tasks like predicting house prices or temperatures.

Mean Absolute Error (MAE)

Measures the average absolute difference between predicted and actual values.

[ MAE = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y}_i| ]

✅ Less sensitive to outliers compared to MSE.

2️⃣ Loss Functions for Binary Classification

Binary Cross-Entropy (Log Loss)

Used when predicting a binary label (0 or 1), it measures the difference between the true label and predicted probability.

[ L = -[y \log(p) + (1 - y) \log(1 - p)] ]

✅ Used in spam detection, medical diagnosis (yes/no tasks).
✅ Requires sigmoid activation in the output layer.

3️⃣ Loss Functions for Multiclass Classification

Categorical Cross-Entropy

Used when predicting one of multiple classes, comparing predicted probabilities with true labels (one-hot encoded).

[ L = -\sum_{i=1}^{n} y_i \log(p_i) ]

✅ Used in digit classification, object recognition tasks.
✅ Requires softmax activation in the output layer.

Sparse Categorical Cross-Entropy

Similar to categorical cross-entropy but uses integer labels instead of one-hot encoding, useful when handling large datasets with many classes.

Summary Table

Task Type	Common Loss Function	Output Activation
Regression	MSE, MAE	None / Linear
Binary Classification	Binary Cross-Entropy	Sigmoid
Multiclass Classification	Categorical Cross-Entropy	Softmax
Multiclass (Sparse)	Sparse Categorical Cross-Entropy	Softmax

Example: Using Loss Functions in TensorFlow

import tensorflow as tf

# For regression
model.compile(optimizer='adam', loss='mean_squared_error')

# For binary classification
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# For multiclass classification
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

Conclusion

✅ Loss functions are essential in guiding the learning process of deep learning models.
✅ Choosing the correct loss function based on your task type ensures effective learning and performance.
✅ Experiment with these loss functions in your projects to understand their impact.

What’s Next?

✅ Try building simple regression and classification models using different loss functions.
✅ Observe how changing the loss function affects training.
✅ Continue structured learning with the next superml.org tutorials.

Join the SuperML Community to share your experiments and get feedback on your learning journey.

Happy Learning! 🧠

Course Content