· Machine Learning · 2 min read
📋 Prerequisites
- Basic Python knowledge
- Understanding of linear regression
- Familiarity with pandas and scikit-learn
🎯 What You'll Learn
- Understand what logistic regression is and how it differs from linear regression
- Implement logistic regression using scikit-learn
- Interpret model outputs and accuracy
- Apply logistic regression to binary classification problems
Introduction
Logistic Regression is a fundamental algorithm used for classification problems in machine learning. Unlike linear regression, which predicts continuous outcomes, logistic regression predicts categorical outcomes, often binary (0 or 1, Yes or No).
When to Use Logistic Regression?
Use logistic regression when you need to:
- Predict whether an email is spam or not.
- Determine if a transaction is fraudulent.
- Predict whether a patient has a disease based on medical data.
How Does Logistic Regression Work?
Logistic regression uses the sigmoid function to convert the output of a linear equation into a probability between 0 and 1, which can then be mapped to classes.
Mathematically:
σ(z) = 1 / (1 + e^(-z))
where z = w₀ + w₁x₁ + w₂x₂ + ... + wₙxₙ
.
Step-by-Step Implementation
1️⃣ Import Libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
2️⃣ Load Dataset
For this example, we’ll use a simple dataset with features indicating exam scores and a target indicating admission (1) or not (0).
data = {
'Exam_Score': [50, 60, 70, 80, 90, 30, 40, 55, 65, 75],
'Admitted': [0, 0, 1, 1, 1, 0, 0, 0, 1, 1]
}
df = pd.DataFrame(data)
print(df.head())
3️⃣ Prepare Data
X = df[['Exam_Score']]
y = df['Admitted']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
4️⃣ Train the Logistic Regression Model
model = LogisticRegression()
model.fit(X_train, y_train)
5️⃣ Evaluate the Model
y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
print("Classification Report:\n", classification_report(y_test, y_pred))
Conclusion
🎉 You have successfully:
✅ Understood logistic regression basics.
✅ Implemented logistic regression using scikit-learn.
✅ Evaluated your model’s performance on a classification task.
What’s Next?
- Try using logistic regression on a larger dataset, such as the Titanic dataset.
- Learn about regularization in logistic regression to handle overfitting.
- Explore multiclass classification with logistic regression.
Join our SuperML Community to share your project results, ask questions, and continue your machine learning journey!