Introduction to Logistic Regression

Learn what logistic regression is, how it works, and how to implement it using Python and scikit-learn in this clear, beginner-friendly tutorial.

🔰 beginner
⏱️ 20 minutes
👤 SuperML Team

· Machine Learning · 2 min read

📋 Prerequisites

  • Basic Python knowledge
  • Understanding of linear regression
  • Familiarity with pandas and scikit-learn

🎯 What You'll Learn

  • Understand what logistic regression is and how it differs from linear regression
  • Implement logistic regression using scikit-learn
  • Interpret model outputs and accuracy
  • Apply logistic regression to binary classification problems

Introduction

Logistic Regression is a fundamental algorithm used for classification problems in machine learning. Unlike linear regression, which predicts continuous outcomes, logistic regression predicts categorical outcomes, often binary (0 or 1, Yes or No).


When to Use Logistic Regression?

Use logistic regression when you need to:

  • Predict whether an email is spam or not.
  • Determine if a transaction is fraudulent.
  • Predict whether a patient has a disease based on medical data.

How Does Logistic Regression Work?

Logistic regression uses the sigmoid function to convert the output of a linear equation into a probability between 0 and 1, which can then be mapped to classes.

Mathematically:

σ(z) = 1 / (1 + e^(-z))

where z = w₀ + w₁x₁ + w₂x₂ + ... + wₙxₙ.


Step-by-Step Implementation

1️⃣ Import Libraries

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

2️⃣ Load Dataset

For this example, we’ll use a simple dataset with features indicating exam scores and a target indicating admission (1) or not (0).

data = {
    'Exam_Score': [50, 60, 70, 80, 90, 30, 40, 55, 65, 75],
    'Admitted': [0, 0, 1, 1, 1, 0, 0, 0, 1, 1]
}
df = pd.DataFrame(data)
print(df.head())

3️⃣ Prepare Data

X = df[['Exam_Score']]
y = df['Admitted']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

4️⃣ Train the Logistic Regression Model

model = LogisticRegression()
model.fit(X_train, y_train)

5️⃣ Evaluate the Model

y_pred = model.predict(X_test)

print("Accuracy:", accuracy_score(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
print("Classification Report:\n", classification_report(y_test, y_pred))

Conclusion

🎉 You have successfully:

✅ Understood logistic regression basics.
✅ Implemented logistic regression using scikit-learn.
✅ Evaluated your model’s performance on a classification task.


What’s Next?

  • Try using logistic regression on a larger dataset, such as the Titanic dataset.
  • Learn about regularization in logistic regression to handle overfitting.
  • Explore multiclass classification with logistic regression.

Join our SuperML Community to share your project results, ask questions, and continue your machine learning journey!

Back to Tutorials

Related Tutorials

🔰beginner ⏱️ 50 minutes

Support Vector Machines (SVMs)

Learn what Support Vector Machines are, how they work, and see clear examples to understand this powerful ML algorithm for classification.

Machine Learning2 min read
machine learningsupport vector machinesclassification +1
🔰beginner ⏱️ 20 minutes

Understanding Decision Trees

Learn what decision trees are, how they work, and how to implement them using Python and scikit-learn for classification and regression tasks.

Machine Learning2 min read
beginnermachine learningclassification +1
🔰beginner ⏱️ 50 minutes

Dimensionality Reduction

Learn what dimensionality reduction is, why it matters in machine learning, and how techniques like PCA, t-SNE, and UMAP help simplify high-dimensional data for effective analysis.

Machine Learning2 min read
machine learningdimensionality reductiondata preprocessing +1
🔰beginner ⏱️ 50 minutes

Genetic Algorithms

Learn what genetic algorithms are, how they mimic natural selection to solve optimization problems, and how they are used in machine learning.

Machine Learning2 min read
machine learninggenetic algorithmsoptimization +1