Build Your First Machine Learning Model: Linear Regression with Python

Introduction

Linear regression is one of the most fundamental algorithms in machine learning. It helps us understand the relationship between variables and predict continuous outcomes.

In this tutorial, you’ll learn how to implement linear regression using Python with pandas, scikit-learn, and matplotlib. By the end of this tutorial, you will be able to build, train, and evaluate your first machine learning model.

Prerequisites

Basic knowledge of Python
Installed libraries: pandas, scikit-learn, matplotlib

You can install these using:

pip install pandas scikit-learn matplotlib

Step 1: Import Libraries

import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score

Step 2: Load and Explore Data

For this tutorial, we’ll use a simple dataset with hours studied vs. scores achieved.

data = {
    'Hours': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    'Scores': [10, 20, 30, 40, 50, 55, 65, 75, 85, 95]
}
df = pd.DataFrame(data)
print(df.head())

Step 3: Visualize the Data

Visualizing helps understand the relationship between hours studied and scores.

plt.scatter(df['Hours'], df['Scores'], color='blue')
plt.title('Hours vs Scores')
plt.xlabel('Hours Studied')
plt.ylabel('Score')
plt.show()

Step 4: Prepare Data for Training

Split your data into features and labels, and then into training and testing sets.

X = df[['Hours']]
y = df['Scores']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Step 5: Train the Linear Regression Model

Now, initialize and train your linear regression model.

model = LinearRegression()
model.fit(X_train, y_train)

Step 6: Evaluate the Model

Evaluate your model using predictions, RMSE, and R² score.

y_pred = model.predict(X_test)

print("Mean Squared Error:", mean_squared_error(y_test, y_pred))
print("R2 Score:", r2_score(y_test, y_pred))

Step 7: Visualize Predictions

Visualize how well your model fits the data.

plt.scatter(X_test, y_test, color='blue', label='Actual')
plt.plot(X_test, y_pred, color='red', linewidth=2, label='Predicted')
plt.title('Actual vs Predicted Scores')
plt.xlabel('Hours Studied')
plt.ylabel('Score')
plt.legend()
plt.show()

Conclusion

🎉 Congratulations! You have successfully:

✅ Loaded and visualized your dataset.
✅ Trained a linear regression model using scikit-learn.
✅ Evaluated and visualized your model’s performance.

What’s Next?

Try using a larger, real-world dataset.
Explore polynomial regression for non-linear relationships.
Read our Intermediate Tutorials to learn classification, hyperparameter tuning, and model deployment.

If you have questions or want to share your results, join our SuperML Community to learn and grow together!

Dimensionality Reduction

Learn what dimensionality reduction is, why it matters in machine learning, and how techniques like PCA, t-SNE, and UMAP help simplify high-dimensional data for effective analysis.

Machine Learning2 min read

machine learningdimensionality reductiondata preprocessing +1

🔰beginner ⏱️ 50 minutes

Genetic Algorithms

Learn what genetic algorithms are, how they mimic natural selection to solve optimization problems, and how they are used in machine learning.

Machine Learning2 min read

machine learninggenetic algorithmsoptimization +1

🔰beginner ⏱️ 40 minutes

Introduction to Natural Language Processing (NLP)

A clear, beginner-friendly introduction to NLP, explaining what it is, why it matters, and its key tasks with practical examples.

Machine Learning2 min read

nlpmachine learningdeep learning +1

🔰beginner ⏱️ 45 minutes

Limitations of Machine Learning

Understand the key limitations and fundamental limits of machine learning to set realistic expectations while building and using ML models.

Machine Learning2 min read

machine learninglimitationsbeginner

Build Your First Machine Learning Model: Linear Regression with Python

📋 Prerequisites

🎯 What You'll Learn

Introduction

Prerequisites

Step 1: Import Libraries

Step 2: Load and Explore Data

Step 3: Visualize the Data

Step 4: Prepare Data for Training

Step 5: Train the Linear Regression Model

Step 6: Evaluate the Model

Step 7: Visualize Predictions

Conclusion

What’s Next?

Related Tutorials

Dimensionality Reduction

Genetic Algorithms

Introduction to Natural Language Processing (NLP)

Limitations of Machine Learning

Build Your First Machine Learning Model: Linear Regression with Python

📋 Prerequisites

🎯 What You'll Learn

Introduction

Prerequisites

Step 1: Import Libraries

Step 2: Load and Explore Data

Step 3: Visualize the Data

Step 4: Prepare Data for Training

Step 5: Train the Linear Regression Model

Step 6: Evaluate the Model

Step 7: Visualize Predictions

Conclusion

What’s Next?

Related Tutorials

Dimensionality Reduction

Genetic Algorithms

Introduction to Natural Language Processing (NLP)

Limitations of Machine Learning

🍪 Cookie Notice

Cookie Preferences

Essential Cookies

Analytics Cookies

Marketing Cookies

Functionality Cookies