Implementation of Polynomial Regression: Python Code & Data Set
Polynomial regression is a powerful machine learning technique used to model non-linear relationships. A perfect real-world example is the relationship between car speed and braking distance, which follows a quadratic regression pattern rather than a straight line. In this guide, we will explore how to calculate polynomial regression coefficients using the least squares method, solve the equation mathematically, and implement it in Python with a real dataset. We will also visualize the results to understand how braking distance increases with speed.
Scenario
Imagine you are analyzing how a car's speed affects the braking distance required to stop. The relationship is non-linear because as speed increases, the braking distance grows exponentially.
A linear model (Y=b0+b1X) won’t fit well, so we use polynomial regression (degree 2):
Braking Distance=b0+b1(Speed)+b2(Speed)2
Sample Dataset (Speed vs Braking Distance)
| Speed (km/h) | Braking Distance (m) |
|---|---|
| 10 | 5 |
| 20 | 20 |
| 30 | 45 |
| 40 | 80 |
| 50 | 125 |
- As speed increases, the braking distance grows non-linearly (quadratically).
- We need to calculate b0,b1,b2 to fit a quadratic curve.
Step-by-Step Solution
- Create the Equation:
Using Y=b0+b1X+b2X2, we set up equations for given data points. - Convert into Matrix Form:

3.Solve for b0,b1,b2 using matrix algebra.
We solve the equation AX=B using matrix algebra:
X=A−1B
Solving the system, we get:
b0≈0, b1≈0, b2≈0.05
Thus, the polynomial equation for braking distance is:
Braking Distance=0.05×Speed2
This equation perfectly fits the given data
Python Code for Polynomial Regression: Car Speed vs Braking Distance

import numpy as np
import matplotlib.pyplot as plt
# Sample dataset: Speed (X) vs Braking Distance (Y)
speed = np.array([10, 20, 30, 40, 50])
braking_distance = np.array([5, 20, 45, 80, 125])
# Create the coefficient matrix A
A = np.vstack([np.ones_like(speed), speed, speed**2]).T
# Solve for coefficients b0, b1, b2
b0, b1, b2 = np.linalg.lstsq(A, braking_distance, rcond=None)[0]
# Generate smooth curve for visualization
speed_smooth = np.linspace(5, 55, 100)
braking_smooth = b0 + b1 * speed_smooth + b2 * speed_smooth**2
# Plot the data points
plt.scatter(speed, braking_distance, color="blue", label="Data Points")
# Plot the polynomial curve
plt.plot(speed_smooth, braking_smooth, color="red", label="Polynomial Fit (Degree 2)")
# Labels and title
plt.xlabel("Speed (km/h)")
plt.ylabel("Braking Distance (m)")
plt.title("Polynomial Regression: Speed vs Braking Distance")
plt.legend()
plt.grid(True)
# Show the plot
plt.show()
# Print the equation
print(f"Polynomial Equation: Braking Distance = {b0:.2f} + {b1:.2f}*Speed + {b2:.2f}*Speed^2")