Linear Regression

Model the relationship between numeric features and a continuous target using the simplest and most widely used regression technique.

Intuition & Equation

Linear Regression assumes the target \(y\) can be approximated as a linear combination of input features:

\[ \hat{y} = w_0 + w_1 x_1 + w_2 x_2 + \dots + w_n x_n \]

Weights (coefficients) measure how much each feature influences the prediction.
Intercept is the prediction when all features are zero.
We choose weights to minimize a loss function, usually Mean Squared Error (MSE).

Linear Regression with scikit-learn

Training a Linear Regression model

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import numpy as np

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

model = LinearRegression()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)

print("Coefficients:", model.coef_)
print("Intercept:", model.intercept_)
print("RMSE:", rmse)

When to Use Linear Regression

Target is continuous (price, temperature, revenue).
Relationship is approximately linear or can be made linear by feature engineering.
You need a simple, fast and interpretable baseline model.

Previous: Model Evaluation Next: Logistic Regression