Logistic Regression

Use Logistic Regression to predict probabilities for binary or multiclass classification problems using a linear decision boundary in feature space.

Sigmoid Function & Probabilities

Instead of predicting a continuous value, Logistic Regression predicts a probability between 0 and 1 using the sigmoid function:

\[ \sigma(z) = \frac{1}{1 + e^{-z}} \]

where \( z = w_0 + w_1 x_1 + \dots + w_n x_n \). We then apply a threshold (usually 0.5) to convert the probability into a class label.

Logistic Regression with scikit-learn

Binary classification example

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

clf = LogisticRegression(max_iter=1000)
clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)
probs = clf.predict_proba(X_test)[:, 1]

print(classification_report(y_test, y_pred))

Multiclass Logistic Regression

scikit‑learn supports multiclass Logistic Regression using one‑vs‑rest (ovr) or multinomial strategies:

clf = LogisticRegression(
    multi_class="multinomial",
    solver="lbfgs",
    max_iter=1000
)

Previous: Linear Regression Next: Decision Trees