Machine Learning
Logistic Regression
Classification
Logistic Regression
Use Logistic Regression to predict probabilities for binary or multiclass classification problems using a linear decision boundary in feature space.
Sigmoid Function & Probabilities
Instead of predicting a continuous value, Logistic Regression predicts a probability between 0 and 1 using the sigmoid function:
\[ \sigma(z) = \frac{1}{1 + e^{-z}} \]
where \( z = w_0 + w_1 x_1 + \dots + w_n x_n \). We then apply a threshold (usually 0.5) to convert the probability into a class label.
Logistic Regression with scikit-learn
Binary classification example
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
clf = LogisticRegression(max_iter=1000)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
probs = clf.predict_proba(X_test)[:, 1]
print(classification_report(y_test, y_pred))
Multiclass Logistic Regression
scikit‑learn supports multiclass Logistic Regression using one‑vs‑rest (ovr) or multinomial strategies:
clf = LogisticRegression(
multi_class="multinomial",
solver="lbfgs",
max_iter=1000
)