Decision Trees Supervised Learning
Intuitive scikit-learn

Decision Trees

Learn how decision trees split data into regions using questions, and how to use them for classification and regression in Python.

What is a Decision Tree?

A decision tree predicts a target by asking a sequence of questions about the features. Each internal node checks a condition (e.g., feature < threshold), and each leaf node outputs a prediction.

  • Classification trees: predict categories.
  • Regression trees: predict numerical values.

Example: Classification Tree

DecisionTreeClassifier on Iris Dataset
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report

iris = load_iris()
X, y = iris.data, iris.target

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42, stratify=y
)

# Create decision tree
tree_clf = DecisionTreeClassifier(
    max_depth=3,      # limit depth to reduce overfitting
    random_state=42
)

tree_clf.fit(X_train, y_train)
y_pred = tree_clf.predict(X_test)

print("Accuracy:", accuracy_score(y_test, y_pred))
print("\nReport:\n", classification_report(y_test, y_pred, target_names=iris.target_names))

Example: Regression Tree

Predict House Price
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
import numpy as np

X = np.array([[500], [750], [1000], [1250], [1500], [1750], [2000]])
y = np.array([100, 150, 200, 250, 300, 320, 350])

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)

tree_reg = DecisionTreeRegressor(
    max_depth=3,
    random_state=42
)

tree_reg.fit(X_train, y_train)
y_pred = tree_reg.predict(X_test)

mse = mean_squared_error(y_test, y_pred)
print("MSE:", mse)