Decision Trees
Supervised Learning
Intuitive
scikit-learn
Decision Trees
Learn how decision trees split data into regions using questions, and how to use them for classification and regression in Python.
What is a Decision Tree?
A decision tree predicts a target by asking a sequence of questions about the features.
Each internal node checks a condition (e.g., feature < threshold),
and each leaf node outputs a prediction.
- Classification trees: predict categories.
- Regression trees: predict numerical values.
Example: Classification Tree
DecisionTreeClassifier on Iris Dataset
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report
iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42, stratify=y
)
# Create decision tree
tree_clf = DecisionTreeClassifier(
max_depth=3, # limit depth to reduce overfitting
random_state=42
)
tree_clf.fit(X_train, y_train)
y_pred = tree_clf.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
print("\nReport:\n", classification_report(y_test, y_pred, target_names=iris.target_names))
Example: Regression Tree
Predict House Price
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
import numpy as np
X = np.array([[500], [750], [1000], [1250], [1500], [1750], [2000]])
y = np.array([100, 150, 200, 250, 300, 320, 350])
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42
)
tree_reg = DecisionTreeRegressor(
max_depth=3,
random_state=42
)
tree_reg.fit(X_train, y_train)
y_pred = tree_reg.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print("MSE:", mse)