Machine Learning Anomaly Detection
Fraud & Intrusion

Anomaly Detection

Anomaly detection identifies rare observations that deviate significantly from the majority of the data, such as fraud, network intrusions or faulty sensors.

Real-World Use Cases

  • Credit card fraud detection.
  • Network intrusion detection.
  • Industrial equipment fault monitoring.
  • Medical anomaly detection (rare diseases, unusual lab results).

Isolation Forest

Isolation Forest isolates anomalies by randomly partitioning the feature space; anomalies are easier to isolate and thus have shorter average path lengths in the trees.

IsolationForest with scikit-learn
from sklearn.ensemble import IsolationForest

iso = IsolationForest(
    n_estimators=200,
    contamination=0.02,
    random_state=42
)
iso.fit(X_train)

scores = iso.decision_function(X_test)
labels = iso.predict(X_test)  # -1 = anomaly, 1 = normal

One-Class SVM

One‑Class SVM learns a decision boundary around the "normal" class and flags points that lie outside this region as anomalies.

from sklearn.svm import OneClassSVM

ocsvm = OneClassSVM(kernel="rbf", gamma="scale", nu=0.05)
ocsvm.fit(X_train_normal)

pred = ocsvm.predict(X_test)  # -1 anomaly, 1 normal

Evaluating Anomaly Detectors

Evaluation is tricky because anomalies are rare and labels may be incomplete.

  • Use precision‑recall curves instead of accuracy for highly imbalanced data.
  • Work closely with domain experts to validate flagged anomalies.
  • Consider cost‑sensitive metrics (false negatives are often more expensive than false positives).