Machine Learning

Data Visualization

Matplotlib and Seaborn visualization for exploratory data analysis in ML.

Data Visualization for ML

Matplotlib Basics

Simple line plot
import matplotlib.pyplot as plt

plt.plot(x, y, marker="o")
plt.xlabel("X")
plt.ylabel("Y")
plt.title("Simple Plot")
plt.grid(True)
plt.show()

Seaborn for Statistical Plots

Distribution and relationship plots
import seaborn as sns

sns.histplot(df["feature"], kde=True)
sns.pairplot(df, hue="target")
sns.heatmap(df.corr(), annot=True, cmap="coolwarm")

Best Practices

  • Always label axes and add clear titles.
  • Use consistent colors across related plots.
  • Prefer simple visuals (bar/line/scatter) over overly complex charts.
  • For ML, focus on plots that reveal leakage, imbalance and non‑linear relationships.

ML-Specific Visualizations

  • Plot learning curves (train vs validation score vs number of samples) to diagnose bias/variance.
  • Use confusion matrices and ROC/PR curves to understand classification performance.
  • Plot feature importances and partial dependence plots for tree‑based models.