Machine Learning
Data Visualization
Matplotlib and Seaborn visualization for exploratory data analysis in ML.
Data Visualization for ML
Matplotlib Basics
Simple line plot
import matplotlib.pyplot as plt
plt.plot(x, y, marker="o")
plt.xlabel("X")
plt.ylabel("Y")
plt.title("Simple Plot")
plt.grid(True)
plt.show()
Seaborn for Statistical Plots
Distribution and relationship plots
import seaborn as sns
sns.histplot(df["feature"], kde=True)
sns.pairplot(df, hue="target")
sns.heatmap(df.corr(), annot=True, cmap="coolwarm")
Best Practices
- Always label axes and add clear titles.
- Use consistent colors across related plots.
- Prefer simple visuals (bar/line/scatter) over overly complex charts.
- For ML, focus on plots that reveal leakage, imbalance and non‑linear relationships.
ML-Specific Visualizations
- Plot learning curves (train vs validation score vs number of samples) to diagnose bias/variance.
- Use confusion matrices and ROC/PR curves to understand classification performance.
- Plot feature importances and partial dependence plots for tree‑based models.