Seaborn Visualization
Beginner Friendly For Data Science

Seaborn for Data Visualization

Learn how to use Seaborn to create beautiful and statistical visualizations in Python, including scatter plots, histograms, boxplots, and more using simple examples and comments.

What is Seaborn?

Seaborn is a high-level data visualization library built on top of Matplotlib. It provides a simple interface to create attractive and informative statistical graphics, especially when working with Pandas DataFrames.

Key Advantages of Seaborn

  • Beautiful default styles with minimal code.
  • Deep integration with Pandas DataFrames.
  • Built-in datasets for quick experimentation.
  • Easy support for colors, groups, and statistical summaries.

Installation & Setup

Install Seaborn and set a clean plotting style.

Install Seaborn
# Install (run in terminal, not in Python)
pip install seaborn
Basic Imports & Style
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Set a nice default style for all Seaborn plots
sns.set(style="whitegrid")

Example 1: Scatter Plot

Scatter plots show the relationship between two numerical variables. Seaborn makes it easy to add color for different groups.

Total Bill vs Tip
import seaborn as sns
import matplotlib.pyplot as plt

# Load example dataset that ships with Seaborn
tips = sns.load_dataset("tips")  # restaurant bills and tips

# Create a scatter plot: total_bill vs tip
sns.scatterplot(
    data=tips,
    x="total_bill",   # column for x-axis
    y="tip",          # column for y-axis
    hue="sex",        # color points by 'sex' column
    style="time",     # marker style by lunch/dinner
    size="size"       # marker size by table size
)

plt.title("Total Bill vs Tip by Sex and Time")
plt.xlabel("Total Bill ($)")
plt.ylabel("Tip ($)")
plt.show()
Tip: Use hue, style, and size to encode additional information into a single scatter plot.

Example 2: Distribution Plot

Distribution plots (histograms with optional KDE curves) help you understand how a variable is spread.

Distribution of Total Bill
import seaborn as sns
import matplotlib.pyplot as plt

tips = sns.load_dataset("tips")

# Plot the distribution of total_bill
sns.histplot(
    data=tips,
    x="total_bill",  # numeric column to plot
    bins=20,         # number of histogram bars
    kde=True,        # show a smooth density curve
    color="teal"
)

plt.title("Distribution of Total Bill")
plt.xlabel("Total Bill ($)")
plt.ylabel("Count")
plt.show()

Example 3: Boxplot

Boxplots summarize the distribution of a numeric variable and highlight potential outliers. They are great for comparing categories.

Total Bill by Day
import seaborn as sns
import matplotlib.pyplot as plt

tips = sns.load_dataset("tips")

# Boxplot of total_bill for each day of the week
sns.boxplot(
    data=tips,
    x="day",          # categories on x-axis
    y="total_bill",   # numeric variable on y-axis
    hue="sex"         # split boxes by sex
)

plt.title("Total Bill by Day and Sex")
plt.xlabel("Day of Week")
plt.ylabel("Total Bill ($)")
plt.show()

Example 4: Correlation Heatmap

Heatmaps are useful to visualize correlations between multiple numeric variables.

Correlation Matrix
import seaborn as sns
import matplotlib.pyplot as plt

tips = sns.load_dataset("tips")

# Compute correlation matrix for numeric columns
corr = tips.corr(numeric_only=True)

# Plot correlation heatmap
sns.heatmap(
    corr,
    annot=True,        # show correlation values
    cmap="coolwarm",   # color map
    fmt=".2f",         # number format
    square=True
)

plt.title("Correlation Heatmap (Tips Dataset)")
plt.show()