Related Data Science Links
Learn Wrangling Data Science Tutorial, validate concepts with Wrangling Data Science MCQ Questions, and prepare interviews through Wrangling Data Science Interview Questions and Answers.
Reshaping & Transforming Data with Pandas
Data wrangling is the art of turning raw tables into the exact shapes you need for analysis and modeling: long vs wide, aggregated vs detailed, joined and enriched.
Melt & Pivot
Melt converts wide data (many columns) into long format (row per observation), while pivot does the opposite. Many visualization and modeling tasks prefer long format.
Thinking in terms of tidy data, each variable should have its own column and each observation its own row. Melt and pivot are the core operations that let you move between human‑friendly report layouts and machine‑friendly tidy tables without losing information.
import pandas as pd
df = pd.DataFrame({
"country": ["US", "US", "IN", "IN"],
"year": [2023, 2024, 2023, 2024],
"sales_a": [10, 12, 8, 9],
"sales_b": [5, 7, 4, 6]
})
long = df.melt(
id_vars=["country", "year"],
value_vars=["sales_a", "sales_b"],
var_name="product",
value_name="sales"
)
wide = long.pivot(
index=["country", "year"],
columns="product",
values="sales"
).reset_index()
Merging Multiple Tables
pd.merge implements SQL‑style joins (inner, left, right, outer). It is essential
when combining fact tables with dimension tables like users, products or time.
users = pd.DataFrame({
"user_id": [1, 2, 3],
"country": ["US", "IN", "US"]
})
orders = pd.DataFrame({
"order_id": [101, 102, 103],
"user_id": [1, 1, 3],
"amount": [50, 30, 80]
})
joined = pd.merge(
orders,
users,
on="user_id",
how="left"
)
print(joined)