Hierarchical Clustering Q&A
20 Core Questions
Interview Prep
Hierarchical Clustering: Interview Q&A
Short questions and answers on hierarchical clustering: agglomerative vs divisive, linkage criteria and dendrogram interpretation.
Dendrogram
Agglomerative
Divisive
Linkage
1
What is hierarchical clustering?
⚡ Beginner
Answer: Hierarchical clustering builds a tree of clusters showing how data points can be grouped at different levels of granularity.
2
What is the difference between agglomerative and divisive clustering?
📊 Intermediate
Answer: Agglomerative starts with each point as its own cluster and merges them; divisive starts with one cluster and recursively splits it.
3
What is a dendrogram?
⚡ Beginner
Answer: A dendrogram is a tree-like diagram that visualizes the sequence of merges or splits in hierarchical clustering.
4
How do you choose the number of clusters from a dendrogram?
📊 Intermediate
Answer: You “cut” the dendrogram at a chosen height and count the number of branches intersected; the height can be picked by big vertical gaps or distance thresholds.
5
What is linkage in hierarchical clustering?
📊 Intermediate
Answer: Linkage defines how the distance between clusters is computed when deciding which ones to merge.
6
Name three common linkage methods.
⚡ Beginner
Answer: Common choices are single linkage, complete linkage and average linkage.
7
What is single linkage in simple terms?
📊 Intermediate
Answer: Single linkage defines cluster distance as the minimum distance between any two points in the two clusters.
8
Why can single linkage cause “chaining”?
🔥 Advanced
Answer: Because it merges clusters as long as any pair of points is close, which can create long, chain-like clusters.
9
What is complete linkage?
📊 Intermediate
Answer: Complete linkage uses the maximum distance between any two points from the clusters when computing cluster distance.
10
How does average linkage work?
📊 Intermediate
Answer: Average linkage uses the average pairwise distance between all points in the two clusters.
11
Does hierarchical clustering require specifying the number of clusters in advance?
⚡ Beginner
Answer: No, it builds a full hierarchy; you can choose the number of clusters afterward by cutting the dendrogram.
12
What are some advantages of hierarchical clustering over k-means?
📊 Intermediate
Answer: It does not require pre‑choosing k, can capture nested cluster structure and works with various distance/linkage choices.
13
What are some disadvantages of hierarchical clustering?
📊 Intermediate
Answer: It can be computationally expensive (O(n²) or worse) and is sensitive to the choice of distance and linkage.
14
Can hierarchical clustering be used with different distance metrics?
⚡ Beginner
Answer: Yes, it can work with any valid distance or similarity measure, not just Euclidean distance.
15
What is a cophenetic correlation coefficient in this context?
🔥 Advanced
Answer: It measures how faithfully the dendrogram preserves pairwise distances of the original data; higher is better.
16
When is hierarchical clustering particularly useful?
⚡ Beginner
Answer: It’s useful when you want to explore cluster structure interactively, e.g., in gene expression analysis or document clustering.
17
Does hierarchical clustering scale well to very large datasets?
🔥 Advanced
Answer: No, standard algorithms do not; for very large n you often need approximations or sampling.
18
How do you evaluate the quality of hierarchical clustering?
📊 Intermediate
Answer: With silhouette scores, cophenetic correlation, and by visually inspecting dendrograms for coherent clusters.
19
How does hierarchical clustering compare to DBSCAN?
🔥 Advanced
Answer: DBSCAN focuses on density-based clusters and can find arbitrary shapes, while hierarchical clustering builds a nested hierarchy; they make different assumptions about structure.
20
What is the key message to remember about hierarchical clustering?
⚡ Beginner
Answer: It’s a flexible, exploratory clustering tool: understand dendrograms, linkage choices and complexity limits to use it effectively.
Quick Recap: Hierarchical Clustering
Think of hierarchical clustering as building a family tree of points; choosing where to cut that tree turns exploration into a concrete clustering solution.