t-SNE Q&A
20 Core Questions
Interview Prep
t-SNE: Interview Q&A
Short questions and answers on t-SNE for nonlinear dimensionality reduction and data visualization.
Visualization
Neighborhoods
Perplexity
Learning Rate
1
What is t-SNE mainly used for?
โก Beginner
Answer: t-SNE is used for visualizing high-dimensional data in 2D or 3D while preserving local neighborhood structure.
2
Is t-SNE a linear or non-linear method?
โก Beginner
Answer: t-SNE is a non-linear dimensionality reduction technique.
3
What does t-SNE try to preserve when reducing dimensions?
๐ Intermediate
Answer: It aims to preserve local neighbor relationships by matching pairwise similarity distributions in high and low dimensions.
4
What is perplexity in t-SNE?
๐ฅ Advanced
Answer: Perplexity is a parameter roughly related to the effective number of neighbors considered for each point.
5
How does the learning rate affect t-SNE?
๐ฅ Advanced
Answer: Too small a learning rate leads to slow convergence; too large can cause points to crowd or diverge.
6
Why is t-SNE primarily an exploratory tool, not a general-purpose feature reducer?
๐ Intermediate
Answer: t-SNE is non-parametric, stochastic and focuses on visualization; it doesnโt provide a simple mapping for new points and distortions can be hard to interpret quantitatively.
7
Is the global structure in a t-SNE plot always reliable?
๐ฅ Advanced
Answer: Not necessarily; t-SNE is designed to preserve local structure, so global distances and cluster sizes can be misleading.
8
Should you run t-SNE on raw features or after a step like PCA?
๐ Intermediate
Answer: Often you first apply PCA to reduce dimensionality (e.g., to 30โ50 dims) and then run t-SNE for stability and speed.
9
Is t-SNE deterministic?
โก Beginner
Answer: No, results vary with random initialization and parameter settings; fixing the random seed improves reproducibility.
10
Is t-SNE suitable as a preprocessing step for clustering?
๐ฅ Advanced
Answer: Generally no; t-SNE is optimized for visualization, not for preserving cluster geometry needed by clustering algorithms.
11
What does it mean if t-SNE shows well-separated clusters?
๐ Intermediate
Answer: It often indicates that the classes or groups have distinct local neighborhoods in high-dimensional space, but itโs not a rigorous proof.
12
How does t-SNE differ from PCA?
๐ Intermediate
Answer: PCA is a linear, global variance-based method; t-SNE is non-linear and local-neighborhood based, optimized for visualization.
13
Why can t-SNE be slow on large datasets?
๐ฅ Advanced
Answer: It needs to compute and optimize over pairwise similarities, though approximate and BarnesโHut variants help scale it up.
14
Which hyperparameters typically require tuning in t-SNE?
๐ Intermediate
Answer: Mainly perplexity, learning rate, number of iterations and sometimes initialization method.
15
What is a typical perplexity range used in practice?
โก Beginner
Answer: Values between 5 and 50 are common; trying a few and comparing plots is recommended.
16
How can you misuse t-SNE in an analysis?
๐ฅ Advanced
Answer: Misuse includes over-interpreting distances and cluster sizes, not checking stability across runs, or using it as evidence of separability without other metrics.
17
Is t-SNE appropriate for streaming or online data?
๐ฅ Advanced
Answer: Not really; itโs batch-oriented and doesnโt provide a simple incremental update rule for new points.
18
Give a real-world use case where t-SNE is very helpful.
โก Beginner
Answer: t-SNE is widely used to visualize embeddings like word vectors, image features or latent representations from neural networks.
19
How can you check if a t-SNE result is robust?
๐ Intermediate
Answer: Re-run t-SNE with different random seeds and parameter settings; stable qualitative patterns increase confidence.
20
What is the key message to remember about t-SNE?
โก Beginner
Answer: t-SNE is a powerful visualization tool, not a general-purpose feature extractor; use it to explore structure, but validate findings with other methods.
Quick Recap: t-SNE
Use t-SNE to visually explore embeddings and clusters, always remembering that it focuses on local neighborhoods and is sensitive to parameter choices.