t-SNE: Interview Q&A

Short questions and answers on t-SNE for nonlinear dimensionality reduction and data visualization.

Visualization Neighborhoods Perplexity Learning Rate

1 What is t-SNE mainly used for? ⚡ Beginner

Answer: t-SNE is used for visualizing high-dimensional data in 2D or 3D while preserving local neighborhood structure.

2 Is t-SNE a linear or non-linear method? ⚡ Beginner

Answer: t-SNE is a non-linear dimensionality reduction technique.

3 What does t-SNE try to preserve when reducing dimensions? 📊 Intermediate

Answer: It aims to preserve local neighbor relationships by matching pairwise similarity distributions in high and low dimensions.

4 What is perplexity in t-SNE? 🔥 Advanced

Answer: Perplexity is a parameter roughly related to the effective number of neighbors considered for each point.

5 How does the learning rate affect t-SNE? 🔥 Advanced

Answer: Too small a learning rate leads to slow convergence; too large can cause points to crowd or diverge.

6 Why is t-SNE primarily an exploratory tool, not a general-purpose feature reducer? 📊 Intermediate

Answer: t-SNE is non-parametric, stochastic and focuses on visualization; it doesn’t provide a simple mapping for new points and distortions can be hard to interpret quantitatively.

7 Is the global structure in a t-SNE plot always reliable? 🔥 Advanced

Answer: Not necessarily; t-SNE is designed to preserve local structure, so global distances and cluster sizes can be misleading.

8 Should you run t-SNE on raw features or after a step like PCA? 📊 Intermediate

Answer: Often you first apply PCA to reduce dimensionality (e.g., to 30–50 dims) and then run t-SNE for stability and speed.

9 Is t-SNE deterministic? ⚡ Beginner

Answer: No, results vary with random initialization and parameter settings; fixing the random seed improves reproducibility.

10 Is t-SNE suitable as a preprocessing step for clustering? 🔥 Advanced

Answer: Generally no; t-SNE is optimized for visualization, not for preserving cluster geometry needed by clustering algorithms.

11 What does it mean if t-SNE shows well-separated clusters? 📊 Intermediate

Answer: It often indicates that the classes or groups have distinct local neighborhoods in high-dimensional space, but it’s not a rigorous proof.

12 How does t-SNE differ from PCA? 📊 Intermediate

Answer: PCA is a linear, global variance-based method; t-SNE is non-linear and local-neighborhood based, optimized for visualization.

13 Why can t-SNE be slow on large datasets? 🔥 Advanced

Answer: It needs to compute and optimize over pairwise similarities, though approximate and Barnes–Hut variants help scale it up.

14 Which hyperparameters typically require tuning in t-SNE? 📊 Intermediate

Answer: Mainly perplexity, learning rate, number of iterations and sometimes initialization method.

15 What is a typical perplexity range used in practice? ⚡ Beginner

Answer: Values between 5 and 50 are common; trying a few and comparing plots is recommended.

16 How can you misuse t-SNE in an analysis? 🔥 Advanced

Answer: Misuse includes over-interpreting distances and cluster sizes, not checking stability across runs, or using it as evidence of separability without other metrics.

17 Is t-SNE appropriate for streaming or online data? 🔥 Advanced

Answer: Not really; it’s batch-oriented and doesn’t provide a simple incremental update rule for new points.

18 Give a real-world use case where t-SNE is very helpful. ⚡ Beginner

Answer: t-SNE is widely used to visualize embeddings like word vectors, image features or latent representations from neural networks.

19 How can you check if a t-SNE result is robust? 📊 Intermediate

Answer: Re-run t-SNE with different random seeds and parameter settings; stable qualitative patterns increase confidence.

20 What is the key message to remember about t-SNE? ⚡ Beginner

Answer: t-SNE is a powerful visualization tool, not a general-purpose feature extractor; use it to explore structure, but validate findings with other methods.

Quick Recap: t-SNE

Use t-SNE to visually explore embeddings and clusters, always remembering that it focuses on local neighborhoods and is sensitive to parameter choices.

Back: PCA Q&A Next: Bagging Q&A