Dim Reduction

Dimensionality Reduction Q&A

1What is dimensionality reduction?
Answer: Reducing feature count while preserving useful information.
2Why use it?
Answer: Faster training, less overfitting, better visualization.
3PCA in one sentence?
Answer: PCA projects data onto orthogonal directions of max variance.
4PCA needs scaling?
Answer: Usually yes, to prevent large-scale features dominating components.
5Explained variance ratio?
Answer: Fraction of total variance captured by each component.
6Feature selection vs extraction?
Answer: Selection keeps original features; extraction creates new components.
7What is t-SNE used for?
Answer: Nonlinear 2D/3D visualization of high-dimensional data neighborhoods.
8What is UMAP?
Answer: Fast manifold learning method for structure-preserving embeddings.
9Can PCA improve model performance?
Answer: Sometimes, by denoising and reducing multicollinearity.
10Risk of too much reduction?
Answer: Information loss and poorer downstream performance.
11How choose number of components?
Answer: Cumulative explained variance and validation performance.
12One-line summary?
Answer: Dimensionality reduction simplifies data while retaining core signal.