Autoencoders 20 Essential Q/A
DL Interview Prep

Autoencoders: 20 Interview Questions

Master undercomplete, sparse, denoising, contractive, and variational autoencoders (VAE). Bottleneck, reconstruction loss, latent space, and applications. Interview-ready answers.

Undercomplete Sparse VAE Denoising Contractive Anomaly Detection
1 What is an autoencoder? Basic architecture. ⚡ Easy
Answer: An autoencoder is an unsupervised neural network that learns to copy its input to output via a bottleneck (latent) layer. Architecture: Encoder compresses input to latent code; Decoder reconstructs from latent code. Trained with reconstruction loss (e.g., MSE).
z = f_enc(x); ŷ = f_dec(z); L = ||x - ŷ||²
2 What is undercomplete vs overcomplete autoencoder? 📊 Medium
Answer: Undercomplete: bottleneck dimension less than input dimension. Forces compression, learns useful features. Overcomplete: bottleneck dimension larger than input. Risks learning identity function; requires regularization (sparse, denoising).
Undercomplete: compression, meaningful features
Overcomplete: needs regularization
3 Common reconstruction loss functions? ⚡ Easy
Answer: MSE for continuous values, binary cross-entropy for pixel values in [0,1] (treating as probabilities), L1 loss for robustness.
4 Real-world applications of autoencoders? 📊 Medium
Answer: Dimensionality reduction (PCA nonlinear), anomaly detection (high reconstruction error), denoising, inpainting, compression, generative modeling (VAE), feature extraction for pretraining.
5 How does a denoising autoencoder (DAE) work? 📊 Medium
Answer: Input is corrupted (e.g., add noise, mask), model learns to reconstruct clean original. Forces learning robust features, not just copying. Corrupting process: x̃ = x + ε, ε ~ N(0, σ²).
L = ||x - f_dec(f_enc(x̃))||²
6 What is a sparse autoencoder? How enforce sparsity? 🔥 Hard
Answer: Enforces that most latent units are inactive. Sparsity penalty added to loss: KL divergence between average activation and target (e.g., ρ=0.05). Also L1 regularization on activations. Encourages specialized feature detectors.
L = reconstruction + β Σ KL(ρ || ρ̂_j)
7 Explain contractive autoencoder. How different from denoising? 🔥 Hard
Answer: CAE adds penalty: Frobenius norm of Jacobian of encoder ∂z/∂x. Encourages robustness to small input changes by making latent representation contractive. DAE corrupts input; CAE regularizes gradient.
L = reconstruction + λ ||∂f_enc(x)/∂x||²_F
8 What is a variational autoencoder? Probabilistic perspective. 🔥 Hard
Answer: VAE is a generative model that learns latent distribution p(z|x). Encoder outputs parameters of Gaussian (μ, σ). Decoder generates data from sampled z. Trained with ELBO: reconstruction + KL divergence to prior p(z)=N(0,I). Enables interpolation and generation.
ELBO: E_q[log p(x|z)] - KL(q(z|x) || p(z))
9 Why reparameterization trick in VAE? 🔥 Hard
Answer: Sampling from N(μ, σ) is non-differentiable. Trick: sample ε ~ N(0,1), then z = μ + σ·ε. Gradient flows through μ,σ, enabling backprop.
z = mu + sigma * torch.randn_like(sigma)
10 VAE vs standard autoencoder: key differences? 📊 Medium
Answer: AE learns deterministic latent code; VAE learns probabilistic latent distribution. AE latent space not continuous; VAE enforces smoothness via KL. VAE is generative; AE is reconstructive.
11 What is β-VAE? 🔥 Hard
Answer: VAE variant with β > 1 multiplier on KL term. Encourages more disentangled latent representations. Trade-off: reconstruction vs. independence.
L = reconstruction + β·KL
12 How to use autoencoders for anomaly detection? 📊 Medium
Answer: Train on normal data only. Anomalies yield high reconstruction error. Set threshold based on validation. Used in fraud, industrial defect detection.
13 When to use convolutional autoencoder? ⚡ Easy
Answer: For image data. Encoder uses Conv+Pooling, decoder uses Transposed Conv/Upsampling. Preserves spatial structure.
14 What is stacked autoencoder? Greedy layer-wise pretraining? 🔥 Hard
Answer: Multiple autoencoders stacked; each layer trained separately to reconstruct previous layer's output. Used for deep network pretraining (historical). End-to-end fine-tuning.
15 List ways to regularize autoencoders. 📊 Medium
Answer: Sparse penalty (KL, L1), denoising (corrupt input), contractive (Jacobian), variational (KL to prior), dropout.
16 Linear autoencoder with MSE – relation to PCA? 🔥 Hard
Answer: Linear autoencoder (no nonlinearity) with MSE learns the same principal subspace as PCA. Weights span the same space, but not necessarily orthogonal.
17 What is posterior collapse in VAE? 🔥 Hard
Answer: Decoder ignores latent z, and q(z|x) matches prior p(z). Happens with strong decoders (e.g., autoregressive). Solutions: KL annealing, free bits, β-VAE, reducing decoder capacity.
18 What is adversarial autoencoder? 🔥 Hard
Answer: AAE uses adversarial training to match aggregated posterior of latent code to prior. Discriminator distinguishes true prior samples vs encoder output. Enables more flexible priors.
19 How to visualize autoencoder learned features? 📊 Medium
Answer: For images, visualize decoder weights or generate from latent traversal. For latent space, t-SNE/UMAP on z. For VAE, interpolate between z1 and z2.
20 Can autoencoders handle missing data? 🔥 Hard
Answer: Yes: train denoising autoencoder with random masking. Model learns to impute missing values. Also partial autoencoders. VAE can model conditional distribution.

Autoencoders – Interview Cheat Sheet

AE Types
  • Undercomplete Bottleneck < dim
  • Sparse KL / L1 penalty
  • VAE Probabilistic, generative
  • Denoising Corrupt input
  • Contractive Jacobian penalty
Loss Components
  • Reconstruction (MSE/BCE)
  • Sparsity KL / L1
  • KL (VAE) to N(0,I)
  • Jacobian Frobenius norm
Applications
  • Anomaly detection high recon error
  • Denoising image/audio
  • Dimensionality reduction nonlinear PCA
  • Generation VAE
Pitfalls
  • Overcomplete → identity
  • VAE posterior collapse
  • Blurry VAE generations

Verdict: "Autoencoders learn efficient codes – regularize to get useful features."