Computer Vision Interview 20 essential Q&A Updated 2026
Autoencoder

Autoencoders: 20 Essential Q&A

Compress to a bottleneck and reconstruct—unsupervised representation learning and the path to VAEs.

~11 min read 20 questions Advanced
encoderbottleneckVAEreconstruction
1 What is an autoencoder? ⚡ easy
Answer: Neural net trained to copy input to output through a bottleneck: encoder maps x→z, decoder maps z→x̂—forces compact representation.
2 Role of the encoder? ⚡ easy
Answer: Maps high-dimensional input (e.g. image) to a lower-dimensional latent code z—extracts salient factors.
3 Role of the decoder? ⚡ easy
Answer: Maps latent z back to output space—should reconstruct structure lost only if bottleneck truly limits capacity.
4 Why a bottleneck? 📊 medium
Answer: Constrains information flow so the model must learn a compressed code—similar inputs map to nearby latents if the AE is well regularized.
5 Common reconstruction loss? 📊 medium
Answer: MSE (L2) per pixel for continuous images; BCE if outputs are probabilities; perceptual losses use a pretrained net’s features.
# loss = F.mse_loss(recon, x)  # vanilla AE
6 Under-complete vs over-complete? 🔥 hard
Answer: Under-complete: dim(z) < dim(x)—true compression. Over-complete: dim(z) larger—needs regularization (sparse, denoising, VAE) or trivial identity.
7 What is a denoising autoencoder? 📊 medium
Answer: Train on corrupted inputs (noise, masking) to reconstruct clean x—learns robust features instead of copying noise.
8 Sparse autoencoder? 📊 medium
Answer: Penalize activations (e.g. KL on firing rates) so few units active per example—encourages meaningful distributed codes when over-complete.
9 VAE vs deterministic AE? 📊 medium
Answer: VAE encodes a distribution q(z|x); sample z for decoder—adds KL to prior p(z) for a generative model with smooth latent space.
10 What does the KL term do? 🔥 hard
Answer: Pulls approximate posterior toward prior (often N(0,I))—balances reconstruction vs regularization; enables sampling new z ~ p(z).
11 Reparameterization trick? 🔥 hard
Answer: Write z = μ(x) + σ(x)⊙ε with ε~N(0,1) so gradients flow through μ,σ—needed to backprop through stochastic sampling.
12 Use for anomaly detection? 📊 medium
Answer: Train on normal data; high reconstruction error on test indicates out-of-distribution—used in defect and fraud pipelines.
13 Link to PCA? 🔥 hard
Answer: Linear AE with MSE and tied weights can recover PCA subspace—deep nonlinear AE generalizes with stronger representational power.
14 Disentangled representations? 🔥 hard
Answer: Ideal latents align with generative factors; plain AE does not guarantee this—β-VAE and supervision help.
15 AE vs GAN for generation? 📊 medium
Answer: AE/VAE optimize likelihood-like objectives; GAN uses adversarial realism—GANs often sharper; VAEs more stable latent geometry.
16 Convolutional autoencoder? ⚡ easy
Answer: Encoder stacks conv+pool/downsample; decoder uses upsample/transpose conv—standard for images.
17 AE for super-resolution? 📊 medium
Answer: Condition decoder on low-res input or use skip connections (U-Net style)—AE ideas plus perceptual loss improve texture.
18 Embeddings for search? 📊 medium
Answer: Use encoder output as vector; nearest neighbors in latent space for similar images—may need contrastive training for metric quality.
19 Training tips? ⚡ easy
Answer: Normalize inputs; watch for posterior collapse in VAE; use skip connections if reconstruction is blurry from pure bottleneck.
20 Limitations? 📊 medium
Answer: Reconstructions can be blurry (MSE averages); latent may be entangled; vanilla AE is not a sharp generative model without VAE/GAN hybrids.

Autoencoder Cheat Sheet

Core
  • Encoder → z → decoder
Loss
  • MSE / BCE
  • Denoise / sparse
VAE
  • KL + sample z

💡 Pro tip: Bottleneck forces compression; regularize if over-complete.

Full tutorial track

Go deeper with the matching tutorial chapter and code examples.