Autoencoders: 20 Essential Q&A

Question 1

1 What is an autoencoder? ⚡ easy

Answer

Answer: Neural net trained to copy input to output through a bottleneck: encoder maps x→z, decoder maps z→x̂—forces compact representation.

Question 2

2 Role of the encoder? ⚡ easy

Answer

Answer: Maps high-dimensional input (e.g. image) to a lower-dimensional latent code z—extracts salient factors.

Question 3

3 Role of the decoder? ⚡ easy

Answer

Answer: Maps latent z back to output space—should reconstruct structure lost only if bottleneck truly limits capacity.

Question 4

4 Why a bottleneck? 📊 medium

Answer

Answer: Constrains information flow so the model must learn a compressed code—similar inputs map to nearby latents if the AE is well regularized.

Question 5

5 Common reconstruction loss? 📊 medium

Answer

Answer: MSE (L2) per pixel for continuous images; BCE if outputs are probabilities; perceptual losses use a pretrained net’s features.

Question 6

6 Under-complete vs over-complete? 🔥 hard

Answer

Answer: Under-complete: dim(z) < dim(x)—true compression. Over-complete: dim(z) larger—needs regularization (sparse, denoising, VAE) or trivial identity.

Question 7

7 What is a denoising autoencoder? 📊 medium

Answer

Answer: Train on corrupted inputs (noise, masking) to reconstruct clean x—learns robust features instead of copying noise.

Question 8

8 Sparse autoencoder? 📊 medium

Answer

Answer: Penalize activations (e.g. KL on firing rates) so few units active per example—encourages meaningful distributed codes when over-complete.

Question 9

9 VAE vs deterministic AE? 📊 medium

Answer

Answer: VAE encodes a distribution q(z|x); sample z for decoder—adds KL to prior p(z) for a generative model with smooth latent space.

Question 10

10 What does the KL term do? 🔥 hard

Answer

Answer: Pulls approximate posterior toward prior (often N(0,I))—balances reconstruction vs regularization; enables sampling new z ~ p(z).

Question 11

11 Reparameterization trick? 🔥 hard

Answer

Answer: Write z = μ(x) + σ(x)⊙ε with ε~N(0,1) so gradients flow through μ,σ—needed to backprop through stochastic sampling.

Question 12

12 Use for anomaly detection? 📊 medium

Answer

Answer: Train on normal data; high reconstruction error on test indicates out-of-distribution—used in defect and fraud pipelines.

Question 13

13 Link to PCA? 🔥 hard

Answer

Answer: Linear AE with MSE and tied weights can recover PCA subspace—deep nonlinear AE generalizes with stronger representational power.

Question 14

14 Disentangled representations? 🔥 hard

Answer

Answer: Ideal latents align with generative factors; plain AE does not guarantee this—β-VAE and supervision help.

Question 15

15 AE vs GAN for generation? 📊 medium

Answer

Answer: AE/VAE optimize likelihood-like objectives; GAN uses adversarial realism—GANs often sharper; VAEs more stable latent geometry.

Question 16

16 Convolutional autoencoder? ⚡ easy

Answer

Answer: Encoder stacks conv+pool/downsample; decoder uses upsample/transpose conv—standard for images.

Question 17

17 AE for super-resolution? 📊 medium

Answer

Answer: Condition decoder on low-res input or use skip connections (U-Net style)—AE ideas plus perceptual loss improve texture.

Question 18

18 Embeddings for search? 📊 medium

Answer

Answer: Use encoder output as vector; nearest neighbors in latent space for similar images—may need contrastive training for metric quality.

Question 19

19 Training tips? ⚡ easy

Answer

Answer: Normalize inputs; watch for posterior collapse in VAE; use skip connections if reconstruction is blurry from pure bottleneck.

Question 20

20 Limitations? 📊 medium

Answer

Answer: Reconstructions can be blurry (MSE averages); latent may be entangled; vanilla AE is not a sharp generative model without VAE/GAN hybrids.

Related Computer Vision Links

Autoencoders: 20 Essential Q&A

Quick Navigation

Autoencoder Cheat Sheet

Core

Loss

VAE

Full tutorial track