GANs 20 Essential Q/A
DL Interview Prep

GANs: 20 Interview Questions

Master Generative Adversarial Networks: generator, discriminator, DCGAN, WGAN, CycleGAN, conditional GANs, mode collapse, evaluation metrics. Concise, interview-ready answers with loss formulas.

Generator Discriminator Latent Space Mode Collapse DCGAN WGAN CycleGAN
1 What is a Generative Adversarial Network (GAN)? Explain the core idea. ⚡ Easy
Answer: GANs consist of two networks: a generator (G) that creates fake data from noise, and a discriminator (D) that tries to distinguish real from fake. They play a minimax game: G tries to fool D, D tries to not be fooled. The equilibrium is when G produces realistic data (D=0.5).
min_G max_D V(D,G) = E_x[log D(x)] + E_z[log(1 - D(G(z)))]
2 Describe the roles of generator and discriminator in detail. 📊 Medium
Answer: Generator maps latent vector z (random noise) to data space, trying to produce realistic samples. Discriminator is a binary classifier that outputs probability of input being real. They are trained alternately: D on real + fake, G to maximize D's error.
3 What is the difference between minimax loss and non-saturating loss in GANs? 🔥 Hard
Answer: Minimax: G minimizes log(1-D(G(z))) → early vanishing gradient. Non-saturating: G maximizes log(D(G(z))) → stronger gradients early. Modern GANs use non-saturating loss with improved training.
L_G = -E_z[log(D(G(z)))] (non-saturating)
4 What is mode collapse in GANs? Why does it happen? 🔥 Hard
Answer: Mode collapse occurs when generator produces limited varieties (only few modes of data distribution). It happens when G finds a few "tricks" that fool D and over-optimizes them, failing to explore full distribution.
Solutions: WGAN, minibatch discrimination, unrolled GANs
Collapsed G: repetitive samples
5 What are the main contributions of DCGAN? 📊 Medium
Answer: DCGAN (Deep Convolutional GAN) introduced: 1) Replace pooling with strided convolutions (D) / fractional-strided (G). 2) BatchNorm in both G and D. 3) No fully connected layers. 4) ReLU in G (except output tanh), LeakyReLU in D. Stabilized training.
6 How does WGAN improve GAN training? 🔥 Hard
Answer: WGAN replaces JSD with Earth-Mover (Wasserstein) distance, which is continuous and provides meaningful gradients even when D is perfect. Uses weight clipping (later gradient penalty WGAN-GP) for Lipschitz constraint. Solves mode collapse and training instability.
V(G,D) = E_x[D(x)] - E_z[D(G(z))]; Lipschitz constraint via gradient penalty
7 What is a conditional GAN? Where is it used? 📊 Medium
Answer: cGAN feeds additional condition (class label, text, image) to both generator and discriminator. Enables controlled generation. Applications: Pix2Pix, text-to-image synthesis, semantic segmentation.
min_G max_D V(D,G) = E_x[log D(x|y)] + E_z[log(1 - D(G(z|y)))]
8 How does CycleGAN perform unpaired image translation? 🔥 Hard
Answer: CycleGAN uses two generators (G: X→Y, F: Y→X) and two discriminators. Key: cycle-consistency loss – translating X→Y→X should return original. No paired data needed. Also identity loss to preserve color.
L_cyc = E_x[||F(G(x))-x||] + E_y[||G(F(y))-y||]
9 What is latent space in GANs? Why interpolation is smooth? 📊 Medium
Answer: Latent space (z) is low-dimensional input to generator, typically Gaussian. G learns to map continuous z to realistic images; interpolating between z vectors yields semantically smooth transitions, showing G has learned meaningful representations.
10 What is unique about StyleGAN architecture? 🔥 Hard
Answer: StyleGAN removes input latent vector; instead uses mapping network to intermediate latent space w, then AdaIN (adaptive instance normalization) controls style at each layer. Also adds noise for stochastic variations. Enables disentangled control (coarse/fine styles).
11 How are GANs evaluated? Explain FID and Inception Score. 🔥 Hard
Answer: Inception Score (IS): uses pretrained InceptionNet; measures image quality and diversity (high score if confident class predictions & varied labels). Frechet Inception Distance (FID): computes Wasserstein-2 distance between real & fake feature distributions; lower is better, more robust than IS.
12 Why do GANs suffer from vanishing gradients? 📊 Medium
Answer: When D becomes too strong (perfectly classifies), log(1-D(G(z))) saturates to 0, giving G almost no gradient. Solutions: non-saturating loss, WGAN (critic scores not probabilities), label smoothing, or making D weaker.
13 What is one-sided label smoothing? Why only for real labels? 📊 Medium
Answer: Replace real labels (1) with soft values like 0.9. Prevents D from becoming overconfident, providing smoother gradients. Only smooth real labels; smoothing fake labels (0→0.1) encourages D to push G samples away, harming training.
14 Compare GANs and VAEs. 📊 Medium
GANs: Adversarial training, sharp realistic images, no explicit likelihood, prone to mode collapse, harder to train.
VAEs: Variational lower bound, maximizes likelihood, covers all modes (but blurry), stable training, latent space structured.
15 Why is weight clipping problematic in WGAN? How is it fixed? 🔥 Hard
Answer: Weight clipping forces critic to lie in narrow space, leading to capacity underuse and exploding/vanishing gradients. WGAN-GP replaces it with gradient penalty: penalize if gradient norm deviates from 1 (Lipschitz constraint).
gp = lambda * ((grad_norm - 1)**2).mean()
16 What is spectral normalization in GANs? 🔥 Hard
Answer: Normalizes weights by their largest singular value, enforcing Lipschitz constraint (spectral norm = 1). Used in SNGAN; stabilizes training without heavy hyperparameter tuning. Works well for both G and D.
17 Why introduce attention in GANs? 📊 Medium
Answer: SAGAN uses self-attention to model long-range dependencies (global features) instead of only local convolutions. Improves image quality in complex scenes (e.g., ImageNet) by capturing relationships between distant regions.
18 Explain feature matching technique in GANs. 🔥 Hard
Answer: G is trained to match the expected features (intermediate activations) of real data from D, not just final D output. Minimizes L2 distance between real/fake feature means. Helps prevent overtraining on current D.
19 What is progressive growing in GANs? 🔥 Hard
Answer: Start training with low-resolution images, gradually add layers to increase resolution. Stabilizes high-resolution GAN training (e.g., 1024x1024). Both G and D grow simultaneously. Used in StyleGAN, ProGAN.
20 What is Nash equilibrium in context of GANs? Do we achieve it? 🔥 Hard
Answer: Nash equilibrium: D is optimal (cannot distinguish real/fake) and G is optimal (data distribution = real distribution). In practice, GANs oscillate and rarely converge to exact equilibrium; we aim for approximate Nash. Techniques like consensus optimization try to find stable points.

GANs – Interview Cheat Sheet

Generator
  • Creates realistic fake data
  • Mode collapse risk
  • z Latent input
Discriminator
  • Real/Fake Binary classifier
  • Gradient Provides signal to G
Advanced GANs
  • DCGAN Convolutional guidelines
  • WGAN Wasserstein, stable
  • CycleGAN Unpaired translation
  • StyleGAN Style control
Metrics
  • FID Frechet distance
  • IS Inception Score

Verdict: "GANs: adversarial game, generator vs discriminator. Latent z to photorealistic; watch for mode collapse, use WGAN or spectral norm."