Computer Vision Interview
20 essential Q&A
Updated 2026
RetinaNet
RetinaNet: 20 Essential Q&A
Focal loss and feature pyramids for dense classification without drowning in easy negatives.
~11 min read
20 questions
Advanced
focal lossFPNimbalanceone-stage
Quick Navigation
1
What is RetinaNet?
📊 medium
Answer: One-stage detector with FPN backbone and focal loss on dense classification—closes accuracy gap to two-stage without proposals.
2
Focal loss intuition?
🔥 hard
Answer: Down-weights easy negatives (well-classified background) so training focuses on hard examples—prevents huge CE loss from overwhelming gradients.
# FL = -α * (1 - p_t)**γ * log(p_t) # p_t = prob of ground-truth class
3
Role of γ (gamma)?
🔥 hard
Answer: Focusing parameter: (1 − p_t)^γ reduces loss for high-confidence correct preds; γ=0 is CE; typical γ=2.
4
Why imbalance in one-stage?
📊 medium
Answer: ~100k anchors per image with few positives—vanilla CE is dominated by easy background classifications.
5
How does FPN help RetinaNet?
📊 medium
Answer: Predicts at multiple pyramid levels P3–P7 with shared heads—each level responsible for objects in a scale range.
6
Subnet design?
📊 medium
Answer: Separate small conv classification and box regression subnets applied per level—4 conv layers each in original paper.
7
Anchors?
⚡ easy
Answer: Similar to RPN: multiple scales/ratios per location; classification predicts class (sigmoid per class) and reg head predicts deltas.
8
Box regression loss?
⚡ easy
Answer: Smooth L1 on positive anchors only—standard in Faster R-CNN lineage.
9
vs SSD?
📊 medium
Answer: Both multi-scale one-stage; RetinaNet’s focal loss specifically addresses training imbalance SSD tackled partly with hard negative mining.
10
vs two-stage?
📊 medium
Answer: No separate proposal stage—simpler pipeline; historically competitive mAP on COCO with proper FPN + focal loss.
11
Training tips?
📊 medium
Answer: Longer schedules help; careful anchor matching; synchronized BN on multi-GPU for large batch stability.
12
Inference cost?
⚡ easy
Answer: Single backbone forward + per-level heads + NMS—faster than two-stage but still heavier than tiny YOLO variants.
13
Anchor-free successors?
🔥 hard
Answer: FCOS, CenterNet, DETR reduce anchor design—focal loss ideas still influence classification in some heads.
14
Why sigmoid per class?
📊 medium
Answer: Enables multi-label rare cases and simplifies K independent binary classifiers vs softmax mutual exclusivity.
15
Unified loss?
⚡ easy
Answer: Sum of focal classification + smooth L1 regression over all locations (masked to assigned anchors).
16
Variants of focal loss?
🔥 hard
Answer: Quality focal loss, balanced loss, GHM—adjust weighting scheme for hard/easy examples differently.
17
IoU-aware classification?
🔥 hard
Answer: Some heads predict joint IoU quality with class to better rank detections—post-RetinaNet refinement.
18
Historical COCO note?
⚡ easy
Answer: RetinaNet showed one-stage could match two-stage mAP around 2017—important milestone before transformer detectors.
19
Limitations?
📊 medium
Answer: Many hyperparameters (α, γ, anchor design); dense preds still need NMS; superseded in some tracks by newer architectures.
20
When reuse focal loss?
⚡ easy
Answer: Any extreme class imbalance in dense prediction—segmentation, keypoint heatmaps, or custom detectors.
RetinaNet Cheat Sheet
Loss
- FL = −α(1−p_t)^γ log p_t
- Focus on hard ex.
Backbone
- FPN P3–P7
Type
- One-stage dense
💡 Pro tip: Focal loss fights easy-negative dominance in dense classification.
Full tutorial track
Go deeper with the matching tutorial chapter and code examples.