Transfer Learning MCQ · test your knowledge
From fine‑tuning to domain adaptation – 15 questions covering pretrained models, feature extraction, and best practices.
Transfer learning: leverage pretrained models
Transfer learning enables you to take a model trained on a large dataset (like ImageNet) and adapt it to a new, often smaller task. It dramatically reduces training time and data requirements. This MCQ covers core concepts: feature extraction, fine‑tuning, domain adaptation, and popular architectures like ResNet, BERT, and VGG.
Why transfer?
Transfer learning is the go‑to approach in computer vision and NLP when you have limited data. It also speeds up convergence and often yields better performance.
Transfer learning glossary – key concepts
Feature extraction
Freeze pretrained layers, use them as fixed feature extractors, and train only the new classifier head.
Fine‑tuning
Unfreeze some or all of the pretrained layers and continue training on the new task with a low learning rate.
Pretrained CNN (ResNet, VGG)
Models trained on ImageNet, often used for vision transfer learning.
Pretrained NLP (BERT, GPT)
Transformer models pretrained on massive text corpora, fine‑tuned for downstream tasks.
Domain adaptation
Special case of transfer learning where source and target domains differ but tasks are similar.
Catastrophic forgetting
When fine‑tuning, the model may forget previously learned knowledge if trained too aggressively.
Freezing layers
Setting `requires_grad=False` for pretrained layers to keep their weights unchanged during training.
# PyTorch transfer learning example (feature extraction)
import torchvision.models as models
model = models.resnet18(pretrained=True)
for param in model.parameters():
param.requires_grad = False
model.fc = nn.Linear(model.fc.in_features, num_classes) # new head
Common transfer learning interview questions
- What's the difference between feature extraction and fine‑tuning?
- Why do we typically use a lower learning rate for pretrained layers?
- What is catastrophic forgetting and how do you mitigate it?
- How would you adapt a ResNet trained on ImageNet to a grayscale medical image task?
- Explain domain adaptation and give an example.
- When would you unfreeze layers progressively during fine‑tuning?