Advanced CNN Architectures MCQ

ResNet skip connections, MobileNet depthwise separable convolutions, and EfficientNet scaling.

Easy: 0 Q Medium: 0 Q Hard: 0 Q

Your Score

0/0

Keep practicing to improve your Computer Vision knowledge!

0 Correct 0 Incorrect

ResNet MCQ

Residual learning

ResNet replaces a stack of layers learning H(x) with residual blocks learning F(x) such that output is F(x)+x when shapes match. Identity shortcuts propagate gradients and ease optimization, enabling much deeper networks on ImageNet and detection backbones.

Why F(x)+x

If the desired mapping is close to identity, learning small perturbations F is easier than learning a full mapping from scratch.

Key ideas

Residual block

Two or three conv layers plus a shortcut summing with the input.

Projection shortcut

1×1 conv on x when channel or spatial sizes change.

Bottleneck

Reduces cost: narrow → 3×3 → narrow channels per block.

Batch norm

Stabilizes training of very deep stacks (used in original ResNet).

Deep stack

stem → stage of residual blocks → global pool → FC / detection head

Pro tip: When implementing, verify tensor shapes: mismatched H×W or C needs projection conv on the shortcut.

MobileNet MCQ

Efficient CNNs

MobileNet factorizes a standard convolution into a depthwise spatial filter per input channel followed by a pointwise 1×1 that mixes channels. This cuts FLOPs and parameters dramatically. Width and input resolution multipliers offer accuracy–latency tradeoffs.

Depthwise separable

Cost roughly channels × k² + channels² vs channels_in × channels_out × k² for a k×k conv.

Key ideas

Depthwise

Each channel convolved independently with its own spatial kernel.

Pointwise

1×1 conv combines channels—linear mix at each pixel.

Width multiplier α

Uniformly thins channel counts across the network.

Resolution

Smaller input ρ reduces compute quadratically in spatial size.

MobileNet block

depthwise 3×3 → BN → ReLU → pointwise 1×1 → BN → ReLU

Pro tip: MobileNetV2 adds inverted residuals and linear bottlenecks; still built on depthwise separable ideas.

Previous: CNN Basics for Vision MCQ Next: Generative Vision Models MCQ