Related Computer Vision Links
Learn Mobilenet Computer Vision Tutorial, validate concepts with Mobilenet Computer Vision MCQ Questions, and prepare interviews through Mobilenet Computer Vision Interview Questions and Answers.
Computer Vision Interview
20 essential Q&A
Updated 2026
MobileNet
MobileNet: 20 Essential Q&A
Depthwise + pointwise convolutions—building accurate vision models under tight FLOP and latency budgets.
~11 min read
20 questions
Advanced
depthwisepointwiseα widthinverted
Quick Navigation
1
What is MobileNet?
📊 medium
Answer: Efficient CNN family for mobile/edge using depthwise separable convolutions to cut FLOPs and parameters vs standard convs.
2
What is depthwise convolution?
📊 medium
Answer: Each input channel has its own spatial filter—no mixing across channels; drastically fewer params than full conv per output channel.
3
What is pointwise convolution?
📊 medium
Answer: 1×1 conv after depthwise—mixes channels at each spatial location, like per-pixel linear layer across depth.
# Depthwise: groups=in_channels; Pointwise: 1x1 conv
4
Complexity vs standard conv?
🔥 hard
Answer: Roughly 1/C_out + 1/k² factor reduction vs k×k conv when comparing costs—huge savings for large kernels and channels.
5
Width multiplier α?
⚡ easy
Answer: Uniformly thin every layer’s channels by α ∈ (0,1]—linear accuracy–latency tradeoff for deployment targets.
6
Resolution multiplier?
📊 medium
Answer: Train/infer on smaller input resolution ρ—quadratic FLOP savings with predictable accuracy drop.
7
MobileNetV2 inverted residual?
🔥 hard
Answer: Expand low-dim bottleneck → depthwise → project back—shortcut connects thin bottlenecks (memory efficient), opposite of classical residual wide→narrow.
8
Why expansion t?
📊 medium
Answer: Depthwise needs rich features—expand channels by factor t before DW conv, then linear 1×1 compress to avoid ReLU killing info in low-dim subspace.
9
ReLU6?
📊 medium
Answer: Clip ReLU at 6—originally claimed helpful for quantized deployment; still seen in some mobile architectures.
10
MobileNet + SSD?
📊 medium
Answer: Lightweight object detectors attach SSD heads to MobileNet stages—real-time on phones with acceptable mAP on constrained devices.
11
vs ShuffleNet?
🔥 hard
Answer: ShuffleNet uses channel shuffle after grouped convs—different structural trick; both target efficient inference.
12
vs EfficientNet?
📊 medium
Answer: EfficientNet scales depth/width/resolution together (compound scaling)—often better Pareto frontier; MobileNet simpler family widely supported in runtimes.
13
MobileNetV3?
🔥 hard
Answer: Uses NAS + NetAdapt for layer choices, h-swish activations in some layers, SE-like squeeze-excitation—improved accuracy per FLOP.
14
Squeeze-and-excitation?
📊 medium
Answer: Global pool → small FC → channel gates—recalibrates channel importance; appears in MobileNetV3 and many efficient nets.
15
Quantization?
⚡ easy
Answer: Depthwise-heavy nets often deployed as INT8—fewer MACs and memory; validate accuracy after PTQ/QAT.
16
Strided depthwise?
📊 medium
Answer: Depthwise conv with stride 2 downsamples spatially—paired with pointwise for channel mix; replaces pooling in many blocks.
17
Pointwise = ?
⚡ easy
Answer: Standard conv with 1×1 kernel—channel mixing only, no spatial context.
18
Transfer to tasks?
📊 medium
Answer: ImageNet-pretrained MobileNet backbones fine-tune for classification, detection, segmentation with small heads—standard on edge.
19
Accuracy ceiling?
📊 medium
Answer: Extreme width/resolution cuts hurt top-1 on hard datasets—need larger efficient families or distillation from big teacher.
20
Deployment?
⚡ easy
Answer: Use vendor runtimes (CoreML, NNAPI, TensorRT) with fused DW+PW kernels; profile latency not just FLOPs.
MobileNet Cheat Sheet
Core
- DW + PW
Scale
- α width
- ρ resolution
V2
- Inverted residual
- Linear bottleneck
💡 Pro tip: Depthwise per channel, pointwise mixes channels—know the cost win.
Full tutorial track
Go deeper with the matching tutorial chapter and code examples.