CNNs for Vision MCQ

Local filters, stacked hierarchies, and why convolutions beat dense layers on images.

Easy: 5 Q Medium: 6 Q Hard: 4 Q

Conv

Local filters

Pool

Downsample

Stride

Step size

Receptive field

Context

Your Score

0/15

Keep practicing to improve your CNN fundamentals!

0 Correct 0 Incorrect

Convolutional networks for images

CNNs apply learned filters locally across the spatial grid, sharing parameters across locations (translation equivariance). Stacked conv layers build hierarchical features; pooling and stride reduce resolution; normalization and skip connections appear in deeper designs used for detection and segmentation.

Parameter sharing

One conv kernel is reused at every spatial position—far fewer parameters than a fully connected layer on the full image.

Key ideas

Convolution

Sliding inner product: output channels mix local neighborhoods of input channels.

Pooling

Max or average pool reduces spatial size and adds local translation tolerance.

Stride & padding

Stride > 1 downsamples; padding preserves spatial size or aligns dimensions.

Receptive field

Region in the input that can influence one output neuron—grows with depth.

Typical CNN stack

Conv → activation → pool → … → global pool / FC → task head

Pro tip: 1×1 convolutions mix channels without changing spatial size—used heavily in Inception-style and bottleneck blocks.

Previous: SLAM MCQ Next: AlexNet MCQ