Computer Vision Interview 20 essential Q&A Updated 2026

features

Feature Detection Intro: 20 Essential Q&A

Detectors vs descriptors, invariances, and how classical features support matching and SLAM.

~11 min read 20 questions Intermediate

keypointsdescriptorsmatchinginvariance

Quick Navigation

1 What is a local image feature? ⚡ easy

Answer: A salient image patch with a keypoint (location, scale, orientation) and often a descriptor vector summarizing local appearance for matching.

2 Difference detector vs descriptor? 📊 medium

Answer: Detector finds stable interest points; descriptor encodes local neighborhood for similarity—can mix (e.g. Harris corners + SIFT descriptor in some pipelines).

3 Why are corners good features? 📊 medium

Answer: High gradient in multiple directions—well localized, repeatable under small viewpoint/light changes vs flat regions or straight edges.

4 What is a blob feature? 📊 medium

Answer: Extremum in scale-space (LoG/DoG)—captures roundish regions; complementary to corners for texture-poor scenes.

5 What is scale invariance? 📊 medium

Answer: Detect+describe at multiple scales or with scale-normalized patch so matching works across zoom—SIFT pyramid, ORB octave pyramid.

6 Achieve rotation invariance? 📊 medium

Answer: Assign dominant orientation from gradient histogram and rotate patch canonical frame—or use rotation-invariant descriptors (some trade distinctiveness).

7 What are affine covariant regions? 🔥 hard

Answer: Regions that deform predictably under affine viewing of planar surfaces—MSER, Harris-Affine family; stronger than similarity for wide baselines.

8 What is NCC template matching? 📊 medium

Answer: Normalized cross-correlation over patches—brightness/contrast normalized SSD-like score; expensive vs sparse keypoints.

9 What is Lowe's ratio test? 📊 medium

Answer: Reject match if distance to nearest neighbor is not sufficiently smaller than second-nearest—reduces ambiguous matches.

10 Mutual nearest neighbor? ⚡ easy

Answer: Accept match only if a is nearest to b and b nearest to a—simple filter for symmetric uniqueness.

11 Why RANSAC after feature matching? 📊 medium

Answer: Estimates geometric model (F/E/H) while rejecting outliers from incorrect matches—essential for robust pose and stitching.

12 What is bag-of-visual-words? 📊 medium

Answer: Quantize descriptors to vocabulary clusters; image → histogram of words—classic image retrieval / classification before deep CNNs.

13 Features in tracking? ⚡ easy

Answer: Track keypoints frame-to-frame with KLT, optical flow, or re-detect+match—balance drift vs redetection.

14 Features in SLAM / VO? 📊 medium

Answer: Sparse landmarks for bundle adjustment; need repeatable detection and robust data association across frames.

15 Define repeatability. ⚡ easy

Answer: Same real-world point detected under noise, blur, and viewpoint change—measured by overlap of keypoint regions on benchmark sequences.

16 Define distinctiveness. ⚡ easy

Answer: Descriptor separates correct matches from distractors—low false match rate at fixed threshold.

17 Effect of occlusion? ⚡ easy

Answer: Keypoints disappear; need robust matching, wide baseline tolerance, or dense methods / learning-based segmentation.

18 Why Hamming distance? ⚡ easy

Answer: For binary descriptors (BRIEF, ORB)—XOR + bit count; fast with POPCNT hardware.

19 What is FLANN? 📊 medium

Answer: Fast Library for Approximate Nearest Neighbors—speeds k-NN on high-dim descriptors with trees or LSH; trade accuracy for speed.

20 Learned local features? 🔥 hard

Answer: SuperPoint, LIFT, etc.—CNNs predict keypoints+descriptors end-to-end; outperform classical on some benchmarks with enough data.

Features Intro Cheat Sheet

Pipeline

Detect
Describe
Match

Invariance

Scale / rotation
Affine (harder)

Robustness

Ratio test
RANSAC

💡 Pro tip: Separate “where” (detector) from “what” (descriptor).

Full tutorial track

Go deeper with the matching tutorial chapter and code examples.

Features Tutorial

Previous Next