Computer Vision Interview
60 Q&A
Chapter 5
Local Feature Detectors — Interview Q&A
Harris corners, SIFT, and ORB—scale and rotation aware keypoints for matching and tracking.
60 questions
Chapter 5
Harris Corner Detector: 20 Essential Q&A
1
What does the Harris corner detector find?
⚡ easy
Answer: Locations where intensity changes strongly in two directions—corners and strong junctions—via local second-order structure of gradients.
2
What does Harris maximize?
📊 medium
Answer: Change in SSD of a patch under small shifts u,v—approximated by quadratic form involving structure tensor M.
3
Define the second-moment matrix M.
🔥 hard
Answer: M = Σ w(x,y) [Ix² IxIy; IxIy Iy²] over a window—captures local gradient covariance; eigenvectors give principal gradient directions.
4
Interpret eigenvalues λ1, λ2 of M?
📊 medium
Answer: Both small: flat; one large, one small: edge; both large: corner (intensity varies along two orthogonal directions).
5
Harris response R?
📊 medium
Answer: R = det(M) − k·trace(M)² = λ1λ2 − k(λ1+λ2)²—avoids explicit eigen decomposition; k ≈ 0.04–0.06 typical.
6
Effect of k?
⚡ easy
Answer: Tunes sensitivity vs noise; too large suppresses corners; empirical constant, not learned from data in classical form.
7
R on flat region?
⚡ easy
Answer: det≈0, trace≈0 → R negative or near zero—rejected.
8
R on edge?
⚡ easy
Answer: One eigenvalue ~0 → det≈0 while trace>0 → R negative—rejected as corner.
9
What is Shi-Tomasi “good features to track”?
📊 medium
Answer: Score = min(λ1, λ2) with threshold—more stable for tracking; picks corners with minimum directional strength guaranteed.
10
Effect of window size?
📊 medium
Answer: Larger window: smoother M, less localization noise but merges nearby corners; smaller: noisier, better localization.
11
Why Gaussian weights w?
⚡ easy
Answer: Emphasize center of patch, reduce boundary artifacts when sliding window—standard in cornerHarris.
12
Invariances of Harris?
📊 medium
Answer: Invariant to rotation (eigenvalues of symmetric M); not scale invariant—same corner changes type across scales; partial brightness affine in practice.
13
How fix scale weakness?
🔥 hard
Answer: Multi-scale Harris, scale-space extrema (like SIFT), or detectors with inherent scale selection (LoG).
14
Refine corners to sub-pixel?
📊 medium
Answer: Fit quadratic to corner response surface or iterative refinement (OpenCV
cornerSubPix) using gradients.
15
Need NMS?
⚡ easy
Answer: Yes—Harris map is dense; keep local maxima above threshold separated by minimum distance.
16
Harris vs FAST?
📊 medium
Answer: FAST: speed-optimized segment test, not gradient matrix—faster, less accurate localization; Harris more principled, slower.
17
OpenCV
cornerHarris output?
⚡ easy
Answer: Float response map; threshold + NMS to get points; often followed by
goodFeaturesToTrack (Shi-Tomasi).
R = cv2.cornerHarris(gray, 2, 3, 0.04)
18
Why use det − k·trace²?
🔥 hard
Answer: Algebraic proxy for “both eigenvalues large” without sqrt—computationally cheap and continuous score.
19
Gradients Ix, Iy?
⚡ easy
Answer: Usually Sobel or Scharr on smoothed image—noise reduction before derivative recommended.
20
Planar surface assumption?
📊 medium
Answer: Harris assumes small motion model in image plane—breaks for strong perspective on 3D corners unless patch small enough.
SIFT: 20 Essential Q&A
21
What is SIFT?
⚡ easy
Answer: Scale-Invariant Feature Transform—detects blob-like keypoints in scale-space and builds a 128-D gradient-orientation histogram descriptor; robust to scale, rotation, moderate viewpoint/lighting.
22
What is Difference of Gaussians (DoG)?
📊 medium
Answer: DoG = G(σ1)−G(σ2) approximates scale-normalized LoG—cheap way to find blob-like structures across scales.
23
What is an octave?
📊 medium
Answer: Series of images downsampled by 2 with several σ levels per octave—covers large scale range efficiently.
24
How are keypoints detected?
🔥 hard
Answer: 3×3×3 neighborhood search for scale-space extrema (max/min) in DoG volume—candidate keypoints.
25
Refinement and edge rejection?
🔥 hard
Answer: Taylor expansion fit for subpixel location and scale; reject low contrast; use Hessian of DoG to reject edge-like unstable peaks (ratio of principal curvatures).
26
Orientation histogram?
📊 medium
Answer: Weighted gradient orientations in neighborhood; peak(s) define canonical rotation—descriptor becomes rotation invariant.
27
How is the descriptor built?
📊 medium
Answer: 16×16 window into 4×4 cells; each cell has 8-bin orientation histogram of gradients; 4×4×8 = 128 values, normalized.
28
Why 4×4 grid?
⚡ easy
Answer: Balances spatial layout (localization) vs distinctiveness; finer grid more sensitive to deformation.
29
Why normalize twice?
📊 medium
Answer: L2 normalize, clip large values to reduce illumination dominance, renormalize—improves robustness to affine lighting.
30
What is RootSIFT?
📊 medium
Answer: Apply square root to L1-normalized SIFT then L2 normalize—uses Hellinger kernel implicitly; often improves retrieval.
31
SIFT invariances?
📊 medium
Answer: Scale + rotation; approximate affine with dominant orientation; not fully viewpoint invariant for strong 3D perspective.
32
SIFT vs ORB speed?
⚡ easy
Answer: SIFT heavier (float descriptor, pyramid DoG); ORB binary + FAST—ORB much faster on embedded/CPU.
33
SIFT patents?
⚡ easy
Answer: Were encumbered in US until expired (~2020); OpenCV contrib had nonfree flag—now widely usable.
34
Typical matching?
📊 medium
Answer: L2 or cosine on float vectors; ratio test + RANSAC for geometry.
35
Contrast threshold?
⚡ easy
Answer: Filters weak DoG extrema—reduces unstable keypoints on flat noise.
36
Why DoG approximates LoG?
📊 medium
Answer: Mathematical identity: DoG with σ ratio ~√2 approximates σ²∇²G up to scale—cheap blob detector.
37
Color SIFT?
🔥 hard
Answer: Compute SIFT on color channels or opponent color spaces for extra discriminability—more dimensions or fused descriptors.
38
PCA-SIFT?
🔥 hard
Answer: Project gradient patch to lower-dim PCA basis—smaller descriptor; less common now than vanilla SIFT or learned features.
39
OpenCV?
⚡ easy
Answer:
SIFT_create() in cv2 (main module after patent expiry); returns keypoints + descriptors.
40
Limitations?
📊 medium
Answer: Computation cost, repetitive texture ambiguities, limited with strong motion blur or specular highlights—deep features may win with data.
ORB: 20 Essential Q&A
41
What is ORB?
⚡ easy
Answer: Oriented FAST and Rotated BRIEF—free alternative to SIFT/SURF: FAST corners, orientation from intensity centroid, steered BRIEF binary descriptor with learned pattern (rBRIEF).
42
What is FAST?
📊 medium
Answer: Compare pixel to arc of circle pixels; corner if contiguous segment brighter/darker by threshold—very fast binary tests.
43
What is BRIEF?
📊 medium
Answer: Binary string from pairwise intensity comparisons in smoothed patch—256 bits typical; match with Hamming distance.
44
ORB orientation?
📊 medium
Answer: Intensity centroid vs corner—angle of vector from keypoint to centroid gives dominant direction to steer BRIEF.
45
What is rBRIEF?
🔥 hard
Answer: Learn subset of BRIEF pairs with low correlation under rotation—better variance and discrimination than random BRIEF when oriented.
46
Scale in ORB?
⚡ easy
Answer: Image pyramid with FAST+BRIEF at each level—approximates scale invariance like other multi-scale detectors.
47
Match ORB how?
⚡ easy
Answer: Hamming distance on bitstrings—very fast with POPCNT; BFMatcher or LSH variants.
48
ORB vs BRISK?
📊 medium
Answer: BRISK uses scale-space FAST-like sampling with learned pattern; both binary; tradeoffs in pattern and scale sampling differ.
49
ORB vs SIFT?
📊 medium
Answer: ORB: faster, compact binary, less discriminative on hard wide-baseline; SIFT: float 128-D, heavier, often stronger on difficult pairs.
50
Typical ORB length?
⚡ easy
Answer: 256 bits (32 bytes)—fixed in OpenCV default; tunable via WTA_K and descriptor size params.
51
BRIEF pixel pairs?
📊 medium
Answer: Predefined or learned (x_i, y_i) locations in patch; compare I(x_i)<I(y_i) → bit—rotation steers coordinates.
52
Binary descriptor noise?
⚡ easy
Answer: Sensitive to bit flips from noise—Gaussian smoothing before sampling reduces; strong blur changes comparisons.
53
Steered BRIEF?
🔥 hard
Answer: Rotate sampling coordinates by orientation θ before comparisons—makes descriptor rotation invariant.
54
WTA in ORB OpenCV?
🔥 hard
Answer: Can use multi-point WTA to build richer binary tests—implementation detail in ORB options.
55
OpenCV?
⚡ easy
Answer:
cv2.ORB_create(nfeatures=500) → detectAndCompute.
orb = cv2.ORB_create(500)
kp, des = orb.detectAndCompute(gray, None)
56
NMS on FAST?
⚡ easy
Answer: Suppress nearby FAST responses—ORB applies score (Harris) and grid to distribute features.
57
Harris on FAST?
📊 medium
Answer: Use Harris measure on candidate FAST points to rank corner quality.
58
Why ORB on mobile?
⚡ easy
Answer: Low memory, integer/bit ops, real-time VO/SLAM on CPUs without GPU.
59
When ORB struggles?
📊 medium
Answer: Strong viewpoint change, repetitive textures, heavy motion blur—may need SIFT/AKAZE or learning methods.
60
What is AKAZE briefly?
📊 medium
Answer: Nonlinear scale space + binary descriptor—often stronger than ORB on some benchmarks, still efficient.
Full tutorial chapter
Pair these interview notes with the matching CV tutorial chapter.