Computer Vision Interview 60 Q&A Chapter 5

Local Feature Detectors — Interview Q&A

Harris corners, SIFT, and ORB—scale and rotation aware keypoints for matching and tracking.

60 questions Chapter 5

Harris Corner Detector: 20 Essential Q&A

1 What does the Harris corner detector find? ⚡ easy
Answer: Locations where intensity changes strongly in two directions—corners and strong junctions—via local second-order structure of gradients.
2 What does Harris maximize? 📊 medium
Answer: Change in SSD of a patch under small shifts u,v—approximated by quadratic form involving structure tensor M.
3 Define the second-moment matrix M. 🔥 hard
Answer: M = Σ w(x,y) [Ix² IxIy; IxIy Iy²] over a window—captures local gradient covariance; eigenvectors give principal gradient directions.
4 Interpret eigenvalues λ1, λ2 of M? 📊 medium
Answer: Both small: flat; one large, one small: edge; both large: corner (intensity varies along two orthogonal directions).
5 Harris response R? 📊 medium
Answer: R = det(M) − k·trace(M)² = λ1λ2 − k(λ1+λ2)²—avoids explicit eigen decomposition; k ≈ 0.04–0.06 typical.
6 Effect of k? ⚡ easy
Answer: Tunes sensitivity vs noise; too large suppresses corners; empirical constant, not learned from data in classical form.
7 R on flat region? ⚡ easy
Answer: det≈0, trace≈0 → R negative or near zero—rejected.
8 R on edge? ⚡ easy
Answer: One eigenvalue ~0 → det≈0 while trace>0 → R negative—rejected as corner.
9 What is Shi-Tomasi “good features to track”? 📊 medium
Answer: Score = min(λ1, λ2) with threshold—more stable for tracking; picks corners with minimum directional strength guaranteed.
10 Effect of window size? 📊 medium
Answer: Larger window: smoother M, less localization noise but merges nearby corners; smaller: noisier, better localization.
11 Why Gaussian weights w? ⚡ easy
Answer: Emphasize center of patch, reduce boundary artifacts when sliding window—standard in cornerHarris.
12 Invariances of Harris? 📊 medium
Answer: Invariant to rotation (eigenvalues of symmetric M); not scale invariant—same corner changes type across scales; partial brightness affine in practice.
13 How fix scale weakness? 🔥 hard
Answer: Multi-scale Harris, scale-space extrema (like SIFT), or detectors with inherent scale selection (LoG).
14 Refine corners to sub-pixel? 📊 medium
Answer: Fit quadratic to corner response surface or iterative refinement (OpenCV cornerSubPix) using gradients.
15 Need NMS? ⚡ easy
Answer: Yes—Harris map is dense; keep local maxima above threshold separated by minimum distance.
16 Harris vs FAST? 📊 medium
Answer: FAST: speed-optimized segment test, not gradient matrix—faster, less accurate localization; Harris more principled, slower.
17 OpenCV cornerHarris output? ⚡ easy
Answer: Float response map; threshold + NMS to get points; often followed by goodFeaturesToTrack (Shi-Tomasi).
R = cv2.cornerHarris(gray, 2, 3, 0.04)
18 Why use det − k·trace²? 🔥 hard
Answer: Algebraic proxy for “both eigenvalues large” without sqrt—computationally cheap and continuous score.
19 Gradients Ix, Iy? ⚡ easy
Answer: Usually Sobel or Scharr on smoothed image—noise reduction before derivative recommended.
20 Planar surface assumption? 📊 medium
Answer: Harris assumes small motion model in image plane—breaks for strong perspective on 3D corners unless patch small enough.

SIFT: 20 Essential Q&A

21 What is SIFT? ⚡ easy
Answer: Scale-Invariant Feature Transform—detects blob-like keypoints in scale-space and builds a 128-D gradient-orientation histogram descriptor; robust to scale, rotation, moderate viewpoint/lighting.
22 What is Difference of Gaussians (DoG)? 📊 medium
Answer: DoG = G(σ1)−G(σ2) approximates scale-normalized LoG—cheap way to find blob-like structures across scales.
23 What is an octave? 📊 medium
Answer: Series of images downsampled by 2 with several σ levels per octave—covers large scale range efficiently.
24 How are keypoints detected? 🔥 hard
Answer: 3×3×3 neighborhood search for scale-space extrema (max/min) in DoG volume—candidate keypoints.
25 Refinement and edge rejection? 🔥 hard
Answer: Taylor expansion fit for subpixel location and scale; reject low contrast; use Hessian of DoG to reject edge-like unstable peaks (ratio of principal curvatures).
26 Orientation histogram? 📊 medium
Answer: Weighted gradient orientations in neighborhood; peak(s) define canonical rotation—descriptor becomes rotation invariant.
27 How is the descriptor built? 📊 medium
Answer: 16×16 window into 4×4 cells; each cell has 8-bin orientation histogram of gradients; 4×4×8 = 128 values, normalized.
28 Why 4×4 grid? ⚡ easy
Answer: Balances spatial layout (localization) vs distinctiveness; finer grid more sensitive to deformation.
29 Why normalize twice? 📊 medium
Answer: L2 normalize, clip large values to reduce illumination dominance, renormalize—improves robustness to affine lighting.
30 What is RootSIFT? 📊 medium
Answer: Apply square root to L1-normalized SIFT then L2 normalize—uses Hellinger kernel implicitly; often improves retrieval.
31 SIFT invariances? 📊 medium
Answer: Scale + rotation; approximate affine with dominant orientation; not fully viewpoint invariant for strong 3D perspective.
32 SIFT vs ORB speed? ⚡ easy
Answer: SIFT heavier (float descriptor, pyramid DoG); ORB binary + FAST—ORB much faster on embedded/CPU.
33 SIFT patents? ⚡ easy
Answer: Were encumbered in US until expired (~2020); OpenCV contrib had nonfree flag—now widely usable.
34 Typical matching? 📊 medium
Answer: L2 or cosine on float vectors; ratio test + RANSAC for geometry.
35 Contrast threshold? ⚡ easy
Answer: Filters weak DoG extrema—reduces unstable keypoints on flat noise.
36 Why DoG approximates LoG? 📊 medium
Answer: Mathematical identity: DoG with σ ratio ~√2 approximates σ²∇²G up to scale—cheap blob detector.
37 Color SIFT? 🔥 hard
Answer: Compute SIFT on color channels or opponent color spaces for extra discriminability—more dimensions or fused descriptors.
38 PCA-SIFT? 🔥 hard
Answer: Project gradient patch to lower-dim PCA basis—smaller descriptor; less common now than vanilla SIFT or learned features.
39 OpenCV? ⚡ easy
Answer: SIFT_create() in cv2 (main module after patent expiry); returns keypoints + descriptors.
40 Limitations? 📊 medium
Answer: Computation cost, repetitive texture ambiguities, limited with strong motion blur or specular highlights—deep features may win with data.

ORB: 20 Essential Q&A

41 What is ORB? ⚡ easy
Answer: Oriented FAST and Rotated BRIEF—free alternative to SIFT/SURF: FAST corners, orientation from intensity centroid, steered BRIEF binary descriptor with learned pattern (rBRIEF).
42 What is FAST? 📊 medium
Answer: Compare pixel to arc of circle pixels; corner if contiguous segment brighter/darker by threshold—very fast binary tests.
43 What is BRIEF? 📊 medium
Answer: Binary string from pairwise intensity comparisons in smoothed patch—256 bits typical; match with Hamming distance.
44 ORB orientation? 📊 medium
Answer: Intensity centroid vs corner—angle of vector from keypoint to centroid gives dominant direction to steer BRIEF.
45 What is rBRIEF? 🔥 hard
Answer: Learn subset of BRIEF pairs with low correlation under rotation—better variance and discrimination than random BRIEF when oriented.
46 Scale in ORB? ⚡ easy
Answer: Image pyramid with FAST+BRIEF at each level—approximates scale invariance like other multi-scale detectors.
47 Match ORB how? ⚡ easy
Answer: Hamming distance on bitstrings—very fast with POPCNT; BFMatcher or LSH variants.
48 ORB vs BRISK? 📊 medium
Answer: BRISK uses scale-space FAST-like sampling with learned pattern; both binary; tradeoffs in pattern and scale sampling differ.
49 ORB vs SIFT? 📊 medium
Answer: ORB: faster, compact binary, less discriminative on hard wide-baseline; SIFT: float 128-D, heavier, often stronger on difficult pairs.
50 Typical ORB length? ⚡ easy
Answer: 256 bits (32 bytes)—fixed in OpenCV default; tunable via WTA_K and descriptor size params.
51 BRIEF pixel pairs? 📊 medium
Answer: Predefined or learned (x_i, y_i) locations in patch; compare I(x_i)<I(y_i) → bit—rotation steers coordinates.
52 Binary descriptor noise? ⚡ easy
Answer: Sensitive to bit flips from noise—Gaussian smoothing before sampling reduces; strong blur changes comparisons.
53 Steered BRIEF? 🔥 hard
Answer: Rotate sampling coordinates by orientation θ before comparisons—makes descriptor rotation invariant.
54 WTA in ORB OpenCV? 🔥 hard
Answer: Can use multi-point WTA to build richer binary tests—implementation detail in ORB options.
55 OpenCV? ⚡ easy
Answer: cv2.ORB_create(nfeatures=500) → detectAndCompute.
orb = cv2.ORB_create(500)
kp, des = orb.detectAndCompute(gray, None)
56 NMS on FAST? ⚡ easy
Answer: Suppress nearby FAST responses—ORB applies score (Harris) and grid to distribute features.
57 Harris on FAST? 📊 medium
Answer: Use Harris measure on candidate FAST points to rank corner quality.
58 Why ORB on mobile? ⚡ easy
Answer: Low memory, integer/bit ops, real-time VO/SLAM on CPUs without GPU.
59 When ORB struggles? 📊 medium
Answer: Strong viewpoint change, repetitive textures, heavy motion blur—may need SIFT/AKAZE or learning methods.
60 What is AKAZE briefly? 📊 medium
Answer: Nonlinear scale space + binary descriptor—often stronger than ORB on some benchmarks, still efficient.
Full tutorial chapter

Pair these interview notes with the matching CV tutorial chapter.

align-items-center flex-wrap gap-2"> Previous Next