Local Feature Detectors — Interview Q&A

Question 1

1 What does the Harris corner detector find? ⚡ easy

Answer

Answer: Locations where intensity changes strongly in two directions—corners and strong junctions—via local second-order structure of gradients.

Question 2

2 What does Harris maximize? 📊 medium

Answer

Answer: Change in SSD of a patch under small shifts u,v—approximated by quadratic form involving structure tensor M.

Question 3

3 Define the second-moment matrix M. 🔥 hard

Answer

Answer: M = Σ w(x,y) [Ix² IxIy; IxIy Iy²] over a window—captures local gradient covariance; eigenvectors give principal gradient directions.

Question 4

4 Interpret eigenvalues λ1, λ2 of M? 📊 medium

Answer

Answer: Both small: flat; one large, one small: edge; both large: corner (intensity varies along two orthogonal directions).

Question 5

5 Harris response R? 📊 medium

Answer

Answer: R = det(M) − k·trace(M)² = λ1λ2 − k(λ1+λ2)²—avoids explicit eigen decomposition; k ≈ 0.04–0.06 typical.

Question 6

6 Effect of k? ⚡ easy

Answer

Answer: Tunes sensitivity vs noise; too large suppresses corners; empirical constant, not learned from data in classical form.

Question 7

7 R on flat region? ⚡ easy

Answer

Answer: det≈0, trace≈0 → R negative or near zero—rejected.

Question 8

8 R on edge? ⚡ easy

Answer

Answer: One eigenvalue ~0 → det≈0 while trace>0 → R negative—rejected as corner.

Question 9

9 What is Shi-Tomasi “good features to track”? 📊 medium

Answer

Answer: Score = min(λ1, λ2) with threshold—more stable for tracking; picks corners with minimum directional strength guaranteed.

Question 10

10 Effect of window size? 📊 medium

Answer

Answer: Larger window: smoother M, less localization noise but merges nearby corners; smaller: noisier, better localization.

Question 11

11 Why Gaussian weights w? ⚡ easy

Answer

Answer: Emphasize center of patch, reduce boundary artifacts when sliding window—standard in cornerHarris.

Question 12

12 Invariances of Harris? 📊 medium

Answer

Answer: Invariant to rotation (eigenvalues of symmetric M); not scale invariant—same corner changes type across scales; partial brightness affine in practice.

Question 13

13 How fix scale weakness? 🔥 hard

Answer

Answer: Multi-scale Harris, scale-space extrema (like SIFT), or detectors with inherent scale selection (LoG).

Question 14

14 Refine corners to sub-pixel? 📊 medium

Answer

Answer: Fit quadratic to corner response surface or iterative refinement (OpenCV cornerSubPix) using gradients.

Question 15

15 Need NMS? ⚡ easy

Answer

Answer: Yes—Harris map is dense; keep local maxima above threshold separated by minimum distance.

Question 16

16 Harris vs FAST? 📊 medium

Answer

Answer: FAST: speed-optimized segment test, not gradient matrix—faster, less accurate localization; Harris more principled, slower.

Question 17

17 OpenCV cornerHarris output? ⚡ easy

Answer

Answer: Float response map; threshold + NMS to get points; often followed by goodFeaturesToTrack (Shi-Tomasi).

Question 18

18 Why use det − k·trace²? 🔥 hard

Answer

Answer: Algebraic proxy for “both eigenvalues large” without sqrt—computationally cheap and continuous score.

Question 19

19 Gradients Ix, Iy? ⚡ easy

Answer

Answer: Usually Sobel or Scharr on smoothed image—noise reduction before derivative recommended.

Question 20

20 Planar surface assumption? 📊 medium

Answer

Answer: Harris assumes small motion model in image plane—breaks for strong perspective on 3D corners unless patch small enough.

Question 21

21 What is SIFT? ⚡ easy

Answer

Answer: Scale-Invariant Feature Transform—detects blob-like keypoints in scale-space and builds a 128-D gradient-orientation histogram descriptor; robust to scale, rotation, moderate viewpoint/lighting.

Question 22

22 What is Difference of Gaussians (DoG)? 📊 medium

Answer

Answer: DoG = G(σ1)−G(σ2) approximates scale-normalized LoG—cheap way to find blob-like structures across scales.

Question 23

23 What is an octave? 📊 medium

Answer

Answer: Series of images downsampled by 2 with several σ levels per octave—covers large scale range efficiently.

Question 24

24 How are keypoints detected? 🔥 hard

Answer

Answer: 3×3×3 neighborhood search for scale-space extrema (max/min) in DoG volume—candidate keypoints.

Question 25

25 Refinement and edge rejection? 🔥 hard

Answer

Answer: Taylor expansion fit for subpixel location and scale; reject low contrast; use Hessian of DoG to reject edge-like unstable peaks (ratio of principal curvatures).

Question 26

26 Orientation histogram? 📊 medium

Answer

Answer: Weighted gradient orientations in neighborhood; peak(s) define canonical rotation—descriptor becomes rotation invariant.

Question 27

27 How is the descriptor built? 📊 medium

Answer

Answer: 16×16 window into 4×4 cells; each cell has 8-bin orientation histogram of gradients; 4×4×8 = 128 values, normalized.

Question 28

28 Why 4×4 grid? ⚡ easy

Answer

Answer: Balances spatial layout (localization) vs distinctiveness; finer grid more sensitive to deformation.

Question 29

29 Why normalize twice? 📊 medium

Answer

Answer: L2 normalize, clip large values to reduce illumination dominance, renormalize—improves robustness to affine lighting.

Question 30

30 What is RootSIFT? 📊 medium

Answer

Answer: Apply square root to L1-normalized SIFT then L2 normalize—uses Hellinger kernel implicitly; often improves retrieval.

Question 31

31 SIFT invariances? 📊 medium

Answer

Answer: Scale + rotation; approximate affine with dominant orientation; not fully viewpoint invariant for strong 3D perspective.

Question 32

32 SIFT vs ORB speed? ⚡ easy

Answer

Answer: SIFT heavier (float descriptor, pyramid DoG); ORB binary + FAST—ORB much faster on embedded/CPU.

Question 33

33 SIFT patents? ⚡ easy

Answer

Answer: Were encumbered in US until expired (~2020); OpenCV contrib had nonfree flag—now widely usable.

Question 34

34 Typical matching? 📊 medium

Answer

Answer: L2 or cosine on float vectors; ratio test + RANSAC for geometry.

Question 35

35 Contrast threshold? ⚡ easy

Answer

Answer: Filters weak DoG extrema—reduces unstable keypoints on flat noise.

Question 36

36 Why DoG approximates LoG? 📊 medium

Answer

Answer: Mathematical identity: DoG with σ ratio ~√2 approximates σ²∇²G up to scale—cheap blob detector.

Question 37

37 Color SIFT? 🔥 hard

Answer

Answer: Compute SIFT on color channels or opponent color spaces for extra discriminability—more dimensions or fused descriptors.

Question 38

38 PCA-SIFT? 🔥 hard

Answer

Answer: Project gradient patch to lower-dim PCA basis—smaller descriptor; less common now than vanilla SIFT or learned features.

Question 39

39 OpenCV? ⚡ easy

Answer

Answer: SIFT_create() in cv2 (main module after patent expiry); returns keypoints + descriptors.

Question 40

40 Limitations? 📊 medium

Answer

Answer: Computation cost, repetitive texture ambiguities, limited with strong motion blur or specular highlights—deep features may win with data.

Question 41

41 What is ORB? ⚡ easy

Answer

Answer: Oriented FAST and Rotated BRIEF—free alternative to SIFT/SURF: FAST corners, orientation from intensity centroid, steered BRIEF binary descriptor with learned pattern (rBRIEF).

Question 42

42 What is FAST? 📊 medium

Answer

Answer: Compare pixel to arc of circle pixels; corner if contiguous segment brighter/darker by threshold—very fast binary tests.

Question 43

43 What is BRIEF? 📊 medium

Answer

Answer: Binary string from pairwise intensity comparisons in smoothed patch—256 bits typical; match with Hamming distance.

Question 44

44 ORB orientation? 📊 medium

Answer

Answer: Intensity centroid vs corner—angle of vector from keypoint to centroid gives dominant direction to steer BRIEF.

Question 45

45 What is rBRIEF? 🔥 hard

Answer

Answer: Learn subset of BRIEF pairs with low correlation under rotation—better variance and discrimination than random BRIEF when oriented.

Question 46

46 Scale in ORB? ⚡ easy

Answer

Answer: Image pyramid with FAST+BRIEF at each level—approximates scale invariance like other multi-scale detectors.

Question 47

47 Match ORB how? ⚡ easy

Answer

Answer: Hamming distance on bitstrings—very fast with POPCNT; BFMatcher or LSH variants.

Question 48

48 ORB vs BRISK? 📊 medium

Answer

Answer: BRISK uses scale-space FAST-like sampling with learned pattern; both binary; tradeoffs in pattern and scale sampling differ.

Question 49

49 ORB vs SIFT? 📊 medium

Answer

Answer: ORB: faster, compact binary, less discriminative on hard wide-baseline; SIFT: float 128-D, heavier, often stronger on difficult pairs.

Question 50

50 Typical ORB length? ⚡ easy

Answer

Answer: 256 bits (32 bytes)—fixed in OpenCV default; tunable via WTA_K and descriptor size params.

Question 51

51 BRIEF pixel pairs? 📊 medium

Answer

Answer: Predefined or learned (x_i, y_i) locations in patch; compare I(x_i)<I(y_i) → bit—rotation steers coordinates.

Question 52

52 Binary descriptor noise? ⚡ easy

Answer

Answer: Sensitive to bit flips from noise—Gaussian smoothing before sampling reduces; strong blur changes comparisons.

Question 53

53 Steered BRIEF? 🔥 hard

Answer

Answer: Rotate sampling coordinates by orientation θ before comparisons—makes descriptor rotation invariant.

Question 54

54 WTA in ORB OpenCV? 🔥 hard

Answer

Answer: Can use multi-point WTA to build richer binary tests—implementation detail in ORB options.

Question 55

55 OpenCV? ⚡ easy

Answer

Answer: cv2.ORB_create(nfeatures=500) → detectAndCompute.

Question 56

56 NMS on FAST? ⚡ easy

Answer

Answer: Suppress nearby FAST responses—ORB applies score (Harris) and grid to distribute features.

Question 57

57 Harris on FAST? 📊 medium

Answer

Answer: Use Harris measure on candidate FAST points to rank corner quality.

Question 58

58 Why ORB on mobile? ⚡ easy

Answer

Answer: Low memory, integer/bit ops, real-time VO/SLAM on CPUs without GPU.

Question 59

59 When ORB struggles? 📊 medium

Answer

Answer: Strong viewpoint change, repetitive textures, heavy motion blur—may need SIFT/AKAZE or learning methods.

Question 60

60 What is AKAZE briefly? 📊 medium

Answer

Answer: Nonlinear scale space + binary descriptor—often stronger than ORB on some benchmarks, still efficient.

Local Feature Detectors — Interview Q&A

Harris Corner Detector: 20 Essential Q&A

SIFT: 20 Essential Q&A

ORB: 20 Essential Q&A

Full tutorial chapter