Stereo Vision: 20 Essential Q&A

Question 1

1 What is stereo vision? ⚡ easy

Answer

Answer: Using two (or more) calibrated views with known baseline to recover depth via triangulation of corresponding points.

Question 2

2 Define disparity. 📊 medium

Answer

Answer: Horizontal shift between corresponding pixels in a rectified stereo pair—larger disparity means closer surface (inverse relation to depth).

Question 3

3 Depth from disparity? 📊 medium

Answer

Answer: Z ≈ f × B / d (f focal length, B baseline, d disparity)—assumes rectified parallel cameras and pinhole model.

Question 4

4 Baseline tradeoff? 📊 medium

Answer

Answer: Larger B increases depth precision (more parallax) but worsens occlusions and matching in narrow scenes; small B reduces measurable disparity range.

Question 5

5 What is rectification? 🔥 hard

Answer

Answer: Warp both images so epipolar lines are horizontal scanlines—reduces correspondence search to 1D and simplifies disparity.

Question 6

6 Epipolar constraint? 📊 medium

Answer

Answer: Without rectification, match for a point lies on a line in the other image—comes from epipolar geometry of two views.

Question 7

7 What is stereo matching? 📊 medium

Answer

Answer: For each pixel (or patch), find best match along epipolar line using photometric cost (SAD, census, CNN features).

Question 8

8 Cost volume? 🔥 hard

Answer

Answer: 3D array H×W×D of matching costs over disparity levels—winner-take-all or global optimization (SGC, belief propagation) picks disparities.

Question 9

9 What is SGM? 🔥 hard

Answer

Answer: Semi-Global Matching aggregates costs along many paths with smoothness penalties—good quality/speed tradeoff in OpenCV StereoSGBM.

Question 10

10 Occlusion regions? 📊 medium

Answer

Answer: Pixels visible in only one view have undefined disparity—detected by consistency checks or left-right validation.

Question 11

11 Sub-pixel disparity? 📊 medium

Answer

Answer: Parabolic fit around discrete minimum or phase-based methods—needed for smooth surfaces and accurate 3D.

Question 12

12 Common errors? 📊 medium

Answer

Answer: Calibration errors, textureless regions, repetitive patterns, specular highlights, and motion if scene moves between exposures.

Question 13

13 StereoBM vs SGBM? ⚡ easy

Answer

Answer: BM: fixed small block, fast, blocky. SGBM: semi-global, slower, smoother—preferred when quality matters.

Question 14

14 Monocular depth? 📊 medium

Answer

Answer: Single image lacks scale without priors—learned networks predict relative depth; stereo gives metric depth with calibration.

Question 15

15 vs RGB-D? ⚡ easy

Answer

Answer: Structured light / ToF gives depth directly—no correspondence problem but range/resolution limits; stereo passive but needs texture.

Question 16

16 Multi-view stereo? 🔥 hard

Answer

Answer: Fuse many images (MVS) for dense point clouds—used in photogrammetry beyond two-camera stereo.

Question 17

17 Stereo in driving? 📊 medium

Answer

Answer: Wide-baseline camera pairs on vehicles for obstacle depth; often fused with radar/LiDAR and learned refinement.

Question 18

18 Fuse with LiDAR? 🔥 hard

Answer

Answer: Sparse accurate LiDAR anchors depth map from stereo; learning-based fusion common in autonomy stacks.

Question 19

19 Learned stereo? 📊 medium

Answer

Answer: CNNs build cost volumes or regress disparity directly (e.g. PSMNet)—strong on benchmarks when enough training data.

Question 20

20 Need calibration? ⚡ easy

Answer

Answer: Yes for metric depth—need K, distortion, and stereo extrinsics; rectification matrices derived from them.

Stereo Vision: 20 Essential Q&A

Quick Navigation

Stereo Cheat Sheet

Geometry

Match

Issues

Full tutorial track