Computer Vision Interview 20 essential Q&A Updated 2026

tracking

Object Tracking Basics: 20 Essential Q&A

Follow objects over time—association, identity, and the gap between detection and trajectories.

~11 min read 20 questions Intermediate

MOTSOTassociationID

Quick Navigation

1 What is object tracking? ⚡ easy

Answer: Estimating object state over time in video—position, size, sometimes 3D pose—while preserving identity across frames.

2 SOT vs MOT? 📊 medium

Answer: Single-object tracking: one target given init box. Multi-object tracking: many objects with IDs—requires association across detections.

3 Tracking-by-detection? 📊 medium

Answer: Run detector each frame, link boxes into trajectories via association—modular and strong with good detectors.

4 What is data association? 🔥 hard

Answer: Decide which detection belongs to which track—classic bipartite matching with cost matrix (IoU, appearance, motion).

5 IoU matching? ⚡ easy

Answer: Greedy or Hungarian match predicted boxes to tracks by highest IoU above threshold—simple baseline for MOT.

6 Hungarian algorithm? 📊 medium

Answer: Solves assignment problem in O(n³)—global optimum for linear cost—used in SORT / DeepSORT association.

7 What are ID switches? 📊 medium

Answer: Tracker swaps identities between objects—common near crossings; penalized in MOTA metric.

8 Handle occlusion? 📊 medium

Answer: Predict motion during missing detections (Kalman), re-identify with appearance when visible again, or joint optimization over windows.

9 What is drift in SOT? ⚡ easy

Answer: Small errors accumulate updating from own predictions—mitigated by periodic re-detection or robust loss.

10 Why Kalman filters? 📊 medium

Answer: Predict box between frames with constant-velocity model; update when measurements arrive—cheap smooth motion prior.

11 Role of Re-ID features? 📊 medium

Answer: Cosine distance on embedding reduces ID switches when IoU ambiguous (similar DeepSORT).

12 What is MOTA? 🔥 hard

Answer: Multiple Object Tracking Accuracy—combines false positives, misses, and ID switches vs ground truth trajectories.

13 Online tracking? ⚡ easy

Answer: Uses only past and current frames—needed for robotics/live video; batch methods use future frames (smoother but not causal).

14 Classical KLT? 📊 medium

Answer: Track corner features with local flow—fast but fragile to appearance change; less common alone for generic objects now.

15 Siamese trackers? 📊 medium

Answer: Template branch + search region CNN—fast SOT without online fine-tuning in early versions (SiamFC family).

16 Transformer MOT? 🔥 hard

Answer: Track queries attend across space-time (e.g. TrackFormer)—joint detection+association in one model trend.

17 Real-time MOT? 📊 medium

Answer: Light detector + simple association (SORT) or specialized accelerators—appearance models add compute.

18 BEV tracking? 🔥 hard

Answer: Track in bird’s-eye view from multi-camera or LiDAR—used in autonomous driving stacks.

19 3D MOT? 📊 medium

Answer: Associate 3D boxes or point clusters—IoU in 3D or GIoU variants; Kalman in xyz + yaw.

20 Common benchmarks? ⚡ easy

Answer: MOTChallenge, KITTI tracking, nuScenes tracking—each defines detection input protocol and metrics.

Tracking Cheat Sheet

Paradigm

Det + associate

Match

IoU / cost matrix
Hungarian

Metric

MOTA
IDF1

💡 Pro tip: MOT = detect every frame + keep consistent IDs.

Full tutorial track

Go deeper with the matching tutorial chapter and code examples.

Tracking Tutorial

Previous Next