Computer Vision Interview
20 essential Q&A
Updated 2026
tracking
Object Tracking Basics: 20 Essential Q&A
Follow objects over time—association, identity, and the gap between detection and trajectories.
~11 min read
20 questions
Intermediate
MOTSOTassociationID
Quick Navigation
1
What is object tracking?
⚡ easy
Answer: Estimating object state over time in video—position, size, sometimes 3D pose—while preserving identity across frames.
2
SOT vs MOT?
📊 medium
Answer: Single-object tracking: one target given init box. Multi-object tracking: many objects with IDs—requires association across detections.
3
Tracking-by-detection?
📊 medium
Answer: Run detector each frame, link boxes into trajectories via association—modular and strong with good detectors.
4
What is data association?
🔥 hard
Answer: Decide which detection belongs to which track—classic bipartite matching with cost matrix (IoU, appearance, motion).
5
IoU matching?
⚡ easy
Answer: Greedy or Hungarian match predicted boxes to tracks by highest IoU above threshold—simple baseline for MOT.
6
Hungarian algorithm?
📊 medium
Answer: Solves assignment problem in O(n³)—global optimum for linear cost—used in SORT / DeepSORT association.
7
What are ID switches?
📊 medium
Answer: Tracker swaps identities between objects—common near crossings; penalized in MOTA metric.
8
Handle occlusion?
📊 medium
Answer: Predict motion during missing detections (Kalman), re-identify with appearance when visible again, or joint optimization over windows.
9
What is drift in SOT?
⚡ easy
Answer: Small errors accumulate updating from own predictions—mitigated by periodic re-detection or robust loss.
10
Why Kalman filters?
📊 medium
Answer: Predict box between frames with constant-velocity model; update when measurements arrive—cheap smooth motion prior.
11
Role of Re-ID features?
📊 medium
Answer: Cosine distance on embedding reduces ID switches when IoU ambiguous (similar DeepSORT).
12
What is MOTA?
🔥 hard
Answer: Multiple Object Tracking Accuracy—combines false positives, misses, and ID switches vs ground truth trajectories.
13
Online tracking?
⚡ easy
Answer: Uses only past and current frames—needed for robotics/live video; batch methods use future frames (smoother but not causal).
14
Classical KLT?
📊 medium
Answer: Track corner features with local flow—fast but fragile to appearance change; less common alone for generic objects now.
15
Siamese trackers?
📊 medium
Answer: Template branch + search region CNN—fast SOT without online fine-tuning in early versions (SiamFC family).
16
Transformer MOT?
🔥 hard
Answer: Track queries attend across space-time (e.g. TrackFormer)—joint detection+association in one model trend.
17
Real-time MOT?
📊 medium
Answer: Light detector + simple association (SORT) or specialized accelerators—appearance models add compute.
18
BEV tracking?
🔥 hard
Answer: Track in bird’s-eye view from multi-camera or LiDAR—used in autonomous driving stacks.
19
3D MOT?
📊 medium
Answer: Associate 3D boxes or point clusters—IoU in 3D or GIoU variants; Kalman in xyz + yaw.
20
Common benchmarks?
⚡ easy
Answer: MOTChallenge, KITTI tracking, nuScenes tracking—each defines detection input protocol and metrics.
Tracking Cheat Sheet
Paradigm
- Det + associate
Match
- IoU / cost matrix
- Hungarian
Metric
- MOTA
- IDF1
💡 Pro tip: MOT = detect every frame + keep consistent IDs.
Full tutorial track
Go deeper with the matching tutorial chapter and code examples.