Object tracking: CV guide

Detection vs tracking

Detection classifies and localizes all objects each frame—independent of history. Tracking exploits temporal continuity: prediction from the previous frame reduces search cost and stabilizes identity. Hybrid pipelines run a detector every N frames and a cheap tracker in between, or fuse detections with Kalman prediction and Hungarian matching.

Single-object

One initialized box; tracker updates each frame—OpenCV Tracker* API.

Multi-object (MOT)

Many IDs; needs association to match detections to trajectories across frames.

OpenCV: CSRT tracker (example)

CSRT (Channel and Spatial Reliability) is accurate but slower than KCF. On OpenCV 4.x, legacy trackers often live under cv2.legacy.

import cv2

cap = cv2.VideoCapture("clip.mp4")
ok, frame = cap.read()
bbox = cv2.selectROI("ROI", frame, showCrosshair=True, fromCenter=False)
cv2.destroyWindow("ROI")

tracker = cv2.legacy.TrackerCSRT_create()
tracker.init(frame, bbox)

while True:
    ok, frame = cap.read()
    if not ok:
        break
    ok, bbox = tracker.update(frame)
    if ok:
        x, y, w, h = [int(v) for v in bbox]
        cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
    cv2.imshow("track", frame)
    if cv2.waitKey(1) == 27:
        break

If cv2.legacy is missing, try cv2.TrackerCSRT_create() on older builds, or install opencv-contrib-python.

Faster option: KCF

tracker = cv2.legacy.TrackerKCF_create()
tracker.init(frame, bbox)

KCF is faster; CSRT handles deformation and occlusion slightly better. MOSSE is older and very fast but brittle on scale change.

When trackers fail

Drift — model updates on wrong pixels; use conservative learning rates or stop updating on low confidence.
Occlusion / motion blur — switch to detection-based re-id (DeepSORT) or manual re-init.
Scale / out-of-plane rotation — use scale-pyramid extensions or bounding-box regression from a detector.

MIL tracker (brief)

MIL (Multiple Instance Learning) treats ambiguous positive bags of patches inside the box—more robust to slight misalignment than naive correlation trackers. Create with cv2.legacy.TrackerMIL_create() where available.

                    Takeaways
                    Classic OpenCV trackers = single-object, short-term, init once.
For many objects + IDs, combine a detector with SORT / DeepSORT.
Profile CSRT vs KCF on your resolution and FPS budget.

                

Quick FAQ

Maintain a list of tracker instances, each init with its ROI, and call update per frame. For consistent IDs across occlusions, prefer detection + association (next chapters).

Color-histogram modes in OpenCV (cv2.meanShift, CamShift) work on controlled color distributions; modern pipelines usually prefer learned trackers or detectors.