Face & Pose Estimation MCQ
Face recognition pipelines and human pose keypoint estimation.
Face Recognition MCQ
Face recognition pipeline
Modern systems detect faces, align landmarks, crop, and map to an embedding vector with a CNN trained via metric objectives (triplet, contrastive) or softmax variants. Verification compares two embeddings with a threshold; identification searches a gallery (1:N). Fairness, spoofing, and privacy are active concerns.
Metric space
Same-identity pairs should be closer in cosine or L2 distance than different identities.
Key ideas
Detection + alignment
Find face and warp to canonical pose before embedding.
Embedding
Compact vector representing identity-discriminative features.
Verification
1:1 decision: same or different person.
Identification
Match probe to gallery (closed or open set).
Scoring
cosine similarity or L2 distance vs learned threshold
Pose Estimation MCQ
Human pose from pixels
Pose estimation predicts 2D (or 3D) locations of anatomical joints. Top-down methods detect people then estimate pose per crop; bottom-up methods predict all joints then group them (e.g. Part Affinity Fields). Datasets like COCO define a standard skeleton and metrics (OKS).
OKS / PCK
Object Keypoint Similarity generalizes AP to keypoints using scale-normalized distance thresholds.
Key ideas
Keypoints
Wrist, elbow, hip, etc., as (x,y) or heatmap peaks.
Heatmap head
One channel per joint; argmax or refinement for location.
Top-down
Person detector → single-person pose network per box.
Bottom-up
All joints + pairwise cues to assemble instances.
Typical stack
backbone → heatmaps / offsets → grouping or single-person decode