CV MCQ — Chapter 17 0 Questions
Face & Pose Estimation

Face & Pose Estimation MCQ

Face recognition pipelines and human pose keypoint estimation.

Easy: 0 Q Medium: 0 Q Hard: 0 Q

Face Recognition MCQ

Face recognition pipeline

Modern systems detect faces, align landmarks, crop, and map to an embedding vector with a CNN trained via metric objectives (triplet, contrastive) or softmax variants. Verification compares two embeddings with a threshold; identification searches a gallery (1:N). Fairness, spoofing, and privacy are active concerns.

Metric space

Same-identity pairs should be closer in cosine or L2 distance than different identities.

Key ideas

Detection + alignment

Find face and warp to canonical pose before embedding.

Embedding

Compact vector representing identity-discriminative features.

Verification

1:1 decision: same or different person.

Identification

Match probe to gallery (closed or open set).

Scoring

cosine similarity or L2 distance vs learned threshold

Pro tip: Use large diverse training data; test on demographic slices for bias evaluation.

Pose Estimation MCQ

Human pose from pixels

Pose estimation predicts 2D (or 3D) locations of anatomical joints. Top-down methods detect people then estimate pose per crop; bottom-up methods predict all joints then group them (e.g. Part Affinity Fields). Datasets like COCO define a standard skeleton and metrics (OKS).

OKS / PCK

Object Keypoint Similarity generalizes AP to keypoints using scale-normalized distance thresholds.

Key ideas

Keypoints

Wrist, elbow, hip, etc., as (x,y) or heatmap peaks.

Heatmap head

One channel per joint; argmax or refinement for location.

Top-down

Person detector → single-person pose network per box.

Bottom-up

All joints + pairwise cues to assemble instances.

Typical stack

backbone → heatmaps / offsets → grouping or single-person decode

Pro tip: Occlusion and rare poses hurt both paradigms—data augmentation and 3D priors help in video.