OCR & Autonomous Driving MCQ
Optical character recognition and computer vision stacks for autonomous vehicles.
Optical Character Recognition MCQ
Reading text in images
OCR splits into locating text (detection) and reading glyphs or sequences (recognition). Classical pipelines use segment-then-classify; deep models use CNN+RNN+CTC or attention decoders for line-level text. Scene text in photos is harder than scanned documents due to blur, perspective, and clutter.
Detection vs recognition
You can detect word/quadrilateral boxes with a detector, crop rectified patches, then run a sequence recognizer—end-to-end models combine both.
Key ideas
Text detection
EAST, DB, or segmentation masks for text regions.
Line recognition
Reshape feature maps to sequence; RNN + CTC or attention.
CTC loss
Aligns variable-length outputs without per-character frame alignment.
Lexicon / LM
Constrains decoding with dictionaries or language models.
Classic stack
detect → deskew / rectify → segment characters or line CRNN → post-process
Autonomous Vehicles MCQ
Vision in self-driving stacks
Autonomous systems use cameras for rich semantics (lanes, signs, color) and often fuse LiDAR/radar for range and weather robustness. Semantic segmentation labels drivable space; detectors track vehicles and pedestrians; HD maps and odometry integrate over time. Redundancy and validation matter as much as model accuracy.
Functional safety
Production stacks duplicate sensing modalities and monitor perception health—not only raw mAP.
Key ideas
Lane detection
Polynomial fits, segmentation masks, or row-wise classifiers on road.
Critical objects
Vehicles, pedestrians, cyclists—often tracked over time.
Segmentation
Freespace vs obstacles; curb and road boundary cues.
Fusion
Project LiDAR into camera; late or early fusion strategies.
Perception loop
capture → calibrate → detect/segment → track → planner