Optical Character Recognition MCQ
Find text regions, normalize crops, and transcribe characters or sequences—printed or in the wild.
Text
Unicode
Detection
Boxes / masks
Recognition
Sequence
Scene
Wild text
Reading text in images
OCR splits into locating text (detection) and reading glyphs or sequences (recognition). Classical pipelines use segment-then-classify; deep models use CNN+RNN+CTC or attention decoders for line-level text. Scene text in photos is harder than scanned documents due to blur, perspective, and clutter.
Detection vs recognition
You can detect word/quadrilateral boxes with a detector, crop rectified patches, then run a sequence recognizer—end-to-end models combine both.
Key ideas
Text detection
EAST, DB, or segmentation masks for text regions.
Line recognition
Reshape feature maps to sequence; RNN + CTC or attention.
CTC loss
Aligns variable-length outputs without per-character frame alignment.
Lexicon / LM
Constrains decoding with dictionaries or language models.
Classic stack
detect → deskew / rectify → segment characters or line CRNN → post-process