CV Libraries & Frameworks — Interview Q&A

Question 1

1 What is OpenCV? ⚡ easy

Answer

Answer: Open-source computer vision library with C++ core and Python/Java bindings—image I/O, filters, geometry, features, ML/DNN hooks.

Question 2

2 What is a Mat? 📊 medium

Answer

Answer: Dense n-dimensional array—stores pixels with type (8UC3, etc.), refcounted; ROI shares data unless copy() used.

Question 3

3 imread flags? ⚡ easy

Answer

Answer: IMREAD_COLOR, GRAYSCALE, UNCHANGED—default BGR color; watch alpha and 16-bit paths for medical/raw imagery.

Question 4

4 Why BGR? 📊 medium

Answer

Answer: Historical—matplotlib expects RGB; convert with cvtColor before display in Python notebooks.

Question 5

5 resize interpolation? 📊 medium

Answer

Answer: INTER_LINEAR default; INTER_AREA for downscale; INTER_CUBIC/LANCZOS4 for quality upsampling—trade speed vs sharpness.

Question 6

6 Drawing functions? ⚡ easy

Answer

Answer: line, rectangle, circle, putText—modify image in-place; anti-aliased variants available.

Question 7

7 GaussianBlur? 📊 medium

Answer

Answer: Separable kernel smoothing—reduce noise before edge detect; kernel size should be odd.

Question 8

8 Canny steps? 📊 medium

Answer

Answer: Gradient + hysteresis thresholds—good thin edges; sensitive to blur and threshold tuning.

Question 9

9 findContours? 📊 medium

Answer

Answer: Expects binary mask; returns boundary curves—CHAIN_APPROX_SIMPLE compresses polygons; use for shape analysis.

Question 10

10 Morphology? ⚡ easy

Answer

Answer: erode/dilate/open/close with structuring element—clean masks, separate blobs, fill holes.

Question 11

11 VideoCapture? 📊 medium

Answer

Answer: Read camera or file; check isOpened(); codec fourcc for VideoWriter—platform quirks on macOS/Windows.

Question 12

12 Python vs C++? 📊 medium

Answer

Answer: Same algorithms; Python faster to prototype; C++ for embedded latency—NumPy array can wrap Mat zero-copy in some flows.

Question 13

13 dnn module? 📊 medium

Answer

Answer: Read ONNX/Caffe/TF frozen graphs—blobFromImage, setInput, forward—good for deployment without full DL framework.

Question 14

14 calib3d snapshot? 🔥 hard

Answer

Answer: calibrateCamera, undistort, stereoRectify—pinhole + distortion model for AR and measurement.

Question 15

15 ROI pitfalls? 📊 medium

Answer

Answer: Slicing shares memory—mutations affect parent Mat; clone for independent crop.

Question 16

16 Performance? 📊 medium

Answer

Answer: Avoid Python loops on pixels; use vectorized OpenCV; optional IPP/TBB builds; profile hot paths.

Question 17

17 UMat / OpenCL? 🔥 hard

Answer

Answer: Transparent OpenCL offload when T-API enabled—mixed pipelines need careful sync with Mat.

Question 18

18 Why build from source? ⚡ easy

Answer

Answer: Enable nonfree (SIFT/SURF in older builds), CUDA, custom flags—wheels on PyPI are convenient but fixed options.

Question 19

19 License? ⚡ easy

Answer

Answer: Apache 2.0 (4.5+)—older versions mixed; check contrib modules and patent notes for algorithms.

Question 20

20 Alternatives? 📊 medium

Answer

Answer: scikit-image, Pillow (limited CV), VTK, vendor SDKs—OpenCV remains default for classical CV education and tooling.

Question 21

21 What is torchvision? ⚡ easy

Answer

Answer: PyTorch domain library for vision—datasets, transforms, model architectures, and utilities (ops, io).

Question 22

22 Transforms v2? 📊 medium

Answer

Answer: Tensor-based, torchscript-friendly transforms with consistent API for image/video/bbox/mask—prefer over legacy PIL transforms.

Question 23

23 Compose? ⚡ easy

Answer

Answer: Chain transforms in order—typically Resize → ToImage → ToDtype(scale) → Normalize before batching.

Question 24

24 ImageFolder? 📊 medium

Answer

Answer: Folder-per-class dataset returning image, label—pairs with DataLoader for supervised classification finetuning.

Question 25

25 Common augmentations? 📊 medium

Answer

Answer: RandomResizedCrop, hflip, ColorJitter, RandAugment—match train vs eval (no randomness at test).

Question 26

26 Normalize mean/std? 📊 medium

Answer

Answer: Per-channel (x-mean)/std—use weights’ documented stats (ImageNet) when loading pretrained backbones.

Question 27

27 models.resnet50 pattern? ⚡ easy

Answer

Answer: Factory functions return architecture; pass weights=ResNet50_Weights.IMAGENET1K_V2 for pretrained kernels.

Question 28

28 Weights enums? 📊 medium

Answer

Answer: Typed enums carry meta (categories, metrics)—get_weight() or auto-download on first use; reproducible defaults.

Question 29

29 Finetune classifier? 🔥 hard

Answer

Answer: Replace final FC layer to num_classes; freeze backbone optionally; differential LR for head vs body.

Question 30

30 DataLoader notes? 📊 medium

Answer

Answer: num_workers, pin_memory=True on GPU, persistent_workers—collate_fn for variable-size detection batches.

Question 31

31 Detection helpers? 🔥 hard

Answer

Answer: coco_eval, NMS in torchvision.ops—RCNN/Mask R-CNN reference implementations live in torchvision.detection.

Question 32

32 ONNX export? 📊 medium

Answer

Answer: torch.onnx.export on wrapped model—watch dynamic axes and op support; verify in onnxruntime.

Question 33

33 torchvision vs timm? 📊 medium

Answer

Answer: timm: huge model zoo; torchvision: tightly coupled PyTorch references—often mix timm backbone + custom head.

Question 34

34 AMP? ⚡ easy

Answer

Answer: autocast + GradScaler—most torchvision ops support fp16 on CUDA; watch BatchNorm numerics.

Question 35

35 torchvision.ops? 📊 medium

Answer

Answer: ROIAlign, NMS, box_iou—building blocks for detectors; CUDA kernels behind the scenes.

Question 36

36 Video datasets? 📊 medium

Answer

Answer: Kinetics-style readers + temporal transforms—memory heavy; clip sampling strategies matter.

Question 37

37 Extract features? 🔥 hard

Answer

Answer: Forward hooks or intermediate layers API—FPN-style multi-scale features for segmentation/detection heads.

Question 38

38 torch.jit? 🔥 hard

Answer

Answer: Trace or script model+transforms carefully—some dynamic Python in transforms blocks scripting.

Question 39

39 Version coupling? ⚡ easy

Answer

Answer: torchvision releases track specific torch versions—install matched pairs to avoid binary incompatibility.

Question 40

40 Debug pipeline? ⚡ easy

Answer

Answer: Visualize tensors after transforms; assert value ranges [0,1] or normalized; check label mapping in ImageFolder.

Question 41

41 TensorFlow vision stack? ⚡ easy

Answer

Answer: Keras high-level API + tf.data for input pipelines + optional KerasCV for detection/segmentation blocks.

Question 42

42 Why tf.data? 📊 medium

Answer

Answer: Declarative, parallel map/prefetch/batch—keeps GPU fed; cache() and prefetch(AUTOTUNE) are standard patterns.

Question 43

43 Preprocessing layers? 📊 medium

Answer

Answer: Augment inside model (RandomFlip, RandomRotation) for exportable training graph—same code path in TFLite if supported.

Question 44

44 keras.applications? ⚡ easy

Answer

Answer: Pretrained ResNet, EfficientNet, etc.—set include_top false for feature extraction; load imagenet weights.

Question 45

45 Finetuning recipe? 🔥 hard

Answer

Answer: Freeze base, train head; unfreeze top layers with small LR; use early stopping—watch catastrophic forgetting on tiny data.

Question 46

46 TPU strategy? 🔥 hard

Answer

Answer: TPUStrategy scope + dataset stored in GCS—batch size and image size constraints differ from GPU pipelines.

Question 47

47 MirroredStrategy? 📊 medium

Answer

Answer: Data-parallel multi-GPU on one host—simple path to scale batch without manual gradient sync code.

Question 48

48 SavedModel? 📊 medium

Answer

Answer: Directory with graph + variables + signatures—TF Serving and TFLite consume this interchange format.

Question 49

49 TFLite conversion? 📊 medium

Answer

Answer: TFLiteConverter.from_keras_model—select ops, FP16/INT8 quantization for mobile/NPU deployment.

Question 50

50 Quantization-aware training? 🔥 hard

Answer

Answer: Simulate low precision during training—better INT8 accuracy than post-training quant on hard models.

Question 51

51 TensorFlow Serving? 📊 medium

Answer

Answer: Model server with REST/gRPC—batching scheduler for production latency/throughput tradeoffs.

Question 52

52 What is KerasCV? 📊 medium

Answer

Answer: Modular OD/seg components (YOLO, RetinaNet building blocks), augmentations—accelerates TF-native vision research.

Question 53

53 TF vs PyTorch (interview angle)? 📊 medium

Answer

Answer: TF: deployment/serving story, TPU; PyTorch: research ergonomics—both mature for vision; know your team stack.

Question 54

54 TFRecord? ⚡ easy

Answer

Answer: Serialized Example protos for large datasets—efficient sequential read on GCS; not mandatory for small local image folders.

Question 55

55 @tf.function? 📊 medium

Answer

Answer: Trace graph for performance—avoid Python side effects; use Tensor ops inside for autograph success.

Question 56

56 tf.io.decode_image? ⚡ easy

Answer

Answer: Decode JPEG/PNG to uint8 tensor—pair with convert_image_dtype for float pipeline.

Question 57

57 Losses for segmentation? 📊 medium

Answer

Answer: SparseCategoricalCrossentropy with logits, Dice/Focal in add-ons—match activation (sigmoid vs softmax) and mask handling.

Question 58

58 Profiler? ⚡ easy

Answer

Answer: TensorBoard profiler traces input bottleneck vs GPU kernel gaps—optimize prefetch and augmentation placement.

Question 59

59 TF Hub? 📊 medium

Answer

Answer: Reusable SavedModel modules (image feature vectors)—quick baseline without training from scratch.

Question 60

60 Debugging? ⚡ easy

Answer

Answer: tf.data.take(1), eager execution default in TF2, assert shapes—Static vs dynamic shapes affect XLA.

CV Libraries & Frameworks — Interview Q&A

OpenCV: 20 Essential Q&A

PyTorch Vision (torchvision): 20 Essential Q&A

TensorFlow Vision: 20 Essential Q&A

Full tutorial chapter