Related Neural Networks Links
Learn Pytorch Neural Networks Tutorial, validate concepts with Pytorch Neural Networks MCQ Questions, and prepare interviews through Pytorch Neural Networks Interview Questions and Answers.
Neural Networks
15 Essential Q&A
Interview Prep
PyTorch for Neural Networks — 15 Interview Questions
Tensors, requires_grad, nn.Module, optimizers, GPU moves, and the standard train loop pattern.
Colored left borders per card; green / amber / red difficulty chips.
Tensor
Autograd
Module
CUDA
1 Tensor vs NumPy array.Easy
Answer: PyTorch tensors support GPU, autograd, and DL ops; can bridge with
.numpy() / torch.from_numpy (share memory when possible on CPU).2 What does
requires_grad=True mean?EasyAnswer: Track operations on this tensor to build a graph for
.backward()—needed for parameters and sometimes inputs (meta-learning).3
nn.Module—what must you implement?EasyAnswer:
forward(self, x) defines computation; parameters registered as nn.Parameter or child modules—never call forward hooks manually; use model(x).4
nn.Sequential vs subclassing Module.MediumAnswer: Sequential chains layers in order—simple. Subclass when you need branching, conditionals, or multi-input forward.
5 Training step skeleton.Easy
Answer:
optimizer.zero_grad() → forward → loss → loss.backward() → optimizer.step(). Zero grad clears previous iteration’s .grad.6 DataLoader purpose.Easy
Answer: Batches samples, optional shuffle, num_workers for parallel loading, pin_memory for faster GPU transfer.
7 Move model and tensors to GPU.Easy
Answer:
device = torch.device("cuda"); model.to(device); batch tensors .to(device)—all operands must be on same device.8
model.eval() and torch.no_grad().MediumAnswer:
eval() switches BN/Dropout to inference behavior. no_grad() disables autograd for inference—saves memory and compute.9
detach() vs item().MediumAnswer:
detach() breaks gradient graph (still tensor). item() pulls Python scalar from single-element tensor—no grad.10
CrossEntropyLoss vs BCEWithLogitsLoss.MediumAnswer: CE expects class indices + raw logits (softmax inside loss). BCEWithLogits is per-element sigmoid + BCE for multi-label or binary.
11 Save/load checkpoints.Easy
Answer:
torch.save({"model": model.state_dict(), "opt": opt.state_dict()}, path); load with load_state_dict—state_dict has weights only, not architecture.12
torch.compile (high level).HardAnswer: JIT-style graph capture and optimization (Inductor)—can speed training/inference; may need fallbacks for dynamic shapes.
13 Automatic Mixed Precision (AMP).Medium
Answer: Run most forward/back in float16 with
autocast, GradScaler for stable grads—faster on Tensor Cores.14 Custom
autograd.Function—when?HardAnswer: Need a new op with explicit forward/backward; rare in app code—use when no built-in op fits.
15 PyTorch vs TensorFlow (eager) one line.Easy
Answer: Both default eager now; PyTorch historically more Pythonic for research; TF strong in production tooling (TF Serving, TFLite)—convergence in practice.
Say
zero_grad before backward—classic trick question.Quick review checklist
- Tensor, requires_grad, backward, zero_grad.
- nn.Module, Sequential, device, eval/no_grad.
- DataLoader; CE vs BCE logits; state_dict; AMP sketch.