Mixed NLP questions â€“ core concepts set 2

20 more mixed questions and answers that test your understanding of sequence labeling, NER, QA, summarization, attention, retrieval and evaluation across the NLP stack.

What is sequence labeling and which NLP tasks fit this formulation?

Answer: Sequence labeling assigns a label to each token in a sequence; tasks include POS tagging, chunking, NER, BIO tagging for entities and token-level classification in many structured prediction problems.

Why are CRF layers often added on top of BiLSTM or transformer encoders for NER?

Answer: CRFs model dependencies between labels (e.g. valid BIO transitions), enforcing global consistency across the sequence and reducing inconsistent tag sequences that a per-token classifier might produce independently.

What is the difference between extractive and abstractive summarization at a high level?

Answer: Extractive methods select sentences or phrases verbatim from the source, while abstractive methods generate new text that may paraphrase or compress the content, similar to how humans write summaries.

How does multi-hop question answering differ from simple factoid QA?

Answer: Multi-hop QA requires reasoning across multiple pieces of evidence, often from different sentences or documents, whereas factoid QA can be answered from a single local span that directly contains the fact.

What is attention masking and where is it used?

Answer: Attention masks prevent attending to certain positions, such as padding tokens or future tokens in causal language models, ensuring correct semantics and avoiding leakage of information the model should not see.

Why is calibration of model confidence important in deployed NLP systems?

Answer: Calibrated probabilities allow systems to know when to abstain, defer to humans or request clarification, reducing harmful mistakes and enabling better risk management in high-stakes applications like medicine or finance.

What role does a retriever play in retrieval-augmented generation?

Answer: The retriever searches external knowledge sources (like vector indices or BM25 search) for relevant documents based on a query, and the generator then conditions on this retrieved context to produce more accurate, grounded responses.

Why is it useful to separate retrieval and generation components in NLP systems?

Answer: Separation allows independent optimization, easier updates of knowledge (by changing the index instead of retraining the LM) and modular design where different retrievers or generators can be swapped for various tasks and constraints.

What is label smoothing and why is it used in sequence models?

Answer: Label smoothing replaces one-hot target distributions with slightly softened ones, preventing the model from becoming overconfident and often improving generalization, especially in encoderâ€“decoder generation tasks like MT.

How can evaluation metrics encourage undesirable behavior in generative models?

Answer: If metrics reward surface overlap or length characteristics too strongly, models may learn to game them by copying source text, being overly verbose or repeating common phrases, rather than optimizing for user satisfaction or correctness.

What is catastrophic forgetting in the context of fine-tuning language models?

Answer: Catastrophic forgetting occurs when fine-tuning on a small dataset overwrites useful general knowledge learned during pretraining, degrading performance on previous capabilities, which can be mitigated via careful learning rates and regularization.

Why might you use adapter layers instead of full fine-tuning?

Answer: Adapters add small trainable modules within a frozen backbone, significantly reducing the number of parameters to update, easing storage of multiple task adaptations and mitigating forgetting across tasks sharing a base model.

What is the difference between extractive QA and open-domain QA?

Answer: Extractive QA answers questions from a specified context passage by selecting spans, while open-domain QA must first locate relevant documents from a large corpus or web, then extract or generate the answer from those sources.

How can QA-based evaluation be used to assess summarization quality?

Answer: QA-based evaluation creates questions from the source document and checks whether answers can be recovered from the summary, indirectly measuring how much critical information the summary preserves compared to the original text.

What are hallucinations in retrieval-augmented systems and why can they still occur?

Answer: Even with retrieval, generators may ignore or misinterpret documents and invent details; retrieval reduces but does not eliminate hallucinations, so careful prompt design, grounding checks and output validation remain necessary.

Why are system and developer messages important in chat-based LLM APIs?

Answer: They provide high-priority instructions and constraints that shape the assistantâ€™s behavior across many turns, enabling consistent persona, policy enforcement and separation between user content and system-level rules.

What is data leakage and how can it affect NLP model evaluation?

Answer: Data leakage occurs when test information unintentionally appears in training or prompts; it leads to over-optimistic performance estimates because the model effectively memorizes answers rather than generalizing from unseen data.

Why should evaluation sets be representative and diverse?

Answer: Representative, diverse evaluation sets uncover weaknesses across topics, styles and user groups, avoiding misleading conclusions that a system is strong when it only performs well on narrow or biased test slices.

How does curriculum learning sometimes help in training NLP models?

Answer: Curriculum learning presents easier examples first and gradually increases difficulty, which can stabilize optimization and help models learn better representations before tackling more complex inputs or tasks.

What mindset is helpful when preparing for mixed-topic NLP interviews?

Answer: Focus on understanding core principles rather than memorizing details, practice explaining trade-offs and design decisions clearly and be ready to connect concepts across modeling, data, evaluation and deployment concerns.

â† Mixed NLP Q&A â€“ Set 1 Next: NLP Interview Q&A â†’

NLP Q&A

Related Natural Language Processing Links

Mixed NLP questions â€“ core concepts set 2

What is sequence labeling and which NLP tasks fit this formulation?

Why are CRF layers often added on top of BiLSTM or transformer encoders for NER?

What is the difference between extractive and abstractive summarization at a high level?

How does multi-hop question answering differ from simple factoid QA?

What is attention masking and where is it used?

Why is calibration of model confidence important in deployed NLP systems?

What role does a retriever play in retrieval-augmented generation?

Why is it useful to separate retrieval and generation components in NLP systems?

What is label smoothing and why is it used in sequence models?

How can evaluation metrics encourage undesirable behavior in generative models?

What is catastrophic forgetting in the context of fine-tuning language models?

Why might you use adapter layers instead of full fine-tuning?

What is the difference between extractive QA and open-domain QA?

How can QA-based evaluation be used to assess summarization quality?

What are hallucinations in retrieval-augmented systems and why can they still occur?

Why are system and developer messages important in chat-based LLM APIs?

What is data leakage and how can it affect NLP model evaluation?

Why should evaluation sets be representative and diverse?

How does curriculum learning sometimes help in training NLP models?

What mindset is helpful when preparing for mixed-topic NLP interviews?

ðŸ” Mixed NLP concepts covered â€“ Set 2

NLP Q&A

Related Natural Language Processing Links

Mixed NLP questions â€“ core concepts set 2

What is sequence labeling and which NLP tasks fit this formulation?

Why are CRF layers often added on top of BiLSTM or transformer encoders for NER?

What is the difference between extractive and abstractive summarization at a high level?

How does multi-hop question answering differ from simple factoid QA?

What is attention masking and where is it used?

Why is calibration of model confidence important in deployed NLP systems?

What role does a retriever play in retrieval-augmented generation?

Why is it useful to separate retrieval and generation components in NLP systems?

What is label smoothing and why is it used in sequence models?

How can evaluation metrics encourage undesirable behavior in generative models?

What is catastrophic forgetting in the context of fine-tuning language models?

Why might you use adapter layers instead of full fine-tuning?

What is the difference between extractive QA and open-domain QA?

How can QA-based evaluation be used to assess summarization quality?

What are hallucinations in retrieval-augmented systems and why can they still occur?

Why are system and developer messages important in chat-based LLM APIs?

What is data leakage and how can it affect NLP model evaluation?

Why should evaluation sets be representative and diverse?

How does curriculum learning sometimes help in training NLP models?

What mindset is helpful when preparing for mixed-topic NLP interviews?

ðŸ” Mixed NLP concepts covered â€“ Set 2

ðŸ” Mixed NLP concepts covered â€“ Set 2