Machine translation – short Q&A
20 questions and answers on machine translation, including neural seq2seq models, attention, transformers and evaluation metrics like BLEU for real-world MT systems.
What is machine translation (MT)?
Answer: Machine translation is the automatic conversion of text from a source language to a target language by a computer system, aiming to preserve meaning and produce fluent target sentences.
How did statistical machine translation (SMT) work?
Answer: SMT modeled translation using phrase tables, alignment models and language models, selecting the target sentence that maximized a combination of translation and fluency probabilities learned from parallel corpora.
What is neural machine translation (NMT)?
Answer: NMT uses neural networks, typically encoder–decoder architectures with attention or transformers, to model translation as a sequence-to-sequence learning problem trained end-to-end on parallel data.
Why was attention a breakthrough for NMT?
Answer: Attention lets the decoder dynamically focus on relevant source tokens at each step, alleviating the bottleneck of compressing the entire source sentence into a single fixed vector and improving translation quality, especially for long sentences.
How is BLEU used to evaluate MT systems?
Answer: BLEU compares n-grams in the system translation to one or more reference translations, computing a precision-based score with a brevity penalty; higher BLEU roughly indicates closer match to human references.
What are limitations of BLEU?
Answer: BLEU is sensitive to tokenization and reference choice, does not capture meaning directly, may undervalue valid paraphrases and is less reliable at sentence level than corpus level, so human evaluation remains important.
What is beam search in NMT decoding?
Answer: Beam search keeps the top-k partial translations at each decoding step, expanding them and pruning low-probability candidates, approximating the most likely output sequence under the model distribution.
What is exposure bias in sequence generation?
Answer: Exposure bias arises when models are trained with teacher forcing (seeing gold histories) but must generate based on their own predictions at test time, causing compounding errors; methods like scheduled sampling try to mitigate this.
How do transformers improve MT compared to RNN-based models?
Answer: Transformers rely on self-attention instead of recurrence, enabling better parallelization, modeling of long-range dependencies and richer context mixing, which leads to state-of-the-art MT performance on many benchmarks.
What is subword tokenization and why is it useful for MT?
Answer: Techniques like BPE or SentencePiece split words into subword units, handling rare and compound words more effectively, reducing OOV issues and improving vocabulary efficiency in NMT systems.
What is multilingual NMT?
Answer: Multilingual NMT trains a single model on multiple language pairs, often using language tags to indicate target language, enabling transfer learning across languages and even zero-shot translation between unseen pairs.
How does back-translation help low-resource MT?
Answer: Back-translation generates synthetic source sentences from target-side monolingual data using a reverse model, creating additional parallel data that boosts performance when genuine parallel data is scarce.
What is word alignment in MT?
Answer: Word alignment links source words to corresponding target words in parallel sentences, useful for phrase extraction in SMT, analyzing NMT behavior and building lexicons or terminology constraints in hybrid systems.
Why is domain adaptation important in MT?
Answer: MT models trained on general data often underperform on specialized domains (legal, medical, technical); fine-tuning or adapting on in-domain parallel or monolingual data improves terminology and style fidelity.
What are common human evaluation methods for MT?
Answer: Human evaluation often rates adequacy and fluency on Likert scales or uses direct assessment and pairwise ranking, providing a more reliable view of translation quality than automatic metrics alone.
How do large language models influence MT today?
Answer: Large language models can perform translation via prompting or act as strong target-language priors in hybrid systems, but dedicated MT models with strong encoders and training data still play a major role in production.
What are some ethical considerations in MT deployment?
Answer: Issues include handling sensitive content, preserving privacy, avoiding harmful mistranslations in critical domains and managing biases or offensive outputs that arise from training data artifacts.
What is terminology control in MT?
Answer: Terminology control ensures specific domain terms are translated in consistent, predefined ways, using techniques like constrained decoding, dictionaries or post-editing to meet business or legal requirements.
How is MT used in industrial workflows?
Answer: MT supports localization pipelines, real-time chat translation, document translation services and human-in-the-loop post-editing, often integrated with translation memory and terminology management tools.
What is quality estimation (QE) in MT?
Answer: QE predicts translation quality without reference translations, flagging segments that need human review and helping allocate post-editing effort efficiently in large-scale MT deployments.
🔍 Machine translation concepts covered
This page covers machine translation: SMT and NMT basics, attention and transformers, subword modeling, BLEU and human evaluation, domain adaptation and industrial deployment considerations.