NLP Tutorial

Translation, Summarization & QA

Machine translation, text summarization, and question answering systems.

Machine Translation

The Quest for Universal Translation

Machine Translation (MT) is the use of software to translate text or speech from one language to another. It has evolved through four distinct generations.

1950s
RBMT

Rule-based. Characterized by word-for-word dictionaries and complex grammar rules.

1990s
SMT

Statistical. Learning probabilities and patterns from large multilingual corpora.

2014
NMT

Deep Learning. Introduction of Seq2Seq models with Attention mechanisms.

2017+
LLMs

Generative. Extreme zero-shot translation capability using Transformers.

The Neural Paradigm — Seq2Seq

Neural Machine Translation uses an Encoder-Decoder architecture. The Encoder converts the source sentence into a "thought vector", and the Decoder generates the target language.

The "Encoder-Decoder" Loop
  1. Input: "Je t'aime" (French)
  2. Encoder: Transforms text into an abstract numerical representation.
  3. Attention: Focuses on specific relevant words (e.g., "Je" connects to "I").
  4. Decoder: Sequentially predicts "I", then "love", then "you".

Level 1 — Translating with Transformers

The standard way to perform translation today is using pre-trained Transformer models like T5 or MarianMT from the Hugging Face ecosystem.

Python: MarianMT Workflow
from transformers import pipeline

# Load translator model (English to German)
translator = pipeline("translation_en_to_de", model="Helsinki-NLP/opus-mt-en-de")

texts = [
    "Machine learning is the future of translation.",
    "Artificial intelligence can help us communicate better.",
    "The weather in London is usually very rainy."
]

for text in texts:
    translated = translator(text)[0]['translation_text']
    print(f"EN: {text}")
    print(f"DE: {translated}\n")

Evaluating Performance — BLEU Score

Unlike classification tasks that use Accuracy, translation quality is measured using BLEU (Bilingual Evaluation Understudy). It calculates how many n-grams in the machine output match the human reference translation.

BLEU Score Interpretation / Quality Level
< 10 Almost useless or garbled output
10 - 29 The gist is clear, but heavy grammatical errors
30 - 50 High quality / Highly understandable by humans
> 60 Superior quality, often surpassing human translation

Text Summarization

The Art of Compression

Text Summarization is the task of shortening a text document to create a summary that retains the most important points of the original text. There are two primary types:

Extractive

Original sentences are ranked by importance, and the top ones are selected verbatim. (Think: Highlighting key parts of a book)

Abstractive

The machine understands the context and generates entirely new sentences to convey the message. (Think: Paraphrasing in its own words)

Level 1 — Extractive Summarization (NLTK)

Simple extractive summarization works by calculating word frequencies and scoring sentences based on those frequencies.

Python: Frequency-based Sumy
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize, sent_tokenize

text = """Artificial intelligence is a branch of computer science that deals with creating smart machines. 
It has become a central part of the technology industry. 
The research associated with artificial intelligence is highly technical and specialized. 
The core problems of artificial intelligence include programming computers for certain traits such as knowledge and reasoning."""

# 1. Word frequency table
stopWords = set(stopwords.words("english"))
words = word_tokenize(text)
freqTable = dict()
for word in words:
    word = word.lower()
    if word not in stopWords:
        freqTable[word] = freqTable.get(word, 0) + 1

# 2. Score sentences
sentences = sent_tokenize(text)
sentenceValue = dict()
for sentence in sentences:
    for word, freq in freqTable.items():
        if word in sentence.lower():
            if sentence in sentenceValue:
                sentenceValue[sentence] += freq
            else:
                sentenceValue[sentence] = freq

# 3. Get top sentences
summary = sorted(sentenceValue, key=sentenceValue.get, reverse=True)[:2]
print("Summary:\n", " ".join(summary))

Level 2 — Abstractive Summarization (Hugging Face)

Modern abstractive summarization uses Large Language Models (LLMs) like BART or T5 to rephrase the source content with high fluency.

Python: BART Transformer
from transformers import pipeline

summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

ARTICLE = """The Apollo 11 mission was the first manned mission to land on the Moon. 
Launched by NASA on July 16, 1969, it carried Commander Neil Armstrong and lunar module pilot Buzz Aldrin. 
Armstrong became the first person to step onto the lunar surface on July 21, 1969. 
The event was broadcast on live TV to a worldwide audience and is considered a milestone in human history."""

summary = summarizer(ARTICLE, max_length=50, min_length=20, do_sample=False)
print(summary[0]['summary_text'])

Question Answering

Asking the Machines

Question Answering (QA) is a computer science discipline within the fields of AI and NLP that focuses on building systems that automatically answer questions posed by humans in natural language.

Extractive QA

The model identifies the exact span of text in the document that contains the answer. (Think: Pointing to the answer in a book)

Generative QA / RAG

The model generates a free-form answer based on retrieved documents or internal knowledge. (Think: Formulating its own sentence)

Level 1 — Extractive QA (Hugging Face)

Using models like BERT or RoBERTa, we can feed a context and a question, and get the exact start/end indices of the answer span.

Python: BERT QA Pipeline
from transformers import pipeline

qa_model = pipeline("question-answering", model="deepset/bert-base-cased-squad2")

context = "The Nile is the longest river in the world, stretching over 6,650 kilometers through northeastern Africa."
question = "How long is the Nile river?"

result = qa_model(question=question, context=context)
print(f"Answer: {result['answer']}")
print(f"Confidence: {round(result['score'], 4)}")

Level 2 — Generative QA & RAG (OpenAI / LangChain)

Instead of just pointing, Retrieval Augmented Generation (RAG) retrieves relevant documents and then uses an LLM to formulate a clean, natural response.

Python: Generative QA Prompt
import openai

# RAG Simple Workflow logic
context_docs = "Solar panels convert sunlight into electrical energy using photovoltaic cells."
user_question = "How do solar panels work?"

prompt_template = f"""
Based tightly on the provided context, answer the user's question clearly.

Context: {context_docs}
Question: {user_question}
Answer:
"""

response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": prompt_template}]
)

print(response.choices[0].message.content)