Part-of-Speech Tagging
Tutorial Section
Part-of-Speech Tagging
Discover Part-of-Speech (POS) tagging and how to extract linguistic structure and word grammar from text.
Part-of-Speech (POS) Tagging
Part-of-Speech (POS) tagging is the process of marking up a word in a text corpus to a corresponding grammatical part of speech, based on both its definition and its context alongside adjacent words. This is a critical step for Syntactic Analysis.
POS Breakdown Example
The
Determiner
Determiner
quick
Adjective
Adjective
brown
Adjective
Adjective
fox
Noun
Noun
jumps
Verb
Verb
over
Preposition
Preposition
the
Determiner
Determiner
lazy
Adjective
Adjective
dog
Noun
Noun
Why POS Tagging?
- Word Sense Disambiguation: Tells the difference between "I like to bear (Verb) weight" and "I saw a brown bear (Noun)".
- Lemmatization: Feeds into lemmatizers to provide context.
- Named Entity Recognition: Proper Nouns (NNP) are crucial flags for identifying people, cities, and companies.
Penn Treebank POS Tags
Most modern datasets are trained using the Penn Treebank tagset.
| Tag | Description | Examples |
|---|---|---|
| NN | Noun, singular | dog, truth, time |
| NNP | Proper noun | London, John |
| VB | Verb, base form | run, eat, be |
| VBD | Verb, past tense | ran, ate, was |
| JJ | Adjective | happy, large |
| RB | Adverb | quickly, very |
| IN | Preposition | in, of, under |
Implementation with spaCy
spaCy is highly optimized and widely used in industry for POS tagging and dependency parsing. It returns both universal tags (POS) and detailed tags (TAG).
POS Tagging with spaCy
import spacy
# Load the small English language model
nlp = spacy.load("en_core_web_sm")
text = "Apple is looking at buying U.K. startup for $1 billion"
doc = nlp(text)
# Token text, POS tag, and detailed explanation
print(f"{'Text':<12} | {'Tag':<8} | {'Detailed POS Description'}")
print("-" * 50)
for token in doc:
print(f"{token.text:<12} | {token.pos_:<8} | {spacy.explain(token.pos_)}")
# Output Sample:
# Apple | PROPN | proper noun
# is | AUX | auxiliary
# looking | VERB | verb
# at | ADP | adposition
# buying | VERB | verb
# U.K. | PROPN | proper noun
# startup | NOUN | noun