Part-of-Speech Tagging Tutorial Section

Part-of-Speech Tagging

Discover Part-of-Speech (POS) tagging and how to extract linguistic structure and word grammar from text.

Part-of-Speech (POS) Tagging

Part-of-Speech (POS) tagging is the process of marking up a word in a text corpus to a corresponding grammatical part of speech, based on both its definition and its context alongside adjacent words. This is a critical step for Syntactic Analysis.

POS Breakdown Example
The
Determiner
quick
Adjective
brown
Adjective
fox
Noun
jumps
Verb
over
Preposition
the
Determiner
lazy
Adjective
dog
Noun

Why POS Tagging?

  • Word Sense Disambiguation: Tells the difference between "I like to bear (Verb) weight" and "I saw a brown bear (Noun)".
  • Lemmatization: Feeds into lemmatizers to provide context.
  • Named Entity Recognition: Proper Nouns (NNP) are crucial flags for identifying people, cities, and companies.

Penn Treebank POS Tags

Most modern datasets are trained using the Penn Treebank tagset.

Tag Description Examples
NNNoun, singulardog, truth, time
NNPProper nounLondon, John
VBVerb, base formrun, eat, be
VBDVerb, past tenseran, ate, was
JJAdjectivehappy, large
RBAdverbquickly, very
INPrepositionin, of, under

Implementation with spaCy

spaCy is highly optimized and widely used in industry for POS tagging and dependency parsing. It returns both universal tags (POS) and detailed tags (TAG).

POS Tagging with spaCy
import spacy

# Load the small English language model
nlp = spacy.load("en_core_web_sm")

text = "Apple is looking at buying U.K. startup for $1 billion"
doc = nlp(text)

# Token text, POS tag, and detailed explanation
print(f"{'Text':<12} | {'Tag':<8} | {'Detailed POS Description'}")
print("-" * 50)

for token in doc:
    print(f"{token.text:<12} | {token.pos_:<8} | {spacy.explain(token.pos_)}")

# Output Sample:
# Apple        | PROPN    | proper noun
# is           | AUX      | auxiliary
# looking      | VERB     | verb
# at           | ADP      | adposition
# buying       | VERB     | verb
# U.K.         | PROPN    | proper noun
# startup      | NOUN     | noun