Word Sense Disambiguation Tutorial Section

Word Sense Disambiguation

Understand Word Sense Disambiguation (WSD) and how algorithms determine which definition of a polysemous word applies in context.

Word Sense Disambiguation (WSD)

One of the most notoriously difficult problems in all of linguistics is Polysemy—the capacity for a single word to have multiple distinct meanings (senses). Word Sense Disambiguation (WSD) is the task of computational algorithms attempting to figure out which "sense" of a word is being used in a given sentence.

The Ambiguity Framework

Target Word: Bass

Sense 1: Music Context

"He played the bass guitar."

Sense 2: Animal Context

"I caught a massive bass while fishing."

The Lesk Algorithm (Knowledge-Based WSD)

Created by Michael Lesk in 1986, this is the classic dictionary-based algorithm for WSD. It relies heavily on taxonomies like WordNet. The core idea is incredibly elegant: Count the overlapping words between the context of the sentence and the dictionary definition of the senses.

How Lesk Calculates

Given the sentence: "We had grilled pine cones for dessert."

  1. Fetch Dictionary Definitions for "pine":
    • Sense A: "A kind of evergreen tree with needle-shaped leaves and cones."
    • Sense B: "To waste away through sorrow or illness."
  2. Define the Sentence Context: Context = {"grilled", "cones", "dessert"}
  3. Calculate Intersection Overlap: Compare Context to Definitions. Sense A overlaps on the word "cones" (Score=1). Sense B has no overlap (Score=0). The algorithm correctly assigns Sense A!

Modern NLP WSD

The Lesk Algorithm struggles heavily because dictionary definitions are notoriously short, resulting in low overlap scores (often 0 overlap for both senses).

Deep Contextualized Embeddings (like ELMo and BERT) effectively solved the WSD problem. Because they generate dynamic embeddings on the fly, the mathematical vector for "bass" in a fishing context is fundamentally separated in vector-space geometry from the word "bass" in a guitar context. We simply utilize a K-Nearest Neighbors classifier on the generated embedding space!