Shift-Reduce Parsing Tutorial Section

Shift-Reduce Parsing

Learn the mechanics of transition-based parsing algorithms used in state-of-the-art dependency parsers.

Transition-Based Parsing (Shift-Reduce)

How does an algorithm actually construct a Dependency tree in code? The most famous algorithm is the Shift-Reduce Parser. It reads words one by one and uses a stack data structure to draw grammatical arrows incrementally as it processes the sentence left-to-right.

This algorithm runs in highly efficient O(N) time complexity, the primary reason spaCy uses it for its lightning-fast dependency parser.

The Three Core Data Structures

  1. The Buffer: A queue containing all the words waiting to be read.
  2. The Stack: LIFO structure holding words currently being processed.
  3. The Output Memory: A list of the dependency linkages `(Head → Dependent)` mapped out so far.

The 3 Possible Actions

1. SHIFT

Move the first word from the Buffer onto the top of the Stack.

2. LEFT-ARC

Draw an arrow from the 2nd token on the stack to the 1st token on stack. Then Pop the 1st token.

3. RIGHT-ARC

Draw an arrow from the 1st token on the stack to the 2nd token on stack. Then Pop the 2nd token.

Example Walkthrough: "I saw John"
Step Action Taken Stack (Top is right) Buffer (Next is left) Link Created
0Initialize[ROOT]["I", "saw", "John"]None
1SHIFT[ROOT, "I"]["saw", "John"]None
2SHIFT[ROOT, "I", "saw"]["John"]None
3LEFT-ARC[ROOT, "saw"]["John"]saw → I (nsubj)
4SHIFT[ROOT, "saw", "John"][]None
5RIGHT-ARC[ROOT, "saw"][]saw → John (dobj)
6RIGHT-ARC[ROOT][]ROOT → saw (root)

Note: Originally, parsers used heuristic rules to decide which action to take. Modern parsers train a Neural Network classification model to look at the stack at any given frame and mathematically classify whether it should output Shift, Left, or Right!