Flair
Powerful NLP framework for sequence labeling using contextual string embeddings.
Flair
Developed by Zalando Research, Flair is a PyTorch-driven NLP library that burst into popularity by inventing "Contextual String Embeddings" (now known as Flair Embeddings).
Why Character-Level Matters
A standard Word2Vec model maps the unique word "President" to a vector. But what happens if a user makes a typo and types "Presdient"? Word2Vec throws an "Out Of Vocabulary" (OOV) error because it has no mathematical vector for that sequence of letters.
Flair treats text as a sequence of characters, passing them through an LSTM. It can therefore dynamically generate highly accurate mathematical representations for misspelled words, slang, or novel usernames because it calculates their meaning based on surrounding character contexts!
Level 1 — Named Entities via Flair
from flair.data import Sentence
from flair.models import SequenceTagger
# Let's load a NER model (Downloads on first run)
# 'ner-fast' is optimized for CPU speed vs 'ner-large' for extreme GPU accuracy
tagger = SequenceTagger.load('ner-fast')
# You wrap strings in the special Flair 'Sentence' object
sentence = Sentence("George Washington travelled to New York.")
# Run the neural network over the sentence
tagger.predict(sentence)
# Iterate over the tags it injected
for entity in sentence.get_spans('ner'):
print(f"Text: {entity.text:<20} | Label: {entity.get_label('ner').value:<5} | Confidence: {entity.get_label('ner').score:.4f}")
# Outputs:
# Text: George Washington | Label: PER | Confidence: 0.9997
# Text: New York | Label: LOC | Confidence: 0.9996