Named entity recognition โ short Q&A
20 questions and answers on NER, covering entity types, tagging schemes, classical and neural models and evaluation metrics.
What is named entity recognition (NER)?
Answer: NER is the task of finding spans of text that refer to real-world entities and assigning them types such as person, location, organization, date or other predefined categories.
What are common entity types in standard NER datasets?
Answer: Typical coarse types include PERSON, LOCATION, ORGANIZATION and MISC, with some corpora adding finer-grained types such as DATE, TIME, MONEY, GPE, NORP and more.
What is the BIO (or IOB) tagging scheme?
Answer: BIO tags label each token as B-
Why is NER often modeled as a sequence labeling problem?
Answer: Whether a token is part of an entity depends on neighboring tokens and entity boundaries; sequence models like CRFs or BiLSTM-CRF capture dependencies between adjacent BIO tags across the sentence.
How do CRFs help NER compared to token-independent classifiers?
Answer: CRFs enforce global consistency across tags (e.g. preventing I-PER from following O directly) and exploit correlations between neighboring labels, reducing invalid or fragmented entity predictions.
What features were used in traditional (non-neural) NER systems?
Answer: Classic systems used lexical features (word identity), orthographic patterns (capitalization, digits), affixes, gazetteers, POS tags and surrounding context words as input to CRF or MaxEnt models.
How do neural NER models differ from feature-engineered ones?
Answer: Neural models learn features automatically from embeddings and contextual encoders (BiLSTMs or transformers), often with a CRF layer on top, reducing manual feature design and improving performance.
What is span-based NER?
Answer: Span-based NER treats candidate spans (start, end indices) as units to classify into entity types or non-entities, rather than predicting token-level BIO tags, which can better handle nested or overlapping entities.
How is NER evaluated?
Answer: NER is usually evaluated with precision, recall and F1 based on correctly predicted entity spans and types; partial overlaps or type mismatches are counted as errors in strict evaluation.
Why is domain adaptation important for NER?
Answer: Entity names, formats and contexts vary across domains (news vs. biomedical vs. social media), so models trained on one domain may miss or mislabel entities in another without adaptation or fine-tuning.
What is nested NER?
Answer: Nested NER deals with entities that contain other entities (e.g. โ[University of [California]]โ with ORG containing LOC), requiring models and evaluation that support multiple levels of entity structure.
How do pre-trained language models like BERT help NER?
Answer: BERT provides rich contextual token embeddings that capture syntax and semantics, so fine-tuning a simple classifier or CRF on top of BERT often yields state-of-the-art NER performance with less feature engineering.
Why are gazetteers still useful in NER?
Answer: Gazetteers (lists of known entities) provide strong prior knowledge and can boost recall for specific domains, although overreliance on them can reduce generalization and introduce bias if not curated carefully.
What is the difference between coarse and fine-grained NER?
Answer: Coarse-grained NER uses a small set of broad types (PER, ORG, LOC), while fine-grained NER distinguishes many subtypes (e.g. company vs. university vs. sports team) for more detailed entity categorization.
How does NER relate to information extraction?
Answer: NER is often the first step in information extraction pipelines, identifying key entities which are then linked, normalized and used in relation or event extraction to build structured knowledge.
What challenges does NER face in social media text?
Answer: Noisy spelling, hashtags, creative capitalization and informal language make entity boundaries and types harder to recognize, requiring robust tokenization, normalization and character-level modeling.
What is cross-lingual or multilingual NER?
Answer: Cross-lingual NER learns from data in one or multiple languages and transfers to new languages, often using multilingual embeddings or models like mBERT to share representations across languages.
How is NER connected to entity linking?
Answer: NER identifies entity mentions and their types, while entity linking maps those mentions to entries in a knowledge base; successful linking depends on accurate and complete NER output.
What are BIOES or BILOU tagging schemes?
Answer: BIOES (or BILOU) refines BIO by distinguishing single-token entities and end-of-entity tokens, which can help sequence models better learn precise entity boundaries in NER tasks.
Why is high-quality annotation important for NER?
Answer: NER models are sensitive to label noise and inconsistent guidelines; high-quality, consistently annotated corpora are essential for training reliable systems and benchmarking progress.
๐ NER concepts covered
This page covers named entity recognition: entity types, BIO tagging, classical and neural NER, evaluation metrics, domain adaptation and the role of NER in broader information extraction pipelines.