Ling 570 Day 17: Named Entity Recognition Chunking.
-
Upload
ashlynn-johnston -
Category
Documents
-
view
219 -
download
0
Transcript of Ling 570 Day 17: Named Entity Recognition Chunking.
Ling 570 Day 17:Named Entity Recognition
Chunking
Sequence Labeling
• Goal: Find most probable labeling of a sequence
• Many sequence labeling tasks– POS tagging– Word segmentation– Named entity tagging– Story/spoken sentence segmentation– Pitch accent detection– Dialog act tagging
NER AS SEQUENCE LABELING
NER as Classification Task
• Instance:
NER as Classification Task
• Instance: token• Labels:
NER as Classification Task
• Instance: token• Labels:– Position: B(eginning), I(nside), Outside
NER as Classification Task
• Instance: token• Labels:– Position: B(eginning), I(nside), Outside– NER types: PER, ORG, LOC, NUM
NER as Classification Task
• Instance: token• Labels:– Position: B(eginning), I(nside), Outside– NER types: PER, ORG, LOC, NUM– Label: Type-Position, e.g. PER-B, PER-I, O, …– How many tags?
NER as Classification Task
• Instance: token• Labels:– Position: B(eginning), I(nside), Outside– NER types: PER, ORG, LOC, NUM– Label: Type-Position, e.g. PER-B, PER-I, O, …– How many tags?• (|NER Types|x 2) + 1
NER as Classification: Features
• What information can we use for NER?
NER as Classification: Features
• What information can we use for NER?
NER as Classification: Features
• What information can we use for NER?
– Predictive tokens: e.g. MD, Rev, Inc,..• How general are these features?
NER as Classification: Features
• What information can we use for NER?
– Predictive tokens: e.g. MD, Rev, Inc,..• How general are these features? – Language? Genre? Domain?
NER as Classification: Shape Features
• Shape types:
NER as Classification: Shape Features
• Shape types:– lower: e.g. e. e. cummings• All lower case
NER as Classification: Shape Features
• Shape types:– lower: e.g. e. e. cummings• All lower case
– capitalized: e.g. Washington• First letter uppercase
NER as Classification: Shape Features
• Shape types:– lower: e.g. e. e. cummings• All lower case
– capitalized: e.g. Washington• First letter uppercase
– all caps: e.g. WHO• all letters capitalized
NER as Classification: Shape Features
• Shape types:– lower: e.g. e. e. cummings• All lower case
– capitalized: e.g. Washington• First letter uppercase
– all caps: e.g. WHO• all letters capitalized
– mixed case: eBay• Mixed upper and lower case
NER as Classification: Shape Features
• Shape types:– lower: e.g. e. e. cummings
• All lower case
– capitalized: e.g. Washington• First letter uppercase
– all caps: e.g. WHO• all letters capitalized
– mixed case: eBay• Mixed upper and lower case
– Capitalized with period: H.
NER as Classification: Shape Features
• Shape types:– lower: e.g. e. e. cummings
• All lower case
– capitalized: e.g. Washington• First letter uppercase
– all caps: e.g. WHO• all letters capitalized
– mixed case: eBay• Mixed upper and lower case
– Capitalized with period: H.– Ends with digit: A9
NER as Classification: Shape Features
• Shape types:– lower: e.g. e. e. cummings
• All lower case
– capitalized: e.g. Washington• First letter uppercase
– all caps: e.g. WHO• all letters capitalized
– mixed case: eBay• Mixed upper and lower case
– Capitalized with period: H.– Ends with digit: A9– Contains hyphen: H-P
Example Instance Representation
• Example
Sequence Labeling
• Example
Evaluation
• System: output of automatic tagging• Gold Standard: true tags
Evaluation
• System: output of automatic tagging• Gold Standard: true tags
• Precision: # correct chunks/# system chunks• Recall: # correct chunks/# gold chunks• F-measure:
Evaluation
• System: output of automatic tagging• Gold Standard: true tags
• Precision: # correct chunks/# system chunks• Recall: # correct chunks/# gold chunks• F-measure:
• F1 balances precision & recall
Evaluation
• Standard measures:– Precision, Recall, F-measure– Computed on entity types (Co-NLL evaluation)
Evaluation
• Standard measures:– Precision, Recall, F-measure– Computed on entity types (Co-NLL evaluation)
• Classifiers vs evaluation measures– Classifiers optimize tag accuracy
Evaluation
• Standard measures:– Precision, Recall, F-measure– Computed on entity types (Co-NLL evaluation)
• Classifiers vs evaluation measures– Classifiers optimize tag accuracy• Most common tag?
Evaluation
• Standard measures:– Precision, Recall, F-measure– Computed on entity types (Co-NLL evaluation)
• Classifiers vs evaluation measures– Classifiers optimize tag accuracy• Most common tag?
– O – most tokens aren’t NEs
– Evaluation measures focuses on NE
Evaluation
• Standard measures:– Precision, Recall, F-measure– Computed on entity types (Co-NLL evaluation)
• Classifiers vs evaluation measures– Classifiers optimize tag accuracy
• Most common tag? – O – most tokens aren’t NEs
– Evaluation measures focuses on NE• State-of-the-art:– Standard tasks: PER, LOC: 0.92; ORG: 0.84
Hybrid Approaches
• Practical sytems– Exploit lists, rules, learning…
Hybrid Approaches
• Practical sytems– Exploit lists, rules, learning…– Multi-pass:• Early passes: high precision, low recall• Later passes: noisier sequence learning
Hybrid Approaches
• Practical sytems– Exploit lists, rules, learning…– Multi-pass:• Early passes: high precision, low recall• Later passes: noisier sequence learning
• Hybrid system:– High precision rules tag unambiguous mentions• Use string matching to capture substring matches
Hybrid Approaches
• Practical sytems– Exploit lists, rules, learning…– Multi-pass:
• Early passes: high precision, low recall• Later passes: noisier sequence learning
• Hybrid system:– High precision rules tag unambiguous mentions
• Use string matching to capture substring matches
– Tag items from domain-specific name lists– Apply sequence labeler
CHUNKING
What is Chunking?
• Form of partial (shallow) parsing– Extracts major syntactic units, but not full parse trees
• Task: identify and classify – Flat, non-overlapping segments of a sentence– Basic non-recursive phrases– Correspond to major POS
• May ignore some categories; i.e. base NP chunking
– Create simple bracketing• [NPThe morning flight][PPfrom][NPDenver][Vphas arrived]
• [NPThe morning flight] from [NPDenver] has arrived
ExampleS
NP
NNP
Breaking
NNP
Dawn
VP
VBZ
has
VP
VBN
broken
PP
IN
into
NP
DT
the
NN
box
NN
office
NN
top
NN
ten
NPPPVPNP
ExampleS
NP
NNP
Breaking
NNP
Dawn
VP
VBZ
has
VP
VBN
broken
PP
IN
into
NP
DT
the
NN
box
NN
office
NN
top
NN
ten
Why Chunking?
• Used when full parse unnecessary– Or infeasible or impossible (when?)
• Extraction of subcategorization frames– Identify verb arguments
• e.g. VP NP• VP NP NP• VP NP to NP
• Information extraction: who did what to whom• Summarization: Base information, remove mods• Information retrieval: Restrict indexing to base NPs
Processing Example
• Tokenization: The morning flight from Denver has arrived
• POS tagging: DT JJ N PREP NNP AUX V
• Chunking: NP PP NP VP
• Extraction: NP NP VP
• etc
Approaches
• Finite-state Approaches– Grammatical rules in FSTs– Cascade to produce more complex structure
• Machine Learning– Similar to POS tagging
Finite-State Rule-Based Chunking
• Hand-crafted rules model phrases– Typically application-specific
• Left-to-right longest match (Abney 1996)– Start at beginning of sentence– Find longest matching rule– Greedy approach, not guaranteed optimal
Finite-State Rule-Based Chunking
• Chunk rules:– Cannot contain recursion• NP -> Det Nominal: Okay• Nominal -> Nominal PP: Not okay
• Examples:– NP (Det) Noun* Noun– NP Proper-Noun– VP Verb– VP Aux Verb
Finite-State Rule-Based Chunking
• Chunk rules:– Cannot contain recursion
• NP -> Det Nominal: Okay• Nominal -> Nominal PP: Not okay
• Examples:– NP (Det) Noun* Noun– NP Proper-Noun– VP Verb– VP Aux Verb
• Consider: Time flies like an arrow• Is this what we want?
Cascading FSTs
• Richer partial parsing– Pass output of FST to next FST
• Approach:– First stage: Base phrase chunking– Next stage: Larger constituents (e.g. PPs, VPs)– Highest stage: Sentences
Example
Chunking by Classification
• Model chunking as task similar to POS tagging• Instance:
Chunking by Classification
• Model chunking as task similar to POS tagging• Instance: tokens • Labels: – Simultaneously encode segmentation &
identification
Chunking by Classification
• Model chunking as task similar to POS tagging• Instance: tokens • Labels: – Simultaneously encode segmentation &
identification– IOB (or BIO tagging) (also BIOE or BIOSE)• Segment: B(eginning), I (nternal), O(utside)
Chunking by Classification
• Model chunking as task similar to POS tagging• Instance: tokens • Labels: – Simultaneously encode segmentation &
identification– IOB (or BIO tagging) (also BIOE or BIOSE)• Segment: B(eginning), I (nternal), O(utside)• Identity: Phrase category: NP, VP, PP, etc.
Chunking by Classification
• Model chunking as task similar to POS tagging• Instance: tokens • Labels: – Simultaneously encode segmentation &
identification– IOB (or BIO tagging) (also BIOE or BIOSE)
• Segment: B(eginning), I (nternal), O(utside)• Identity: Phrase category: NP, VP, PP, etc.• The morning flight from Denver has arrived• NP-B NP-I NP-I PP-B NP-B VP-B VP-I
Chunking by Classification
• Model chunking as task similar to POS tagging• Instance: tokens • Labels: – Simultaneously encode segmentation & identification– IOB (or BIO tagging) (also BIOE, etc.)
• Segment: B(eginning), I (nternal), O(utside)• Identity: Phrase category: NP, VP, PP, etc.• The morning flight from Denver has arrived• NP-B NP-I NP-I PP-B NP-B VP-B VP-I• NP-B NP-I NP-I NP-B
Features for Chunking
• What are good features?
Features for Chunking
• What are good features?– Preceding tags
• for 2 preceding words
– Words• for 2 preceding, current, 2 following
– Parts of speech• for 2 preceding, current, 2 following
• Vector includes those features + true label
Chunking as Classification
• Example
Evaluation
• System: output of automatic tagging• Gold Standard: true tags – Typically extracted from parsed treebank
• Precision: # correct chunks/# system chunks• Recall: # correct chunks/# gold chunks• F-measure:
• F1 balances precision & recall
State-of-the-Art
• Base NP chunking: 0.96
State-of-the-Art
• Base NP chunking: 0.96• Complex phrases: Learning: 0.92-0.94
• Most learners achieve similar results
– Rule-based: 0.85-0.92
State-of-the-Art
• Base NP chunking: 0.96• Complex phrases: Learning: 0.92-0.94
• Most learners achieve similar results
– Rule-based: 0.85-0.92• Limiting factors:
State-of-the-Art
• Base NP chunking: 0.96• Complex phrases: Learning: 0.92-0.94
• Most learners achieve similar results
– Rule-based: 0.85-0.92• Limiting factors:– POS tagging accuracy– Inconsistent labeling (parse tree extraction)– Conjunctions
• Late departures and arrivals are common in winter• Late departures and cancellations are common in winter