Joint Posterior Revision of NLP Annotations via Ontological ......Eric Clapton 3. General...

1
Tools: Gold Corpus (NERC): AIDA CoNLL-YAGO (train) Datasets: AIDA CoNLL-YAGO (test-b) MEANTIME TAC-KBP Research question: Does the posterior joint revision of the annotations from Stanford CoreNLP (NERC) and DBpedia Spotlight (EL), via YAGO, improve their performances? Measures: NERC / EL / NERC+EL Deterministically computable from the classes of the entity (possibly leveraging alignments between EL Knowledge Base and ) Joint Posterior Revision of NLP Annotations via Ontological Knowledge Marco Rospocher ([email protected]), Francesco Corcoglioniti ([email protected]) Mr. Washington was runner-up at Wimbledon in 1996. 1. Problem: Incoherent Mention-level NLP Annotations The GW Bridge is a suspension bridge over the Hudson Washington (state) George Washington Bridge 2. Solution: Coherence via Ontology Eric Clapton is one of the greatest guitar players. PER Eric Clapton 3. General Probabilistic Model Variables Conditional Independence Assumptions 4. Model Instantiated on NERC + EL Ontological Background Knowledge confidence score learned from data Estimating NERC Estimating EL 6,016,695 entities Taxonomy of 568,255 classes (only ingoing links) #4479 5. Evaluation ②③ M CP CP M * CP Prior (popularity based on entity ingoing links) Computed from gold corpus (NERC + class annotations) Consider only class sets restricted to popular classes (seen at least n * times in the gold corpus) Restricted to gold mentions. Similar improvements also considering all mentions, and macro-averaging by document or by NERC type. PER ORG NERC EL Named Entity Recognition and Classification Entity Linking

Transcript of Joint Posterior Revision of NLP Annotations via Ontological ......Eric Clapton 3. General...

Page 1: Joint Posterior Revision of NLP Annotations via Ontological ......Eric Clapton 3. General Probabilistic Model Variables Conditional Independence Assumptions 4. Model Instantiated on

Tools:

Gold Corpus (NERC): AIDA CoNLL-YAGO (train)

Datasets: AIDA CoNLL-YAGO (test-b)MEANTIME TAC-KBP

Research question: Does the posterior joint revision of the annotations from Stanford CoreNLP (NERC) and DBpedia Spotlight (EL),via YAGO, improve their performances?

Measures: NERC / EL / NERC+EL

Deterministically computable from the classes of the entity (possibly leveraging alignments between EL Knowledge Base and )

Joint Posterior Revision of NLP Annotationsvia Ontological KnowledgeMarco Rospocher ([email protected]), Francesco Corcoglioniti ([email protected])

Mr. Washington was runner-up at Wimbledon in 1996.

1. Problem: Incoherent Mention-level NLP Annotations

The GW Bridge is a suspension bridge over the Hudson

Washington (state)

George Washington Bridge

2. Solution: Coherence via Ontology

Eric Clapton is one of the greatest guitar players.PER

Eric Clapton

3. General Probabilistic Model

Variables Conditional Independence Assumptions

4. Model Instantiated on NERC + EL

Ontological Background Knowledge

confidence score learned from data

Estimating NERC

Estimating EL

6,016,695 entitiesTaxonomy of 568,255 classes

(only ingoing links)

#4479

5. Evaluation

①②③

②③

M

CP

CP

M*

CP

Prior (popularity basedon entity ingoing links)

Computed from gold corpus(NERC + class annotations)

Consider only class sets restricted to popular classes(seen at least n* times in the gold corpus)

Restricted to gold mentions. Similar improvements also considering all mentions, and macro-averaging by document or by NERC type.

PER

ORG

NERC

EL

Named Entity Recognition and Classification

Entity Linking

①② ③