Linguistic Resources for the 2013 TAC KBP Entity Linking Evaluation

15
Linguistic Resources for the 2013 TAC KBP Entity Linking Evaluation Joe Ellis (presenter), Justin Mott, Xuansong Li, Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data Consortium University of Pennsylvania, USA

description

Linguistic Resources for the 2013 TAC KBP Entity Linking Evaluation. Joe Ellis (presenter ), Justin Mott, Xuansong Li, Jeremy Getman, Jonathan Wright, Stephanie Strassel. Linguistic Data Consortium University of Pennsylvania, USA. 2013 Source Corpus. Entity Linking Overview. - PowerPoint PPT Presentation

Transcript of Linguistic Resources for the 2013 TAC KBP Entity Linking Evaluation

Page 1: Linguistic Resources for  the  2013 TAC KBP Entity Linking Evaluation

Linguistic Resources for the 2013 TAC KBP Entity Linking Evaluation

Joe Ellis (presenter), Justin Mott, Xuansong Li, Jeremy Getman, Jonathan

Wright, Stephanie Strassel

Linguistic Data ConsortiumUniversity of Pennsylvania, USA

Page 2: Linguistic Resources for  the  2013 TAC KBP Entity Linking Evaluation

2013 Source Corpus

Language Genre Documents

English

Newswire 1,000,257

Web Text  999,999

Discussion Forums  99,063

Chinese

Newswire  2,000,256

Web Text  815,886

Discussion Forums 199,321

Spanish Newswire  910,734

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Page 3: Linguistic Resources for  the  2013 TAC KBP Entity Linking Evaluation

Entity Linking Overview

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Stage 1:Select name strings

and ref docs

Stage 2:Link namestrings to KB or mark as NIL

Page 4: Linguistic Resources for  the  2013 TAC KBP Entity Linking Evaluation

Entity Linking Overview

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Stage 1:Select name strings

and ref docs

Stage 2:Link namestrings to KB or mark as NIL

Page 5: Linguistic Resources for  the  2013 TAC KBP Entity Linking Evaluation

Entity Linking Overview

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Stage 1:Select name strings

and ref docs

Stage 2:Link namestrings to KB or mark as NIL

Page 6: Linguistic Resources for  the  2013 TAC KBP Entity Linking Evaluation

Entity Linking Overview

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Stage 1:Select name strings

and ref docs

Stage 2:Link namestrings to KB or mark as NIL

Stage 3:Co-reference NIL

entities

Wendy

Wendy Gaxiola

Page 7: Linguistic Resources for  the  2013 TAC KBP Entity Linking Evaluation

Entity Linking Overview

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Stage 1:Select name strings

and ref docs

Stage 2:Link namestrings to KB or mark as NIL

Stage 3:Co-reference NIL

entitiesWendyWendy Gaxiola

Page 8: Linguistic Resources for  the  2013 TAC KBP Entity Linking Evaluation

Entity Linking – Stage 1

Run named entity taggers over source corporaProvides guided search through the corpusThanks KBP coordinators!

Namestring SelectionConfusable

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Page 9: Linguistic Resources for  the  2013 TAC KBP Entity Linking Evaluation

Entity Linking – Stage 1

Run named entity taggers over source corporaProvides guided search through the corpusThanks KBP coordinators!

Namestring SelectionConfusable

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Page 10: Linguistic Resources for  the  2013 TAC KBP Entity Linking Evaluation

Entity Linking – Stage 1

Run named entity taggers over source corporaProvides guided search through the corpusThanks KBP coordinators!

Namestring SelectionConfusable

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Page 11: Linguistic Resources for  the  2013 TAC KBP Entity Linking Evaluation

Entity Linking – Stage 1

Run named entity taggers over source corporaProvides guided search through the corpusThanks KBP coordinators!

Namestring SelectionConfusableAmbiguousVaried

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

?

Page 12: Linguistic Resources for  the  2013 TAC KBP Entity Linking Evaluation

Entity Linking – Stage 1

Run named entity taggers over source corporaProvides guided search through the corpusThanks KBP coordinators!

Namestring SelectionConfusableAmbiguousVaried

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

?

Page 13: Linguistic Resources for  the  2013 TAC KBP Entity Linking Evaluation

Entity Linking – Stage 1: Namestring Selection

Ratios NIL & non-NIL Entity types Genre

Measurable confusability Multiple-entity namestrings (“Smith”) Multiple-namestring entities (“Barack Obama”, “Bam-Bam”,

“Bammy”) NIL singletons Cross-lingual

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Page 14: Linguistic Resources for  the  2013 TAC KBP Entity Linking Evaluation

Entity Linking – Stages 2 & 3: KB Linking and NIL

Coref KB Linking

Review ref document and search KB for matching nodeMultiple namestrings viewed together for quicker linking

NIL CoreferenceNIL queries (no KB match) require manual co-reference

annotationTime-limited quality control pass to enhance

completeness and accuracy

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Page 15: Linguistic Resources for  the  2013 TAC KBP Entity Linking Evaluation

Delivered 2013 Resources

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Corpus Title Type LDC Catalog Language Size

TAC 2013 KBP English Entity Linking Evaluation Queries and Knowledge Base Links

Evaluation LDC2013E90 English803 GPE686 PER701 ORG

TAC 2013 KBP Chinese Entity Linking Evaluation Queries and Knowledge Base Links

Evaluation LDC2013E96ChineseEnglish

714 GPE706 PER735 ORG

TAC 2013 KBP Spanish Entity Linking Evaluation Queries and Knowledge Base Links

Evaluation LDC2013E97SpanishEnglish

660 GPE695 PER762 ORG