Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of...
-
Upload
alyson-williamson -
Category
Documents
-
view
214 -
download
0
Transcript of Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of...
Text Analysis ConferenceKnowledge Base Population
2013
Hoa Trang DangNational Institute of Standards and Technology
Sponsored by:
TAC KBP Goals
• Goal: Populate a knowledge base (KB) with information about entities as found in a collection of source documents, following a specified schema for the KB
• KBP 2009-2011: Focus on augmenting an existing KB. Decompose KBP into two tasks▫ Entity-Linking: link each given named entity mention to a node in
reference KB (or create new node)▫ Slot-Filling: Learn attributes about target entities from the source
documents and add new information about the entity to the reference KB
• KBP 2012: Combine entity-linking and slot-filling to build a KB from scratch -> Cold Start
• KBP 2013: ▫ Conversational, informal data (discussion fora)▫ Temporal constraints for Slot Filling (2011 pilot)▫ Sentiment analysis for Slot Filling
TAC KBP 2013 Track Participants
• Track coordinators▫ Hoa Dang (Slot Filler Validation)▫ Jim Mayfield (Entity Linking, Cold Start KBP)▫ Margaret Mitchell (Sentiment Slot Filling)▫ Mihai Surdeanu (English Slot Filling and Temporal Slot
Filling)• LDC linguistic resource providers: Joe Ellis, Jeremy
Getman, Justin Mott, Xuansong Li, Kira Griffitt, Stephanie M. Strassel, Jonathan Wright
• Coordinators emeritus: Ralph Grishman, Heng Ji• Advisor: Boyan Onyshkevych• 45 Teams
▫ 14 countries (21 USA, 9 China, 3 Spain, 2 Germany,….)
6 (8) TAC KBP 2013 Tracks
• Entity-Linking▫ English▫ Chinese▫ Spanish
• Slot-Filling (English)▫ Regular▫ Sentiment▫ Temporal▫ Slot Filler Validation Task
• Cold Start (English)
Entity Linking and Slot Filling Tracks
• Goal: Augment a reference knowledge base (KB) with info about query entities (PER, ORG, GPE) as found in a diverse collection of documents
• Reference KB: Oct 2008 Wikipedia snapshot. Each KB node corresponds to a Wikipedia page and contains:▫ Infobox▫ Wiki_text (free text not in infobox)
• English source documents:▫ 1M News docs▫ 1M Web docs▫ 99K Discussion Forum docs (threads)
• Chinese source documents: 2M news, 800K Web• Spanish source documents: 900K news
Entity-Linking Evaluation Results
• English▫ Participants: 26 teams▫ Highest F1: 0.721 (0.730 in 2012)▫ Median F1: 0.583 (0.536 in 2012)
• Chinese▫ Participants: 4 teams▫ Highest F1: 0.622 (0.740 in 2012)▫ Median F1: 0.619 (0.617 in 2012)
• Spanish▫ Participants 3 teams▫ Highest F1: 0.709 (0.641 in 2012)▫ Median F1: 0.651 (0.612 in 2012)
Regular Slot Filling Evaluation Results
•Participants: 18 teams•Human F1: 0.685 (0.814 in 2012)•Highest System F1: 0.373 (0.517 in 2012)•2nd Highest System F1: 0.339 (0.296 in 2012)•Median System F1: 0.150 (0.099 in 2012)
Sentiment Slot Filling Track
• Sentiment analysis for KBP:▫Holder (PER, ORG, GPE)▫Target (PER, ORG, GPE)▫Polarity (positive, negative)
• Implemented as regular slot filling, with different set of slots▫{per,org,gpe}:positive-towards▫{per,org,gpe}:negative-towards▫{per,org,gpe}:positive-from▫{per,org,gpe}:negative-from
• Participants: 3 teams• Evaluation results:
▫Human F1: 0.727▫Highest System F1: 0.132▫Median System F1: 0.014
Temporal Slot Filling Track
• Find tightest temporal constraints [T1 T2 T3 T4] on a given relation▫ Relation is true for a period beginning between T1 and
T2▫ Relation is true for a period ending between T3 and T4
• Participants: 5 teams• Evaluation results:
▫ Human Accuracy: 0.688▫ Highest System Accuracy: 0.331▫ Median System Accuracy: 0.148
Slot Filler Validation Track (SFV)
• Task: Determine whether or not a candidate slot filler is correct
• Objective: improve precision without excessive reduction of recall
• Participants: 5 teams• Some SFV runs had overwhelmingly positive impact
on individual SF runs!
Cold Start KBP Track
• Goal: Build a KB from scratch, containing all targeted info about all entities as found in a relatively closed domain corpus of documents
• KB schema: same entity types and slots as regular slot-filling task• Source document collection:
▫ 50K Web pages from small-town publications (from TREC KBA document stream)
• Required capabilities:▫ Entity-linking: Grounding all named entity mentions in docs to
KB nodes▫ Slot-filling: Learning attributes about all named entities
• Post-submission evaluation queries traverse KB starting from a single entity node (entity mention):▫ 0-hop: Find all children of Michael Jordan▫ 1-hop: Find date of birth of each of the children of Michael
Jordan
Cold Start Evaluation Results (Preliminary)
• Participants: 3 teams• 0-hop queries:
▫ Highest F1 0.384 (0.497 in 2012)• 1-hop queries:
▫ Highest F1 0.145 (0.255 in 2012)• Combined 0-hop and 1-hop F1
▫ Highest F1: 0.278 (~0.352 in 2012)
TAC KBP Discussion/Planning Sessions
• Monday, November 18 (2:15-3:10pm):▫ English Slot Filling▫ Slot Filler Validation▫ Temporal Slot Filling?▫ +Spanish Slot Filling?▫ +Event identification and argument extraction?
• Tuesday, November 19 (3:00-4:00pm):▫ Cold Start▫ English Entity Linking (as queries in Cold Start
framework?)▫ Cross-Lingual Spanish and Chinese Entity Linking
+ Discussion forum