Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng...

44
Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic Institute {yud2 , jih}@rpi.edu November 18, 2013

Transcript of Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng...

Page 1: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Slot Filling based on Knowledge Graph and

Truth FindingDian Yu, Haibo Li, Hongzhao Huang and Heng Ji

Computer Science DepartmentRensselaer Polytechnic Institute

{yud2, jih}@rpi.edu

November 18, 2013

Page 2: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

2

Our Starting Point = 0

BLENDER SF2010 System BLENDER SFV2012 System

Why? Because Heng wants everything new in her new place.

Page 3: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

3

Outline Limitations of state-of-the-art Our Vision and Approach Overview Knowledge Graph Construction

Knowledge Path Extraction Knowledge Path Selection Knowledge Graph Clustering

Truth Finding 2-Layer Mutual Enhancement Truth Finding Credibility Initialization Credibility Propagation

Experimental Results Remaining Challenges Conclusions and Future Work

Page 4: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

4

Outline Limitations of State-of-the-art Our Vision and Approach Overview Knowledge Graph Construction

Knowledge Path Extraction Knowledge Path Selection Knowledge Graph Clustering

Truth Finding 2-Layer Mutual Enhancement Truth Finding Credibility Initialization Credibility Propagation

Experimental Results Remaining Challenges Conclusions and Future Work

Page 5: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Unpleasant Situation of Slot Filling 2009-2012 The most challenging task in KBP

Most previous systems hit the 30% “performance ceiling”

No significant publications on this task at major venues

What are the bottlenecks? Limited amount of labeled data supervised learning is

infeasible Low coverage of patterns construct knowledge graph with

enriched semantic IE annotations and coreference Knowledge gap linguistic constraints mining, path selection and graph clustering Conflicting results truth finding

Page 6: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Inspiration from Gravitational Theory

Query Slot Filler?

Slot Type?

Page 7: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Semantic Annotation

Knowledge Graph Construction

Approach Overview

Query

Query Expansion

Source Corpus

Information Retrieval

Information ExtractionDependency Parsing

Path Extraction Path Selection Graph Clustering

Truth Finding

Slot Fills

Wikipedia Mining

Redundancy Removal &Filler Normalization

Merged KBs

Alternative Name Slot Fill Extraction

Page 8: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

8

Outline Limitations of State-of-the-art Our Vision and Approach Overview Knowledge Graph Construction

Knowledge Path Extraction Knowledge Path Selection Knowledge Graph Clustering

Truth Finding 2-Layer Mutual Enhancement Truth Finding Credibility Initialization Credibility Propagation

Experimental Results Remaining Challenges Conclusions and Future Work

Page 9: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Inspiration from Gravitational Theory

Query Slot Filler?

Slot Type?

Page 10: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Manually crafted/edited patterns: low coverage; expensive Bootstrapping: hard to generalize; long-tail distribution

Typical Dependency patterns for per:place_of_birth <Query_PER> nsubjpass-1 born prep_in <Filler_LOC> <Query_PER> partmod born prep_in <Filler_LOC> <Query_PER> nsubjpass-1 born prep_on <Filler_LOC> <Query_PER> rcmod born prep_in <Filler_LOC>

Missing some simple cases Charles Gwathmey [1] was born on June 19 , 1938 , in

Charlotte [2] , N.C.. Dependency path between [1] and [2]:

[ 'nsubjpass', 'born', 'prep_on', 'June', 'prep_in', 'N.C', 'nn') ]

Bottleneck: Low Coverage of Patterns

Page 11: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Typical Dependency Patterns for per:place_of_death• <Q_PER> nsubj-1 dies prep_in <A_LOC>• <Q_PER> nsubj-1 died prep_in <A_LOC>• <Q_PER> nsubj-1 died prep_on <A_LOC>• <Q_PER> nsubj-1 died prep_in hospital nn <A_LOC>

Missing some simple cases• ``60 Minutes'' was the brainchild of Don Hewitt [1], the show 's

longtime executive producer who died Wednesday of pancreatic cancer at his home in Bridgehampton, N.Y. [2] , at age 86 .

• Dependency path between [1] and [2]: [ 'appos', "producer", 'nsubj', 'died', "who", 'rcmod', 'died', 'prep_at', 'home', 'prep_in‘]

 

Bottleneck: Low Coverage of Patterns

Page 12: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

• Deep Knowledge Acquisition: Nominal Coreference Almost overnight, he became fabulously rich, with a $3-million

book deal, a $100,000 speech making fee, and a lucrative multifaceted consulting business, Giuliani Partners. As a celebrity rainmaker and lawyer, his income last year exceeded $17 million. His consulting partners included seven of those who were with him on 9/11, and in 2002 Alan Placa, his boyhood pal, went to work at the firm.

After successful karting career in Europe, Perera became part of the Toyota F1 Young Drivers Development Program and was a Formula One test driver for the Japanese company in 2006.

“Alexandra Burke is out with the video for her second single … taken from the British artist’s debut album”

“a woman charged with running a prostitution ring … her business, Pamela Martin and Associates”

Our Solution: Online knowledge graph construction; enrich paths with semantic annotations and Information Extraction (coreference/relation/event)

Knowledge Gap 1

Page 13: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Knowledge Path Extraction

① ② ③

① Relevant Document Set

② Sentence Set[Tree representation]

③ Extracted Paths

Page 14: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Knowledge Path Extraction Mays, 50, had died in his sleep at his Tampa home the morning of June 28.

Extracted Knowledge Paths:

1) Mays {}…amod…50

2) Mays…nsubj…died…prep_at

…home…Tampa

3) Mays…nsubj…died…prep_at

…June, 28

{PER, NAM, Billy Mays}

{Death-Trigger} {NUM}

{GPE, NAM, FL-USA} {06/28/2009, TIME-WITHIN}

{FAC, NOM}{Located}

{PER, PRO, Mays}

Page 15: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

15

Knowledge Path Extraction Each node is a entity/time/value mention extent or a

word, enriched by Entity type/subtype Time normalization, role Mention head Full entity mention name a mention node refers to Slot type of trigger phrases mined from Gigaword, Wikipedia

articles and KBs

Each edge is a derivation path from syntactic parsing, or a type labeled dependency path, or a event/semantic relation extracted by IE, labeled with argument

roles

Page 16: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

16

Outline Limitations of State-of-the-art Our Vision and Approach Overview Knowledge Graph Construction

Knowledge Path Extraction Knowledge Path Selection Knowledge Graph Clustering

Truth Finding 2-Layer Mutual Enhancement Truth Finding Credibility Initialization Credibility Propagation

Experimental Results Remaining Challenges Conclusions and Future Work

Page 17: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Deep Knowledge Acquisition: Implicit paraphrases & long-tail distribution “employee/member”:

Sutil, a trained pianist, tested for Midland in 2006 and raced for Spyker in 2007 where he scored one point in the Japanese Grand Prix.

Daimler Chrysler reports 2004 profits of $3.3 billion; Chrysler earns $1.9 billion. In her second term, she received a seat on the powerful Ways and Means

Committee Jennifer Dunn was the face of the Washington state Republican Party for more

than two decades State of Residence: Davis became Virginia's first Republican woman elected to

Congress in 2000, and she was a member of the House Armed Services Committee and the Foreign Affairs Committee

Buchwald lied about his age and escaped into the Marine Corps. By 1942, Peterson was performing with one of Canada's leading big bands, the

Johnny Holmes Orchestra. Even more: “would join”, “would be appointed”, “will start at”, “went to work”, “was

transferred to”, “was recruited by”, “took over as”, “succeeded PERSON”, “began to teach piano”, …

“spouse”: Buchwald 's 1952 wedding -- Lena Horne arranged for it to be held in London 's

Westminster Cathedral -- was attended by Gene Kelly , John Huston , Jose Ferrer , Perle Mesta and Rosemary Clooney , to name a few

Knowledge Gap 2

Page 18: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Need to filter out noisy contexts: 97% paths are irrelevant Our Solution: Multi-Layer Path Selection Encode slot type-specific linguistic constraints for deep understanding Constraint Examples

Candidate/context node attributes (entity type, mention type, time, number, url…) Stop words, upper case/lower case Match name gazetteer and existing KBs (YAGO, Wikipedia infoboxes, Freebase,

DBPedia) and KB mined from Wikipedia Mining Path length His most noticeable moment in the public eye came in 1979 , when Muslim militants in Iran seized the U.S. Embassy and took the Americans stationed there hostage . path = ('poss', 'moment', 'nsubj', 'came', 'advcl', 'seized', 'nsubj', 'Muslim militants','amod')

Coreference link/relation argument roles/event argument roles Position of a particular node/edge type in the path Semantic categories of context nodes from IE annotations Entity node’s role in the entire sentence (e.g. remove commenter/reporter)

Filter “orgin” if the person is a commenter: “Canada and Russia , they have unbelievable rosters , '' Forsberg said .”

 

Too Rich is not Always a Good Thing

Page 19: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

More Constraint Examples Edge type

place_of_death and place_of_birth paths should include prep_in or prep_at edges)

Filter “Employee” if the dependency path includes “prep_on” Manh called on the Asian Development Bank to play a greater role in helping improve national infrastructure. (path: ['nsubj', 'called', 'prep_on‘]) Lexical Constraints based on trigger phrases/words [Heng’s several days

paper-pen work] Mining from Gigaword, Wikipedia articles Mining from KBs (Wikipedia infoboxes, Freebase, YAGO, DBPedia) CMU NELL knowledge base (e.g. religion list from is-a relation) Examples:

“top-employees”: chief executive officer, chief financial officer, chief operating officer, chief strategy and development officer, chiev information officer, e-commerce and security officer,…

“headquarters”: based, headquarter, headquarters, 's Disease list from medical ontology

Comparison with competing context nodes

Page 20: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

20

Outline Limitations of State-of-the-art Our Vision and Approach Overview Knowledge Graph Construction

Knowledge Path Extraction Knowledge Path Selection Knowledge Graph Clustering

Truth Finding 2-Layer Mutual Enhancement Truth Finding Credibility Initialization Credibility Propagation

Experimental Results Remaining Challenges Conclusions and Future Work

Page 21: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Inspiration from Gravitational Theory

Query Slot Filler?

Slot Type?

Page 22: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Slot Filling != Binary Relation Extraction A sentence is usually anchored by a predicate instead of

a pair of entities Slot fillers need to be extracted from multiple documents

instead of a local context involving two entities (main difference from ACE relation extraction)

Capture the interactions among query, candidate slot filler and all other (competing) entities from global contexts, instead of only the path between query and candidate slot filler

Cross-slot Cross-entity reasoning is required Generalization of similar specific graphs Model a candidate mention or context word’s latent

semantic role based on its local context knowledge graph

Page 23: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Filler competing with popular entities involved in centroid events/topics “Hewitt was born Dec. 14 , 1922 , in New York City , but his family

soon moved to Boston , where his father worked as the classified advertising manager for the Boston Herald American.” Query: Hewitt Candidate Filler 1: New York City

Path: 'nsubjpass', 'born', 'prep_in‘ Candidate Filler 2: Boston

Path: 'nsubjpass', 'born', 'conj_but', 'moved', 'prep_to‘ Small Universe of “Boston”

Small Universe of a Mention/Word

Boston movement

Boston

coref

Herald Americanmod his

father

employee familyHewitt

hisfamily family

Page 24: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Knowledge Graph Clustering• Hypothesis: Entity mentions/words that share similar local

graph structures and labels are likely to play similar roles

• Local graphs for correct spouse fillers:

• Local graphs for incorrect spouse fillers:

Page 25: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Knowledge Graph Clustering Similarity Measures

Structure similarity number of nodes number of edges radius degree assortativity of graph the maximum degree centrality for nodes in the

graph (density) Similarity between attributes of nodes and edges

(PageRank)

More powerful for persons than organizations

Page 26: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

26

Outline Limitations of State-of-the-art Our Vision and Approach Overview Knowledge Graph Construction

Knowledge Path Extraction Knowledge Path Selection Knowledge Graph Clustering

Truth Finding 2-Layer Mutual Enhancement Truth Finding Credibility Initialization Credibility Propagation

Experimental Results Remaining Challenges Conclusions and Future Work

Page 27: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Negative Statement Steinmeier, who became Chancellor Angela Merkel's foreign

minister in 2005, has denied the U.S. planned to send Kurnaz to Germany.

Conflicting Evidence from Multiple Sources Yolanda King , daughter of Martin Luther King Jr. , dies ATLANTA She was 51 King died late Tuesday in Santa Monica , California ,

at age 51 , said Steve Klein , a spokesman for the King Center A unique challenge that did not exist in traditional single-

document information extraction Our Solution

Develop a new validation approach based on a "truth-finding" framework

Propagate evidence among system, evidence and claim

Truth Finding

Page 28: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Hypothesis 1: A claim is likely to be true if it's supported by many trustworthy evidences. An evidence is more likely to be trustworthy if many claims it supports are true

Hypothesis 2: A claim or evidence is more likely to be true if it is extracted by many trustworthy systems. And a system is more likely to be credible if it can extract many trustworthy claims or evidences

Hypotheses

Page 29: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Claim-Evidence Networks

2-Layer Mutual Enhancement Truth Finding

 

Evidence Claim System

 

 

 

 

 

 

 

 

 

Claim-System Networks

Page 30: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

There might be multiple true claims, some redundant, some distinct but of the same type

Most of the previous truth finding methods relied on the crowd of wisdom (“great minds think alike"); but majority voting may not always work because certain implicit truths might only be discovered by a few good systems/sources

The performance of a system may vary over time

Systems may share similar resources and be dependent on each other

Not only provide confidence scores, but also detailed evidence and aspects

What’s New

Page 31: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Credibility Initialization 

System 1 System 2

Page 32: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Credibility Initialization Initializing scores for claims: Evaluate each claim based on evidence from its dependency path

Query Fillerdependency

tagEntitydependency

tag

sentence: US actress Patricia Neal dies at 84.query: Patricia Nealslot: per:originFiller: USdependency path from query to filler: nn

Page 33: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Credibility Propagation based on Tri-HITS Propagating credibility scores from claim to system

Update system credibility

Propagate credibility from system to claim

Update claim credibility

(Huang, 2012)

Page 34: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

34

Outline Limitations of State-of-the-art Our Vision and Approach Overview Knowledge Graph Construction

Knowledge Path Extraction Knowledge Path Selection Knowledge Graph Clustering

Truth Finding 2-Layer Mutual Enhancement Truth Finding Credibility Initialization Credibility Propagation

Experimental Results Remaining Challenges Conclusions and Future Work

Page 35: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Overall Performance

System Data Precision Recall F-Measure

2010 System 2010 Eval Data 27.4% 26.6% 27.0%

2013 System2012 Eval Data

(Approx.)36.4% 61.0% 45.6%

2013 Eval Data 40.7% 29.0% 33.9%

Our fresh system significantly outperforms our old system (18.6% )

The pool is much better this year (11.7% ) Top 3 among all teams; Top 1 among all DEFT teams

Page 36: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Impact of Knowledge Path Extraction Alternative name feedback based query expansion: 1.1% gain in F-Measure

Entity Coreference Resolution to enrich knowledge paths: 1.2% gain in F-Measure

Relax path length constraints: 0.5% gain in F-Measure

Page 37: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

Impact of Knowledge Path Selection and Truth Finding

KBP2013 SF Systems

F-M

easu

re

Page 38: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

38

Outline Limitations of State-of-the-art Our Vision and Approach Overview Knowledge Graph Construction

Knowledge Path Extraction Knowledge Path Selection Knowledge Graph Clustering

Truth Finding 2-Layer Mutual Enhancement Truth Finding Credibility Initialization Credibility Propagation

Experimental Results Remaining Challenges Conclusions and Future Work

Page 39: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

39

Remaining Challenges• Name Tagging Errors• Coreference Resolution Errors

• He worked his way up the organization under founder Ted Arison and his son Micky , who now leads Carnival Corp. and called Dickinson, `` one of the most influential people in the development of the modern-day cruise industry.

• Indiana Muslim running for Congress wants to combat ignorance about his [Andre Carson] faith INDIANAPOLIS -- A convert to Islam stands an election victory away from becoming the second Muslim elected to Congress and a role model for a faith community seeking to make its mark in national politics.

• Vague Justification• It was in December 1970 that Anderson criticized Hoover 's pretrial

attack on two Roman Catholic priests , Daniel J. and Philip F. Berrigan , who were later convicted of destroying draft board records. religion filler?

• Fuzzy Definition• She and Russell Simmons, 50, have two daughters: 8-year-old Ming

Lee and 5-year-old Aoki Lee.

Page 40: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

40

Remaining Challenges• Distinguish Slot Directions

• Organization parent/subsidiary; members/member_of• Implicit Relations

He [Pascal Yoadimnadji] has been evacuated to France on Wednesday after falling ill and slipping into a coma in Chad, Ambassador Moukhtar Wawa Dahab told The Associated Press. His wife, who accompanied Yoadimnadji to Paris, will repatriate his body to Chad, the amba. is he dead? in Paris?

Until last week, Palin was relatively unknown outside Alaska, and as facts have dribbled out about her, the McCain campaign has insisted that its examination of her background was thorough and that nothing that has come out about her was a surprise. does she live in Alaska?

The list says that the state is owed $2,665,305 in personal income taxes by singer Dionne Warwick of South Orange, N.J., with the tax lien dating back to 1997. does she live in NJ?

Vernon Bellecourt -- whose Ojibwe name, WaBun-Inini, means "Man of Dawn" or "Daybreak" -- was born on the White Earth Indian Reservation in Minnesota. He left home at 15 after finding work in a carnival. did he live in Minnesota?

Page 41: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

41

Outline Limitations of State-of-the-art Our Vision and Approach Overview Knowledge Graph Construction

Knowledge Path Extraction Knowledge Path Selection Knowledge Graph Clustering

Truth Finding 2-Layer Mutual Enhancement Truth Finding Credibility Initialization Credibility Propagation

Experimental Results Remaining Challenges Conclusions and Future Work

Page 42: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

42

Conclusions and Future Work Mined and incorporated rich knowledge from multiple lexical,

syntactic and semantic levels for slot filling Proposed a new knowledge graph representation Developed a new truth-finding framework for answer validation Married low-level IE with high-level Data Mining Future Work

Incorporate more knowledge resources such as NELL into path selection Hierarchal knowledge graph clustering Collective joint extraction across queries and slot types Truth Finding

Source: Publication Agency, Reporter’s profile, social network and his/her role in the event, Reporting time and location

System: add history, profile and confidence values (this year’s data is not very discriminative)

Claim: compute similarity based on coreference resolution, entity/event clustering and equivalence, modeling complexity; distance from answers from the top-tier systems

Evidence Dimensions: soft constraints in path selection

Page 43: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

43

Resources Sharing Plans January 2014

Heng’s paper-pen made constraints & dictionaries BLENDER KB (merged and cleaned from Wikipedia

infoboxes, Freebase, YAGO and DBPedia)

March 2014 Slot Filling system to share with KBP community;

integrated into BBN DEFT platform

Page 44: Slot Filling based on Knowledge Graph and Truth Finding Dian Yu, Haibo Li, Hongzhao Huang and Heng Ji Computer Science Department Rensselaer Polytechnic.

44