WDAqua introduction presentation

14
Handling Dynamicity and Temporality of Web Data Hady Elsahar [email protected] Jean Monnet University Saint-Étienne, France

Transcript of WDAqua introduction presentation

Page 1: WDAqua introduction presentation

Handling Dynamicity and Temporality of Web Data

Hady [email protected]

Jean Monnet UniversitySaint-Étienne, France

Page 2: WDAqua introduction presentation

First try with Question Answering Weet it : Natural language interface for Linked Data (ElSahar et al. ‘11 )

Page 3: WDAqua introduction presentation

● Most of the current knowledge bases focus on static facts and ignore the temporal dimension of facts.

● Aspects of temporality and Dynamicity of Datasets :○ Aspect 1 : Many facts are valid only during a particular time period.

○ Aspect 2 : New extracted facts can contradict with, verify or modify new ones

○ Aspect 3 : Some Facts are collectively induced from a series of Events

Handling Dynamicity of Data

Page 4: WDAqua introduction presentation

Challenges and Motivations (1) : Stephen HawkingMany facts are valid only during a particular time period.

Use Case : Questions about Temporal facts

● Who is first Wife of Stephen Hawiking ?● Who is the 10th President of France ? ● Who is the past CEO of google ?

Page 5: WDAqua introduction presentation

Extraction and Represenation of Temporal data

Extraction and representation of Temporal Facts and Events❏ Representation :

❏ Keeping the last updated fact is not enough (DBpedia)❏ Higher order fact (Erdal and Weikum ‘11)

❏ f1:Bill_Clinton isPresidentOf USA.❏ f2:f1 startedOnDate 20-01-1993

❏ Wikidata Qualifiers (Vrandečić ‘12)

❏ Temporal fact and event extraction:❏ Free Text and structured data from wikipedia (patterns and pattern induction)

(Erdal and Weikum ‘11)

Page 6: WDAqua introduction presentation

Annotation of temporal facts in documents for Question answering

SemEval-2015 Task 5: QA TempEval

Page 7: WDAqua introduction presentation

SemEval-2015 Task 5: QA TempEval Question Examples in the Evaluation Dataset :Yes / No:

● “Did the the Indonesian stock market rise again after it’s last fall ? List:

● “What happened after the crash?” ● “What happened between the crash and yesterday?”

When (Factoid): ● “When did the Oscar ceremony end yesterday ?”

Applications ?

Page 9: WDAqua introduction presentation

(Frank Sinatra, profession, Singer) confidence : 0.9

(Jared leto, influenced_by, Frank Sinatra) confidence : 0.8

● People influenced by Writers are probably writers as well ● people are probably born at the same place of their siblings

Challenges and Motivations (2) : Stephen HawkingIn Highly dynamic datasets, new extracted facts can contradict with, verify or modify new ones.

Page 10: WDAqua introduction presentation

Evaluation of new facts using Link prediction

Link Prediction

● Add new facts without extra knowledge ● Assess the validity of an unknown fact

Page 11: WDAqua introduction presentation

Embedding Models for knowledge basesTransE : Modeling Relations as Translations (Bordes et al. ’13):

● Modeling Facts as translations between vectors of entities VSubject + VRelation ≅ VObject

● distance is used to Quantify confidence in facts

● Training objective: Find the representations that Minimizes distances across all true facts and maximize across “corrupted” facts ( s’ , o’ ):

Page 12: WDAqua introduction presentation

Other Embedding Models:● Structured Embeddings (SE) (Bordes et al ‘11 ) ● Collective Matrix Factorization (RESCAL) (Nickel et al., ’11)● Neural Tensor Networks (socher et al. ‘13)● TATEC (Garcia-Duran et al., ’14)

Embedding Models for Text + Knowledge bases:● Joint Learning of Words and Meaning Representations (Bordes et al. ‘12)● Knowledge Graph and Text Jointly Embedding (Wang et al ‘14)

Link prediction using Embedding Models

Page 13: WDAqua introduction presentation

Applications ? ● Verification of new Extracted Facts● Completeness of new added datasets● Modeing literals dataypes (length, date ..etc ) not only relations and

entities.

Embedding Models other benefits ? (collaboration potential) ● Entity Disambiguation for Fact Extraction and QA (Bordes et al. ‘12)● Paraphrase Detection for Questions, (PARALEX) (Fader et al. ‘13)

Page 14: WDAqua introduction presentation

Challenges and Motivations (3) :

Reasoning with more than one supporting facts ● Reasoning about positions (ex: Geo Data)

● Reasoning about Counts● Reasoning about sizes Fact 1 : 55 passengers crammed into the smuggler’s boat.

Fact 2 : The boat made it to the Greek island.

Question : Where are the passengers ?

Stephen HawkingFacts induced from a series of Events

● Towards AI-Complete QA: A Set of Prerequisite Toy Tasks (Wetson et al ‘15)● Memory Networks (Wetson et al ‘14)