A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations...

109
A Hitchhiker’s guide to Ontology Fabian M. Suchanek el´ ecom ParisTech University Paris, France me>13 intro>19 1

Transcript of A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations...

Page 1: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

A Hitchhiker’s guideto Ontology

Fabian M. SuchanekTelecom ParisTech UniversityParis, France

me>13intro>19

1

Page 2: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

permanent

money freedom

2003:

2005:

2008:

Fabian M. Suchanek

22

Page 3: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

money

permanent

freedom

2003: BSc in Cognitive ScienceOsnabruck University/DE

2005: MSc in Computer ScienceSaarland University/DE

2008: PhD in Computer Science

Max Planck Institute/DE

Fabian M. Suchanek

33

Page 4: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

2003: BSc in Cognitive ScienceOsnabruck University/DE

2005: MSc in Computer ScienceSaarland University/DE

2008: PhD in Computer Science

Max Planck Institute/DE

money

permanent

freedom

Fabian M. Suchanek

44

Page 5: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

2009: PostDoc at Microsoft ResearchSilicon Valley/US

permanent

money freedom

Fabian M. Suchanek

55

Page 6: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

2009: PostDoc at Microsoft ResearchSilicon Valley/US

permanent

money freedom

Fabian M. Suchanek

66

Page 7: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

2009: PostDoc at Microsoft ResearchSilicon Valley/US

2010: PostDocINRIA Saclay/FR

money

permanent

freedom

Fabian M. Suchanek

77

Page 8: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

2009: PostDoc at Microsoft ResearchSilicon Valley/US

2010: PostDocINRIA Saclay/FR

money

permanent

freedom

Fabian M. Suchanek

88

Page 9: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

2009: PostDoc at Microsoft ResearchSilicon Valley/US

2010: PostDocINRIA Saclay/FR

2012: Research group leader

Max Planck Institute/DE

permanent

money freedom

Fabian M. Suchanek

99

Page 10: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

2009: PostDoc at Microsoft ResearchSilicon Valley/US

2010: PostDocINRIA Saclay/FR

2012: Research group leader

Max Planck Institute/DE

permanent

money freedom

Fabian M. Suchanek

1010

Page 11: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

2009: PostDoc at Microsoft ResearchSilicon Valley/US

2010: PostDocINRIA Saclay/FR

2012: Research group leader

Max Planck Institute/DE2013: Associate Professor

Telecom ParisTech/FRmoney

permanent

freedom

Fabian M. Suchanek

1111

Page 12: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

2009: PostDoc at Microsoft ResearchSilicon Valley/US

2010: PostDocINRIA Saclay/FR

2012: Research group leader

Max Planck Institute/DE2013: Associate Professor

Telecom ParisTech/FRmoney

permanent

freedom

Fabian M. Suchanek

1212

Page 13: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

2009: PostDoc at Microsoft ResearchSilicon Valley/US

2010: PostDocINRIA Saclay/FR

2012: Research group leader

Max Planck Institute/DE2013: Associate Professor

Telecom ParisTech/FR2016: Full Professor

Telecom ParisTech/FR

money

permanent

freedom

Fabian M. Suchanek

1313

Page 14: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

2009: PostDoc at Microsoft ResearchSilicon Valley/US

2010: PostDocINRIA Saclay/FR

2012: Research group leader

Max Planck Institute/DE2013: Associate Professor

Telecom ParisTech/FR2016: Full Professor

Telecom ParisTech/FR

money

permanent

freedom

Fabian M. Suchanek

1414

Page 15: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

15

I am an Elvis Fan!

1961

15

Page 16: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Old Elvis from Sachsmedia 16

Meeting Elvis Presley

2015

1961

16

Page 17: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Old Elvis from Sachsmedia

At Elvis’ ranchin New Zealand

Ranch is Murchison Lodge in New Zealand 17

Meeting Elvis Presley

1961

2015

17

Page 18: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

18

Knowledge Cards

Elvis Presley

18

Page 19: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

19

Knowledge Cards

Elvis Presley

???

19

Page 20: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

For us, a knowledge base (“KB”, “ontology”) is a graph, wherethe nodes are entities and the edges are relations.

20

Knowledge Bases

type

born 1935

Singer

Person

subclassOf

(We do not distinguish T-Box and A-Box.)

20

Page 21: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Miningknowledge

bases->

DeterminingIncompleteness->

A ∧B ⇒ C

Protectingknowledge bases->

Creativly extendingknowledge bases ->

Constructingknowledge bases

21

Singer

type

born

Knowledge Base Life Cycle

1935

21

Page 22: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Elvis Presley was one

of the best blah blahblub blah don’t readthis, listen to thespeaker! blah blah blahblubl blah you are stillreading this! blah blah

blah blah blabbel blah

Born: 1935In: Tupelo...

Categories:Rock&Roll, American Singers,Academy Award winners...

Extracting from Wikipedia

22

Elvis Presley

ElvisPresley

AmericanSinger

1935 Tupelo

USAAcAward

type

bornIn

locatedInwon

born

22

Page 23: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

See it online

Intermediate extractors• clean facts• deduplicate facts and entities• check consistency

Extractor

GeoNames Extractor

Extractor

Extractor

23

Creating a large knowledge base

ensuring high quality (95%)

Extractor

Extractor

ExtractorWordNet

See it locally23

Page 24: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Intermediate extractors• clean facts• deduplicate facts and entities• check consistency

See it online See it locally

multilingual>29

Extractor

GeoNames

WordNet

Extractor

Extractor

Extractor

24

Creating a large knowledge base

ensuring high quality (95%)

Extractor

Extractor

Extractor

24

Page 25: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Extractor

25

Integrating multilingual Wikis

was born

e nato

est ne

Extractor

25

Page 26: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Extractor

26

was born

e nato

est ne

map?

Extractor

Integrating multilingual Wikis

26

Page 27: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Extractor

27

was born

e nato

est ne

?Extractor

Extractor

Integrating multilingual Wikis

27

Page 28: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

<Elvis,USA><Mao,China>

<Elvis,USA><Mao,China>

28

bornIn

est ne

Integrating multilingual Wikis

28

Page 29: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

<Elvis,USA><Mao,China>

<Elvis,USA><Mao,China>

est ne= bornIn

P (S1 ⊇ S2) > θ

29

bornIn

est ne

[CIDR 2015]

?

Integrating multilingual Wikis

29

Page 30: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

<Elvis,USA><Mao,China>

<Elvis,USA><Mao,China>

est ne= bornIn

P (S1 ⊇ S2) > θ

30

bornIn

est ne

[CIDR 2015]

?

Integrating multilingual Wikis

30

Page 31: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Example: YAGO about Elvis

3131

Page 32: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Wikipedia + WordNettime and space10 languages

100 relations100m facts10m entities95% accuracyused by DBpedia

and IBM Watson

http://yago-knowledge.org

Caveat:focus onprecision!

canonic&IBEX>42 32

YAGO: a large knowledge base

[WWW2007,JWS2008,WWW2011 demo,AIJ2013,WWW2013 demo,CIDR2015]

canonic>37

soon open-source!

32

Page 33: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

ElvisPresley

Mr.Presley

Mr.Obama

ElvisHurricane

Canonicalizing New Entities

33

Elvis?

??

??

?

?

33

Page 34: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Use hierarchical agglomerative clustering

• TF-IDF token overlap• Triple overlap• String Similarity

•Word overlap in source docs• Type overlap• Overlap of co-mentions

ElvisHurricane

ElvisPresley

Mr.Presley

Mr.Obama

details>36

Canonicalizing New Entities

34

Elvis

34

Page 35: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

• TF-IDF token overlap• Triple overlap• String Similarity

•Word overlap in source docs• Type overlap• Overlap of co-mentions

MachineLearning

Use hierarchical agglomerative clustering

ElvisHurricane

ElvisPresley

Mr.Presley

Mr.Obama

Canonicalizing New Entities

35

Elvis

35

Page 36: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

• TF-IDF token overlap• Triple overlap• String Similarity

•Word overlap in source docs• Type overlap• Overlap of co-mentions

MachineLearning

MachineLearning

Use hierarchical agglomerative clustering

ElvisHurricane

ElvisPresley

Mr.Presley

Mr.Obama

Canonicalizing New Entities

36

Elvis

36

Page 37: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

• TF-IDF token overlap• Triple overlap• String Similarity

•Word overlap in source docs• Type overlap• Overlap of co-mentions

MachineLearning

MachineLearning

Use hierarchical agglomerative clustering

ElvisHurricane

ElvisPresley

Mr.Presley

Mr.Obama

Canonicalizing New Entities

37

Elvis

[CIKM 2014]

37

Page 38: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

ElvisHurricane

ElvisPresley

Mr.Presley

Mr.Obama

IBEX>42

Canonicalizing New Entities

38

Elvis

[CIKM 2014]

38

Page 39: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

39

Goal: Harvest entities from the Web

39

Page 40: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

c© Whittleseacitylac

Unique identifiers can be verified by a checksum.They exist for products, books, documents, chemicals,...

40

IBEX: Collect unique ids

887128476661

40

Page 41: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

c© Whittleseacitylac 41

IBEX: Collect unique ids

id name

1234 Puma PowerTech Blaze1234 Please choose

1234 Puma Shoe1234 Puma PowerTech Blaze

5678Sony Cybershot TS1005678Please choose

... ...

URLu1u1u2u2u3u3...

41

Page 42: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Found• 13m email addresses

with their name• 235K chemical products

• 1.4m books• 1.1m products... with an accuracy of 73%-96%

Analyzed

• Global trade flow• frequent email providers• frequent people names and more

All data available online athttp://resources.mpi-inf.mpg.de/d5/ibex/

42

IBEX: analyses

[WebDB 2015]

42

Page 43: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Miningknowledge

bases->

DeterminingIncompleteness->

A ∧B ⇒ C

Protectingknowledge bases->

Creativly extendingknowledge bases ->

Constructingknowledge bases

43

Singer

type

born

Knowledge Base Life Cycle

1935

43

Page 44: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

married(x, y) ∧ hasChild(x, z)⇒ hasChild(y, z)

PCA>49

Rule Mining finds patterns

married

hasChild hasChild

4444

Page 45: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

But: Rule mining needs counter examplesand RDF ontologies are positive only

married(x, y) ∧ hasChild(x, z)⇒ hasChild(y, z)

PCA>49

Rule Mining finds patterns

45

hasChild hasChild

married

45

Page 46: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Assumption:If we know r(x,y1),..., r(x,yn),then all other r(x,z) are false.

Partial Completeness Assumption

46

hasChild

46

Page 47: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Assumption:If we know r(x,y1),..., r(x,yn),then all other r(x,z) are false.

Partial Completeness Assumption

47

hasChild

47

Page 48: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Assumption:If we know r(x,y1),..., r(x,yn),then all other r(x,z) are false.

conf>49

Partial Completeness Assumption

48

hasChild

48

Page 49: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

~B ⇒ r(x, y)

married(z, x) ∧ hasChild(z, y)⇒ hasChild(x, y)

conf( ~B ⇒ r(x, y)) = #x,y: ~B∧r(x,y)#x,y:∃y′: ~B∧r(x,y′)

# of parent-child

relations that wepredict correctly

in the KB

# of parent-child

relations that wepredict and where

a child is known

Partial Completeness Assumption

4949

Page 50: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Caveat: rules cannotpredict the unknownwith high precision

AMIE is based on an efficient in-memory databaseimplementation.

r(x, y) ∧ r′(z, y)⇒ r”(x, z)

AMIE finds rules in ontologies

AMIE

(5min)

5050

Page 51: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

type(x, pope)⇒diedIn(x,Rome)

AMIE finds rules in ontologies

51

[WWW 2013, VLDB journal 2015]

AMIE

(5min)

>Rosa

51

Page 52: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Matching heterogeneous KBs

52

USA

bornInCountryUSA

TupelobornInCity locatedIn

52

Page 53: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

1. Coalesce the KBs

53

USAbornInCountry

bornInCityTupelo

locatedIn

53

Page 54: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

bornInCity(x, y) ∧ locatedIn(y, z)⇒ bornInCountry(x, z)

Caveat:Spurious

correlations

2. Mine rules

54

USAbornInCountry

AMIE

“ROSA rule”

TupelobornInCity locatedIn

54

Page 55: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

⇒ bornInCountry(x, z)bornInCity(x, y) ∧ locatedIn(y, z)

ROSA rules match ontologies

55

[AKBC 2013]

“ROSA rule”

55

Page 56: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Miningknowledge

bases->

DeterminingIncompleteness->

A ∧B ⇒ C

Protectingknowledge bases->

Creativly extendingknowledge bases ->

Constructingknowledge bases

56

Singer

type

born

Knowledge Base Life Cycle

1935

56

Page 57: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Incompleteness

57

marriedTo

c© Girl Happy57

Page 58: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Incompleteness

58

marriedTo

c© Girl Happy58

Page 59: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Incompleteness

59

marriedTo

c© Girl Happy

?

marriedTo

Given a subject s anda relation r, do we knowall o with r(s, o) ?

59

Page 60: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Signals for Incompleteness

60

marriedTo

?

marriedTo

Closed World AssumptionPartial Completeness AssumptionPopularity oracle

AMIE oracle: Learn rules such asmoreThan1(x, hasParent)⇒ complete(x, hasParent)

No-change oracleStar-pattern oracle

Class-oracle

>Details

60

Page 61: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Signals for Incompleteness (F1)

61* = biased training sample61

Page 62: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Signals for Incompleteness

62

marriedTo

marriedTo

AMIE can predict incompleteness

• bornIn: 100% F1-measure• diedIn: 96%• directed: 100%• graduatedFrom: 87%

• hasChild: 78%• isMarriedTo: 46%... and more.

[WSDM 2017]

>paris

62

Page 63: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

If incomplete, use another KB

63

birthDate

1935

type

RockSinger

married

1935-01-08

born

Singer

type

plays

YAGO ElvisPedia

63

Page 64: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Wo is the spouseof the guitar

player?

No Links => No Use

64

birthDate

1935

type

RockSinger

married

1935-01-08

born

Singer

type

plays

YAGO ElvisPedia

64

Page 65: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Match classes, entities, & relations

subClassOf

sameAs

subPropertyOf

65

birthDate

1935

type

RockSinger

married

1935-01-08

born

Singer

type

plays

YAGO ElvisPedia >details

65

Page 66: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Match classes, entities, & relations

”Elvis” ”Elvis”

name label

6666

Page 67: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

1. Match literals

Match classes, entities, & relations

”Elvis”

name label

”Elvis”

6767

Page 68: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

1. Match literals

Match classes, entities, & relations

”Elvis”

name label

”Elvis”

6868

Page 69: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

2. Assume small equivalence of all relations

Match classes, entities, & relations

”Elvis”

name label

”Elvis”

6969

Page 70: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

2. Assume small equivalence of all relations

Match classes, entities, & relations

”Elvis”

name label

”Elvis”

7070

Page 71: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

What about matching the entities?

What does it mean that both Elvises share the same name?

Match classes, entities, & relations

”Elvis”

name label

”Elvis”

71>fun

71

Page 72: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

ifun(r, y) = 1#x:r(x,y)

ifun(r) = HMyifun(r, y)

Inverse Functionality

”Elvis”

name

1935

born

ifun(name, Elvis)= 1 / 2ifun(born, 1935)= 1/10

ifun(name)= 0.9ifun(born)= 0.1

72>fun

72

Page 73: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

ifun(r) = #x:∃y:r(x,y)#x,y:r(x,y)

# of subjectsdivided by

# of factsifun(r) = HMyifun(r, y)

Inverse Functionality

”Elvis”

name

1935

born

7373

Page 74: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

3. If subjects share a relation that is highlyinverse functional, and the object is matched,then match the subjects.

Match classes, entities, & relations

”Elvis”

name label

”Elvis”

7474

Page 75: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

4. If relations share many pairs,

increase their match

Match classes, entities, & relations

”Elvis”

name label

”Elvis”

7575

Page 76: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

4. If relations share many pairs,

increase their match

Match classes, entities, & relations

”Elvis”

name label

”Elvis”

7676

Page 77: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

5. Iterate

Caveat: Convergence proof only in subcase

P (r1 ⊆ r2) = 42φγ∑...P (e1 = e2)...

P (e1 ≡ e2) =∏1

42αβ...P (r1 ⊆ r2)...

Match classes, entities, & relations

”Elvis”

name label

”Elvis”

7777

Page 78: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

6. Compute class subsumption

P (c1 ⊆ c2) = arcsin(4.1125)× P (e1 ≡ e2)× ...

Match classes, entities, & relations

singer AmericanSinger

type type

7878

Page 79: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Caveat: Class alignmentworks with “only” 85%precision due tospurious correlation.

6. Compute class subsumption

Match classes, entities, & relations

AmericanSingersinger

type type

7979

Page 80: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

PARIS matches DBpedia & YAGO

• in 2 hours• with 90% accuracy

PARIS:match entities,classes,relations

80

[VLDB 2012, APWeb 2014 invited paper]

80

Page 81: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Miningknowledge

bases->

DeterminingIncompleteness->

A ∧B ⇒ C

Protectingknowledge bases->

Creativly extendingknowledge bases ->

Constructingknowledge bases

81

Singer

type

born

Knowledge Base Life Cycle

1935

81

Page 82: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

People may “steal” from other ontologies without giving due credit.Most ontologies have licenses that require attribution.

stolen

Plagiarism

YAGO

cc

82

EvilPedia

82

Page 83: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

By adding a few fake facts to the source ontology,one can prove theft in the target ontology.

Additive Watermarking

[WWW 2012 demo]83

stolen

YAGOEvilPedia

83

Page 84: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

One can also prove theft by selectivelyremoving facts from the source ontology.

details>76

details&DIVINA>99

Subtractive Watermarking

[ISWC 2011]84

YAGO

stolen

EvilPedia

84

Page 85: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Subtractive Watermarking

85

YAGO

85

Page 86: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

ratio of overlap

ratio of covered holes

An Innocent KB

86

YAGOElvisPedia

86

Page 87: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

high overlap

no holes covered

but

Probability that thishappens by chance

overlap ratio

number of holes

α :

n :

∼ (1− α)n

DIVINA>99

A Stolen KB

87

YAGO

Dieb-i-pedia

(Argument by Mauro Sozio; original paper uses chi square test)87

Page 94: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

CC by-sa VectorPortalc©Ethan Hill

resetpasswordby credit card

resetpasswordby backup email

resetpasswordby backup email

Two-factorauthentication

Security: How many keys does a hacker need?Safety: How many keys can I afford to lose?Protection: How many keys to destroy my data?

94

Protecting the user

Mat Honan

Owen’s storyMat’s story94

Page 95: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

gmail-destructiongmail-access

gmail-2ndfactor

gmail-

trusteddevice

phone-

authen-tication

phone-app

phone-

SMS

mac-access

dell-access

phone-possession

phone-password

phone-

simPin

gmail-password

gmail-1stfactor

gmail-recovery

code

yahoo-

data

95

Account Dependenciesgmail-data

95

Page 96: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

https://suchanek.name/programs/divina96

DIVINADiscovering Vulnerabilities of Internet Accounts

[WWW 2015 demo]

96

Page 97: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Miningknowledge

bases->

DeterminingIncompleteness->

A ∧B ⇒ C

Protectingknowledge bases->

Creativly extendingknowledge bases ->

Constructingknowledge bases

97

Singer

type

born

Knowledge Base Life Cycle

1935

97

Page 99: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

99

Description Logics do not work

c© Vileda

+ ( - ) +

c© Avsar Arasc© Stone Mens Wear

BabyMop ≡Romper u ∃has.(Mop u ¬∃has.Stick) u ∃has.Baby≡ ⊥

Mop ≡ Tool u ∃has.Stick u ∃has.Strings

99

Page 100: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

100

Language for Combinatorial Creativity

c© Vileda

+ ( - ) +

c© Avsar Arasc© Stone Mens Wear

Romper + ∃has.(Mop − ∃has.Stick) + ∃has.Baby≡ BabyMop

Mop ≡ Tool u ∃has.Stick u ∃has.Strings

Mop − ∃has.> ≡ Tool u ∃StringsMop + ∃has.> ≡ Mop

Mop → ∃r.> ≡ Stick

Mop ↑ ∃has.> ≡ ∃has.Stick

Subtraction:Addition:Succession:Selection*:

100

Page 101: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Language for Combinatorial CreativityMop ≡ Tool u ∃has.Stick u ∃has.Strings

[ISWC 2016paper & demo]

1) Descriptive experiments 2) Generative experiments1/3 nonsense, 1/3 exists,1/3 “imaginable”

Mop − ∃has.> ≡ Tool u ∃StringsMop + ∃has.> ≡ Mop

Mop → ∃r.> ≡ Stick

Mop ↑ ∃has.> ≡ ∃has.Stick

Subtraction:Addition:Succession:Selection*:

>LeMonde

101

Page 102: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

102

Another creative idea...

1944 2013

102

Page 103: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

103

Mining Le Monde

time place entity

1967 USA

103

Page 104: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

red: many foreign companies mentionedblue: few foreign companies mentioned

104

Presence of foreign companies

104

Page 105: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

105

[AKBC 2013]

[VLDB 2014 vision]

Average age of people mentioned

105

Page 106: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Miningknowledge

bases->

DeterminingIncompleteness->

A ∧B ⇒ C

Protectingknowledge bases->

Creativly extendingknowledge bases ->

Constructingknowledge bases

106

Singer

type

born

Knowledge Base Life Cycle

1935

106

Page 107: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

107

Is Elvis dead?

1977died

Elvis

???

107

Page 108: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

100m statements95% precision-> 5m wrong statements

108

Is Elvis dead?

1977died

Elvis

???

108

Page 109: A Hitchhiker’s guide to OntologyYAGO ElvisPedia 64. Match classes, entities, & relations subClassOf sameAs subPropertyOf 65 birthDate 1935 type RockSinger married 1935-01-08 born

Miningknowledge

bases->

DeterminingIncompleteness->

A ∧B ⇒ C

Protectingknowledge bases->

Creativly extendingknowledge bases ->

Constructingknowledge bases

109

Singer

type

born

Knowledge Base Life Cycle

1935

Slides done with Powerline, my free SVG slide editor with LaTex support109