Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF...
Transcript of Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF...
![Page 1: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/1.jpg)
EnterpriseKnowledgeGraphsforLargeScaleAnalytics
NidhiRajshree,IBMWatson,USA,Nitish Aggarwal,IBMWatson,USA,Sumit Bhatia, IBMResearch,IndiaAnshu Jain,IBMResearch,Almaden,USA
![Page 2: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/2.jpg)
The material presented in this tutorial represents the personal opinion of the presenters and not of IBM and affiliated organization.
![Page 3: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/3.jpg)
Outlineofthetutorial
Part 1: Knowledge Graph Construction• Introduction• DBpedia: Knowledge extraction• Approaches to extend knowledge graph• Knowledge extraction from scratch
Part 2: Knowledge Graph Analytics• Finding entities of interest• Entity exploration• Upcoming challenges
![Page 4: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/4.jpg)
WhatisKnowledgeGraph
“The KnowledgeGraph isa knowledgebase usedby Google toenhanceits searchengine'ssearchresultswith semantic-searchinformationgatheredfromawidevarietyofsources.”
![Page 5: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/5.jpg)
WhatisKnowledgeGraph
“The KnowledgeGraph isa knowledgebase usedby Google toenhanceits searchengine'ssearchresultswith semantic-searchinformationgatheredfromawidevarietyofsources.”
“AKnowledgegraph(i)mainlydescribesrealworldentitiesandinterrelations,organizedinagraph(ii)definespossibleclassesandrelationsofentitiesinaschema”(iii)allowspotentiallyinterrelatingarbitraryentitieswitheachother… [Paulheim H.]
“WedefinesaKnowledgeGraphasanRDFgraphconsistsofasetofRDFtripleswhereeachRDFtriple(s,p,o)isanorderedsetoffollowingRDFterm….”[Pujara J.alal.]
![Page 6: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/6.jpg)
WhatisKnowledgeGraph
Nosingleformaldefinition…
• Definesrealworldentities
• Providesrelationshipsbetweenthem
![Page 7: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/7.jpg)
WhatisKnowledgeGraph
Nosingleformaldefinition…
• Definesrealworldentities
• Providesrelationshipsbetweenthem
• Containsrulesdefinesthroughontologies
• Enablereasoningtoinfernewknowledge
![Page 8: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/8.jpg)
WhyKnowledgeGraph
Building an intelligent system that can interact with human, requires knowledge about real world entities.
![Page 9: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/9.jpg)
WhyKnowledgeGraph
Building an intelligent system that can interact with human, requires knowledge about real world entities.
• Enhance search results.
• Enhance ad sense.
• Help in language understanding.
• Enables knowledge discovery.
![Page 10: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/10.jpg)
Isthereexistingknowledgegraphreadytouseformyapplication?
![Page 11: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/11.jpg)
GoogleKnowledgeGraphFacebook
EntityGraph
MicrosoftSatori
LinkedInKnowledgeGraph
AmazonProductGraph
![Page 12: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/12.jpg)
![Page 13: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/13.jpg)
DBpedia:Knowledgeextraction
![Page 14: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/14.jpg)
DBpedia:Knowledgeextraction
TheCityofNewYork,oftencalledNewYorkCity orsimplyNewYork,isthemostpopulouscityintheUnitedStates.
![Page 15: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/15.jpg)
DBpedia:Knowledgeextraction
TheCityofNewYork,oftencalledNewYorkCity orsimplyNewYork,isthemostpopulouscityintheUnitedStates.
<NewYorkCity>,<CityIn><UnitedStates>.
<CityName>,<locatedIn><CountryName>.
![Page 16: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/16.jpg)
DBpedia:Knowledgeextraction
![Page 17: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/17.jpg)
DBpedia:Knowledgeextraction
TheCityofNewYork,oftencalledNewYorkCity orsimplyNewYork,isthemostpopulouscityintheUnitedStates.
![Page 18: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/18.jpg)
DBpedia:Knowledgeextraction
![Page 19: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/19.jpg)
DBpedia:Knowledgeextraction
<headentity>,<rel>< tailentity>
![Page 20: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/20.jpg)
DBpedia:Knowledgeextraction
<headentity>,<rel>< tailentity>
WikipediaInfobox
![Page 21: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/21.jpg)
![Page 22: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/22.jpg)
DBpedia:Knowledgeextraction
![Page 23: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/23.jpg)
DBpedia:Knowledgeextraction
![Page 24: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/24.jpg)
DBpedia:Knowledgeextraction
![Page 25: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/25.jpg)
DBpedia:Knowledgeextraction
Parsers
Ontology(Classes,properties)
dbr:IBM dbp:foundedBydbr:Charles_Ranlett_Flint
dbr:IBM dbp:foundedBydbr:Charles_Ranlett_Flint
dbr:IBM dbp:foundedBydbr:Charles_Ranlett_Flint
……………
![Page 26: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/26.jpg)
(Research)problemsinknowledgegraphs
• Incomplete knowledge– Missing entities– Missing relations– Limited entity and relation types
![Page 27: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/27.jpg)
(Research)problemsinknowledgegraphs
• Incomplete knowledge– Missing entities– Missing relations– Limited entity and relation types
• Incorrect knowledge– Wrong entity label recognition– Wrong entity and relation type– Wrong facts
![Page 28: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/28.jpg)
(Research)problemsinknowledgegraphs
• Incomplete knowledge– Missing entities– Missing relations– Limited entity and relation types
• Incorrect knowledge– Wrong entity label recognition– Wrong entity and relation type– Wrong facts
• Inconsistency in knowledge– Different labels for same entity– Merging entities with same labels
![Page 29: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/29.jpg)
Approachestoextendknowledgegraphs
• Extracting knowledge from Wikipedia tables– Large amount of raw data in form of tables– Tables have some implicit structure/patterns
![Page 30: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/30.jpg)
Approachestoextendknowledgegraphs
• Extracting knowledge from Wikipedia tables– Large amount of raw data in form of tables– Tables have some implicit structure/patterns
Wiki:AFC_Ajax containingrelationsbetweenplayers,theirshirtnumber,andcountry
![Page 31: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/31.jpg)
Approachestoextendknowledgegraphs
• <Wiki:AFC_Ajax,dbp:rel,Wiki:Andre_Onana>• 80%entitiesinthetablehaverelationdbp:rel withtheWikipediatitleentity
Wiki_AFC_Ajax• Other20%entitiesarelikelytohavethesamerelationshipdbp:rel withWiki_AFC_Ajax
[MunozE.atal.]UsingLinkedDatatoMineRDFfromWikipedia'sTables,WSDM2014
![Page 32: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/32.jpg)
Approachestoextendknowledgegraphs
[MunozE.atal.]UsingLinkedDatatoMineRDFfromWikipedia'sTables,WSDM2014
• Features– Articlefeatures:no.oftables,length– Tablefeatures:no.ofrows,no.ofcolumns– Columnfeatures:no.ofentitiesincolumn,potentialrelations– Cellfeatures:no.ofentitiesinacell,lengthofcell– Manyothers
• Combinesusingclassificationmethod
Prec. Rec. F1
Rule-based 64.23 70.46 67.20
SVM 72.43 75.77 74.06
Logistic 79.62 79.01 79.31
![Page 33: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/33.jpg)
Approachestoextendknowledgegraphs
[MunozE.atal.]UsingLinkedDatatoMineRDFfromWikipedia'sTables,WSDM2014
• Features– Articlefeatures:no.oftables,length– Tablefeatures:no.ofrows,no.ofcolumns– Columnfeatures:no.ofentitiesincolumn,potentialrelations– Cellfeatures:no.ofentitiesinacell,lengthofcell– Manyothers
• Combinesusingclassificationmethod
Prec. Rec. F1
Rule-based 64.23 70.46 67.20
SVM 72.43 75.77 74.06
Logistic 79.62 79.01 79.31
• Rules/heuristicsbasedmethodsmakesmistakes,andhardtocreateoneruleforeveryone.
• Eventhoughcombiningdifferentfeaturesachieves80%accuracy,itintroduces20%noise.
![Page 34: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/34.jpg)
Tabledataislimited,weneedtogobeyond
![Page 35: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/35.jpg)
Approachestoextendknowledgegraphs
• Missingentity/literalforarelation– “ChristopherA.WeltyisanAmericancomputerscientist,whoworksat
GoogleResearchinNY”• <dbr:Chris_Welty><employedBy><?>
– "TomCruiseandBradPittappearinInterviewwiththeVampire"• <dbr:Brad_Pitt><?><dbr:Tom_Cruise>
![Page 36: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/36.jpg)
Approachestoextendknowledgegraphs
• Missingentity/literalforarelation– “ChristopherA.WeltyisanAmericancomputerscientist,whoworksat
GoogleResearchinNY”• <dbr:Chris_Welty><employedBy><?>
– "TomCruiseandBradPittappearinInterviewwiththeVampire"• <dbr:Brad_Pitt><?><dbr:Tom_Cruise>
• KnowledgeBaseCompletion– Similartolinkpredictioninsocialnetworkbutabitmorechallenging– Needtoidentifyrelationtypeinadditiontobinaryoutput.
![Page 37: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/37.jpg)
Approachestoextendknowledgegraphs
• KnowledgeBaseCompletion– TransE:learntheentityandrelationembeddings byassumingthattranslation
ofentityembeddings correspondtotheirrelationembeddings.[Bordes etat.2013]
– S+R≈T,where<S,R,T>
– TransH:Learndifferententityembeddingfordifferentrelationships[Wangatel.2014]
– TransR:Learnentityandrelationembeddings indifferentspace,followingbytranslationperforminrelationspace.[LinY.atel.2015]
– Manymoremethods [NickelM.atal,2015]
![Page 38: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/38.jpg)
Knowledgebasecompletionapproachesfocusonfindingmissingentities/relations
![Page 39: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/39.jpg)
Needtoaddnewentitiesfromexternalsources
![Page 40: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/40.jpg)
Needtoaddnewentitiesfromexternalsources
• Entityrecognitioninexternaltextresource• ManyNamedEntityRecognitionsystems
• LinkextractedentitytoKGorcreateanewnodeifitdoesnothaveacorrespondingentity
• TAC-KBP(EntityDiscoveryandLinkingtask)[JiH.atel.2016]
![Page 41: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/41.jpg)
BuildingknowledgegraphsuchasDBpedia requireslotofmanualefforts
![Page 42: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/42.jpg)
BuildingknowledgegraphsuchasDBpedia requireslotofmanualefforts
• Manyapplicationsrequiredomain/dataspecificcustomknowledgegraphs.
• CreatingschemawithclassstructureandconstraintsforeachKGisdifficult.
![Page 43: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/43.jpg)
Howtocreateaknowledgegraphfromunstructuredtext?
![Page 44: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/44.jpg)
JonathonWatsonworksatIBM.Hehasmorethan50patents,andwonbestinventorawardforhisinvention“NeuralChipbyJonWatsonetal.
![Page 45: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/45.jpg)
JonathonWatsonworksatIBM.Hehasmorethan50patents,andwonbestinventorawardforhisinvention“NeuralChipbyJonWatsonetal.
Entityextraction
Relationextraction
Noisereduction KG
JonathonWatsonIBMJonWatson
employedBy(JonathonWatson,IBM)JonWatson
JonathonWatson,JonWatson
![Page 46: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/46.jpg)
Relationextraction
• Supervised methods
Predefined schema (employedBy, bornOn, BirthPlace …)
Training data
JonathonWatsonworksatIBM.
JonathonWatsonjoinedIBM.
employedBy
employedBy
![Page 47: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/47.jpg)
Relationextraction
• Supervised methods
Predefined schema (employedBy, bornOn, BirthPlace …)
Training data Test data
JonathonWatsonworksatIBM.
JonathonWatsonjoinedIBM.
employedBy
employedBy JonathonWatsonismanageratIBM.
?
![Page 48: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/48.jpg)
Relationextraction
• Supervised methods
Predefined schema (employedBy, bornOn, BirthPlace …)
Training data Test data
JonathonWatsonworksatIBM.
JonathonWatsonjoinedIBM.
employedBy
employedBy JonathonWatsonismanageratIBM.
employedBy
![Page 49: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/49.jpg)
Relationextraction
• Supervised methods
Pros: High accuracy and less noise
Cons: Hard and expensive to build labeled data
![Page 50: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/50.jpg)
Relationextraction
• Supervised methods• Distantly supervised methods
employedBy (JonWatson,IBM)
affiliated(MichaelDecker,,SMU)
JonWatsonworksatIBM.
JonWatsonbecomesVPatIBM.……….
MichaelDeckerjoinsDataSciencegroupatSMU.
MichaelDeckerwonanationalfundingawardat
SMU.……….
![Page 51: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/51.jpg)
Relationextraction
• Supervised methods• Distantly supervised methods
employedBy (JonWatson,IBM)
affiliated(MichaelDecker,,SMU)
JonWatsonworksatIBM.
JonWatsonbecomesVPatIBM.……….
MichaelDeckerjoinsDataSciencegroupatSMU.
MichaelDeckerwonanationalfundingawardat
SMU.……….
Trainingsentences
![Page 52: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/52.jpg)
Relationextraction
• Supervised methods• Distantly supervised methods
Pros: Overcome the effort of labeling data
Cons: Dependency of existing knowledge graph and corresponding . text
![Page 53: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/53.jpg)
Relationextraction
• Supervised methods• Distantly supervised methods• Unsupervised methods (OpenIE, Universal Schema)
JonathonWatsonworksatIBM.
JonathonWatsonjoinedIBM.
join
(ROOT(S(NP(JonWatson))(VP(VBZworks)(PP(INat)(NPIBM)))
(ROOT(S(NP(JonWatson))(VP(VBDjoined)(NPIBM))
work
![Page 54: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/54.jpg)
Relationextraction
• Supervised methods• Distantly supervised methods• Unsupervised methods (OpenIE, Universal Schema)
Pros: eliminates the effort of labeling data
Cons: Noisy, large number of relations
![Page 55: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/55.jpg)
Relationextraction
• Supervised methods• Distantly supervised methods• Unsupervised methods (OpenIE, Universal Schema)
Relation1 Relation2 Relation3
WorksemployerCompany
employedBy….
livesIncurrentCityCountry
….
VicePresidentexecutive
Boardmember
….
![Page 56: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/56.jpg)
Relationextraction(UniversalSchema)
• Clustering using vector similarity• Matrix completion and fill the empty values [YaoL.atel.,
2012]
employeBy affiliated Leaderof
Jon x x
Michael x
Steve x x
Joyce x x x
![Page 57: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/57.jpg)
Entitytypesidentification(UniversalSchema)
• Clustering using vector similarity• Matrix completion and fill the empty values [YaoL.atel.,
2012]
director musician actor
Jon x x
Michael x x
Steve x
Joyce x x
![Page 58: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/58.jpg)
Relationextractionindomain
• Supervised methods – Need domain experts to label the data• Distantly supervised methods – Hard to find corresponding
text• Unsupervised methods (OpenIE, Universal Schema) – Noisy
A 59-year-old African American man with a past medical history of hypertension, benign prostatichypertrophy, type II diabetes mellitus for the past 15 years, and chronic back pain presents to the hospitalwith gross hematuria. The patient states that he noticed blood in his urine last night. The patient also reportsmild, intermittent flank pain. The patient states that his diabetes and blood pressure are well controlled withmedications, and that he has managed his chronic back pain with 2 aspirin per day for the past 4 years. Vitalsigns are Temp- 98.6°F, BP- 124/82 mm/Hg, pulse- 88/min, and RR- 14/min. Blood work is notable for HbA1Cof 6.5%. A pyelogram reveals a ring sign. His current fasting glucose is 140mmol/L.<br /><br />What is themost likely etiology of hematuria in this patient?
Symptom
![Page 59: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/59.jpg)
Knowledgegraphsindomain
• Domain specific entity extraction is more challenging
• Limited relation types
• Less explicit mention of entity and relation types in text
• Creating simple schema requires domain experts
![Page 60: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/60.jpg)
Knowledgegraph- Simple
JonathonWatsonworksatIBM.
MichaelDeckerjoinedIBM.
MichaelDeckerattendsSMU
JonathonWatson
IBM
MichaelDecker
SMU
![Page 61: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/61.jpg)
Knowledgegraph- Simple+Schema
JonathonWatsonworksatIBM.
MichaelDeckerjoinsIBM.
MichaelDeckerattendedSMU
JonathonWatson
IBM
MichaelDecker
SMU
affiliated
affiliatedaffiliated
![Page 62: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/62.jpg)
Knowledgegraph- Simple+Schema+Ontology
JonathonWatsonworksatIBM.
MichaelDeckerjoinedIBM.
MichaelDeckerattendsSMU
JonathonWatson
IBM
MichaelDecker
SMU
affiliated
affiliatedaffiliated
Domain,range,constraint
![Page 63: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/63.jpg)
Summary
• Simple knowledge graph works for many applications
• Identify the requirement before finding the solution.
• Many knowledge graphs are publically available
![Page 64: Enterprise Knowledge Graphs for Large Scale Analytics · “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered](https://reader030.fdocuments.in/reader030/viewer/2022040419/5dd11422d6be591ccb641de8/html5/thumbnails/64.jpg)
https://www.youtube.com/watch?v=kao05ArIiok&feature=youtu.be