Integrating Wikidata in the DBpedia ecosystem

24
Integrating Wikidata in the DBpedia ecosystem Presenter: Ali Ismayilov Supervisors: Dimitris Kontokostas Prof. Dr. Sören Auer

Transcript of Integrating Wikidata in the DBpedia ecosystem

Integrating Wikidata in the DBpedia ecosystem

Presenter: Ali IsmayilovSupervisors: Dimitris Kontokostas

Prof. Dr. Sören Auer

Wikidata to DBpedia integration

● Use Wikidata like all other Wikipedia editions

● Apply our extractors to each Wikidata Item● Generate triples in DBpedia domain:

wikidata.dbpedia.org/resource/

Wikidata Layout

Simple Extractors

● WikidataLabelExtractor● WikidataDescriptionExtractor● WikidataAliasExtractor● WikidataSameAsExtractor● WikidataNameSpaceSameAsExtractor● WikidataLLExtractor

WikidataLabelExtractor

● Label language-specific name used for items, properties and queries.

● Example output: wd:Q64 rdfs:label "Berlyn"@fy .

wd:Q64 rdfs:label "Berlin"@gd .

WikidataAliasExtractor

● Aliases are marked as Also known as: in the user-interface.

● They are language-specific alternate labels and can be as many as necessary.

● Example output: wd:Q64 dbo:alias "Berlin, Germany"@en .wd:Q64 dbo:alias "Горад Берлін"@be .

WikidataDescriptionExtractor

● Description is a language-specific descriptive phrase for an item, property or query

● The description therefore does not need to be unique

● it must be unique together with the label. ● Example output:

wd:Q64 dbo:description "Hööftstadt vun Düütschland un en Bundsland"@de. wd:Q64 dbo:description "Hauptstadt von Deutschland"@de .

WikidataSameAsExtractor

● owl:sameAs URIs of DBpedia resources● Example output:

wd:Q64 owl:sameAs <http://de.dbpedia.org/resource/Berlin> .wd:Q64 owl:sameAs <http://nl.dbpedia.org/resource/Berlijn> .

WikidataNameSpaceSameAsExtractor

● owl:sameAs for namespaces● Example output:

wd:Q64 owl:sameAs <http://wikidata.org/entity/Q64> .wd:Q30 owl:sameAs <http://wikidata.org/entity/Q30> .

WikidataLLExtractor

● owl:sameAs links for sitelinks● Example output:

dbr:Berlin owl:sameAs <http://pt.dbpedia.org/resource/Berlim> .dbr:Berlin owl:sameAs <http://it.dbpedia.org/resource/Berlino> .

Wikidata Statement

Property based mapping

● Mappings in Wikidata properties to tranfrom triples in the DBpedia ontology.

● Hard - coded JSON configuration file (subject to change in future)

Property based mapping● Following mapping options:

○ One to One mapping case■ Represented as String to String in json config file.

Example config line: "P102": "party"

○ One to One mapping extended version■ Represented as String to Object in json file.

Wikidata property mapped to DBpedia ontology property and wikidata value mapped to new value. Example config line: "P214": {"owl:sameAs": "http://viaf.org/viaf/$1"}

Property based mapping

○ One to Many mapping case■ In this mapping more than one triple is created.

In configuration file it is represented as String to Array. Example config line:

"P109": [

{"thumbnail":"http://commons.wikimedia.org/wiki/Special:FilePath/$1?width=300"}

{"foaf:depiction":"http://commons.wikimedia.org/wiki/Special:FilePath/$1"}

]

WikidataR2RExtractor

Example output: wd:Q64 <http://www.georss.org/georss/point> "52516666666 13383333333"^^xsd:string .

wd:Q64 rdf:type geo: GeoSpationThing .

wd:Q64 geo:lat "52516666666"^^xsd:float.

wd:Q64 geo:long "13383333333"^^xsd:float.

wd:Q64 dbo: thumbnail <http://commons.wikimedia.org/wiki/Special:FilePath/Locator_map_Berlin_in_Germany.svg?width=300> .

wd:Q64 foaf:depiction <http://commons.wikimedia.org/wiki/Special:FilePath/Locator_map_Berlin_in_Germany.svg> .

Qualifier

Mapping of qualifiers

● There are several approaches for representing N-ary relations http://www.w3.org/TR/swbp-n-aryRelations/

● We are using rdf reification vocabulary for representing N-ary relations. http://www.w3.org/TR/rdf-schema/#ch_reificationvocab

Mapping of qualifiers● Pros of this approach:

○ Simple approach ○ Near to DBpedia owl structure relatively to other

approaches● For this purpose unique statement URIs has

created.

Mapping of qualifiers

● WikidataReificationExtractor● WikidataReificationMapping

WikidataReificationExtractor

Example output: wd:Q30_P6_Q23

rdf: type rdf:Statement;rdf: subject wd:Q30 ;rdf: predicate wd:P6;rdf:object wd:Q23 ;wd:P580 "1789-04-30"^^xsd:date ;wd:Q30_P6_Q23 wd:P582 "1797-03-04"^^xsd:date.

WikidataReificationMapping

Example output:wd:Q30_P6_Q23

rdf: type rdf:Statement;rdf: subject wd:Q30;wd:startDate "1789-04-30"^^xsd:date ;dbo:endDate "1797-03-04"^^xsd:date.

WikidataReferenceExtractor

● All wikidata references extracted with using same statement URIs

● Example output: wd: Q76_P552_Vc1118

dbo:reference "http://www.nytimes.com/2009/01/22/us/politics/22obama.html?ref=us&_r=0"^^xsd:string.

Thank You

Tools and code

● Wikidata Toolkit● https://github.com/alismayilov/extraction-

framework/tree/wikidataAllCommits● It will be merged to main branch after

announcing in DBpedia mail list and after getting feedback.