2014 10-11 Wikidata talk London WMF UK

30
A free knowledge base that can be read and edited by humans and machines alike

description

My WIkidata talk at the WMF UK

Transcript of 2014 10-11 Wikidata talk London WMF UK

Page 1: 2014 10-11 Wikidata talk London WMF UK

A free knowledge base that can be read and edited by humans and machines alike

Page 2: 2014 10-11 Wikidata talk London WMF UK

What is WikiData?

● A project by Wikimedia Deutschland● Launched October 2012● An interlinked database representing

“the sum of all human knowledge”● Centralising key data about “items”● Serving data to other Wikimedia projects● Serving machine-readable data to third parties● MediaWiki extension (“WikiBase”)● Fifth most active Wikimedia project!

Page 3: 2014 10-11 Wikidata talk London WMF UK

Wikimedia Commons

Page 4: 2014 10-11 Wikidata talk London WMF UK

Centralising file storage

Page 5: 2014 10-11 Wikidata talk London WMF UK

Centralising key data storage

Page 6: 2014 10-11 Wikidata talk London WMF UK

Phases of Wikidata1.Language/project links (done for some)

2.Statements (in progress)

3.Queries and lists (planned)

Page 7: 2014 10-11 Wikidata talk London WMF UK

Phases of Wikidata1.Language/project links (done for some)

2.Statements (in progress)

3.Queries and lists (planned)

Currently in phase 2● Wikipedia● Wikivoyage● Wikisource● Wikimedia Commons (partial)● Wikiquote

Page 8: 2014 10-11 Wikidata talk London WMF UK

Phase 1 : Language links

● Old : Each Wikipedia article contains links to all other Wikipedia pages about the same topic in different languages

Page 9: 2014 10-11 Wikidata talk London WMF UK

Phase 1 : Language links

● Old : Each Wikipedia article contains links to all other Wikipedia pages about the same topic in different languages

● New : WikiData contains links to all Wikipedia pages about an item

● ca. 250,000,000 language links removed!

Page 10: 2014 10-11 Wikidata talk London WMF UK

Items and “notability”

● All Wikipedia articles are automatically “notable” on Wikidata

● Items can be created without associated Wikipedia pages, if they either– would be notable by Wikipedia standards

– serve a “structural need”

Page 11: 2014 10-11 Wikidata talk London WMF UK

Items with statementsItem ID: one unique identifier “Qxxx” per itemLabel : One per item, per language

Description : One per item, per language

Alias : Multiple per item, per language

Page 12: 2014 10-11 Wikidata talk London WMF UK

Items with statementsItem ID: one unique identifier “Qxxx” per itemLabel : One per item, per language

Description : One per item, per language

Alias : Multiple per item, per languageStatements : Multiple per item, per property

Links : One per item, per language/project

Page 13: 2014 10-11 Wikidata talk London WMF UK

Phase 2: Statements

StatementItem reference

Property

Qualifier(s)

Source(s)

Rank

Page 14: 2014 10-11 Wikidata talk London WMF UK

Datatypes

Datatypes, depending on property:● Item reference● string● time (precision: from billion years to the second)● globe coordinate● URL● Quantity (numeric value&precision)● Commons media● Monolingual string

Page 15: 2014 10-11 Wikidata talk London WMF UK

Browse in any languageEnglish Chinese Scots

Page 16: 2014 10-11 Wikidata talk London WMF UK

Using Wikidata in otherWikimedia projects

● Show a statement value from the current page's item in Wikipedia etc.

● parser function {{#property:PROPERTY}}● scripts Lua mw.wikibase● Usually “hidden away” in transcluded templates● Popular on smaller Wikipedias

Page 17: 2014 10-11 Wikidata talk London WMF UK

Metrics

● As of October 2014● 15.8 million items (English Wikipedia: 4.6M articles)● ~48 million statements

– 32.5 million item references

– 10.8 million strings

– 2.5 million dates

– 1.7 million coordinates

– 283K quantities

– 927 monolingual strings (those are new...)

Page 18: 2014 10-11 Wikidata talk London WMF UK

Statements per item over time

Page 19: 2014 10-11 Wikidata talk London WMF UK

WikiData API

● Extension of MediaWiki API● Full-”text” search● Request all statements/labels/links etc. for

individual items● Editing via API● OAuth bindings● No queries for statements => items!

Page 20: 2014 10-11 Wikidata talk London WMF UK

The Wikidatatools ecosystem

Page 21: 2014 10-11 Wikidata talk London WMF UK

WikiData Query

● Stand-alone WikiData query server● Uses data dumps and Recent Changes, updated

every 10 minutes● Keeps all item-to-item links, strings, times,

locations in RAM● Can be queried over HTTP, returns JSON

http://wdq.wmflabs.org/

Page 22: 2014 10-11 Wikidata talk London WMF UK

Query editor

Page 23: 2014 10-11 Wikidata talk London WMF UK

People related to Queen Elizabeth II

Page 24: 2014 10-11 Wikidata talk London WMF UK

GeneaWiki

Page 25: 2014 10-11 Wikidata talk London WMF UK

AutoLists

Page 26: 2014 10-11 Wikidata talk London WMF UK

Tempo-spatialdisplay

● Battles are “part of”Franco-Prussian War

● Battles have date ordate range

● Battles have locationlink or coordinates

● Some have an image

Page 27: 2014 10-11 Wikidata talk London WMF UK

Reasonator• Improved visualisation• Special displays by item

type (maps for locations, relatives for people)

• Uses statements from related items

• Automatic description• Iterates property trees

(location, species, subclass)

• Timelines, auto-lists, related images

• Quick info in item link hoverboxes

• >100,000 views this month

tools.wmflabs.org/reasonator

Page 28: 2014 10-11 Wikidata talk London WMF UK

Map of all WikiData items

Page 29: 2014 10-11 Wikidata talk London WMF UK

URLs

WikiData http://www.wikidata.org

EtherPad https://etherpad.wikimedia.org/p/LEskcETL2p

Page 30: 2014 10-11 Wikidata talk London WMF UK

Concept cloud