Linked Open Government Data and the Semantic Web

23
Tetherless World Constellation Open Government Data Jim Hendler Tetherless World Professor of Computer and Cognitive Science Assistant Dean of Information Technology and Web Science Rensselaer Polytechnic Institute http://www.cs.rpi.edu/~hendler @jahendler (twitter)

description

Linked data (Semantic Web) technology has been valuable in promoting govt transparency by allowing mashups of govt data in the US, UK and elsewhere. This talk overviews the promise, status and challenges in this space.

Transcript of Linked Open Government Data and the Semantic Web

Page 1: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Open Government Data

Jim HendlerTetherless World Professor of Computer and Cognitive Science

Assistant Dean of Information Technology and Web Science

Rensselaer Polytechnic Institutehttp://www.cs.rpi.edu/~hendler

@jahendler (twitter)

Page 2: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Government Data on the Web

Page 3: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Current state (academic)

• Lots of data is being opened• But much of it is opaque and contains

(sometime) significant errors• Smart mark-up (including annotation) is needed• But also needed are information and visual

presentation capabilities to really put people in the loop

• Technical approaches are helping but curation (by people and computers) is sorely needed

Page 4: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Linked Data + Semantics

• "Linked Data" approach finds its use cases in Web Applications (at Web scales)– A lot of data, a little semantics– Finding anything in the mess can be a win!

• Example– Declare simple inferable relationships and apply, at

scale, to large, heterogeneous data collections• eg. Use InverseFunctional triangulation to find the entities

that can be inferred to be the same– These are "heuristics" not every answer must be right

(qua Google) – But remember time = money!

Page 5: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

RDFTripleStore

DynamicContentEngine

HTTP

RDF

Web App(w SPARQL)

RDFTripleStore

Fits Web Architecture

• ~2006: Web app developers discover the Semantic Web

HTML

2008 examples include sites from "regular" Web players such as Dow Jones, Reuters and Yahoo!

Page 6: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Government Data on the Web

Page 7: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

What’s promising

• Linked open data (data-gov.tw.rpi.edu, data.gov.uk)

• Open (access) commons and data publishing (and citation)

• Markup languages and semantics and tools to enable transparency

• Web 2.0 to put people in the loop and use and contribute to annotations

• Lower barriers to internet visualization, e.g. Google graphics

Page 8: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Moving data.gov to linked data (UK)

• Built around linked data with top-down push from “Number 10”

Page 9: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Moving data.gov to linked data (US)

• Third parties (like RPI) translate the govt data into Sem Web forms and link to sources

• Plans for a semantic.data.gov in OGD implementation plans,, but unfunded

Page 10: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Pump through to Google Viz for demos

Page 11: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Data.gov + epa.gov

Page 12: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Adding some Web magic

Web Analytics

Social Data Networks

External Links

Page 13: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Identifying cross cuts in the data

Page 14: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

NTIA internet study vs. libraries

Page 15: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

NTIA internet funding vs. tweets about #haiti

Page 16: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Visualization can help identify data errors

Correlates fires, acres burned, and agency budgets

Page 17: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Visualization can help identify data errors

Were there really no fires in 1985?

Page 18: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Combining data from different sites

Page 19: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Presents a challenge – different ontologies

Page 20: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Presents a challenge – different ontologies

Page 21: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Presents a challenge – different ontologies

Same or different?

Page 22: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

And many other interesting issues

• Trust– Government data is controversial, and potentially biased

• How do we confirm or dispute?

• Combination– When we combine data we need to keep the provenance of

information (see trust)• How can we show and use?

• Scaling– Data-gov Wiki has already converted 5,448,693,510 triples

• Versioning and updating• Archiving• Searching• …

Page 23: Linked Open Government Data and the Semantic Web

Tetherless World Constellation

Summary

• The Open Govt data is a great play ground– Government data released as RDF (UK)– Government data converted to RDF (US)– Government data that can be found in many forms

and used or converted (WWW)

• Great showcase for the web nature of the Semantic Web– Mashups

• But many challenges remain– Scaling, Trust, Provenance, Archiving, Curation, …