Flanders Open Data Day II - KeyNote - Erik Mannens

73
ELIS – Mul*media Lab What if dr. Erik Mannens @erikmannens Open Data, Linked Data, and Big Data We need together

Transcript of Flanders Open Data Day II - KeyNote - Erik Mannens

Page 1: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

What if

dr.  Erik  Mannens  @erikmannens  

Open Data, Linked Data, and Big Data

We need

together

Page 2: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Open Data

Page 3: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Way of … Thinking

Page 4: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Silos of Data

Page 5: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

“Stop Hugging your Data”

Page 6: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Page 7: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … Open Learning

Page 8: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Open Data Linked

Page 9: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Way of … Publishing

Page 10: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Semantic Web

Page 11: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Page 12: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Connect your Silos

Page 13: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

5-stars (Technical Perspective)

Open Linked Data (Tim Berners-Lee)

Make your Stuff available on the Web

Make it available as Structured Data

In a non-proprietary Format

Use URLs to identify Things, so one can point at your Stuff

Link your Data to other People’s Data to provide Context

Page 14: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

5-stars (Organisational Perspective)

Open Data Engagement (Tim Davies)

Be Demand-driven

Provide Context

Support Conversation

Build Skills & Capacity

Collaborate with the Community

Page 15: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

5-stars (Functional Perspective)

Open Data Portal Functionalities (iMinds)

Dataset Registry

Metadata Provider

Co-creation Platform

Data Publishing Platform

Common Data Hub

Page 16: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Data as Commodity

Page 17: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Sidenote

R&Wbase

Page 18: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Page 19: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

15’ Open Data Publishing Framework

e.g. data.gent.be

opendata.antwerpen.be

Page 20: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Publishes 2 to 5 Star Data

tdt/core tdt/input triple store

Page 21: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

REST-full API for Developers

triple store

core RESTful data adapter

CSV

XLS

JSON

XML SPARQL endpoint

...

e.g. datatank.gent.be/Grondgebied/Straten or data.irail.be/NMBS/Stations

Page 22: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

R&Wbase

git for triples

Page 23: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Read/Write

LINKED DATA

Page 24: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

TRIPLE STORES are they up for the challenge?

Page 25: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Distributed Triple Version Control

Commits

Deltas Virtual graphs

Versions

store describe

identify resolve

Page 26: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

LIVE triples require fast version retrieval

LIGHTWEIGHT algorithm

through a

Page 27: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Store triples QUADS <subject> <predicate> <object> <context>

using

Page 28: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

R&Wbase

GRAPH access

TRIPLE STORES

PROVENANCE

VERSION

with direct

provides control for

and

Page 29: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Data BIG

Page 30: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Way of … Analyzing

Page 31: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

How Difficult Can It Be?

Page 32: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Collaborative Effort found Higgs Boson

Page 33: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Banking Industry

Healthcare Industry

Marketing Industry

Smart Cities

Deep understanding of some key Big Data markets

Page 34: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

•  US Securities and Exchanges Commission has estimated that it would need to collect 20 terabytes of data per month to monitor all US capital market activity

•  Unstructured data comprises some 80% of the total data held by the average financial institution

•  The total number of non-cash payments in the EU amounted to 90.6 billion in 2011.

•  The total number of automatic teller machines (ATMs) in the EU in 2011 was 0.44 million

•  The number of points of sale (POS) terminals in the EU was 8.8 million in 2011

Big (Data) Bang in Banking

Page 35: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

What if it were

OPEN & LINKED

Page 36: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … OpenSpending

Page 37: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … OpenSpending

Page 38: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … OpenBank

Page 39: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … OpenCorporates

Page 40: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … OpenCorporates - Belgium

Page 41: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

•  Medical images are increasing by 20-40% annually

•  Electronic medical records: in 2009, 99% of primary care physicians in the Netherlands used EMRs, compared to 46% in the United States and 36% in Canada

•  Medical research, in which 100,000 participants are genotyped (ca. 1.5 GB/person), could result in a staggering 150 terabytes of data.

•  As of July 2012 PatientsLikeMe members have shared 4,029,661 symptom reports about 7,338 symptoms and 548,650 treatment histories about 12,838 treatments

Big (Data) Bang in Healthcare

Page 42: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

What if it were

OPEN & LINKED

Page 43: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … PatientsLikeMe

Page 44: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … 23AndMe

Page 45: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … PlayStation III

Page 46: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … OpenPhacts

Page 47: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … DisQover (iMinds –Ontoforce)

Page 48: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

•  Data use is expected to grow by as much as 44 times, amounting to some 35.2ZB (zettabytes -- a billion terabytes) globally

•  Walmart handles more than 1 million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 petabytes of data.

•  Twitter has 200 million tweets per day or approximately 46MB/sec of data created (August 2011)

•  25% of search results for the World’s Top 20 largest brands are links to user-generated content

•  YouTube has 3 billion visitors per day, 48 hours of video is uploaded per minute (May 2011)

•  There are over 200,000,000 blogs: 34% of their posts are opinions about products & brands

Big (Data) Bang in Marketing

Page 49: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

What if it were

OPEN & LINKED

Page 50: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … Consumers in 1990

Page 51: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … Consumers in 2000

Page 52: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … Consumers since 2010

Page 53: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

The Tyranny of the Empowered ConsYOUmers

Page 54: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Page 55: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Page 56: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … GoodRelations

Page 57: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … Nike

Page 58: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

•  Data use is expected to grow by as much as 44 times, amounting to some 35.2ZB (zettabytes -- a billion terabytes) globally

•  Sensors, social media feeds, photos, video and cellphone GPS signals account for 2.5 quintillion bytes of data per day

•  More than 50% global population lives in cities and this number is forecast to rise to 69% by 2050

•  The number of city residents is expected to grow from 3.5 billion to 5 billion in the next 20 years

•  ‘Internet of Things’ Age is approaching: 25 billion devices connected to the Internet by 2015 and 50 billion by 2020

•  Access to public data is estimated to be worth €27 billion in the EU •  ICT-enabled energy efficiency could translate into over €600 billion

worth of cost savings for the public and private sector

Big (Data) Bang in Smart Cities

Page 59: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

What if it were

OPEN & LINKED

Page 60: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … OpenTransport

Page 61: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … OpenTransport

Page 62: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … OpenEnergyMonitor

Page 63: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Page 64: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Page 65: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … Big Data … in Iceland?

Page 66: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

e.g. … a Trillion Sensors … in Iceland!

Page 67: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Page 68: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Page 69: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Page 70: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Page 71: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Page 72: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

QUESTIONS?

dr. Erik Mannens [email protected]

@erikmannens

Thoughts?

Page 73: Flanders Open Data Day II - KeyNote - Erik Mannens

ELIS  –  Mul*media  Lab  

Credits

•  EMC - Greenplum •  Peter Hinssen •  Scott Brinker •  Jim Lecinski •  David Armano •  Did not have time to check all licenses of the Flickr

photos – in my defense, I did not kill anyone nor did I in any way insult and/or infringe the CIA, NSA, NDA, or any other JAA (Just Another Acronym)