Flanders Open Data Day II - KeyNote - Erik Mannens
-
Upload
erik-mannens -
Category
Technology
-
view
394 -
download
0
Transcript of Flanders Open Data Day II - KeyNote - Erik Mannens
ELIS – Mul*media Lab
What if
dr. Erik Mannens @erikmannens
Open Data, Linked Data, and Big Data
We need
together
ELIS – Mul*media Lab
Open Data
ELIS – Mul*media Lab
Way of … Thinking
ELIS – Mul*media Lab
Silos of Data
ELIS – Mul*media Lab
“Stop Hugging your Data”
ELIS – Mul*media Lab
ELIS – Mul*media Lab
e.g. … Open Learning
ELIS – Mul*media Lab
Open Data Linked
ELIS – Mul*media Lab
Way of … Publishing
ELIS – Mul*media Lab
Semantic Web
ELIS – Mul*media Lab
ELIS – Mul*media Lab
Connect your Silos
ELIS – Mul*media Lab
5-stars (Technical Perspective)
Open Linked Data (Tim Berners-Lee)
Make your Stuff available on the Web
Make it available as Structured Data
In a non-proprietary Format
Use URLs to identify Things, so one can point at your Stuff
Link your Data to other People’s Data to provide Context
ELIS – Mul*media Lab
5-stars (Organisational Perspective)
Open Data Engagement (Tim Davies)
Be Demand-driven
Provide Context
Support Conversation
Build Skills & Capacity
Collaborate with the Community
ELIS – Mul*media Lab
5-stars (Functional Perspective)
Open Data Portal Functionalities (iMinds)
Dataset Registry
Metadata Provider
Co-creation Platform
Data Publishing Platform
Common Data Hub
ELIS – Mul*media Lab
Data as Commodity
ELIS – Mul*media Lab
Sidenote
R&Wbase
ELIS – Mul*media Lab
ELIS – Mul*media Lab
15’ Open Data Publishing Framework
e.g. data.gent.be
opendata.antwerpen.be
ELIS – Mul*media Lab
Publishes 2 to 5 Star Data
tdt/core tdt/input triple store
ELIS – Mul*media Lab
REST-full API for Developers
triple store
core RESTful data adapter
CSV
XLS
JSON
XML SPARQL endpoint
...
e.g. datatank.gent.be/Grondgebied/Straten or data.irail.be/NMBS/Stations
ELIS – Mul*media Lab
R&Wbase
git for triples
ELIS – Mul*media Lab
Read/Write
LINKED DATA
ELIS – Mul*media Lab
TRIPLE STORES are they up for the challenge?
ELIS – Mul*media Lab
Distributed Triple Version Control
Commits
Deltas Virtual graphs
Versions
store describe
identify resolve
ELIS – Mul*media Lab
LIVE triples require fast version retrieval
LIGHTWEIGHT algorithm
through a
ELIS – Mul*media Lab
Store triples QUADS <subject> <predicate> <object> <context>
using
ELIS – Mul*media Lab
R&Wbase
GRAPH access
TRIPLE STORES
PROVENANCE
VERSION
with direct
provides control for
and
ELIS – Mul*media Lab
Data BIG
ELIS – Mul*media Lab
Way of … Analyzing
ELIS – Mul*media Lab
How Difficult Can It Be?
ELIS – Mul*media Lab
Collaborative Effort found Higgs Boson
ELIS – Mul*media Lab
Banking Industry
Healthcare Industry
Marketing Industry
Smart Cities
Deep understanding of some key Big Data markets
ELIS – Mul*media Lab
• US Securities and Exchanges Commission has estimated that it would need to collect 20 terabytes of data per month to monitor all US capital market activity
• Unstructured data comprises some 80% of the total data held by the average financial institution
• The total number of non-cash payments in the EU amounted to 90.6 billion in 2011.
• The total number of automatic teller machines (ATMs) in the EU in 2011 was 0.44 million
• The number of points of sale (POS) terminals in the EU was 8.8 million in 2011
Big (Data) Bang in Banking
ELIS – Mul*media Lab
What if it were
OPEN & LINKED
ELIS – Mul*media Lab
e.g. … OpenSpending
ELIS – Mul*media Lab
e.g. … OpenSpending
ELIS – Mul*media Lab
e.g. … OpenBank
ELIS – Mul*media Lab
e.g. … OpenCorporates
ELIS – Mul*media Lab
e.g. … OpenCorporates - Belgium
ELIS – Mul*media Lab
• Medical images are increasing by 20-40% annually
• Electronic medical records: in 2009, 99% of primary care physicians in the Netherlands used EMRs, compared to 46% in the United States and 36% in Canada
• Medical research, in which 100,000 participants are genotyped (ca. 1.5 GB/person), could result in a staggering 150 terabytes of data.
• As of July 2012 PatientsLikeMe members have shared 4,029,661 symptom reports about 7,338 symptoms and 548,650 treatment histories about 12,838 treatments
Big (Data) Bang in Healthcare
ELIS – Mul*media Lab
What if it were
OPEN & LINKED
ELIS – Mul*media Lab
e.g. … PatientsLikeMe
ELIS – Mul*media Lab
e.g. … 23AndMe
ELIS – Mul*media Lab
e.g. … PlayStation III
ELIS – Mul*media Lab
e.g. … OpenPhacts
ELIS – Mul*media Lab
e.g. … DisQover (iMinds –Ontoforce)
ELIS – Mul*media Lab
• Data use is expected to grow by as much as 44 times, amounting to some 35.2ZB (zettabytes -- a billion terabytes) globally
• Walmart handles more than 1 million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 petabytes of data.
• Twitter has 200 million tweets per day or approximately 46MB/sec of data created (August 2011)
• 25% of search results for the World’s Top 20 largest brands are links to user-generated content
• YouTube has 3 billion visitors per day, 48 hours of video is uploaded per minute (May 2011)
• There are over 200,000,000 blogs: 34% of their posts are opinions about products & brands
Big (Data) Bang in Marketing
ELIS – Mul*media Lab
What if it were
OPEN & LINKED
ELIS – Mul*media Lab
e.g. … Consumers in 1990
ELIS – Mul*media Lab
e.g. … Consumers in 2000
ELIS – Mul*media Lab
e.g. … Consumers since 2010
ELIS – Mul*media Lab
The Tyranny of the Empowered ConsYOUmers
ELIS – Mul*media Lab
ELIS – Mul*media Lab
ELIS – Mul*media Lab
e.g. … GoodRelations
ELIS – Mul*media Lab
e.g. … Nike
ELIS – Mul*media Lab
• Data use is expected to grow by as much as 44 times, amounting to some 35.2ZB (zettabytes -- a billion terabytes) globally
• Sensors, social media feeds, photos, video and cellphone GPS signals account for 2.5 quintillion bytes of data per day
• More than 50% global population lives in cities and this number is forecast to rise to 69% by 2050
• The number of city residents is expected to grow from 3.5 billion to 5 billion in the next 20 years
• ‘Internet of Things’ Age is approaching: 25 billion devices connected to the Internet by 2015 and 50 billion by 2020
• Access to public data is estimated to be worth €27 billion in the EU • ICT-enabled energy efficiency could translate into over €600 billion
worth of cost savings for the public and private sector
Big (Data) Bang in Smart Cities
ELIS – Mul*media Lab
What if it were
OPEN & LINKED
ELIS – Mul*media Lab
e.g. … OpenTransport
ELIS – Mul*media Lab
e.g. … OpenTransport
ELIS – Mul*media Lab
e.g. … OpenEnergyMonitor
ELIS – Mul*media Lab
ELIS – Mul*media Lab
ELIS – Mul*media Lab
e.g. … Big Data … in Iceland?
ELIS – Mul*media Lab
e.g. … a Trillion Sensors … in Iceland!
ELIS – Mul*media Lab
ELIS – Mul*media Lab
ELIS – Mul*media Lab
ELIS – Mul*media Lab
ELIS – Mul*media Lab
ELIS – Mul*media Lab
Credits
• EMC - Greenplum • Peter Hinssen • Scott Brinker • Jim Lecinski • David Armano • Did not have time to check all licenses of the Flickr
photos – in my defense, I did not kill anyone nor did I in any way insult and/or infringe the CIA, NSA, NDA, or any other JAA (Just Another Acronym)