Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion,...

24
Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete

Transcript of Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion,...

Page 1: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Making Linked Data Diachronic

Vassilis ChristophidesUniversity of Crete & FORTH-ICS

Heraklion, Crete

Page 2: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Data as an asset!

• One of the most significant changes of the past decade has been the widespread recognition of data as an asset– Data is the new “raw material of business” – Economist

Data Products

Page 3: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Emerging Data EcosystemBig Data has blurred the distinction between public and private

PublicPublic

Public

Volunteered Data

CuratedData

ObservedData

Page 4: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Emerging Data Subjects

data marketers

data brokers

data aggregators

http://www.ftc.gov/bcp/workshops/privacyroundtables/personalDataEcosystem.pdf

A series of data stewards , custodians , and curators are producing, consuming and brokering data products forming a far more complex value making chain than in traditional enterprise or scientific contexts

Page 5: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Any data, any size, anywhere

What to Do with this Data?

• Search:– Find structured data when it’s

relevant to search queries• Visualize, enhance,

communicate to relevant audiences– Support Communities [bio-

diversity, climate, water, …]• Relate data across sources• Fusion data from multiple

sources– Data integration!

Immersive insight, wherever you are

Connecting with the world’s data

Microsoft’s Approach to Big Data

Page 6: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Emerging Data Life-cycle

http://www.ipsr.ku.edu/naddi/about.shtml

Page 7: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Data as a Service (DaaS)

Data as a Service

Software as a Service

Platform as a Service

Infrastructure as a Service

© www.emc.com/collateral/software/white-papers/h10839-big-data-as-a-service-perspt.pdf

• DaaS promises that data products can be provided on demand to the user regardless of geographic or organizational separation of provider & consumer

• DaaS brings the notion that data related services can happen in a centralized place – aggregation, quality, cleansing and enriching data and offering it to different systems, applications or mobile users, irrespective of where they were– Virtualized– On-demand– Self-service– Scalable– Pay as you go

Page 8: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Data Marketplaces

• Services that make it easy to find data from a range of secondary data sources, then consume the data in a usable and unified format– Several of these services are trying to create marketplaces for

data, envisioning that data providers can offer their data sets for sale to data seekers (DataMarket.com)

Data Aggregation and Curation Layer

Data Connection Layer

Data Visualization and Analysis Layer

Data Hosted by Third Party

Data Hosted by Data Provider

Data Hosted in

Marketplace

Dat

a as

a S

ervi

ce

Pres

erva

tion

Serv

ice

Page 9: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

9 Vertical Data Markets

François Bancilhon Data Publica “de data rerum” WOD Tutorials 2013 Paris

Vertical Example Size (M€)

Financial Reuters 300

Press Press Index 250

Legal Francis Lefebvre 240

Solvability Altarès 160

Scientific Technical Medical

Meteo France 160

Image Sipa 60

Economy Société.com 55

Marketing Acxiom 55

Patents Reuters 25

Page 10: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Only a Small Portion of Big Data!

idgknowledgehub.com/idc-releases-first-worldwide-big-data-technology-and-services-market-forecast-shows-big-data-as-the-next-essential-capability-and-a-foundation-for-the-intelligent-economy/2012/05/07/

Page 11: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Data Hub for Market Intelligence

Source Hjalmar Gislason DataMarket, Inc Emerging DaaS business models: A case study European Data Forum (EDF), Dublin 2013

Page 12: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

hortonworks.com/blog/7-key-drivers-for-the-big-data-market

Page 13: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Potential Benefits of Linked Data for Data Marketplaces

• Abstraction layer for virtualized data access across sources– Basis for enabling automation of datasets discovery, linking&fusion

• Flexible data representation model (RDF) and global identifiers for all objects (URI)– Makes easier incremental data integration, interactive exploration

and ad hoc analysis of data• Interlinked datasets

– Newly added data can be integrated with existing ones in the marketplace

– Network effects• Data marketplace interoperability

– Data from different marketplaces can be easily federated• Derived knowledge / facts

– RDF inference of additional implicit facts

Page 14: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Web Data of Increasing StandardizationNot all linked data is open and not all open data is linked!★ Available on the web (whatever format) but with an open license, to be Open Data★★Available as machine-readable structured data (e.g. excel vs. image scan of a table)

★★★ as (2) plus non-proprietary format (e.g. CSV instead of excel)★★★★ as (3), plus using open standards from W3C (RDF and SPARQL ) to identify things through dereferenceable HTTP URIs, to ensure effective access

★★★★★ as all the above plus establishing links between data of different sources

File format

Recommendations(on a scale of 0-5)

csv ★★★

xls ★

pdf ★

doc ★

xml ★★★★

rdf ★★★★★

shp ★★★

ods ★★

tiff ★

jpeg ★

json ★★★

txt ★

html ★★

Page 15: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Key Players Offers Classification

Data Cube

+

Page 16: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

DIACHRON Objectives & Approach

Appraising

Integrating

ArchivingProducing

Publishing

Cleaning

• Preserve (semi-)structured, interrelated, evolving data by keeping them constantly accessible & reusable from an open framework such as the Data Web

• Calls for effective & efficient techniques to manage the lifecycle of web data involving data producers, curators, brokers and consumers– Pay-as-you-go data preservation

spreading costs among key players in a community of interest

• Diachronic Data: Enhance data with temporal and provenance annotations as data products are re-used through complex value making chains

Page 17: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

DIACHRON Research Agenda• How can we assess the quality of harvested datasets in order to

decide which (the data quality dimensions problem) and how many versions of them deserve to be preserved for future use (the appraisal problem)?

• How can we understand dependencies of datasets (the provenance problem) and how can metadata (temporal, spatial, thematic) can be smoothly represented along the data (the annotation problem)?

• How can we monitor changes of third-party datasets (the evolution tracking problem) or how can local/remote data imperfections (e.g., due to change propagation) can be repaired (the curation problem)?

• How do we cite particular versions of a dataset (the citation problem), and how will we be able to retrieve them when looking up a reference (the long term accessibility problem)?

• How do we maintain the consistency of multiple versions of dependent datasets (the archiving problem) and how we will access the datasets along their evolution history (the longitudinal querying problem)?

Page 18: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Knowledge Bases

Datasets

Linked Open Data cloud

AnnotationServices (WP2)

Diachronic Citations

EvolutionServices (WP3)

Archiving Services (WP4)

Longitudinal Query

Processing

Temporal and Provenance Annotations

Cleaning and Repairing

ChangeRecognition and

Propagation

Acquisition Services (WP5)

Multiversion Archiving

Quality-driven Adaptive Crawling

Ranking and Appraisal

distribute

fetch

applyfetch

annotate

fetch

shareOpen Data Applications (WP7)

Enterprise DataIntranets (WP8)

ScientificLinked Data (WP9)The

DIACHRON Platform (WP6)

qq

WP4

WP6

WP5

WP9

WP3

WP2

WP8

WP7

DIACHRON Data Services & Work Plan

Page 19: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Diachronic Data Services Lifecycle

Data Repurposing

Data Archiving Data Evolution

Data Appraisal

Data Citation

Page 20: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Concluding Remarks• The integrated DIACHRON platform and services aim to

support long term usability of open and/or linked data published in the Web and within Enterprise Intranets

• The concept of diachronic data intends to foster self-preserving data embedding an understanding of their evolving semantics, use contexts, and interpretations

• DIACHRON is expected to:

Improve

our understanding of how

linked/open data evolv

es

Reduce the maintenanc

e costs when integrating linked

/ open data

Foster data accountabil

ity and

transparency in open dynamic data space

s

Address

sustainabili

ty issues

for preserving Big

Data

Data Custodians’Effort

Data Consumer‘s Effort

Data Publisher‘s Effort

Fix Overall Data Preservation

Effort

Page 21: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.
Page 22: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Business Models for Linked Data Publishers

http://chiefmartec.com/2010/03/business-models-for-linked-data-and-web-30

Page 23: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Business Webs as Types of Value Creation

• Agora: Open electronic marketplaces with regard to pricing and offered products (e.g. Android marketplace)

• Aggregation: Closed, controlled electronic marketplaces (e.g. Apple App Store)

• Distributed Network: Value Network• Value Chain: ICT-enabled Value Chains• Alliance: Loosely cooperation market players (e.g. Open Source projects)

Page 24: Making Linked Data Diachronic Vassilis Christophides University of Crete & FORTH-ICS Heraklion, Crete.

Data-Driven Business Models

Source Michalis Vafopoulos