Prizms for Data Publication and Management Katie Chastain May 9, 2014.

11
Prizms for Data Publication and Management Katie Chastain May 9, 2014

description

The Value of Linked Data Data is converted and annotated using community-standard vocabularies (ontologies). Allows queries in SPARQL across any or all data sets Provides a way to make metadata machine-readable 2

Transcript of Prizms for Data Publication and Management Katie Chastain May 9, 2014.

Page 1: Prizms for Data Publication and Management Katie Chastain May 9, 2014.

Prizms for Data Publication and ManagementKatie Chastain

May 9, 2014

Page 2: Prizms for Data Publication and Management Katie Chastain May 9, 2014.

The Goal

• Allow a scientist…– to convert data from comma-separated

value format (CSV) to a linked data format– to record provenance information about the

original dataset and conversion transformation

– to receive RDF immediately, for use in visualizations or other semantic applications

– to have a repository with appropriate access for collaborators

2

Page 3: Prizms for Data Publication and Management Katie Chastain May 9, 2014.

The Value of Linked Data

• Data is converted and annotated using community-standard vocabularies (ontologies).

• Allows queries in SPARQL across any or all data sets

• Provides a way to make metadata machine-readable

3

Page 4: Prizms for Data Publication and Management Katie Chastain May 9, 2014.

Revealing Implicit Information

4

Page 5: Prizms for Data Publication and Management Katie Chastain May 9, 2014.

Revealing Implicit Information

5

unitdouble

Measurement

characteristic

ofCharacteristic hasValue hasUnit

Page 6: Prizms for Data Publication and Management Katie Chastain May 9, 2014.

Tabular Datato Linked Data

• CSV2RDF4LOD is a powerful tool for converting tabular data into linked open data, able to represent many complex relationships between data fields– Command-line based– Requires prior knowledge of RDF

• Semantic Annotator provides a graphical user interface for this converter, building up conversion parameters in the background.– Browser-based– Still in development, currently not as powerful

6

Page 7: Prizms for Data Publication and Management Katie Chastain May 9, 2014.

Tabular Datato Linked Data

7

Data

Enhancement Parameters

Linked Data

• CSV2RDF4LOD and Semantic Annotator enable creation of enhancement parameters for conversion• These parameters can also be applied to other datasets with similar structure• Provide provenance about how data conversion was performed

Page 8: Prizms for Data Publication and Management Katie Chastain May 9, 2014.

Converted Data

8

Page 9: Prizms for Data Publication and Management Katie Chastain May 9, 2014.

Prizms Overview

• By Tim Lebo from the TWC• A pipeline of tools for taking data from CSV to linked

open data ready for visualizations.– CSV2RDF4LOD and Semantic Annotator for conversion– DataFAQs for quality management– LODSPeaKr by Alvero Graves for data publication

9

• Organizes data with a Source-Dataset-Version schema

• Captures provenance information at each step of data processing

Prizms on github:https://github.com/timrdf/prizms/

Page 10: Prizms for Data Publication and Management Katie Chastain May 9, 2014.

Prizms Overview

10

Data

E Params

Linked Data Triple

StoreConversion +

Annotation

LODSPeaKrSADI web servicesEtc…

Data Repository

SPARQL Endpoint

Presentation + Publication

Page 11: Prizms for Data Publication and Management Katie Chastain May 9, 2014.

Hosting Data on LODSpeaKr

• Quick statistics for each dataset

• SPARQL endpoint• Ability to plug in to

other semantic web services, such as SADI

11