Prizms for Data Publication and Management Katie Chastain May 9, 2014.

Post on 18-Jan-2018

215 views 0 download

description

The Value of Linked Data Data is converted and annotated using community-standard vocabularies (ontologies). Allows queries in SPARQL across any or all data sets Provides a way to make metadata machine-readable 2

Transcript of Prizms for Data Publication and Management Katie Chastain May 9, 2014.

Prizms for Data Publication and ManagementKatie Chastain

May 9, 2014

The Goal

• Allow a scientist…– to convert data from comma-separated

value format (CSV) to a linked data format– to record provenance information about the

original dataset and conversion transformation

– to receive RDF immediately, for use in visualizations or other semantic applications

– to have a repository with appropriate access for collaborators

2

The Value of Linked Data

• Data is converted and annotated using community-standard vocabularies (ontologies).

• Allows queries in SPARQL across any or all data sets

• Provides a way to make metadata machine-readable

3

Revealing Implicit Information

4

Revealing Implicit Information

5

unitdouble

Measurement

characteristic

ofCharacteristic hasValue hasUnit

Tabular Datato Linked Data

• CSV2RDF4LOD is a powerful tool for converting tabular data into linked open data, able to represent many complex relationships between data fields– Command-line based– Requires prior knowledge of RDF

• Semantic Annotator provides a graphical user interface for this converter, building up conversion parameters in the background.– Browser-based– Still in development, currently not as powerful

6

Tabular Datato Linked Data

7

Data

Enhancement Parameters

Linked Data

• CSV2RDF4LOD and Semantic Annotator enable creation of enhancement parameters for conversion• These parameters can also be applied to other datasets with similar structure• Provide provenance about how data conversion was performed

Converted Data

8

Prizms Overview

• By Tim Lebo from the TWC• A pipeline of tools for taking data from CSV to linked

open data ready for visualizations.– CSV2RDF4LOD and Semantic Annotator for conversion– DataFAQs for quality management– LODSPeaKr by Alvero Graves for data publication

9

• Organizes data with a Source-Dataset-Version schema

• Captures provenance information at each step of data processing

Prizms on github:https://github.com/timrdf/prizms/

Prizms Overview

10

Data

E Params

Linked Data Triple

StoreConversion +

Annotation

LODSPeaKrSADI web servicesEtc…

Data Repository

SPARQL Endpoint

Presentation + Publication

Hosting Data on LODSpeaKr

• Quick statistics for each dataset

• SPARQL endpoint• Ability to plug in to

other semantic web services, such as SADI

11