A Role for Provenance in Quality Assessment

Post on 03-Dec-2014

280 views 0 download

Tags:

description

 

Transcript of A Role for Provenance in Quality Assessment

A Role for Provenance in Quality Assessment

Chris Baillie, Pete Edwards, and Edoardo Pignotti

c.baillie@abdn.ac.uk

Overview

Motivation

Evaluating Data Quality

A Role for Provenance

Future work

c.baillie@abdn.ac.uk

Motivation

“we don’t know whether the information we find [on the Web] is accurate or not. So we have to teach people how to assess what they’ve found’’

Vint Cerf, 2010

Web of Documents has become the Web of documents, services, data, and people.

Anyone can publish anything so we need a way to evaluate quality.

We are investigating these issues within the Internet of Things Sensors now at the centre of many applications

c.baillie@abdn.ac.uk

Example Scenario

c.baillie@abdn.ac.uk

Evaluating Data Quality

c.baillie@abdn.ac.uk

F(E, R) = Q

Entity (and context)To evaluate quality, we must examine the context around data

WIQA Framework examines data content, context, and external ratings

(Bizer et al. 2009)

Quality Scores-Quality is a multi-dimensional construct

- Accuracy- Timeliness- Relevance

Data Requirements-Furber and Hepp (2011) use rules to identify quality problems

Representing Sensor Observations

c.baillie@abdn.ac.uk

Linked Data: “recommended best practice for exposing, sharing, and connecting pieces of data using URIs and RDF”

Performing Quality Assessment

c.baillie@abdn.ac.uk

Rrelevance = 1 - ( E distanceFromRoute X )

100

CONSTRUCT { _:b0 a QualityScore . _:b0 score ?qs . _:b0 dqm:ruleViolation _:b1 . _:b1 a DataRequirementViolation . _:b1 dqm:affectedInstance ?instance . } WHERE { ?instance a Observation . ?instance distanceFromRoute ?distance . LET (?qs := (1 - (?distance / 100))) .}

Quality Assessment Results

c.baillie@abdn.ac.uk

Observation Provenance

Provenance is a critical part of observation context

Describes the entities, agents, and activities involved in data creation: How was the observation value measured? Who controlled the sensing process? How has the observation been transformed since it was

created?

W3C Prov-O model provides linked data representation of provenance

Observation Provenance

Quality Score Provenance

Future Work Implementation of quality rules that examine provenance Investigate quality score re-use

Work To Date Developed Quality Assessment Framework that enables:

Linked data representation of sensor observations Definition of quality requirements using SPARQL rules Generation of quality scores via reasoning

Any questions?

Come and see the IRP demo (D9) to see quality assessment in action.

Implementation