Understanding the utility and fitness of Workflow Provenance for Experiment Reporting Pınar Alper,...
-
Upload
mildred-jefferson -
Category
Documents
-
view
215 -
download
0
Transcript of Understanding the utility and fitness of Workflow Provenance for Experiment Reporting Pınar Alper,...
Understanding the utility and fitness of Workflow Provenance for Experiment Reporting
Pınar Alper,
Supervisor: Carole A. Goble
1
2
LocalData
LocalTool
Results
Data
Research Reporting
ResultsResultsResultsResultsResults
Results
Tool
Analysis
Analysis
Analysis
ResultsResultsResultsResultsResults
ResultsResultsResultsResultsResults
Data
select
recollect
share
package
publish
Build a citation string
Package results by origin
Document important run parameteresC. Tenopir, S. Allard, et al. Data sharing by scientists: Practices and perceptions. PLoS ONE,
6(6):e21101, 06 2011.
3
Provenance we have• WF description • Execution provenance
Pros
pecti
ve
Retr
ospe
ctive
Generic information:Data artefacts, consumption/production relationsExecution times/status
4
Provenance that is reported
– Origin – Methodological context – Scientific Context
Scientific Data Provenance
5
Motifs
D Garijo, P Alper, K Belhajjame, O Corcho, Y Gil, C Goble, Common motifs in scientific workflows: An empirical analysis, Future Generation Computer Systems. ISSN 0167-739X.
Minority (~30%) Data-creation
Majority (~70%)Data-preparation (value-copying)
Workflows as implementation artefacts:
• 240 Workflows, 4 Systems 10 domains
• A domain independent characterization of activities • ~90% characterizable
http://purl.org/net/wf-motifs#
Research Framework
WF Summaries
Labeling WF
II III
WF Motifs
I
Minimal additional design-time information
High-level categorization, as Semantic Annotations
Based on empirical evidence
Process Model for labeling
Motifs inform when to collect when to propagate labels
Novelty: Dynamic, domain specific
Novelty: Partial transparency
Graph Re-write primitives
Configurable filters
More informed abstraction wMotifs
Novelty: Declarative abstraction and contextual grouping
6
Grey-box
Groundtruth –user behavior
P Alper, K Belhajjame, C Goble, P Karagoz, Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotations, IEEE Big Data, July 2013.
P Alper, C Goble, and K Belhajjame. 2013. On assisting scientific data curation in collection-based dataflows using labels. In Proceedings of the 8th Workshop on Workflows in Support of Large-Scale Science (WORKS '13). ACM, New York, NY, USA, 7-16. DOI=10.1145/2534248.2534249
7
How do I use TavernaWorkbenchscufl2-api
make a wfInquire about details
Scufl2-wfdescwe operate on abstract wf
description
IssuesAdditional characteristics (port depths, itertion config)
Annotation support @UI w key-value pairs
List handling representation
Resource uniqueness
8
Thank you!
Carole A. GOBLEUniversity of Manchester
Khalid BELHAJJAMEUniversité Paris Dauphine
Pinar KARAGOZMiddle East Technical University
Pinar ALPERUniversity of Manchester