Formalising a protocol for recording provenance in Grids Paul Groth – [email protected]...

29
Formalising a protocol for recording provenance in Grids Paul Groth – [email protected] University of Southampton All Hands Meeting 2004

Transcript of Formalising a protocol for recording provenance in Grids Paul Groth – [email protected]...

Page 1: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

Formalising a protocol for recording provenance in Grids

Paul Groth – [email protected]

University of Southampton

All Hands Meeting 2004

Page 2: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

Or…How to show your work.

In a Grid

Page 3: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

Contents

1. What is Provenance and why you should care.

2. The Grid and Provenance3. An Architectural Vision 4. PReP5. Let’s get formal (yawn….)6. What’s next.7. Conclusion

Page 4: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

A Definition

Main Entry: prov·e·nance Pronunciation: 'präv-n&n(t)s, 'prä-v&-"nän(t)sFunction: nounEtymology: French, from provenir to come forth, originate, from Latin provenire, from pro- forth + venire to come Date: 17851 : ORIGIN, SOURCE2 : the history of ownership of a valued object or work of art or literature

Documentation of Process i.e. showing your work

Page 5: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

The importance of provenance

Process is IMPORTANT

Art Wine Drug Discovery Financial Auditing Aerospace …

Page 6: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

The Grid

The Grid problem is defined as coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organisations [FKT01].

Effort is required to allow users to place their trust in the data produced by such virtual organisations

Page 7: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

… and the Provenance Problem

Given a set of services in an open grid environment that decide to form a virtual organisation with the aim to produce a given result;

How can we determine the process that generated the result, especially after the virtual organisation has been disbanded?

Page 8: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

Provenance Problem cont.

Provenance recording should be part of the infrastructure, so that users can elect to enable it when they execute their complex tasks over the Grid or in Web Services environments.

Currently, the Web Services protocol stack and the Open Grid Services Architecture do not provide any support for recording provenance.

Methods are generally adhoc and do not interoperate.

Page 9: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

Execution Provenance

2 TypesProvenance about an interactionProvenance about an actor

Provenance is not a one way street No standard way to record execution

provenance.

Page 10: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

An Architecture

Page 11: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

An Architecture with Provenance Support

Page 12: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

PReP- Provenance Recording Protocol

client serviceinvocation

result

ProvenanceService

recordinvocationand result

recordinvocationand result

negotiate

Why record 2 views?

Page 13: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

Provenance Service3

client serviceinvocationresult

ProvenanceService

invocationand result record

invocationand result record

client serviceinvocationresult

ProvenanceServiceinvocation

and result recordinvocation

and result record

client serviceinvocationresult

ProvenanceService

invocationand result record

invocationand result record

Provenance services may be shared or different

Page 14: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

Linking Records

client serviceinvocationresult

ProvenanceService

invocationand result record

invocationand result record

client serviceinvocationresult

Provenance Serviceinvocation

and result recordinvocation

and result record

client serviceinvocationresult

ProvenanceService

invocationand result record

invocationand result record

Provenance Record

Record Link

Page 15: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

PReP in detail

Model PReP using asynchronous message passing. Maps well to any implementation Helpful for scalability

Four Phase Protocol Negotiation Invocation Provenance Recording Termination

Page 16: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

PReP’s messages

ProposeReplyInvoke ResultRecord NegotiationRecord InvocationRecord ResultSubmission FinishedAdditional Provenance

Record Negotiation AckRecord Invocation AckRecord Result AckSubmission Finished AckAdditional Provenance Ack

Page 17: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

PReP’s messages

ProposeReplyInvoke ResultRecord NegotiationRecord InvocationRecord ResultSubmission FinishedAdditional Provenance

Record Negotiation AckRecord Invocation AckRecord Result AckSubmission Finished AckAdditional Provenance Ack

Used for connecting provenance records and for recording provenance about actors.

Page 18: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

Provenance Service – An abstract state machine

Formalise the protocol by formalising the individual entities in the protocol

Know exactly how the Provenance Service responds to receipt of messages

Use to show a liveness property Something good will eventually

happen

Page 19: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.
Page 20: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

Rules of the ProvenanceService’s ASM

Page 21: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

Client and Service

State transition diagram Cannot formalise internals, only the

response to PReP Show Termination Property

Page 22: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.
Page 23: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

VRML Demo

Page 24: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

Sketch showing Liveness

Goal Submission Finished Ack Sent Assume

• Client & Service are live• Communication channels work

• Personally, do not like this assumption• Finite number of additional prov msgs

Show termination of Client & Service using graph

ASM rules guarantee that the provenance service fills up. Notify rule fires. Ack Sent

Q.E.D.

Page 25: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

What’s next? Security

Support some “classical” properties of distributed algorithms. • Using mutual authentication, an invoked service can ensure that it submits data to a specific provenance server, and vice-versa, a provenance server can ensure that it receives data from a given service. • With non-repudiation, we can retain evidence of the fact that a service has committed to executing a particular invocation and has produced a given result. • We anticipate that cryptographic techniques will be useful to ensure such properties

Page 26: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

But wait there’s more…

Err…What if you have a lot of data?Look at scalability

A real (prototype) provenance serviceWe have one in the labNow let other people use it

And along comes trust

Page 27: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

Conclusion

Provenance is important Execution provenance is the first layer Provenance recording must be part of

the infrastructure. Standards. Start from specification not

implementation. PReP is a first start. …and it’s cool.

Page 28: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

Acknowledgements

Luc Moreau Michael Luck Victor Tan Simon Miles

Page 29: Formalising a protocol for recording provenance in Grids Paul Groth – pg03r@ecs.soton.ac.uk University of Southampton All Hands Meeting 2004.

Visit http://www.pasoa.orgE-mail me: [email protected]

The End