CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis...

29
CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005

Transcript of CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis...

Page 1: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Promoting reuse and repurposing on the Semantic Grid

Antoon Goderis

University of Manchester, UK

CHESS seminar, 19 July 2005

Page 2: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Talk plan

• The grid

• The semantic grid

• Reuse and repurposing

• 7 bottlenecks to repurposing

• Semantics to the rescue

Page 3: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

The Grid

1. Pervasive and dependable computing utility

2. A distributed computing infrastructure for advanced science and engineering

3. Coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organisations

Page 4: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Science in the 21st century

• Huge quantities of data • Huge number of data collection

devices• Analysis is the bottleneck• Global distributed science

– Collaboration and sharing the norm

• In silico experiments– Build, reuse, repurpose

on-line concurrent processes (workflows)

114 genomes735 in progress

Page 5: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Grid application evolution

Large scale data, large number of machines,

expensive computation, simple semantics, small

numbers of people

Smaller scale data, less machine computational

intensive, complex heterogeneous

applications, complex semantics, many people

High Energy Physics

Functional GenomicsOceanographyBiodiversityEarth ScienceNeuroscience

Page 6: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

The Semantic Grid

• The Grid has been about large scale computation

• But the applications are also about collaboration• A gap between grid computing endeavours and

the vision of Grid computing • To support the full richness of the vision we

need both grid and semantic web (technologies)• Knowledge explicitly asserted & explicitly used

Page 7: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

ClassicalWeb

ClassicalGrid

SemanticWeb

Ric

her

sem

antic

s

More computation

SemanticGrid

Source: Norman Paton

Page 8: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Semantics in Grid workflows

• Classification and discovery of computational and data resources; provenance trails

• Declarative specification of services, workflows and their requirements; problem solving selection

• Job control, distributed execution models, semantic integration, resource brokering, resource scheduling

• Encoding performance metrics, service state, event notification topics, access rights to databases, personal profiles and security groupings; charging infrastructure

Page 9: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Talk plan

• The grid

• The semantic grid

• Reuse and repurposing

• 7 bottlenecks to repurposing

• Semantics to the rescue

Page 10: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

From building workflows to recycling them

• Reuse of workflows– Best practice – Training– Peer review

• Repurposing– Adapt and extend useful fragments– Build on best practice– Across groups / communities

Page 11: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Analyze This

Page 12: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Analyze This x #scientistsx #workflows

x #versionsx #runs

Page 13: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Bridging user information need and workflow descriptions

Page 14: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Network effects!

Bridging user information need and workflow descriptions

Page 15: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Reuse and repurposing

• A user will reuse a workflow or workflow fragment that fits their purpose and could be customised with different parameter settings or data inputs to solve their particular scientific problem.

Page 16: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Reuse and repurposing

• A user will reuse a workflow or workflow fragment that fits their purpose and could be customised with different parameter settings or data inputs to solve their particular scientific problem.

– A piece of an experimental description that is a coherent sub-workflow that makes sense to a domain specialist (in Ptolemy, a composite actor)

– A snippet of workflow code + annotation

Page 17: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Reuse and repurposing

• A user will reuse a workflow or workflow fragment that fits their purpose and could be customised with different parameter settings or data inputs to solve their particular scientific problem.

• A user will repurpose a workflow or workflow fragment by

1. finding one that is close enough to be the basis of a new workflow for a different purpose and

2. making small changes to its structure to fit it to its new purpose.

Aiming for automated discovery of ranked fragments

Page 18: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

7 bottlenecks to workflow repurposing

1. Lack of a comprehensive discovery model

2. Process knowledge acquisition bottleneck

3. Lack of workflow fragment rankings

4. Workflow interoperability

5. Restrictions on service availability

6. Rigidity of service and workflow definitions

7. Intellectual property rights on workflows

Collect enough

workflows

Make workflows

usable

Page 19: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

A comprehensive discovery model

• A user will repurpose a workflow or workflow fragment by1. finding one that is close enough to be the basis of a

new workflow for a different purpose and 2. making small changes to its structure to fit it to its

new purpose.• Based on semantic annotation, find a set of workflows,

which people can then edit– For scientists: data flow based queries in their

jargon, largely abstracting from control– For developers: control flow based queries,

largely abstracting from data

Page 20: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Keplerhttp://kepler.ecoinformatics.org/

Courtesy Bertram Ludaescher

Page 21: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

• Scientist queries– Find all processes where sequence alignment is

followed by visualisation– Given a set of data points, services, or fragments,

have these been connected up in an existing base of workflows? Alternatives?

– Show me the provenance of this workflow• Developer queries

– How have people applied this dataflow execution model (eg in Ptolemy, an SDF Director)?

– How can it be combined with other execution models?

A comprehensive discovery model

Page 22: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

• Challenges– Libraries of (scientific) task based patterns

• Eg task semantics of gene annotation pipelines classified in OWL

– Libraries of design patterns for distributed behaviour• Identify how people build concurrent systems;

how they choose (combinations of) execution semantics

• A good start: workflow patterns for Petri Nets– Eg synchronizing merge and multi-merge

A comprehensive discovery model

Page 23: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Workflow fragment rankings

• A user will repurpose a workflow or workflow fragment by1. finding one that is close enough to be the basis of a new

workflow for a different purpose and 2. making small changes to its structure to fit it to its new purpose.

• We need metrics for processes– For scientists: ranking scientific relevance– For developers:

• compare processes based on the same execution semantics

• compare different execution semantics• Challenge: defining the metrics, and combining them into

rankings

Page 24: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Workflow interoperability

• A user will repurpose a workflow or workflow fragment by

1. finding one that is close enough to be the basis of a new workflow for a different purpose and

2. making small changes to its structure to fit it to its new purpose.• Workflows take a long time to build and get very large• The nice thing about standards…• Different workflow systems, different (implicit) semantics• Import workflows across workflow environments

1. Manually redo it in your own

2. Wrapping

3. Auto-rewrite to new environment • eg

Page 25: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Workflow interoperability

• To inform interoperation, we need a layer of abstraction that captures behavioural semantics

• Many non-standardised formalisms out there– Functional languages - one paradigm fits all?– Petri nets – Process algebras– Finite State Machines– All (hierarchical-) combinations of these

• Challenge: – Behavioural design patterns to compare formalism

classes, eg PN and SDF Director

Page 26: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Conclusions

• Grid = Semantic Grid• Reuse <> repurposing• Task and behavioural semantics both needed for

repurposing• Design patterns for distributed processes: a long road

ahead– Task semantics– Behavioural semantics

Page 27: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

EPSRC funded UK eScience Program Pilot Project

Many slides taken from Carole Goble

Page 28: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

Core• Matthew Addis, Nedim Alpdemir, Tim Carver, Rich Cawley, Neil Davis, Alvaro

Fernandes, Justin Ferris, Robert Gaizaukaus, Kevin Glover, Carole Goble, Chris Greenhalgh, Mark Greenwood, Yikun Guo, Jan Humble, Ananth Krishna, Peter Li, Phillip Lord, Darren Marvin, Simon Miles, Luc Moreau, Arijit Mukherjee, Tom Oinn, Juri Papay, Savas Parastatidis, Norman Paton, Terry Payne, Matthew Pocock Milena Radenkovic, Stefan Rennick-Egglestone, Peter Rice, Ian Roberts, Martin Senger, Nick Sharman, Robert Stevens, Victor Tan, Anil Wipat, Paul Watson, Jimi Worthington and Chris Wroe.

Users• Simon Pearce and Claire Jennings, Institute of Human Genetics School of

Clinical Medical Sciences, University of Newcastle, UK• Hannah Tipney, May Tassabehji, Andy Brass, St Mary’s Hospital, Manchester,

UK• Steve Kemp, Liverpool, UKPostgraduates• Martin Szomszor, Duncan Hull, Jun Zhao, Pinar Alper, Keith Flanagan, Antoon

Goderis, Tracy Craddock, Alastair HampshireIndustrial • Dennis Quan, Sean Martin, Michael Niemi, Syd Chapman (IBM)• Robin McEntire (GSK)Collaborators• Keith Decker

Page 29: CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.

CHESS seminar July 2005

References

• Publications on– Home page: www.cs.man.ac.uk/~goderisa– myGrid site: www.mygrid.org.uk