1 P-GRADE Portal tutorial at EGEE'09 Gergely Sipos MTA SZTAKI EGEE Training and Induction.
-
Upload
brett-robbins -
Category
Documents
-
view
222 -
download
0
description
Transcript of 1 P-GRADE Portal tutorial at EGEE'09 Gergely Sipos MTA SZTAKI EGEE Training and Induction.
1
P-GRADE Portal tutorial at P-GRADE Portal tutorial at EGEE'09EGEE'09
www.lpds.sztaki.hu/gasuc www.portal.p-grade.hu
Gergely SiposMTA SZTAKI
EGEE Training and InductionEGEE Application Porting Support
2
Agenda of the morningAgenda of the morning
• Introduction to workflow concept• Workflow hands-on
~ Break
• Parameter studies• Parameter study hands-on
• Further information and next steps
3
WorkflowWorkflow
The automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules to achieve, or contribute to, an overall business goal.
• Workflow management system (WFMS) is the software that does it
www.wfmc.org
Workflow Reference Model, 19/11/1998
4
Why use workflowWhy use workflowss in Grid? in Grid?
• Build distributed applications through orchestration of multiple services
• A single job or a single service is good for nothing…
• Integration of multiple teams involved• Collaborative work
• Unit of reusage• (E-)science requires traceable, repetable analysis
• (Typically) ease of use grids• Graphical representation
9
Grid WFMSGrid WFMS
Source: Jia Yu and Rajkumar Buyya: A Taxonomy of Workflow Management Systems for Grid Computing, Journal of Grid Computing, Volume 3, Numbers 3-4 / September, 2005
15
(Some of the) available grid (Some of the) available grid workflow systemsworkflow systems
http://www.gridworkflow.org Categories for
– Composition tools – Description languages
• Scientific• Industrial• Formalism
– Engines
Some relevant tools for ARC, gLite, Globus, UNICORE grid users• Condor DAGMan
– Used as an enactor in P-GRADE Portal, Pegasus, …– Uses DAGMan WF language (DAG = Directed Acyclic Graph)
• MOTEUR– Interfaced with “pilot job” framework on EGEE (pull style job execution)– Uses SCUFL WF language
• gLite WMS– Describe workflows in JDL– Share Input-Output sandboxes with multiple jobs
• Taverna– Mainly for cluster computing– ARC interface is available by Lubeck University
• …
17
Short History of P-GRADE portalShort History of P-GRADE portal
• Parallel Grid Application Development Environment
• Initial development started in the Hungarian SuperComputing Grid project in 2003
• It has been continuously developed since 2003• Around 30 manyear development + training + user support
• Detailed information: http://portal.p-grade.hu/ • Open Source community development since
January 2008: https://sourceforge.net/projects/pgportal/
• Current version: 2.8
18
Current Current P-GRADE P-GRADE Portal Portal related projectsrelated projects
• GGF GIN (Since 2006)– Providing the GIN Resource Testing portal
• EU EGEE-II, EGEE-III (2006-2010)– Tool recommended for application development– Intensively used in new users’ training
• EU SEE-GRID-SCI (2008-2010)– Interfacing to DSpace-based workflow storage– Infrastructure testing workflows
• EU CancerGrid (2007-2009)– Development of new generation P-GRADE (gUSE
and WS-PGRADE)– Integration with desktop grids
• EU EDGeS (2008-2009)– Transparent access to Desktop Grid systems
19
Portal installationsPortal installations
P-GRADE Portal services:– SEE-GRID infrastructure– Several VOs of EGEE:
• Biomed, Astronomy, Central European, NA4,...– GILDA: Training VO of EGEE– Many national Grids (UK National Grid Service,
HunGrid, Turkish Grid, etc.)– US Open Science Grid, TeraGrid– OGF Grid Interoperability Now (GIN) VO– …
Portal services and account request:http://portal.p-grade.hu/index.php?m=3&s=0 Account request form on portal login page
20
Multi-Grid portal installation:Multi-Grid portal installation:www.lpds.sztaki.hu/multi-gridwww.lpds.sztaki.hu/multi-grid
21
Design principlesDesign principles of P-GRADE portalof P-GRADE portal
• P-GRADE Portal is not only a user interface, it is a – General purpose– Workflow-level – Multi-Grid – Application Development and Execution Environment
• P-GRADE Portal includes a high-level middleware layer for orchestrating jobs on grid resources – inside a grid– among several different grids (and several VOs)
• P-GRADE Portal is grid-neutral:– Unlike many existing grid portals it is not tailored to any particular grid
type– Can be connected to various grids based on different grid middleware
• LCG-2, gLite, GT2, GT4, ARC, Unicore, etc.– Implements the high-level grid middleware services on top of the
existing grid middleware services– The workflow interface is the same no matter which type of grid is
connected to it
22
What is a P-GRADE Portal workflow?What is a P-GRADE Portal workflow?
• A directed acyclic graph where– Nodes represent jobs (batch
programs to be executed on a computing element)
– Ports represent input/output files the jobs expect/produce
– Arcs represent file transfer operations
• semantics of the workflow:– A job can be executed if all
of its input files are available
23
Three levels of parallelismThree levels of parallelism
– PS workflow level: Parameter study execution of the workflow
– Workflow level: Parallel execution among workflow nodes (WF branch parallelism)
Multiple jobs run parallel
Each job can be a parallel program
– Job level: Parallel execution inside a workflow node (MPI job as workflow component)
Multiple instances of the same workflow process
different data files
24
~100independent
jobs torun
Example: Computational ChemistryExample: Computational Chemistry
Department of Chemistry, University of Perugia
SOLUTION OF SCHRODINGER EQUATION FOR TRIATOMIC SYSTEMS USING TIME-DEPENDENT (RWAVEPR) OR TIME INDEPENDENT (ABC) METHOD
A single execution can be between 5 hours and 10 hours
SEQUENTIAL FORTRAN 90
Many simulations at the same time
Full story: EGEE Grid Application Porting Support - http://www.lpds.sztaki.hu/gasuc/index.php?m=7&s=3
25
Typical user scenarioTypical user scenarioJob compilation phaseJob compilation phase
Portalserver
Gridservices
DOWNLOAD BINARI(ES)
UPLOAD JOB SOURCE(S)
Client COMPILE – EDIT
26
Typical user scenarioTypical user scenarioWorkflow development phaseWorkflow development phase
Portalserver
Gridservices
START EDITOR
OPEN & EDIT WORKFLOW
ADD BINARIES
SAVE WORKFLOW
Client
DSpace WFrepository
IMPORT WORKFLOW
27
MyProxyCertificate servers
Portalserver
Gridservices
TRANSFER FILES, SUBMIT JOBS
DOWNLOAD (SMALL)
RESULTS
DOWNLOAD (SMALL) RESULTS
Typical user scenariosTypical user scenarios Workflow execution phaseWorkflow execution phase
VISUALIZE JOBS and
WORKFLOW PROGRESS
MONITOR JOBS
DOWNLOAD PROXY CERTIFICATES
Client
28
Accessing local and remote filesAccessing local and remote files
Portalserver
Gridservices
Computing elements
Storage elements and File catalogs
REMOTE INPUTFILES
REMOTE OUTPUT
FILES
LOCAL INPUT FILES
& EXECUTABLES
LOCAL OUTPUT
FILES
LOCAL INPUT FILES
& EXECUTABLES
LOCAL OUTPUT
FILES
Only the permanent
files!
Use legacy executables with Grid files without touching the code
29
Extended DAGMan
Java Webstartworkflow editorWeb browser
EGEE, Globus (and ARC) Grid services + MyProxy service (gLite WMS, LFC,…; Globus GRAM, …)
Globus and gLite command line clients + scripts
P-GRADE PortalP-GRADE Portal structural overviewstructural overview
Extended DAGMan WF specification
Globus GIISgLite BDII
DSpacerepository
30
Web interface - PortletsWeb interface - Portlets
31
Email notificationsEmail notifications
NOTIFY
32
Workflow portletWorkflow portlet
WORKFLOW EDITOR
33
Graphical workflow editingGraphical workflow editing
• To define a graph:1. Drag & drop components:
jobs and ports2. Define their properties3. Connect ports by
channels (no cycles, no loops)
System generates JDL for each job automatically
34
Workflow Workflow EditorEditorProperties of a jobProperties of a job
Properties of a job:• Executable file• Type of executable
(Sequential / Parallel)• Command line parameters• Which resource to use?
• Which VO?• Broker or Computing
element?
35
Workflow Workflow EditorEditorDefining input-output filesDefining input-output files
File propertiesType: input: the executable reads output: the executable generatesFile type: local: comes from my desktop remote: comes from an SEFile: location of the fileInternal file name: Executable uses this e.g. fopen(“file.in”, …)File storage type (output files only): Permanent: final result Volatile: temp. data channel
36
• Client side location:result.dat
• LFC logical file name(LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04_-_result.dat
• GridFTP address (in Globus Grids):gsiftp://somengshost.ac.uk/mydir/result.dat
Local fileLocal file
Remote fileRemote file
How to refer to an I/O file?How to refer to an I/O file?
• Client side location:c:\experiments\11-04.dat
• LFC logical file name(LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04.dat
• GridFTP address (in Globus Grids):gsiftp://somengshost.ac.uk/mydir/11-04.dat
Input file Output file
37
Upload a workflow from client side Upload a workflow from client side or from FTP serveror from FTP server
UPLOAD
STORED on FTP server
38
Importing an applicationImporting an application
INCOMPLETE WORKFLOW Open it in editor and save it again
39
Import a workflow from DSpace Import a workflow from DSpace repositoryrepository
40
External access to DSpaceExternal access to DSpacehttp://pgrade-dspace.sztaki.huhttp://pgrade-dspace.sztaki.hu
41
Certificate and proxy Certificate and proxy management Portletmanagement Portlet
42
OGF GIN interoperability portal by P-GRADEAcccessing Globus, gLite and ARC based grids/VOs simultaneously
P-GRADEGEMLCA
Portal
GEMLCA GEMLCA RepositoryRepository
P-GRADEportal
Proxy 1
Proxy 2
Proxy 5
Proxy 4
Proxy 3
Proxy 6
43
Application executionApplication execution
44
Fault-tolerant executionFault-tolerant execution
• Utilizing– Condor DAGMan’s rescue mechanism– EGEE job resubmission mechanism of WMS
• If the EGEE broker leaves a job stuck in a CEs’ queue, the portal automatically – kills the job on this site and – resubmits the job to the broker by prohibiting this site.
• As a result – the portal guarantees the correct submission of a job
as long as there exists at least one matching resource
– job submission is reliable even in an unreliable grid
45
Information system visualizationInformation system visualization
46
LFC-SELFC-SE file browser portlet file browser portlet
47
Compilation supportCompilation support
48
WORKFLOW HANDS-ONWORKFLOW HANDS-ON
49
From workflows to From workflows to parameter studiesparameter studies
Advanced execution patterns
50
Scaling up a workflow to a Scaling up a workflow to a parameter studyparameter study
Complete workflow
P-GRADE Portal:Files in the same LFC catalog
(e.g. /grid/gilda/sipos/myinputs)
P-GRADE Portal:Results produced in
the same catalog
51
Advanced parameter studiesAdvanced parameter studiesGenerator
component(s)Initial input data
Generate orcut input into smaller pieces
Collector component(s)
Aggregate result
Complete workflow
P-GRADE Portal:Files in the same LFC catalog
(e.g. /grid/gilda/sipos/myinputs)
P-GRADE Portal:Results produced in
the same catalog
52
Concept of parameter study Concept of parameter study workflowsworkflows
GEN
SEQ
COLL
SEQSEQSEQ
Parameter study part
Collector part evaluates and
integrates the results
Generator part generates the
input parameter space
53
Turning a WF into a parameter studyTurning a WF into a parameter study
By switching at least one of the open input ports
into a “PS Input port” the WF is turned into a Parameter Study
54
Input-output files are stored in SEsInput-output files are stored in SEs/grid/gilda/sipos/InputImages Image.0 Image.1
/grid/gilda/sipos/XCoordinates XCoordinate.0 XCoordinate.1
/grid/gilda/sipos/YCoordinates YCoordinate.0 YCoordinate.1
/grid/gilda/sipos/Output ImagePart.0 ImagePart.1 . . .
2 x 2 x 2 = 8 execution of the whole workflow
CROSS PRODUCT of data items
55
A B
Typical data-flow compositionsTypical data-flow compositions
A X B
MWF
A1
A2
A3
B1
B2
B3
{A1, A2, A3} {B1, B2, B3}
XWF
A1
A2
A3
B1
B2
B3
{A1, A2, A3} {B1, B2, B3}
dot iterator:one-to-one
cross iterator:all-to-all
WF
Ai Bj
{A1, A2, A3}
match iterator
If Ai and Bj have acommon ancestor
{B1, B2, B3}
A M B
CROSS ITERATOR DOT ITERATOR MATCH ITERATOR
Find these in e.g. TAVERNA, MOTEURP-GRADE Portalsupports this
56
PS Input PortPS Input Port
Grid Directory instead of
FILE reference
57
Parameter generatorParameter generator
Generator can be attached to any parameter input port
Generator can be• Auto generator: to generate text files• Custom generator: to generate any content
Generated files are moved into SE by the portal
58
Definition Window of Auto Generator JobDefinition Window of Auto Generator Job
User defines the template of the text file
User puts key(s) into the template
User defines values for the key(s)• Integer number• Real number• Custom set• …
59
PPlacement of resultlacement of result
60
Will contain one compressed file for each execution of the workflow.
Use the default value!
Choose a „reliable” Storage Element
PPlacement of resultlacement of result
61
Executing PS workflowsExecuting PS workflows
PS Details for parameter sweep
workflows applications
62
Detailed view of a PS workflowDetailed view of a PS workflow
Workflow instances
Overall statistics of workflow instances
Collector job(s)
Generator job(s)
63
PARAMETER STUDY PARAMETER STUDY HANDS-ONHANDS-ON
65
Backup slides to answer Backup slides to answer questionsquestions
66
Proxy delegations Proxy delegations MyProxy
server
P-GRADE Portalserver GILDA
services
Proxy VOMSserver
ProxyProxy
VOMS ext.
Proxy
VOMS ext.
usernamepassword
Proxy based authentication
Login & psw based
authentication
usernamepassword
67
SettingsSettings
Portal administrator can – connect the portal
to several grids– register default
resources of the connected grids
68
SettingsSettings
User can customize the connected grids by adding and removing resources