06/08/10 PBS, LSF and ARC 2
Outline
•Introduction•Requirements•PBS and LSF•ARC
•Architecture of P-GRADE Portal runtime layer•PBS/LSF integration•ARC integration•Summary
06/08/10 PBS, LSF and ARC 3
Introduction
•P-GRADE Portal supported gLite, Globus•ETHZ requirement:•Make use of PBS local clusters•Make use of LSF local clusters (Brutus)•Sometimes make use of ARC grid resources
•All this should be integrated within P-GRADE Portal
06/08/10 PBS, LSF and ARC 4
PBS (and LSF)
•Portable Batch Scheduler•(Load Sharing Facility)•Schedule users' jobs on a cluster•Interactive login to a submission node•Users execute different commands:•qsub (bsub): submit•qstat (bjobs): status•qdel (bkill): abort
SubmissionNode
Clusternode
Clusternode
Clusternode
Clusternode
Clusternode
Schedulernode
06/08/10 PBS, LSF and ARC 5
ARC
•Advanced Resource Connector•Complete grid middleware with:
•Information system•Command-line clients with integrated broker•Data management stack (GridFTP)
•Usable through client programs:•Job description: xRSL•ngsub: submit•ngstat: status update•ngkill: cancel•ngget: get results
06/08/10 PBS, LSF and ARC 6
P-GRADE Portal Architecture
•Workflow Editor-related components•Portlet-related components•Workflow data storage•Execution layer
•See next slide!
06/08/10 PBS, LSF and ARC
P-GRADE Portal MachineP-GRADE Portal Machine
Globus GridGlobus Grid EGEE GridEGEE Grid
P-GRADE Portal's filesystemP-GRADE Portal's filesystem
UserWorkflow
Data
Common workflow andjob execution scripts
Globus scripts EGEE scripts
Apache Tomcat servlet containerApache Tomcat servlet container
GridSphere portal framework
P-GRADEPortalPortlet
DAGMan
PBS scripts
PBSCluster
PBSCluster
WorkflowEditorServlet
WorkflowEditorClient
P-GRADEPortalPortlet
P-GRADEPortalPortlet
P-GRADEPortalPortlet
P-GRADEPortalPortlet
06/08/10 PBS, LSF and ARC 8
LSF and PBS integration I.
•Principal idea:•User should be able to configure a remote ssh connection to submission nodes through the Settings portlet•Connection is established using ssh keypairs•Established connections are reused in order to minimize ssh connection attempts
•Connections are used on a:•Per-user,•Per-resource bassis→ a given user's connection isn't accessible by other users→ different resources use different connections
06/08/10 PBS, LSF and ARC 9
LSF and PBS integration II.
Portal Machine
Connection Pool User 1
Connection Pool User 2
LSF resource 1
PBS resource 1
LSF resource 3
PBS resource 2
LSF resource 2
PRIV
PUB
PRIV
PUB
PUB
PUB
PUB
06/08/10 PBS, LSF and ARC 10
LSF and PBS integration III.
•Job preparation:•wkf_pre_LSF.sh: prepare job, wrapper, collect files•wkf_pre_PBS.sh: prepare job, wrapper, collect files
•Job execution:•wkf_LSF.sh: submit and observe job using b* commands•wkf_PBS.sh: submit and observer job using q* commands•Wrappers:
•LSF_fake.sh: handle generator and collector jobs, run exe•PBS_fake.sh: handle generator and collector jobs, run exe
•Job post-processing:•No real task (wkf_post_LSF.sh and wkf_post_PBS.sh)
06/08/10 PBS, LSF and ARC 11
LSF and PBS integration features
•Full PS support•Very quick response time compared to grid middlewares•Support for any kind of executable
06/08/10 PBS, LSF and ARC 12
ARC integration I.
•Very similar to the EGEE support•An ARC client stack has to be installed on the P-GRADE Portal machine•Users can gain access with X.509 proxy certs•Two possible resource selections:•User can specify the target cluster•Cluster can be selected by client broker
06/08/10 PBS, LSF and ARC 13
ARC integration II.
•Job preparation: wkf_pre_nordugrid.sh•Wrapper script preparation•Generator-related cleanups (as needed)•Autogenerator-related file uploads (as needed)
•Job execution: wkf_nordugrid.sh•xRSL prepared based on job properties•Job submission and management using ng* commands•Wrapper script: manage generator and collector jobs if needed
•Job post-processing: wkf_post_nordugrid.sh•No real job to perform
06/08/10 PBS, LSF and ARC 14
ARC integration features
•Full PS support•Offers the possibility to select execution resource•Support for any kind of executable•Multi-node job support•Offers possibility to specify runTimeEnvironment attributes
Top Related