E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano...

30
1 E-infrastructure shared between Europe and Latin America Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico City, 23 October 2007

Transcript of E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano...

Page 1: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

1

E-infrastructure shared between Europe and Latin America

Workload Management System-WMS

Luciano Diaz

Universidad Nacional Autónoma de México - UNAM

Mexico City, 23 October 2007

Page 2: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

2

E-infrastructure shared between Europe and Latin AmericaIntroduction

The purpose of the Workload Manager Service (WMS) is to accept requests for job submission and management coming from its clients and take the appropriate actions to satisfy them.

A user can submit and cancel jobs, query their status, and retrieve their output. These tasks go under the name of Workload Management.

Page 3: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

3

E-infrastructure shared between Europe and Latin America Introduction

The user interaction with the WMS is indeed limited to the description of the characteristics and requirements of the request via a high-level, user-oriented specification language, the Job Description Language (JDL), and to the submission of it through the provided interfaces.

The JDL allows the description of the following request types supported by the WMS:

• Job: a simple application

• DAG: a direct acyclic graph of dependent jobs (just for glite).

Jobs in turn can be batch, interactive, MPI-based

Page 4: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

4

E-infrastructure shared between Europe and Latin AmericaService Architecture

The WMS provides a set of client tools (that will be referred to as WMS-UI):

• Command line interface

• Graphical interface

• API: providing both C++ and Java bindings

Page 5: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

5

E-infrastructure shared between Europe and Latin AmericaService ArchitectureThe main operations made possible by the WMS-UI are:

• Find the list of resources suitable to run a specific job.

• Submit a job for execution on a remote CE.

• Check the status of a submitted job.

• Cancel one or more submitted jobs.

• Retrieve the output files of a completed job.

• Retrieve and display bookkeeping information about submitted jobs.

• Retrieve and display logging information about submitted jobs.

• Start a local listener for an interactive job.

Page 6: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

6

E-infrastructure shared between Europe and Latin America

Main Commands

The most relevant commands to interact with the WMS are:– edg-job-list-match <jdl_file>– edg-job-submit <jdl_file>– edg-job-status <job_Id>– edg-job-get-output <job_Id>– edg-job-cancel <job_Id>You can access information about the usage of each

command by issuing either:<command> --help

or

man <command>

Page 7: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

7

E-infrastructure shared between Europe and Latin America

edg-job-list-match

• edg-job-list-match [options] <jdl file>

Displays the list of identifiers of the resources (and the corresponding ranks - if requested) on which the user is authorized and satisfying the job requirements included in the JDL.

options:

--help--version--verbose, -v--rank--config, -c <config file>--config-vo <config-vo file>

--vo <vo value>--output, -o <output file>--noint--debug--logfile <logfile file>

Page 8: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

8

E-infrastructure shared between Europe and Latin America edg-job-submit

• edg-job-submit [options] <jdl file>

This command submits a job to the grid. It requires a JDL file as input and returns a job identifier.

options:

--help--version--input, -i <input file>--resource, -r <resource value>--chkpt <chkpt file>--nolisten--nogui--nomsg

--lrms <lrms value>--config, -c <config file>--config-vo <config-vo file>--vo <vo value>--output, -o <output file>--noint--debug--logfile <logfile file>

Page 9: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

9

E-infrastructure shared between Europe and Latin America edg-job-status• edg-job-status [options] <job Id(s)>

This command prints the status of a job previously submitted using edg-job-submit. The job status request is sent to the LB (Logging and Bookkeeping service) that provides the requested information.

options:

--help--version--all--input, -i <input file>--verbosity [0|1|2]--from [MM:DD:]hh:mm[:[CC]YY]--to [MM:DD:]hh:mm[:[CC]YY]--config, -c <config file>--status, -s <status value>

--exclude, -e <exclude value>--config-vo <config-vo file>--vo <vo value>--output, -o <output file>--noint--debug--logfile <logfile file>

Page 10: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

10

E-infrastructure shared between Europe and Latin America Job state machine• Job life-cycle state machine:

Page 11: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

11

E-infrastructure shared between Europe and Latin America edg-job-get-output

edg-job-get-output [options] <job Id(s)>

The edg-job-get-output command can be used to retrieve the output files of a job that has been submitted through the edg-job-submit command with a job description file including the OutputSandbox attribute.

After the submission, when the job has terminated its execution, the user can download the files generated by the job and temporarily stored on the Resource Broker machine as specified by the OutputSandbox attribute, issuing the edg-job-get-output with as input the ID returned by the edg-job-submit.

Page 12: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

12

E-infrastructure shared between Europe and Latin America edg-job-get-outputOptions:

--help--version--input, -i <input file>--dir <dir value>--config, -c <config file>--noint--debug--logfile <logfile file>

Page 13: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

13

E-infrastructure shared between Europe and Latin America edg-job-cancel

edg-job-cancel [options] <job Id(s)>

This command cancels a job previously submitted using edg-job-submit. Before cancellation, it prompts the user for confirmation. The cancel request is sent to the Network Server that forwards it to the WM that fulfills it.

Options:

--vo <vo value>--output, -o <output file>--noint--debug--logfile <logfile file>

--help--version--all--input, -i <input file>--config, -c <config file>--config-vo <config-vo file>

Page 14: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

14

E-infrastructure shared between Europe and Latin Americaaditional commands

The WMS-UI also provides three additional commands. They are:

• edg-job-get-logging-info (mostly useful for debugging purposes)

• edg-job-attach (for interactive jobs only)

Page 15: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

15

E-infrastructure shared between Europe and Latin Americaaditional commands• edg-job-get-logging-info [options] <job Id(s)>

This command prints all the events related to a previously submitted job, that have been logged to the LB during request’s lifetime by the WMS components that have handled it. The job logging- info request is sent to the LB (Logging and Bookkeeping service) that provides the requested information.

• edg-job-attach [options] <job Id>

This command attaches a listener to a previously submitted interactive job. This will make the job standard streams be re-directed to the command shell (or to a dedicated graphical window - if requested).

Page 16: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

16

E-infrastructure shared between Europe and Latin America .

Descriptions for the command options

Page 17: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

17

E-infrastructure shared between Europe and Latin America

EDG-JOB-LIST-MATCH

edg-job-list-match [options] <jdl file>

-help

-verbose

-c file_path (configuration file pointed instead configuration standard)

-version

-config

-output

-logfile file_path

-rank It displays the "matching" CEIds and the associated ranking values.

-noint every interactive question to the user is skipped. All warning and errors are written:

edg-job-output_<UID>_<PID>_<timestamp>.log under the /tmp.

Page 18: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

18

E-infrastructure shared between Europe and Latin America

EDG-JOB-LIST-MATCH

-config –vo file_path The vo-specific configuration file pointed to by file_path is used instead of the standard vo-specific configuration file

-o file_path It returns the CEIds list in the file specified by file_path. This can be either a simple name or an absolute path

-v file_path It displays on the standard output the job class-ad that is sent to the Network Server generated from the job description file. This differs from the content of the job description file because some attributes cannot be directly inserted by the user.

-vo vo-name It allows the user to specify the Virtual Organisation she/he is currently working for. The following rule is followed for determining the user’s VO:

the default VO from the user proxy

the VO specified through the –vo or –config-vo options

the VirtualOrganisation attribute in the JDL

 If none of the listed trials has success an error is returned and the submission is aborted. This option is allowed only when used together with the –all one.

Page 19: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

19

E-infrastructure shared between Europe and Latin America

EDG-JOB-SUBMIT

edg-job-submit [options] <jdl file>

-help

-version

-c file_path (configuration file pointed instead configuration standard)

-config file_path

-input file_path

-output file_path

-r <full hostname>:<port number>/jobmanager-<service>-<queue name>

-resource <full hostname>:<port number>/jobmanager-<service>-<queue name>)

Page 20: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

20

E-infrastructure shared between Europe and Latin America

EDG-JOB-STATUS

edg-job-status [options] <job Id(s)>

-help

-verbose

-c file_path (configuration file pointed instead configuration standard)

-version

-config file_path

-input file_path

-output file_path

-all It displays status information about all job owned by the user. This option can’t be used either if one or more jobIds have been specified or if the –input option has been specified. All LBs listed in the vo-specific WMS-UI configuration file are contacted to fulfil this request

Page 21: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

21

E-infrastructure shared between Europe and Latin America

EDG-JOB-STATUS

-to –from [MM:DD:]hh:mm[:[CC]YY]

-c file_path (configuration file pointed instead configuration standard)

-config –vo file_path The vo-specific configuration file pointed to by file_path is used instead of the standard vo-specific configuration file

-exclude

-e status value It returns the status information for all those jobs that are currently in the status specified by <status_value>

Possible status values are: SUBMITTED, WAITING, READY, SCHEDULED, RUNNING, DONE, ABORTED, CANCELLED, CLEARED

-o file_path It writes the bookkeping information in the file specified by file_path instead of the standard output. file_path can be either a simple name or an absolute path (on the submitting machine).

-user_tag tag_name/tag_value= It returns the status of the jobs that have been submitted with an associated tag named <tag name>whose value is <tag value>

Page 22: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

22

E-infrastructure shared between Europe and Latin America

EDG-JOB-STATUS

-vo vo-name

It allows the user to specify the Virtual Organisation she/he is currently working for. If the user proxy contains VOMS extensions then the VO specified through this option is overridden by the default VO contained in the proxy (i.e. this option is only useful when working with non-VOMS proxies). The following rule is followed for determining the user’s VO:

the default VO from the user proxy

the VO specified through the –vo or –config-vo options

the VirtualOrganisation attribute in the JDL

 If none of the listed trials has success an error is returned and the submission is aborted. This option is allowed only when used together with the –all one.

-noint every interactive question to the user is skipped. All warning and errors are written:

edg-job-status_<UID>_<PID>_<timestamp>.log under the /tmp.

Page 23: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

23

E-infrastructure shared between Europe and Latin America

EDG-JOB-STATUS

-debug This information about the API functions are displayed on the standard output and are written to the file

edg-job-output_<UID>_<PID>_<timestamp>.log under the /tmp.

-logfile file_path this command is relocated to the location pointed by file_path

-jobId It must be last argument of the command

-v verb_level, It sets the detail level of information about the job displayed to the user. Possible values for verb_level are 0,1,2 and 3.

Page 24: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

24

E-infrastructure shared between Europe and Latin America

EDG-JOB-ATTACH

edg-job-attach [options] <job Id>

• -help

• -version

• –port port_num

• -p this starts a listener on the local machine on the specified port and logs these information to the Logging & Bokkeeping associated to the job.

• -nogui If the connection to UI node isn’t possible, the user can specify this option that provides a simple standard non-graphical interaction with the running job.

• -nolisten This allows the user to interact with the job through her/his own tools. It is important to note that the WMS-UI has no more control over the launched listener process that has to be killed by the user.

Page 25: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

25

E-infrastructure shared between Europe and Latin America

EDG-JOB-ATTACH

-config file_path

-i file_path

-c file_path (configuration file pointed instead configuration standard)

-noint every interactive question to the user is skipped. All warning and errors are written:

edg-job-attach_<UID>_<PID>_<timestamp>.log under the /tmp.

-debug This information about the API functions are displayed on the standard output and are written to the file

edg-job-attach_<UID>_<PID>_<timestamp>.log under the /tmp.

-logfile file_path this command is relocated to the location pointed by file_path

-jobId It must be last argument of the command

Page 26: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

26

E-infrastructure shared between Europe and Latin America

EDG-JOB-GET-OUTPUT

edg-job-get-output [options] <job Id(s)>

-help

-version

-config

-input

-dir directory_path It retrieved files (listed by the user through the OutputSandbox attribute)that are stored in the location indicated by directory_path/<login name>_<jobId unique string>.

-c file_path (configuration file pointed instead configuration standard)

-i file_path It makes the command return the OutputSandbox files for each jobId contained in the file_path. This option can’t be used if one (or more) jobIds have been already specified. The format of the input file must be as follows: one jobId for each line and comment lines must begin with a "#" or a "" character.

Page 27: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

27

E-infrastructure shared between Europe and Latin America

EDG-JOB-OUTPUT

-noint every interactive question to the user is skipped. All warning and errors are written:

edg-job-output_<UID>_<PID>_<timestamp>.log under the /tmp.

-debug This information about the API functions are displayed on the standard output and are written to the file

edg-job-output_<UID>_<PID>_<timestamp>.log under the /tmp.

-logfile file_path this command is relocated to the location pointed by file_path

-jobId It must be last argument of the command

Page 28: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

28

E-infrastructure shared between Europe and Latin America

EDG-JOB-CANCEL

edg-job-cancel [options] <job Id(s)>

-help

-verbose

-c file_path (configuration file pointed instead configuration standard)

-version

-config file_path

-input file_path

-output file_path

-all It cancels all job owned by the user. It can’t be used either if one or more Ids have been specified explicitly or with the -input option.

-vo vo-name It allows the user to specify the Virtual Organisation she/he is currently working for.

-config -vo The vo-specific configuration file pointed to by file_path is used instead of the standard vo-specific configuration file. This option is allowed only when used together with the –all one.

Page 29: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

29

E-infrastructure shared between Europe and Latin America

EDG-JOB-CANCEL

-noint every interactive question to the user is skipped. All warning and errors are written:

edg-job-cancel_<UID>_<PID>_<timestamp>.log under the /tmp

-debug This information about the API functions are displayed on the standard output and are written to the file

edg-job-cancel_<UID>_<PID>_<timestamp>.log under the /tmp.

-logfile file_path this command is relocated to the location pointed by file_path

-jobId It must be last argument of the command

Page 30: E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.

30

E-infrastructure shared between Europe and Latin America

QUESTIONS?