CREAM-CE status and evolution plans Paolo Andreetto, Sara Bertocco, Alvise Dorigo, Eric Frizziero,...

1
CREAM-CE status and evolution plans Paolo Andreetto, Sara Bertocco, Alvise Dorigo, Eric Frizziero, Alessio Gianelle, Massimo Sgaravatto, Lisa Zangrando The CREAM-CE implements a Grid job management service available to end users and to other higher level Grid job submission services. It allows the submission, management and monitoring of computational jobs to Local Resource Management Systems (LRMS). Now facing the challenge to support an enlarged community of users, as part of the European Middleware Initiative (EMI), such service needs to be consolidated and evolved. Among the new functionalities introduced with the first EMI release (EMI-1), we highlight the integration with the new authorization framework (ARGUS). The major developments foreseen for the next EMI release include the adoption of job related standard interfaces (EMI-ES) and the enhancing of CREAM with the High Availability (HA) criteria based on the clustered approach. Definition and adoption of job related standard interfaces Several services providing compute and job related functionality have been implemented in the context of different Grid projects. However these services, which provide the same core functionality, have been realized by adopting proprietary solutions. While standard mechanisms for job description (e.g. JSDL) and standard interfaces for job submission and management do exist (e.g. BES), they are not really suited for production use because they lack significant capabilities. EMI-ES EMI-ES EMI-ES This issue is being addressed in parallel on two fronts within: •the EMI project, where a specification for an EMI Execution Service (EMI-ES) interface has been defined •the PGI-OGF working group where, besides the EMI partners, some other actors are involved. The EMI job management components (i.e. CREAM- CE, ARC-CE and UNICORE) will adopt the EMI-ES Interface. Main achievements: increase of the interoperability level implementing a standard solution • simplification of the usage and maintenance • expansion of the user community Towards the CREAM High Availability One of the main objectives described in the CREAM evolution plan, is the need to meet the High Availability (HA) criteria. In particular, CREAM, like several popular Internet services, must rely on large clusters of commodity computers for providing several features, including high performance, scalability, availability and fault tolerance. From the user's point of view the main benefit provided by this enhancement is the guaranteed access to his own jobs and related resources (i.e. job sandbox) during planned and unplanned outages. So, we are focusing on providing CREAM with the ability to be continuously available for serving the user requests independently of eventual current critical conditions. The figure illustrates the high level CREAM clustered architecture which is based on a horizontal topology. We define a CREAM node as a separate CREAM instance running on its dedicated (virtual) machine, while a collection of such nodes is referred as CREAM cluster. The WEB server (e.g. Apache) acts as gateway for incoming requests of authenticated users. These requests are delivered to the load balancer which redirects them to the proper CREAM nodes, basing its decisions on the selected scheduling algorithm (e.g. Round Robin, Weight based, etc). Moreover the load balancer can provide even fault tolerance capability, if appropriately configured. Main achievements: • increase of high performance, scalability, availability and fault tolerance • guaranteed access to the CREAM service independently of critical conditions it is traversing ARGUS as the unique authorization mechanism The use of different authorization mechanisms, each providing the same or similar functionalities is clearly a complication from a deployment and maintenance point of view. Moreover because of bugs or misconfigurations, inconsistent authorization decisions could be made. These issues have been addressed by the EMI project by referring to a single authorization service (ARGUS) which is supposed to be the unique authorization service for all EMI services. The integration with the ARGUS service has been introduced in the first EMI release (EMI-1). CREAM-CE Main achievements: avoidance of inconsistent authorization decisions due to the use of multiple authorization systems (bugs or misconfigurations) simplification of the deployment and maintenance of the authorization layer use of the same authorization framework adopted by EMI Contact: gLite job management product team email: [email protected] For more information: www.eu-emi.eu

Transcript of CREAM-CE status and evolution plans Paolo Andreetto, Sara Bertocco, Alvise Dorigo, Eric Frizziero,...

Page 1: CREAM-CE status and evolution plans Paolo Andreetto, Sara Bertocco, Alvise Dorigo, Eric Frizziero, Alessio Gianelle, Massimo Sgaravatto, Lisa Zangrando.

CREAM-CE status and evolution plans

CREAM-CE status and evolution plans

Paol

o An

dree

tto, S

ara

Berto

cco,

Alvi

se D

orig

o, E

ric F

rizzie

ro, A

less

io G

iane

lle, M

assim

o Sg

arav

atto

, Lisa

Zan

gran

do

The CREAM-CE implements a Grid job management service available to end users and to other higher level Grid job submission services. It allows the submission, management and monitoring of computational jobs to Local Resource Management Systems (LRMS).

Now facing the challenge to support an enlarged community of users, as part of the European Middleware Initiative (EMI), such service needs to be consolidated and evolved.

Among the new functionalities introduced with the first EMI release (EMI-1), we highlight the integration with the new authorization framework (ARGUS). The major developments foreseen for the next EMI release include the adoption of job related standard interfaces (EMI-ES) and the enhancing of CREAM with the High Availability (HA) criteria based on the clustered approach.

Definition and adoption of job related standard interfaces

Several services providing compute and job related functionality have been implemented in the context of different Grid projects. However these services, which provide the same core functionality, have been realized by adopting proprietary solutions. While standard mechanisms for job description (e.g. JSDL) and standard interfaces for job submission and management do exist (e.g. BES), they are not really suited for production use because they lack significant capabilities.

EMI-ES EMI-ES EMI-ES

This issue is being addressed in parallel on two fronts within:

• the EMI project, where a specification for an EMI Execution Service (EMI-ES) interface has been defined

• the PGI-OGF working group where, besides the EMI partners, some other actors are involved.

The EMI job management components (i.e. CREAM-CE, ARC-CE and UNICORE) will adopt the EMI-ES Interface.

Main achievements:

• increase of the interoperability level implementing a standard solution

• simplification of the usage and maintenance

• expansion of the user community

Towards the CREAM High AvailabilityOne of the main objectives described in the CREAM evolution plan, is the

need to meet the High Availability (HA) criteria. In particular, CREAM, like several popular Internet services, must rely on large clusters of commodity computers for providing several features, including high performance, scalability, availability and fault tolerance. From the user's point of view the main benefit provided by this enhancement is the guaranteed access to his own jobs and related resources (i.e. job sandbox) during planned and unplanned outages. So, we are focusing on providing CREAM with the ability to be continuously available for serving the user requests independently of eventual current critical conditions.

The figure illustrates the high level CREAM clustered architecture which is based on a horizontal topology. We define a CREAM node as a separate CREAM instance running on its dedicated (virtual) machine, while a collection of such nodes is referred as CREAM cluster. The WEB server (e.g. Apache) acts as gateway for incoming requests of authenticated users. These requests are delivered to the load balancer which redirects them to the proper CREAM nodes, basing its decisions on the selected scheduling algorithm (e.g. Round Robin, Weight based, etc). Moreover the load balancer can provide even fault tolerance capability, if appropriately configured.

Main achievements:

• increase of high performance, scalability, availability and fault tolerance

• guaranteed access to the CREAM service independently of critical conditions it is traversing

ARGUS as the unique authorization mechanism

The use of different authorization mechanisms, each providing the same or similar functionalities is clearly a complication from a deployment and maintenance point of view. Moreover because of bugs or misconfigurations, inconsistent authorization decisions could be made. These issues have been addressed by the EMI project by referring to a single authorization service (ARGUS) which is supposed to be the unique authorization service for all EMI services.

The integration with the ARGUS service has been introduced in the first EMI release (EMI-1).

CREAM-CE

Main achievements: avoidance of inconsistent authorization decisions due to the use of multiple

authorization systems (bugs or misconfigurations) simplification of the deployment and maintenance of the authorization layer use of the same authorization framework adopted by EMI

Contact: gLite job management product team

email: [email protected]

For more information: www.eu-emi.eu