RMS and Scheduling for Future Generation Grids Ramin Yahyapour University Dortmund Leader CoreGRID...

RMS and Scheduling for Future Generation Grids

Ramin Yahyapour

University DortmundLeader CoreGRID Institute

on Resource Management and Scheduling

CoreGRID – Summer SchoolBonn, 24 July 2006

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies

24.07.06

2

Introduction

We all know what “the Grid” is…– one of the many definitions:

“Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations” (Ian Foster)

– however, the actual scope of “the Grid” is still quite controversial

Many people consider High Performance Computing (HPC) as the main Grid application.

– today’s Grids are mostly Computational Grids or Data Grids with HPC resources as building blocks

– thus, Grid resource management is much related to resource management on HPC resources (our starting point).

– we will return to a broader Grid scope and its implications later


24.07.06

3

Key Question

“Which services/resources to use for an activity, when, where, how?”

Typically: A particular user, or business application, or component applicationneeds for an activity one or several services/resourcesunder given constraints

• Trust & Security• Timing & Economics• Functionality & Service level• Application-specifics & Inter-dependencies• Scheduling and Access Policies

This question has to be answered in an automatic, efficient, and reliable way.

Part of the invisible and smart infrastructure!


24.07.06

4

Motivation

Resource Management for Future/Next Generation Grids!

But what are Future Generation Grids?

HPC Computing– Parallel Computing– Cluster Computing– Desktop Computing

HPC Computing– Parallel Computing– Cluster Computing– Desktop Computing

Enterprise Grids– Business Services– Application Server– Webservices

Enterprise Grids– Business Services– Application Server– Webservices

Ambient IntelligenceUbiquitous Computing

– PDA, Mobile Devices

Ambient IntelligenceUbiquitous Computing

– PDA, Mobile Devicesdepends on who you ask!


24.07.06

5

Resource Definition

Concluding from the different interpretations of “Grid”:for broad acceptance Grid RMS should probably cover the whole scope;

Resources:

Compute

Network

Storage

Data

Software

– components, licenses

Services

– functionality, ability

Management of some resources is less complex,

while other resources require coordination and orchestration to be effective (e.g. HW and SW).

Management of some resources is less complex,

while other resources require coordination and orchestration to be effective (e.g. HW and SW).


24.07.06

6

Resource Management LayerGrid Resource Management System consists of :Local resource management system (Resource Layer)

– Basic resource management unit – Provide a standard interface for using remote resources– e.g. GRAM, etc.

Global resource management system (Collective Layer)– Coordinate all Local resource management system within multiple or

distributed Virtual Organizations (VOs)– Provide high-level functionalities to efficiently use all of resources

• Job Submission• Resource Discovery and Selection• Scheduling• Co-allocation• Job Monitoring, etc.

– e.g. Meta-scheduler, Resource Broker, etc.


24.07.06

7

ResourceBroker

Grid Resource Manager



Information Services

MonitoringServices

SecurityServices

Core Grid Infrastructure Services

Grid Middlewar

e

PBS LSF …

Resource Resource Resource

Local Resource

Management

Higher-Level Services

User/Application

Grid RMS


24.07.06

8

Core Functionalities of a Grid RMS

Resource Discovery

– online, on-demand process

Access to Resource Information

– static and dynamic information

Status Monitoring

– general resource monitoring

– monitoring with respect to a job

Allocation/Scheduling

– coordination is required

SLA Management

– reliable agreements

Execution Management/Provisioning

– start of a job / use of a resource

Accounting and Billing


24.07.06

9

Case 1: RMS for specialized Applications

Specialized resource management dedicated to a single application domain.

– Goal: high efficiency

– Cost: higher development effort

The RMS is adapted to:

– application and its workflow

– resource configuration

There is need for specific interfaces to the resources.

Highly specialized for the application and therefore easier to handle for the user.

– The know-how has been built into the system.

Only certain types of jobs and resources are considered.

Only certain types of jobs and resources are considered.


24.07.06

10

Case 2: RMS as Generic Grid-Middleware

Grid RMS is open for many applications

This may be less efficient than Case 1.

Generic interfaces are required that are adapted to many front- and backends.

This approach requires additional user-/application supplied information:

– job description• workflow, objectives, requirements, constraints

Consideration of security is an integral aspect

– wide variety of security levels

RMS for Future Generation Grids needs the flexibility to cover all kind of jobs and resources

RMS for Future Generation Grids needs the flexibility to cover all kind of jobs and resources


24.07.06

11

FGG Resource Management Need for well-defined interfaces to core services

Inherent support for different implementations

While maintaining cooperation between these implementations

Resource DiscoveryAccess to Resource InformationStatus MonitoringAllocation/SchedulingSLA ManagementExecution Management/ProvisioningAccounting and Billing

Resource DiscoveryAccess to Resource InformationStatus MonitoringAllocation/SchedulingSLA ManagementExecution Management/ProvisioningAccounting and Billing


24.07.06

12

Requirements

Resource Discovery:– scalable

• from cluster grids,• business grids• to global grids

– centralized or decentralized implementations, P2P

– unified naming scheme

Resource Discovery


Status Monitoring


SLA Management



Resource Discovery


Status Monitoring


SLA Management



Aspects:

flexibility

scalability

efficiency

Aspects:

flexibility

scalability

efficiency


24.07.06

13

Requirements

Resource Discovery:– scalable

• from cluster grids,• business grids• to global grids

– centralized or decentralized implementations, P2P

– unified naming scheme

Access to resource information:– static and historic information,– dynamic (future) information:

• planned, predicted

– may be subject to privacy concerns

• user and owner dependent

Resource Discovery


Status Monitoring


SLA Management



Resource Discovery


Status Monitoring


SLA Management



Aspects:

flexibility

scalability

efficiency

Aspects:

flexibility

scalability

efficiency


24.07.06

14

Problem: Job Submission Descriptions differ

The deliverables of the GGF/OGF Working Group JSDL:

A specification for an abstract standard Job Submission Description Language (JSDL) that is independent of language bindings, including; – the JSDL feature set and attribute semantics, – the definition of the relationship between attributes, – and the range of attribute values.

A normative XML Schema corresponding to the JSDL specification.

A document of translation tables to and from the scheduling languages of a set of popular batch systems for both the job requirements and resource description attributes of those languages, which are relevant to the JSDL.


24.07.06

15

JSDL Attribute Categories

The job attribute categories include:

– Job Identity Attributes• ID, owner, group, project, type, etc.

– Job Resource Attributes• hardware, software, including applications, Web and Grid Services, etc.

– Job Environment Attributes• environment variables, argument lists, etc.

– Job Data Attributes• databases, files, data formats, and staging, replication, caching, and disk

requirements, etc.

– Job Scheduling Attributes• start and end times, duration, immediate dependencies etc.

– Job Security Attributes• authentication, authorisation, data encryption, etc.


24.07.06

16

Requirements

Status monitoring:

– job and resource condition

– SLA status

Autonomic aspects:

– detection of unexpected changes

– allows prediction of system behavior

• related to an individual job• and to general demand

– trigger of re-scheduling/re-allocation

Resource Discovery


Status Monitoring


SLA Management



Resource Discovery


Status Monitoring


SLA Management



Aspects:

reliability

scalability

Aspects:

reliability

scalability


24.07.06

17

Requirements

Allocation/Scheduling:– Different application scenarios

• parallel, sequential jobs

• co-allocation and orchestration

• workflows

– Provider policies• access, cost, security

– User/application policies• scheduling objectives,

• cost/budget management

• deadlines

– Cooperation between RM systems– Support for different (= individual)

algorithms and strategies

Resource Discovery


Status Monitoring


SLA Management



Resource Discovery


Status Monitoring


SLA Management



Aspects:

flexibility, easy-to-use

support business models

person-centric

efficiency

Aspects:

flexibility, easy-to-use


person-centric

efficiency


24.07.06

18

Different Level of Scheduling

Resource-level scheduler

– low-level scheduler, local scheduler, local resource manager

– scheduler close to the resource, controlling a supercomputer, cluster, or network of workstations, on the same local area network

– Examples: Open PBS, PBS Pro, LSF, SGE

Enterprise-level scheduler

– Scheduling across multiple local schedulers belonging to the same organization

– Examples: PBS Pro peer scheduling, LSF Multicluster

Grid-level scheduler

– also known as super-scheduler, broker, community scheduler

– Discovers resources that can meet a job’s requirements

– Schedules across lower level schedulers


24.07.06

19

Grid-Level Scheduler

Discovers & selects the appropriate resource(s) for a job

If selected resources are under the control of several local schedulers, a meta-scheduling action is performed

Architecture:– Centralized: all lower level schedulers are under the

control of a single Grid scheduler• not realistic in global Grids

– Distributed: lower level schedulers are under the control of several grid scheduler components; a local scheduler may receive jobs from several components of the grid scheduler


24.07.06

20

Grid Scheduling

Scheduler

Schedule

tim

e

Job-Queue

Machine 1

Scheduler

Scheduleti

me

Job-Queue

Machine 2

Scheduler

Schedule

tim

e

Job-Queue

Machine 3

Grid-SchedulerGrid User


24.07.06

21

Activities of a Grid Scheduler

GGF Document: “10 Actions of Super Scheduling (GFD-I.4)”

1. Authorization Filtering

3. Min. Requirement Filtering

2. Application Definition

Phase One-Resource Discovery

5. System Selection

4. Information Gathering

Phase Two - System Selection

7. Job Submission

6. Advance Reservation

9. Monitoring Progress

8. Preparation Tasks

11. Clean-up Tasks

10 Job Completion

Phase Three- Job Execution

Source: Jennifer Schopf


24.07.06

22

Select a Resource for Execution

Most systems do not provide advance information about future job execution– user information not accurate as mentioned before– new jobs arrive that may surpass current queue entries due to

higher priority

Grid scheduler might consider current queue situation, however this does not give reliable information for future executions:– A job may wait long in a short queue while it would have been

executed earlier on another system.Available information:

– Grid information service gives the state of the resources and possibly authorization information

– Prediction heuristics: estimate job’s wait time for a given resource, based on the current state and the job’s requirements.


24.07.06

23

Requirements (contd)

SLA management:– reliability– orchestration of services– quality of service– business models– accountability

Resource Discovery


Status Monitoring


SLA Management



Resource Discovery


Status Monitoring


SLA Management



Aspects:

persistence


Aspects:

persistence



24.07.06

24

Co-allocation

It is often requested that several resources are used for a single job.– that is, a scheduler has to assure that all resources are

available when needed.• in parallel (e.g. visualization and processing)

• with time dependencies (e.g. a workflow)

The task is especially difficult if the resources belong to different administrative domains.– The actual allocation time must be known for co-allocation– or the different local resource management systems must

synchronize each other (wait for availability of all resources)


24.07.06

25

Example Multi-Site Job Execution

Scheduler

Scheduleti

me

Job-Queue

Machine 2

Scheduler

Schedule

tim

e

Job-Queue

Machine 3

A job uses several resources at different sites in parallel.Network communication is an issue.

Scheduler

Schedule

tim

e

Job-Queue

Machine 1

Grid-Scheduler

Multi-Side Job


24.07.06

26

Advanced Reservation

Co-allocation and other applications require a priori information about the precise resource availability

With the concept of advanced reservation, the resource provider guarantees a specified resource allocation.– includes a two- or three-phase commit for agreeing on

the reservation

Implementations:– GARA/DUROC/SNAP provide interfaces for Globus to

create advanced reservation– implementations for network QoS available.

• setup of a dedicated bandwidth between endpoints– “WS-Agreement” defines a protocol for agreement

management


24.07.06

27

Using Service Level Agreements

The mapping of jobs to resources can be abstracted using the concept of Service Level Agreement (SLAs)

SLA: Contract negotiated between– resource provider, e.g. local scheduler– resource consumer, e.g., grid scheduler, application

SLAs provide a uniform approach for the client to– specify resource and QoS requirements, while– hiding from the client details about the resources,– such as queue names and current workload


24.07.06

28

GGF/OGF – GRAAP Working GroupGoal: Defining WebService-based protocols for negotiation and agreement

management

WS-Agreement Protocol:


24.07.06

29

Requirements

SLA management:– reliability– orchestration of services– quality of service– business models– accountability

Execution Management– services, software,

data/storage, compute, network

Resource Discovery


Status Monitoring


SLA Management



Resource Discovery


Status Monitoring


SLA Management



Aspects:

persistence


Aspects:

persistence



24.07.06

30

GGF/OGF-WG DRMAA

GGF Working Group “Distributed Resource Management Application API”

From the charter:

Develop an API specification for the submission and control of jobs to one or more Distributed Resource Management (DRM) systems.

The scope of this specification is all the high level functionality which is necessary for an application to consign a job to a DRM system including common operations on jobs like termination or suspension.

The objective is to facilitate the direct interfacing of applications to today's DRM systems by application's builders, portal builders, and Independent Software Vendors (ISVs).


24.07.06

31

RequirementsSLA management:

– reliability– orchestration of services– quality of service– business models– accountability

Execution Management– services, software,

data/storage, compute, network

Accounting and Billing– providing economic/financial

services– foundation of business models

Resource Discovery


Status Monitoring


SLA Management



Resource Discovery


Status Monitoring


SLA Management



Aspects:

persistence


Aspects:

persistence


Scheduling in Future Generation Grids

Outlook on future Grid Resource Management and Scheduling


24.07.06

33

Limitations of current Grid RMS

The interaction between local scheduling and higher-level Grid scheduling is currently a one-way communication– current local schedulers are not optimized for Grid-use– limited information available about future job execution– a site is usually selected by a Grid scheduler and the job

enters the remote queue.

The decision about job placement is inefficient.– Actual job execution is usually not known– Co-allocation is a problem as many systems do not

provide advance reservation


24.07.06

34

Example of Grid Scheduling Decision Making

Scheduler

Schedule

tim

e

Job-Queue

Machine 1

Scheduler

Schedule

tim

e

Job-Queue

Machine 2

Scheduler

Schedule

tim

e

Job-Queue

Machine 3

Grid-SchedulerGrid User

15 jobs running20 jobs queued



Where to put the Grid job?


24.07.06

35

Available Information from the Local Schedulers

Decision making is difficult for the Grid scheduler

– limited information about local schedulers is available

– available information may not be reliable

Possible information:

– queue length, running jobs

– detailed information about the queued jobs• execution length, process requirements,…

– tentative schedule about future job executions

These information are often technically not provided by the local scheduler

In addition, these information may be subject to privacy concerns!


24.07.06

36

Consequence

Consider a workflow with 3 short steps (e.g. 1 minute each) that depend on each other

Assume available machines with an average queue length of 1 hour.The Grid scheduler can only submit the subsequent step if the previous job

step is finished.

Result:– The completion time of the workflow may be larger than 3 hours

(compared to 3 minutes of execution time)

– Current Grids are suitable for simple jobs, but still quite inefficient in handling more complex applications

Need for better coordination of higher- and lower-level scheduling!


24.07.06

37

Example Grid Scenario

Remote CenterReads and Generates TB of Data

LAN/WAN Transfer

WAN Transfer Compute Resources

Visualization

Assume a data-intensive simulation that should be visualized and steered during runtime!


24.07.06

38

Resource Request of a Simple Grid Job

A specified architecture with

48 processing nodes,

1 GB of available memory, and

a specified licensed software package

for 1 hour between 8am and 6pm of the following day • Time must be known in advance.

A specific visualization device during program execution

Minimum bandwidth between the VR device and the main computer during

program execution

Input: a specified data set from a data repository

at most 4 €

preference of cheaper job execution over an earlier execution.


24.07.06

39

Example: Coordinated Simulation and VisualizationExpected output of a Grid scheduler:

time

Data Transfer

Loading Data Parallel Computation Providing Data

Data Transfer Network 1

Computer 1

Parallel ComputationComputer 2

Communication for Computation

Network 3

VR-Cave Visualization

Data Data Access Storing Data

Communication for Visualization

Network 2

Software UsageSoftware License

Data StorageStorage

resources

Reservations are necessary!


24.07.06

40

Conclusions for Grid Scheduling

Grids ultimately require coordinated scheduling services.

Support for different scheduling instances

– different local management systems

– different scheduling algorithms/strategies

For arbitrary resources

– not only computing resources, also

– data, storage, network, software etc.

Support for co-allocation and reservation

– necessary for coordinated grid usage (see data, network, software, storage)

Different scheduling objectives

– cost, quality, other


24.07.06

41

Grid-Level Scheduler

Discovers & selects the appropriate resource(s) for a job

If selected resources are under the control of several local schedulers, a meta-scheduling action is performed

Architecture:– Centralized: all lower level schedulers are under the

control of a single Grid scheduler• not realistic in global Grids

– Distributed: lower level schedulers are under the control of several grid scheduler components; a local scheduler may receive jobs from several components of the grid scheduler


24.07.06

42

Grid Scheduling Scenarios – Example I


24.07.06

43

Grid Scheduling Scenarios – Example II


24.07.06

44

Grid Scheduling Scenarios – Example III


24.07.06

45

Towards Grid Scheduling

Grid Scheduling Methods:

– Support for individual scheduling objectives and policies

– Multi-criteria scheduling models

– Economic scheduling methods to Grids

Architectural requirements:

– Generic job description

– Negotiation interface between higher- and lower-level scheduler

– Economic management services

– Workflow management

– Integration of data and network management


24.07.06

46

Scheduling Objectives in the GridIn contrast to local computing, there is no general scheduling objective

anymore

– minimizing response time, minimizing cost

– tradeoff between quality, cost, response-time etc.

Cost and different service quality come into play

– the user will introduce individual objectives

– the Grid can be seen as a market where resource are concurring alternatives

Similarly, the resource provider has individual scheduling policies

Problem:

– the different policies and objectives must be integrated in the scheduling process

– different objectives require different scheduling strategies

– part of the policies may not be suitable for public exposition(e.g. different pricing or quality for certain user groups)


24.07.06

47

Grid Scheduling Algorithms

Due to the mentioned requirements in Grids its not to be expected that a single scheduling algorithm or strategy is suitable for all problems.

Therefore, there is need for an infrastructure that – allows the integration of different scheduling algorithms– the individual objectives and policies can be included– resource control stays at the participating service

providers

Transition into a market-oriented Grid scheduling model


24.07.06

48

Economic Scheduling

Market-oriented approaches are a suitable way to implement the interaction of different scheduling layers– agents in the Grid market can implement different policies and

strategies– negotiations and agreements link the different strategies

together– participating sites stay autonomous

Needs for suitable scheduling algorithms and strategies for creating and selecting offers– need for creating the Pareto-Optimal scheduling solutions

Performance relies highly on the available information– negotiation can be hard task if many potential providers are

available.


24.07.06

49

Economic Scheduling (2)

Several possibilities for market models: auctions of resources/services auctions of jobs

Offer-request mechanisms support: inclusion of different cost models, price determination individual objective/utility functions for optimization goals

Market-oriented algorithms are considered: robust flexible in case of errors simple to adapt markets can have unforeseeable dynamics


24.07.06

50

Conclusions

Key Challenges for FGG RMS– Cooperation

• interoperability between Grid-RMS implementations and types• and between Grid-RMS and local RM systems

– Interoperability through well defined interfaces• identification and adaptation

– Scalability• domain-specific implementation may have limited scalability, • but the general architecture should cover millions of resources.

– Fault-tolerance• resources and instances of core services

– Common security model

The RMS should be invisible to the user andprovide a pervasive common architecture allowing different implementations while maintaining interoperability.

RMS and Scheduling for Future Generation Grids Ramin Yahyapour University Dortmund Leader CoreGRID...

Documents

Transcript of RMS and Scheduling for Future Generation Grids Ramin Yahyapour University Dortmund Leader CoreGRID...