Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation...

25
Dependable and Coordinated Resources Allocation Algorithms for Distributed Computing Victor Toporkov and Dmitry Yemelyanov National Research University “Moscow Power Engineering Institute” RSCDays 2018

Transcript of Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation...

Page 1: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Dependable and Coordinated Resources Allocation Algorithms for Distributed

Computing

Victor Toporkov and Dmitry Yemelyanov

National Research University “Moscow Power Engineering Institute”

RSCDays 2018

Page 2: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Introduction

• Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed environments with non-dedicated resources

• Economic models are used to solve resource sharing and scheduling problems in a transparent and an efficient way in cloud computing, utility Grids, and multi-agent systems

• The main aspect of job-flow scheduling is its efficiency in terms of QoS and resources utilization

2 RSCDays 2018

Page 3: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Job Flow Scheduling Problem

Job 1 Job 2 Job N-1

Job N

Start Time Runtime Cost Data

Storage

. . .

Nodes

Reserved

Reserved

Local task

Reserved Reserved

1

2

3

4

5 Time

3 RSCDays 2018

Page 4: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Job Flow Scheduling: V. Toporkov et al. 2017. Cyclic Anticipation Scheduling in Grid VOs with Stakeholders Preferences. PaCT 2017, LNCS 10421, pp. 372–383.

R E S O U R C E S

Cycle i-1 Cycle i

Job Batch Job Batch Job Flow

4 RSCDays 2018

Page 5: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Job Resource Request

The resource requirements are arranged into a resource request containing:

n - number of simultaneously reserved computational nodes

p - minimal performance requirement for each computational node

V – computational volume for a single node task

C - maximum total job execution cost (budget)

z – preferred job optimization criterion

1

2

n

.

.

t = V/pmin

5 RSCDays 2018

Page 6: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Window Search Problem

Nodes

Time

Reserved

Reserved

Local task

Reserved Reserved

1

2

3

4

5

Slot: • Node • Performance • Cost • Start Time • Finish Time

Allocate a window of four nodes for a time T, with requirements on nodes performance and total cost. Minimize window start time:

6 RSCDays 2018

Page 7: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Deadline and Scheduling Horizon

Reserved

Reserved

Local task

Reserved Reserved

1

2

3

4

5 Time

Job 1 Job 2

There’s a practical limit on the slots availability: • deadline • backfilling • scheduling horizon

7 RSCDays 2018

Page 8: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

General Window Search Scheme

All available time-slots are ordered by the start time;

for each pi in (pmin; pmax) {

while there is at least one slot available {

Add next available slot to the window list;

Check all slots in the window considering required length t = V/pmin

and remove the slots being late;

Select n-slot window best by the given user criterion Z; }

}

return the best of the found interim windows;

8 RSCDays 2018

Page 9: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Window Search Scheme Visualized

5

Increasing Start Time

1

2

3

4

3

2

1

6

4

3

7

6

4

3

Slots

3

6

7

9 RSCDays 2018

Page 10: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Optimal Slots Subset Allocation

c1, z1

c2, z2

cm, zm

1

2

m

1

2

n

n < m

ci, zi

cj, zj

ck, zk

Maximize:

10 RSCDays 2018

Page 11: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Optimization Scheme 0-1 Knapsack Problem with O(m∗C)

k=1 c=3 z=3

c=2 z=1

c=4 z=6

0 - - -

1 - - -

2 - 1 1

3 3 3 3

4 3 3 6

5 3 3 6

6 3 3 6

k=2 c=3 z=3

c=2 z=1

c=4 z=6

0 - - -

1 - - -

2 - - -

3 - - -

4 - - -

5 - 4 4

6 - 4 7

Cj

i

11 RSCDays 2018

n << m

n << C

Page 12: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Simulation Study https://github.com/dmieter/mimapr Intel Core i3-4130,16GB RAM

Simulation environment: • heterogeneous resource domain model

N =100 computing nodes

L = 1200 scheduling horizon

• non-linear probabilistic model for the computing environment parameters generation: cost, performance, initial load

• a space-shared resources allocation policy

Job request:

n = 7 nodes

V = 800 computational volume

C = 644 maximum budget

qi in [0;10] randomly generated for

each node to compare algorithms against Q (sum of qi, i = 1,…,n)

q=5.6

q=1.2

q=9.4

q=4.3

q=3.9

q=2.0

q=7.1

q=5.2

12 RSCDays 2018

Page 13: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Resource Domain Utilization Example

13 RSCDays 2018

Page 14: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Algorithms Comparison

• First Fit returns first suitable and affordable window found

• Multiple Best returns the best window over a set of execution alternatives found

• General Window Search Scheme:

Min Finish Time

Min Runtime

Min Cost

• Max Q additionally implements an optimal slots subset allocation

Kovalenko et al., 2007 Buyya et al., 2007

Ernemann et al., 2002

Toporkov et al., 2012

14 RSCDays 2018

Toporkov, Yemelyanov, 2018

Page 15: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Simulation Study: Time I (7 nodes out of 100)

15 RSCDays 2018

Page 16: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Simulation Study: Time II (7 nodes out of 40)

RSCDays 2018 16

Page 17: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Simulation Study: Execution Cost

17 RSCDays 2018

Page 18: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Simulation Study: Custom Criterion Q = sum of qi, i = 1,…,n, qi in [0;10], Q in [0;70]

18 RSCDays 2018

Page 19: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Running Time and Complexity

19 RSCDays 2018

Page 20: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Running Time Depending on the Scheduling Interval Length L

RSCDays 2018 20

Page 21: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Running Time Depending on a Number n of Nodes Required for a Job Execution

RSCDays 2018 21

Page 22: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Conclusions and Further Work

The general square window search algorithm with the special slots subset allocation procedure is proposed for the resources co-allocation problem

Simulation study proved 44% advantage over the First Fit algorithm and 18% over the MultipleBest optimization heuristic

As a drawback, the general case algorithm requires 600 times longer realization time compared to FirstFit

In our further work, we will be focused on a practical resources allocation tasks implementation based on the proposed general case approach

22 RSCDays 2018

Page 23: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Acknowledgments

This work was partially supported by:

Ministry on Education and Science of the Russian Federation (project no. 2.9606.2017/8.9)

Council on Grants of the President of the Russian Federation for State Support of Young Scientists (grant YPhD-2297.2017.9)

RFBR (grants 18-07-00456 and 18-07-00534)

23 RSCDays 2018

Page 24: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

References https://github.com/dmieter/mimapr СloudOfferFinder SquareWindowFinder https://srmpds.github.io/

International Workshop on SRMPDS (Scheduling and Resource Management for Parallel and Distributed Systems) 2018, University of Oregon, USA

https://www.youtube.com/watch?v=h6KfBKdVHOU Victor Toporkov and Dmitry Yemelyanov. Resources Co-Allocation

Optimization Algorithms for Distributed Computing Environments

RSCDays 2018 24

Page 25: Resources Allocation Algorithms for Distributed and ... · Introduction •Resource co-allocation and scheduling of complex sets of parallel jobs is an essential issue in distributed

Thank You!

25 RSCDays 2018