Robust Resource Allocation of DAGs in a Heterogeneous Multi-core System

26
ROBUST RESOURCE ALLOCATION OF DAGS IN A HETEROGENEOUS MULTI-CORE SYSTEM Luis Diego Briceño, Jay Smith, H. J. Siegel, Anthony A. Maciejewski, Paul Maxwell, Russ Wakefield, Abdulla Al-Qawasmeh, Ron C. Chiang, and Jiayin Li 1 outline motivation and introduction system model robustness example of heuristic results and conclusions Supported by the NSF under grants CNS-0615170 and CNS-0905399

description

Luis Diego Briceño , Jay Smith, H. J. Siegel, Anthony A. Maciejewski, Paul Maxwell, Russ Wakefield, Abdulla Al-Qawasmeh, Ron C. Chiang, and Jiayin Li. outline motivation and introduction system model robustness example of heuristic results and conclusions. - PowerPoint PPT Presentation

Transcript of Robust Resource Allocation of DAGs in a Heterogeneous Multi-core System

Page 1: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

1

ROBUST RESOURCE ALLOCATION OF DAGS IN A HETEROGENEOUS

MULTI-CORE SYSTEMLuis Diego Briceño, Jay Smith, H. J. Siegel,

Anthony A. Maciejewski, Paul Maxwell, Russ Wakefield,

Abdulla Al-Qawasmeh, Ron C. Chiang, and Jiayin Lioutline●motivation and introduction●system model●robustness●example of heuristic●results and conclusions

Supported by the NSF under grants CNS-0615170 and CNS-0905399

Page 2: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

n need to execute applications on satellite datan satellite data is processed in a

heterogeneous computing systemn results are needed before a deadline

Motivation

2

multi-core heterogeneous data processing system

•app1

•app2...

•satellite•data

resultapplications

• deadline

Page 3: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

Problem Statementn multiple applications (for this presentation consider one)n each application is a DAG of tasksn a set of applications must

complete before a deadline Δn completion time of an application

must be robust against uncertainties in the estimated execution time of its tasks5 actual time is data dependent

n goal: robust resource allocation of data and tasks to heterogeneous multi-core system to meet deadline Δ forapplications

3

t 4,α

t1,α

t 2,α

• t3,

α

t 5,α t6,α

t 7,α

• application α

• Δ

• time

Page 4: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

Environmentn consider a heterogeneous environment used

to analyze satellite imaging5 based on commodity hardware

n these environments require analysis of large data sets

n environment similar to systems in use at5 National Center for Atmospheric Research (NCAR) 5 DigitalGlobe

n static resource allocationn estimated time to compute a task is known in advance

4

Page 5: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

5

Contributionsn contributions

5 model and simulation of a complex multi-core-based data processing environment that executes data intensive applicationsg multi-core machines

uRAM managementuhard drive management

g parallel tasksg satellite data placement

5 a robustness metric for this environment5 resource allocation heuristics to maximize robustness

using this metric

Page 6: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

System Model — Satellite Data Placementn satellite data is split into smaller subsets and distributed

among the hard drives of the compute nodes

6

•satellite•data

multi-core heterogeneous data processing system

•satellite•data

PEj,1

computenode j

PEj,8• …

RAMj•HDj

● processing element (PE) is a core● PEj,k — PE k on compute

node j (1 – 8 per node)5 PEs within a compute node are

homogeneous● no multi-tasking within a PE

Page 7: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

●input data sets are staged to RAM

●task 1 (t1) can start execution

●result is stored in RAMj

5RAM space is limited

PEj,1

• computenode j

PEj,8• …

RAMj•HDj

System Model — Processingn tasks execute on processing elements (PEs) [if data on HDj]

5 required input data must be present in RAM to execute task

7

• satellite data at compute node j

task 1

t 1

• input data setsresults

●ex.

Luis Briceno
remove arrow heads
Page 8: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

System Model — RAM Managementn RAM has a fixed capacity

5 160Gbytes (based on DigitalGlobe computer center)g assume 152Gbytes available for datag typical data set was from 1Gbyte to 32Gbytes

n data sets can be swapped in and out of RAM if needed latern all input data sets must be in RAM before task execution

5 data sets must remain in RAM until execution is finished5 must reserve space in RAM for result

8

Page 9: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

System Model — Storagen satellite data sets allocated prior to task execution

5 two scenarios for satellite data allocation g determined by the heuristicg randomly assigned (pre-determined)

n inter-task data is transmitted if destination is not equal to source

9

Page 10: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

n each application appα must complete before Δn appα is divided into Tα tasks (tasks form a DAG)

5 each task requires satellite data sets or produced inter-task data sets

5 ti,α is the i th task in the application α 5 each task produces other data items (e.g., data 7)

g last task produces a result

System Model — Applications

10

• sat. data 6

• t1,

α• t3,

α• t2,

α

• sat. data 1

data 2

• sat. data 4

• data 7

data 3•resultappα

Luis Briceno
add color to diff. sat data sets and inter-task data setsand results have another color
Page 11: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

System Model — Computation Parallelismn 50% of tasks are parallelizable

5 only parallelizable on PEs in the same compute node5 parallel time = sequential time / divider5 parallel execution time is used to model different speed ups

n two types of parallelizable tasks5 25% good parallel tasks

g

5 25% average parallel tasksg

5 divider values chosen arbitrarily for the simulation study 11

PEs1 2 3 4 5 6 7 8

divider 1 1.75 2.5 3.25 4 4.75 5.5 6.25

PEs1 2 3 4 5 6 7 8

divider 1 1.5 2 2.5 3 3.5 4 4.5

Page 12: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

Robustness — Three Questionsn What behavior of the system makes it robust?

5 all applications finish before Δn What uncertainties is the system robust against?

5 differences between actual and estimated timesg assume communications times are fixed

n Quantitatively, exactly how robust is the system?5 smallest common percentage increase (ρ) for all task

execution times that causes the makespan > Δg note: in a real system, the execution times of all tasks

will not be increased by the same common percentage uρ is just a mathematical value used

as a robustness measure

12

Luis Briceno
ETC -> estimated task execution times
Luis Briceno
add text from paper about how p is a measure of robustness but in reality it won't increase by the same amount
Page 13: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

Robustness — Example

n assume 3 applications5 blue (b, d, g, and h), green (a, e, and i), and

pink(c and f)13

• PE1,

1

• PE2,

1

• PE3,1

a

i

b

d

gh

c

•co

mpl

etio

n tim

e

fe

• makespan

Δ

• makespan based on estimated task time

• PE1,1• PE2,1 • PE3,

1

a′

i′

b′

d′

g′ h′

c′

•co

mpl

etio

n tim

e

f′e′

Δ

• makespan when task times = ρ ∙ estimated task time

Luis Briceno
add colors instead of black and white
Page 14: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

14

Related Workn significant amount of research

5 assign a DAG to a heterogeneous computing systemg several critical path heuristics

5 robustness in resource allocation5 our research considers the robustness of the

allocation in DAGsn two heuristics for minimization of makespan from literature

were adapted to this paper5 heuristics originally meant to minimize makespan5 adapted heuristics can handle memory, satellite data

placement, and robustnessn Dynamic Available Tasks Critical Path (DATCP) heuristic

g will be explained today

Page 15: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

15

Dynamic Available Tasks Critical Path (DATCP)outline1. calculate the critical path

for each application5 for each task, from

texit to tentry

g edge labels are average transfer time/byte betweenany two nodes ∙ data size

g determine the maximum time from any successor (child) node to the texit (maxtime)

g critical path value is the sum of task data and satellite data transfer times, maxtime, and average execution time of ti

• 8• 7

• 6

• 3• 5 • 8 • 5

• 4 • 5

373

234

275

266

177

143

66

critical path value

average exec. time

Page 16: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

16

Dynamic Available Tasks Critical Path (DATCP)outline1. calculate the critical path

for each application2. dynamically create a list of

all tasks available for mapping3. determine the task with the

longest critical path from the list of available tasks

4. task ti determined in (3) is assigned to the PE that gives the maximum system robustness based on partial mapping

5. repeat steps (2)–(4) until all tasks are mapped

• 8• 7

• 6

• 3• 5 • 8 • 5

• 4 • 5

373

234

275

266

177

143

66

critical path value

average exec. time

Page 17: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

17

Dynamic Available Tasks Critical Path (DATCP)outline1. calculate the critical path

for each application2. dynamically create a list of

all tasks available for mapping3. determine the task with the

longest critical path from the list of available tasks

4. task ti determined in (3) is assigned to the PE that gives the maximum system robustness based on partial mapping

5. repeat steps (2)–(4) until all tasks are mapped

• 8• 7

• 6

• 3• 5 • 8 • 5

• 4 • 5

373

234

275

266

177

143

66

critical path value

average exec. time

Page 18: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

18

Dynamic Available Tasks Critical Path (DATCP)outline1. calculate the critical path

for each application2. dynamically create a list of

all tasks available for mapping3. determine the task with the

longest critical path from the list of available tasks

4. task ti determined in (3) is assigned to the PE that gives the maximum system robustness based on partial mapping

5. repeat steps (2)–(4) until all tasks are mapped

• 8• 7

• 6

• 3• 5 • 8 • 5

• 4 • 5

373

234

275

266

177

143

66

critical path value

average exec. time

Page 19: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

19

Dynamic Available Tasks Critical Path (DATCP)outline1. calculate the critical path

for each application2. dynamically create a list of

all tasks available for mapping3. determine the task with the

longest critical path from the list of available tasks

4. task ti determined in (3) is assigned to the PE that gives the maximum system robustness based on partial mapping

5. repeat steps (2)–(4) until all tasks are mapped

• 8• 7

• 6

• 3• 5 • 8 • 5

• 4 • 5

373

234

275

266

177

143

66

critical path value

average exec. time

Page 20: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

20

DATCP — Memory Managementn determine available space in RAM

5 decide if the required task and the input data can be stored in RAM immediately

n if there is not enough space 5 heuristic checks when the task's input data sets

can be moved into memory5 heuristic schedules task to start execution at that time

n if incoming data is from another compute node5 send it to destination compute node’s RAM5 if there is no space in RAM then send to the HD

Page 21: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

21

DATCP — Parallelizable Tasksn two approaches are studied

5 no parallelization5 “max” approach

g heuristic always parallelizes across multiple PEs within a compute nodeudetermine system robustness for each possible

assignmentudetermine the node with the most PEs that

have same maximum robustnessumap the task to all PEs that have the same

robustness value within this compute node

Page 22: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

22

DATCP — Satellite Data Placementn two methods

5 random placement5 first time a satellite data set is required, that data

set and the task that requires it are mappedg task is assigned to the PE that maximizes robustness

ustorage location of satellite data set has not been previously determined

usatellite data set is stored in the HD of this PE's corresponding compute node

Page 23: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

23

Results

0

2

4

6

8 7.49 7.44

5.39

3.31 3.312.58 2.25 2.23

robu

stne

ss (ρ

)

DATCP 1: Max parallel with satellite mappingDATCP 2: Max parallel with random satellite mappingDATCP 3: no parallelism with random satellite mappingHRD 1: satellite data (SD) placement based on first task placement with duplication HRD 2: SD placement based on first task placement with no duplication HRD 3: SD placement based on reference count with no duplication HRD 4: random SD placement with duplicationHRD 5: random SD placement and no duplication

Page 24: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

24

Plot of Makespan vs. Robustness

020

0040

0060

0080

0010

000

1200

014

000

0123456789

10

HRDDATCP

makespan (s)

robu

stne

ss (ρ

)

different robustness using DATCP despite having similar

makespans

different robustness using HRD despite having similar

makespans

deadline

Page 25: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

25

Conclusionsn derived a metric to measure the robustnessn interdependency of tasks within applications complicate

the derivation of a robustness metricn DATCP has highest average robustness values

5 initial ordering created by DATCP is much better than the order created by HRD

5 if DATCP order is used in HRD then the results of HRD are significantly improved

n satellite data placement did not have any apparent effect on robustness

Luis Briceno
the initial ordering created by the DATCPis much better than the HCFDR if the DATCP order is used with the HCFDR then the performance improves significantly.
Page 26: Robust Resource Allocation  of DAGs in a Heterogeneous  Multi-core System

26

QUESTIONS?