Scheduling for cloud systems with multi level data locality

Post on 10-Feb-2017

11.204 views 2 download

Transcript of Scheduling for cloud systems with multi level data locality

Scheduling for Cloud Systems with Multi-level Data Locality: Throughput and Heavy-traffic Optimality

Ali YekkehkhanyIn collaboration with Qiaomin Xie, and Professor Yi Lu

University of Illinois at Urbana-Champain (UIUC)

1

Data Processing

• Previously, storage and computing were separate

Computing StorageNetwork

2

Data-Intensive Processing

Explosion of data sets by industry and research

Computing StorageNetwork

Bottleneck

3

Data Centers

• Use separate smaller centers for storage• Move computing to data

Bottleneck

4

Data Centers

Rack Rack

Top of Rack Switch

Core Switch

5

Data-parallel Processing

A A BC D

TA TB

C

Rack 1 Rack 2

local rack-local remote

6

Data-parallel Processing

A A BC D

TB

C

Rack 1 Rack 2

TA

7

Convention

A task type is defined by the locations of its data block

TaskTypes

Servers

2,5,6

2,5,6

1

4,7,8

4,7,8

3,4,9

3,4,9

2 3 n

7,8,9

7,8,9 i,j,k

i,j,k O(n3)

unknown

8

Local, Rack-local, and Remote Service

9

1 2 3 4 5 6 7 8 9 10

Rack 1 Rack 2

Task (1, 3, 4)

Question

10

1 2 3 4 5 6 7 8 9 10Rack 1 Rack 2

A new task arrives

, and scheduling?What queue should the task be routed to?

What algorithm to use for routing

Idle

To which queue should the server give service when it becomes idle?

Metrics of Optimality for the Algorithm

Throughput Optimality:Stabilizing any arrival rate vector within capacity

region.Delay Optimality in Heavy-traffic:

Asymptotically minimizing the average delay as the arrival rate vector approaches the boundary of the capacity region.

11

Previous Work for Two Levels of Data Locality

1- Fluid model Planning, Harrison (98), Harrison-Lopez (99), Bell-Williams (05).

12

TaskTypes

Servers

2,5,6

2,5,6

1

4,7,8

4,7,8

3,4,9

3,4,9

2 3 n

7,8,9

7,8,9 i,j,k

i,j,k O(n3)

unknown

Previous Work for Two Levels of Data Locality

1- Fluid Model Planning:1.1 Throughput optimal1.2 Heavy-traffic optimal

ButNOT practical!

13

Previous Work for Two Levels of Data Locality

2- Join the Shortest Queue-Maxweight (JSQ-MW) Wang et al. (13).

2.1 Throughput optimal2.2 Not heavy-traffic optimal in all loads2.3 Heavy-traffic optimal in SPECIFIC loads

14

Previous Work for Two Levels of Data Locality

3- Priority Algorithm for Near Data Scheduling (Pandas), Q. Xie, Y. Lu (15)

3.1 Throughput optimal3.2 Heavy-traffic optimal for all loads

15

Three Levels of Data Locality

1. Fluid Model planning1. Throughput optimal2. Heavy-traffic optimal3. NOT practical!

2. Extension of JSQ-MaxWeight1. Throughput optimal2. NOT heavy-traffic optimal for all loads

3. Pandas1. Not throughput optimal2. Not heavy-traffic optimal

16

Extension of JSQ-MW for Three Levels of Locality

17

1,2,3Joining the Shortest One

Extension of JSQ-MW for Three Levels of Locality

• Extension of JSQ-MaxWeight for systems with rack structure, Xie et al. (16):– Throughput optimal.– Not heavy-traffic optimal in all loads. Just heavy

traffic optimal in specific loads.

18

Our Throughput and Heavy-traffic Optimal Algorithm

• The routing and scheduling for our algorithm is as follows:– Routing: Weighted Workload– Scheduling: Priority Scheduling for Local, Rack-

local, and Remote tasks queued in the 3 queues associated to each server.

19

Weighted-Workload Routing

20

Rack 1 Rack 2

1 2 43

l k r l k rl k r l k r

Weighted-Workload Routing

21

1 2

Rack 1 Rack 2

43

l - localk - rack-localr - remote

workload

W1 W2 W3 W4

l k r l k rl k r l k r

Weighted-Workload Routing

22

1 2

Rack 1 Rack 2

43

W1 W2 W3 W4

local rack-localremote

l k r l k rl k r l k r

Weighted-Workload Routing

23

1 2

Rack 1 Rack 2

43

W1 W2 W3 W4

local rack-localremote

l k r l k rl k r l k r

Weighted-Workload Routing

24

1 2 43

W1 W2 W3 W4

< <<

l k r l k rl k r l k r

Rack 1 Rack 2

Priority Scheduling

25

1 2

Rack 1 Rack 2

43

Each server serves in the order of

l k r l k rl k r l k r

local,

Priority Scheduling

26

1 2

Rack 1 Rack 2

43

Each server serves in the order of

l k r l k rl k r l k r

local, rack-local, remote

Weighted Workload Algorithm

The Weighted Workload (WW) algorithm proposed by Xie et al. (16) is proved to be both throughput optimal and heavy traffic optimal in all loads.

27

Evaluation

28

Comparing the Stability Regions

29

Heavy-traffic Optimality in Special Load

30

Heavy-traffic optimality of WW

31

References• [1] Q. Xie, A. Yekkehkhany, Y. Lu. Scheduling with Multi-level Data

Locality: Throughput and Heavy-traffic Optimality. In Proceedings of INFOCOM. IEEE, 2016.

• [2] Q. Xie, and Y. Lu. Priotrity Algorithm for Near-data Scheduling: Throughput and Heavy-traffic Optimality. In Proceedings of INFOCOM. IEEE, 2015.

• [3] W. Wang, K. Zhu, L. Ying, J. Tan, and L. Zhang. Map Task Schedul-ing in MapReduce with Data Locality: Throughput and Heavy-traffic Optimality. In Proceedings of INFOCOM. IEEE, 2013.

• [4] J. M. Harrison. Heavy traffic analysis of a system with parallel servers: Asymptotic optimality of discrete review policies. Annals of Applied Probability, 1998.

• [5] J. M. Harrison and M. J. L´opez. Heavy traffic resource pooling in parallel-server systems. Queueing Syst. Theory Appl., 33(4), Apr. 1999.

32

Future Work

• Scheduling for multi-level data locality instead of three levels of data locality.

33

Thanks for Your Attention

34

Any Questions?!

35

Ali Yekkehkhanyyekkehk2@illinois.edu