Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based...

27
AN EXTENSION OF FAIRSHARESCHEDULER AND A NOVEL SLA BASED LEARNING SCHEDULER IN HADOOP BY Dr G SUDHA SADHASIVAM PROFESSOR & PRIYA N STUDENT PSG COLLEGE OF TECHNOLOGY COIMBATORE

Transcript of Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based...

Page 1: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

AN EXTENSION OF

FAIRSHARESCHEDULER AND A

NOVEL SLA BASED LEARNING

SCHEDULER IN HADOOP

BY

Dr G SUDHA SADHASIVAM

PROFESSOR

&

PRIYA N

STUDENTPSG COLLEGE OF TECHNOLOGY

COIMBATORE

Page 2: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

AGENDA

Introduction

- Metascheduler in Fairsharescheduler.

Features.

Extended Fairscheduler Architecture.

Work Flow.

Experimental results.

Learning Scheduler with SLA.

Design of Proposed System.

Work Flow

Page 3: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

FAIRSHARE SCHEDULER

Existing System :-

Jobs in pool are executed in Fairshare

manner.

Proposed System :-

Fairshare Execution of Jobs from pool

such that Large Job first and Small Job

Backfilling.

Page 4: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

FEATURES

Jobs in pools

Guaranteed capacity

Minimum Shares

Job Limits

Job Priorities

Pool Weights

Page 5: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

ARCHITECTURE

USE

R 1

USE

R 2

USE

R 3

USE

R 4

Node 1

Node 2

Node 3

Node 4

Pool

FAIRSHARE

SCHEDULER

LARGE JOB

FIRST+ SMALL

JOB BACKFILLING

Page 6: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

Calculate

User Estimated time = (no.of maps *maptime)+(no.of reduces * reduce time).

Update

Runnability

Taskcount= total_Tasks – running_Tasks –

finished_Tasks +needed_Tasks_for_job

Weight = weight * priorityfactor.

Fairshare= (weight * oldslots )/ totalweight

Deficit (MR_Deficit) = (fairshare - running) *

timedelta

Page 7: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

WORKFLOW

Get Jobs in

pool

Calculate no.

Of maps and

reduces

Find User

Estimated

Time

Create a

list of

jobs

fairschedu

ler.start()

Get

runstate of

job in

progress

Fini

she

d/ru

nnin

g

Remo

ve

from

list

Update:-

Weight,taskco

unt,min.slots,r

unnability,fairs

hare

Job finish

time<user

estimated time

Bring large

job first and

backfill small

jobs

Categorize jobs as

small and large

= no.of maps *

maptime+no.of

reduces *

reduce time

Backfill if

exe_time<delay

Page 8: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N
Page 9: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

RESULT(LFSB) :Different Jobs

Job Size Scheduler with

Backfilling

Scheduler Without

Backfilling

2.5(Mb)

884.8(Kb)

2.5(Mb)

884.8(Kb)

1.2(Mb)

2min 25sec

2min 32 sec

2min 41sec

3min 27 sec

3min 49sec

2min 25sec

3min 17sec

3min 27sec

3min 21sec

4min 40sec

Total 4min 15 sec 5min 39sec

Page 10: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

More small jobs

Job Size Scheduler with Backfilling Scheduler Without Backfilling

2.5(Mb)

884.8(Kb)

1.2(Mb)

884.8(Kb)

884.8(Kb)

1min 43sec

1min 57sec

2min 14sec

2min 22sec

2min 59sec

2min 3sec

2min 11sec

1min 45sec

3min 02sec

4min 0sec

Total 3min 23sec 4min 34sec

Page 11: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

A NOVEL SLA BASED LEARNING

SCHEDULER

Page 12: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

SCHEDULERS IN HADOOP

Hadoop on Demand –

FIFO with Torque

No data locality

Fairshare Fairshares resources among jobs in pools

Excess resources are shored between pools

Capacity Fairsharing among organisations

Inter queue priority is maintained manually (not dynamic)

Dynamic priority scheduler Adjustable priority dynamically

Demand / budget of the user

More priority for smaller jobs

Large jobs have to be broken up into smaller ones

Page 13: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

PATCHES

Security features to isolate users

Launching multuple tasks per heartbeat

Parallelise jobs and launch smaller jobs faster

Prevent oversubscribing nodes (only fter job

submission) – RAM / HD

Page 14: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

Existing System:-

• Task assignment right node.

• No policies and less user level response.

Proposed System :-

SLA :user specifying requirements.

Job executing at right node.

Classify jobs as I/O bound or cpu

bound – priority and assign jobs

Page 15: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

PROPOSED METHODOLOGY

SLA – User details ,job requirements and charge sheet.

Scheduler:

Classifies jobs based on (SLA+Job Features) and node features.(new job)

Classification based on Job traces History (Learning).

Creation of Queues for jobs as I/O and CPU

Assignment to Queues based on Utility Function.

Page 16: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

Node 1

Node 2

Node 3

Node 4

Node 5

SLAUSE

R

Gather all node

details & check for

SLA approval. If

Yes allow to

submit jobs.

LEARNING

SCHEDULER

Owner,Descri

ption,User

details and

requirements

Page 17: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N
Page 18: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N
Page 19: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N
Page 20: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

WORKFLOW OF SCHEDULER

Job

Features

+SLA

CLASSIFIE

R

Node

features

Job Traces

historyRIGHT

NODE&

Job type

Calculate

&Compare

Utility

I/O

queueCPU

queue

I/O

or

CP

U

(MIS+MOS)/MTCT

>Avg.Disk I/o rate

Change

priority

Page 21: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

EXAMPLE

Node Feature value

RAM (m b) HD (Gb) No. Map

tasks

No. Reduce

tasks

Tracker

name

1037 320 8 2 Cluster1

700 256 5 3 Cluster2

1124 128 5 1 Cluster3

437 100 6 3 Cluster4

1540 128 6 2 Cluster5

456 80 5 2 Cluster6

RAM (m b) HD (Gb) No. Map tasks No. Reduce

tasks

0.1 0.5 0.25 0.15

Page 22: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

Job Submitted (Job Features)

1. ram=400Mb,HD=100Gb, M=6,R=2

2. ram=500Mb. HD=120Gb M=8 R=0.

P(node)={no. job Features+no.node features*(P(F1)+P(F2), …P(Fn))}/Total features

P(J1M1)=1,P(J1M2)=0.875 ,P(J1M3)=0.8,P(J1M4)=1, P(J1M5)=1, P(J1M6)=0.625.

P(J2M1)=1,P(J1M2)=0.857 ,P(J1M3)=0.857,P(J1M4)=0.514, P(J1M5)=0.857, P(J1M6)=0.514

JOB 1= M1,M4,M5. M4 satisfies.

JOB 2= M1.

Page 23: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

CPU OR I/O BOUND JOB

Tasks Map Input

Size

Map Output

Size

MIS+MOS/MTC

T

CPU or

I/O

T1 10 20 3 CPU

T2 100 200 30 I/O

T3 50 60 11 I/O

T4 60 60 12 I/O

T5 70 30 10 Cluster5

I/O rate : 10 Mbytes /

sec

MTCT : 10 sec

Page 24: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

SCHEDULER

Find the right node for the job using a

classifier.

:Naïve Bayes classifier

Find the Job type whether I/O or CPU bound.

(MIS+MOS)/MTCT >Avg.Disk I/O rate

Calculate the Utility Function value.

FIFO,Deficit,SJF.

Pass the jobs to the queue.

Page 25: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

ADVANTAGES

Fairscheduler with Backfilling improves on

waiting time for large jobs. It introduces “no

starvation” slogan and improves response

time.

SLA based scheduler brings high user level

response and better utilization of resources.

Page 26: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

REFERENCES Saeed Iqbal ,Rinku Gupta, Yung chin Fang “Job

Scheduling in HPC clusters” DELL Power Solutions 2005.

Juan Wang, Wenming Guo, ”The Application of Backfilling in Cluster Systems”,2009 IEEE International Conference on Communication and Mobile Computing.

Jaideep Dhok and Vasudeva Varma “Using Pattern Classification for Task Assignment in Map Reduce”. 10th IEEE/ACM International Conference CCGrid 2010.

Amy W. Apon, Thomas D.Wagner, and Lawrence. Dowdy. “A learning approach to processor allocation in parallel systems”. In CIKM ’99:Proceedings of the eighth international conference on Information and knowledge management, pages 531–537, New York, NY, USA, 1999.

Harry Zhang. “The Optimality of Naive Bayes”. In Valerie Barr and Zdravko Markov, editors, FLAIRS Conference. AAAI Press, 2004.

Page 27: Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

THANK YOU