Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based...
-
Upload
yahoo-developer-network -
Category
Documents
-
view
1.734 -
download
0
Transcript of Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based...
AN EXTENSION OF
FAIRSHARESCHEDULER AND A
NOVEL SLA BASED LEARNING
SCHEDULER IN HADOOP
BY
Dr G SUDHA SADHASIVAM
PROFESSOR
&
PRIYA N
STUDENTPSG COLLEGE OF TECHNOLOGY
COIMBATORE
AGENDA
Introduction
- Metascheduler in Fairsharescheduler.
Features.
Extended Fairscheduler Architecture.
Work Flow.
Experimental results.
Learning Scheduler with SLA.
Design of Proposed System.
Work Flow
FAIRSHARE SCHEDULER
Existing System :-
Jobs in pool are executed in Fairshare
manner.
Proposed System :-
Fairshare Execution of Jobs from pool
such that Large Job first and Small Job
Backfilling.
FEATURES
Jobs in pools
Guaranteed capacity
Minimum Shares
Job Limits
Job Priorities
Pool Weights
ARCHITECTURE
USE
R 1
USE
R 2
USE
R 3
USE
R 4
Node 1
Node 2
Node 3
Node 4
Pool
FAIRSHARE
SCHEDULER
LARGE JOB
FIRST+ SMALL
JOB BACKFILLING
Calculate
User Estimated time = (no.of maps *maptime)+(no.of reduces * reduce time).
Update
Runnability
Taskcount= total_Tasks – running_Tasks –
finished_Tasks +needed_Tasks_for_job
Weight = weight * priorityfactor.
Fairshare= (weight * oldslots )/ totalweight
Deficit (MR_Deficit) = (fairshare - running) *
timedelta
WORKFLOW
Get Jobs in
pool
Calculate no.
Of maps and
reduces
Find User
Estimated
Time
Create a
list of
jobs
fairschedu
ler.start()
Get
runstate of
job in
progress
Fini
she
d/ru
nnin
g
Remo
ve
from
list
Update:-
Weight,taskco
unt,min.slots,r
unnability,fairs
hare
Job finish
time<user
estimated time
Bring large
job first and
backfill small
jobs
Categorize jobs as
small and large
= no.of maps *
maptime+no.of
reduces *
reduce time
Backfill if
exe_time<delay
RESULT(LFSB) :Different Jobs
Job Size Scheduler with
Backfilling
Scheduler Without
Backfilling
2.5(Mb)
884.8(Kb)
2.5(Mb)
884.8(Kb)
1.2(Mb)
2min 25sec
2min 32 sec
2min 41sec
3min 27 sec
3min 49sec
2min 25sec
3min 17sec
3min 27sec
3min 21sec
4min 40sec
Total 4min 15 sec 5min 39sec
More small jobs
Job Size Scheduler with Backfilling Scheduler Without Backfilling
2.5(Mb)
884.8(Kb)
1.2(Mb)
884.8(Kb)
884.8(Kb)
1min 43sec
1min 57sec
2min 14sec
2min 22sec
2min 59sec
2min 3sec
2min 11sec
1min 45sec
3min 02sec
4min 0sec
Total 3min 23sec 4min 34sec
A NOVEL SLA BASED LEARNING
SCHEDULER
SCHEDULERS IN HADOOP
Hadoop on Demand –
FIFO with Torque
No data locality
Fairshare Fairshares resources among jobs in pools
Excess resources are shored between pools
Capacity Fairsharing among organisations
Inter queue priority is maintained manually (not dynamic)
Dynamic priority scheduler Adjustable priority dynamically
Demand / budget of the user
More priority for smaller jobs
Large jobs have to be broken up into smaller ones
PATCHES
Security features to isolate users
Launching multuple tasks per heartbeat
Parallelise jobs and launch smaller jobs faster
Prevent oversubscribing nodes (only fter job
submission) – RAM / HD
Existing System:-
• Task assignment right node.
• No policies and less user level response.
Proposed System :-
SLA :user specifying requirements.
Job executing at right node.
Classify jobs as I/O bound or cpu
bound – priority and assign jobs
PROPOSED METHODOLOGY
SLA – User details ,job requirements and charge sheet.
Scheduler:
Classifies jobs based on (SLA+Job Features) and node features.(new job)
Classification based on Job traces History (Learning).
Creation of Queues for jobs as I/O and CPU
Assignment to Queues based on Utility Function.
Node 1
Node 2
Node 3
Node 4
Node 5
SLAUSE
R
Gather all node
details & check for
SLA approval. If
Yes allow to
submit jobs.
LEARNING
SCHEDULER
Owner,Descri
ption,User
details and
requirements
WORKFLOW OF SCHEDULER
Job
Features
+SLA
CLASSIFIE
R
Node
features
Job Traces
historyRIGHT
NODE&
Job type
Calculate
&Compare
Utility
I/O
queueCPU
queue
I/O
or
CP
U
(MIS+MOS)/MTCT
>Avg.Disk I/o rate
Change
priority
EXAMPLE
Node Feature value
RAM (m b) HD (Gb) No. Map
tasks
No. Reduce
tasks
Tracker
name
1037 320 8 2 Cluster1
700 256 5 3 Cluster2
1124 128 5 1 Cluster3
437 100 6 3 Cluster4
1540 128 6 2 Cluster5
456 80 5 2 Cluster6
RAM (m b) HD (Gb) No. Map tasks No. Reduce
tasks
0.1 0.5 0.25 0.15
Job Submitted (Job Features)
1. ram=400Mb,HD=100Gb, M=6,R=2
2. ram=500Mb. HD=120Gb M=8 R=0.
P(node)={no. job Features+no.node features*(P(F1)+P(F2), …P(Fn))}/Total features
P(J1M1)=1,P(J1M2)=0.875 ,P(J1M3)=0.8,P(J1M4)=1, P(J1M5)=1, P(J1M6)=0.625.
P(J2M1)=1,P(J1M2)=0.857 ,P(J1M3)=0.857,P(J1M4)=0.514, P(J1M5)=0.857, P(J1M6)=0.514
JOB 1= M1,M4,M5. M4 satisfies.
JOB 2= M1.
CPU OR I/O BOUND JOB
Tasks Map Input
Size
Map Output
Size
MIS+MOS/MTC
T
CPU or
I/O
T1 10 20 3 CPU
T2 100 200 30 I/O
T3 50 60 11 I/O
T4 60 60 12 I/O
T5 70 30 10 Cluster5
I/O rate : 10 Mbytes /
sec
MTCT : 10 sec
SCHEDULER
Find the right node for the job using a
classifier.
:Naïve Bayes classifier
Find the Job type whether I/O or CPU bound.
(MIS+MOS)/MTCT >Avg.Disk I/O rate
Calculate the Utility Function value.
FIFO,Deficit,SJF.
Pass the jobs to the queue.
ADVANTAGES
Fairscheduler with Backfilling improves on
waiting time for large jobs. It introduces “no
starvation” slogan and improves response
time.
SLA based scheduler brings high user level
response and better utilization of resources.
REFERENCES Saeed Iqbal ,Rinku Gupta, Yung chin Fang “Job
Scheduling in HPC clusters” DELL Power Solutions 2005.
Juan Wang, Wenming Guo, ”The Application of Backfilling in Cluster Systems”,2009 IEEE International Conference on Communication and Mobile Computing.
Jaideep Dhok and Vasudeva Varma “Using Pattern Classification for Task Assignment in Map Reduce”. 10th IEEE/ACM International Conference CCGrid 2010.
Amy W. Apon, Thomas D.Wagner, and Lawrence. Dowdy. “A learning approach to processor allocation in parallel systems”. In CIKM ’99:Proceedings of the eighth international conference on Information and knowledge management, pages 531–537, New York, NY, USA, 1999.
Harry Zhang. “The Optimality of Naive Bayes”. In Valerie Barr and Zdravko Markov, editors, FLAIRS Conference. AAAI Press, 2004.
THANK YOU