Observations on dag scheduling and dynamic load-balancing using genetic algorithm
-
date post
17-Oct-2014 -
Category
Education
-
view
313 -
download
3
description
Transcript of Observations on dag scheduling and dynamic load-balancing using genetic algorithm
1
Observations on DAG Scheduling and Dynamic-
Load-Balancing using Genetic Algorithm
Rahul JainIDD Part V
Roll No.- 07020007IT-BHU,Varanasi
2
Outline
Introduction Thesis Objective Directed Acyclic Graph Basic Genetic Algorithm Proposed Algorithm for DAG-Scheduling Proposed Algorithm for DLB Experimental Results and Discussion Conclusion References
3
Introduction
1. Heterogeneous Computing System
Heterogeneous computing systems refer to electronic systems that use a variety of
different types of computational units.
2. Task Scheduling
The multiprocessor scheduling problem is to allocate the tasks of a parallel program to
processors in a way that minimizes its completion time and optimizes the performance.
3. Load Balancing
The technique of distributing load among processors in order to avoid overloading on a
particular processor is called load balancing.
4
Thesis Objective This thesis comprises the study of two research projects: DAG Scheduling
using Genetic Algorithm in Part I and Dynamic Load Balancing using Genetic Algorithm in Part II.
Part I: The objective of part-I is to design an algorithm to schedule the DAG tasks on Heterogeneous processors in such a way that minimize the total completion time (Makespan).
Part II: This part is based on designing an algorithm for scheduling the load among the processor in such a way that none of the processor is overloaded.
Comparison of various metrics is to be done for DAG-Scheduling and Dynamic Load Balancing.
5
Directed Acyclic Graph A process or an application can be broken down into a set of tasks we represent these tasks in the form of a directed acyclic graph
(DAG) A parallel program with n tasks can be represented by a 4-tuple (T, E,
D, [Ai])
1) T = {tl, t2, . . , tn} is the set of tasks.
2) E the edges, represents the communication between tasks3) D is an n x n matrix, where the element dij of D is the data volume
which ti should transmit into tj.
4) Ai, 1 <= i <= n, is a vector [eil, ei2, . . . , eiu,], where eiu, is the execution time of ti on pu.
6
Directed Acyclic Graph Task No. P1 P2 P3 1 14 16 9 2 13 19 18 3 11 13 19 4 13 8 17 5 12 13 10 6 13 16 9 7 7 15 11 8 5 11 14 9 18 12 20 10 21 7 16
7
DAG-Scheduling Basic Assumptions
Any processor can execute the task and communicate with other
machines at the same time.
Each processor can only execute one process at each moment.
Graph is fully connected.
Once a processor has started task execution it continues without
interruption, and after completing the execution it immediately sends
the output data to children tasks in parallel.
Intra-processor communication cost is negligible compared to the
inter-processor communication cost.
8
DAG-Scheduling1.Task selection and Schedule phase
Task Selection phaseTask are selected according to their height in DAG
Calculation of task’s start and finish time
Where , PAT(ti , pu) = processor available
DAT(ti , tk , pu) = data avaulable
ET(ti , pu) = execution time of ti on pu.
9
DAG-Scheduling2. Scheduling Encoding (Chromosome)
A string is a candidate solution for the problem. String consists of several lists. Each list is associated with a processor.
Lets for any application of 10 tasks the generated schedule is:• 1th Processor : t3 t4 t8
• 2th Processor : t5 t7 t9
• 3th Processor : t0 t1 t2 t6
Then the chromosome can be represented as matrix of size[No. of Task x No. of Processors]
P1 P2 P3
t3 t5 t0
t4 t7 t1
t8 t9 t2
t6
10
DAG-Scheduling 3. Initialzation
Population of size POP_SIZE has been initialized.
4.Fitness Function The GA requires a fitness function that assigns a score to each
chromosome in the population. The fitness function in a GA is the objective function that is to be
optimized. In the proposed algorithm Fitness function returns the time when all
tasks in a DAG complete their executions. A fitness function f of a string x is defined as follows:
11
DAG-Scheduling 5.Roulette-Wheel Selection
Roulette-wheel selection is used for selecting potentially useful solutions for recombination ( Crossover ).
Probability of being selected of any chromosome is:
Sum of Fitness = 8. Rand ( 8 ) = 3 Chromosome3 is the parent
.5 1.5 4 2
12
DAG-Scheduling 6.Crossover
New chromosome is generated with this operator. A parent chromosome is selected by roulette-wheel operator and then
two processors are selected from this chromosome. Apply single point crossover on these selected processor list.
Figure: Modified Single Point Crossover
13
DAG-Scheduling 7. Mutation
A mutation operation is designed to reduce the idle time of a processor waiting for the data from other processors.
Data Dominating Parent (DDP) task of task ti be the task which transmits the largest volume of data to ti. That is,
Example of Mutation :
Figure : Mutation.
14
DAG-Scheduling 8. Termination Conditions
Condition 1: If we find an Individual which has Makespan less then the specified minimum, then GA stops evolving.
Condition 2: Variable gen stores the number, how many generations the GA should run. User input the variable every time he runs the program. When the generation count crosses the gen, the GA stops evolving.
15
DAG-Scheduling 9. Pseudo code
Begininitialize P(k); {create an initial population of size POP_SIZE}evaluate P(k); {evaluates the fitness of all the chromosomes}Repeat
For i=1 to POPSIZE doSelect a chromosome a as parent from population;Child 1 <= Crossover( parent); Child 2 <= Mutation ( Child 1 );Add (new temporary population, Child 1, Child 2);End For;Make (new pop, new temp pop, old pop );Population = new population;
While (not termination condition);Select Best chromosome in population as solution andreturn it;
End
16
Dynamic Load Balancing 1. Basic Definitions
Load Calculations: sum of processes execution times allocated to that processor.
Maxspan : maximal finishing time of all processes.max span(T ) = max(Load ( pi ) )∀ 1 ≤ i ≤ Number of Processors
Processor Utilization : Ratio of Load(Pi) to maxspan.
17
Dynamic Load Balancing 2. Basic Assumptions
Each processor can only execute one process at each moment.
Tasks are non-preemptive.
Tasks are totally independent i.e. there is no data transfer take place
among tasks and there are no precedence relations.
Heterogeneity of processors is defined by a multiplying factor x. If
1th processor’s speed is P1 then the ith processor’s speed can be
calculated as
Pi = (1+ (i-1)*x) p1
18
Dynamic Load Balancing 3. Sliding Window Technique
How many task should selected at a time from the pool of task, this is
decided by size of sliding window.
Size is inputted by user. The number of task in a chromosome is
equal to size of sliding window.
Sliding window contains task ID .
Example: Sliding window of size 10
1 2 3 4 5 6 7 8 9 10
19
Dynamic Load Balancing 4. Scheduling Encoding (Chromosome)
This is a 2D matrix of size[no. of processors x size of sliding window].
Figure: Chromosome Representation
5. Initialization Population of size POP_SIZE is initialized by randomly assigning tasks
to processors.
P1 P2 P3
3 5 0
4 7 1
8 9 2
6
20
Dynamic Load Balancing 6. Fitness Function
Fitness function attaches a value to each chromosome in the
population, which indicates the quality of the schedule.
7. Roulette-wheel selection Roulette wheel selection is used which I have described in DAG-
Scheduling Section.
21
Dynamic Load Balancing 8. Cycle Crossover
Single point crossover can’t be used in this GA as it may cause some
tasks to be assigned more than once while some are not assigned.
A new crossover operator is designed called cycle crossover. Here I
am showing you how does it works.
A 8 6 4 10 9 7 1 5 3 2
B 10 2 9 5 6 9 8 7 4 1
A` 8 - - - - - - - - -
B` 10 - - - - - - - - -
A` 8 6 - 10 9 7 1 5 - 2
B` 10 2 - 5 6 9 8 7 - 1
A` 8 6 3 10 9 7 1 5 4 2
B` 10 2 4 5 6 9 8 7 3 1
22
Dynamic Load Balancing 9. Random Swap Mutation
Random mutation technique is used to apply mutation on newly
generated child.
Two processor is selected randomly from the processor list. Make sure
must be different.
Then two random task selected from each processor and swapped.
mutation
P1 P2 P3
3 5 0
4 7 1
8 9 2
6
P1 P2 P3
3 5 0
4 7 8
1 9 2
6
23
Dynamic Load Balancing 10. Termination Condition
The variable gen stores the count, how many generations the GA should run.
When the generation count crosses the gen ,the GA stop evaluating.
11.Task allocation and updating the window When termination condition met, fittest chromosome is assigned to final schedule.
Now window is filled up again by sliding along the subsequent tasks waiting in the task pool.
24
Dynamic Load Balancing 12. Pseudo Code
BeginRepeatsave the tasks into sliding window; initialize P(k); {create an initial population of size POP_SIZE}
evaluate P(k); {evaluates the fitness of all the chromosomes}Repeat
For i=1 to POPSIZE doSelect two chromosome as parent from population;Child 1, Child2 <= Crossover( parent1,parent2); Child 3 <= Mutation ( Child 1 );child4 <= Mutation (Child2)Add (new temp pop, Child 1, Child 2,Child3,Child4);
End For;Make (new pop, new temp pop, old pop );Population = new population;
While (not termination condition);Assign the Best chromosome in population to final schedule;While(Task pool has more tasks)
End
25
Experimental results and Discussion1.Dynamic Load Balancing
1. Test Parameters The measurement of performance of proposed algorithm was based on two metrics: total completion
time (Makespan) and average processor utilization. The calculation of these metrics depends on the
following parameters: Default values
Population Size ( POP_SIZE ) 100
Sliding window size ( sizeSlidingWindow ) 10
No. of Generations ( gen ) 100
No. of Processors ( no_of_Proc ) 10
No. of Tasks ( no_of_Tasks ) 100
26
Experimental results and Discussion1.Dynamic Load Balancing
2. Changing the population size The population sizes ranged from 20 to 200. It was observed that increasing in the population does not increase the
performance after certain limit. You after 120 the completion time is approximately constant
Increasing the population size had a positive effect on the processor utilization
27
Experimental results and Discussion1.Dynamic Load Balancing
3. Changing the No. of Generation Cycles The number of generation cycles was changed from 1 to 500. As the no. of Generation cycle was increased, performance of the schedule
also increased. The total completion time was significantly reduced as the number of generations was increased from 1 to 50.
Increasing the population size had a positive effect on the processor utilization
28
Experimental results and Discussion1.Dynamic Load Balancing
4. Changing the No. Processors The no. of processors was changed from 2 to 20. As the no. of processors were increased for the same number of tasks,
Completion time decreased because now the system has more number of processing elements.
When the no. of processors was increased, avg. processor utilization decreased.
29
Experimental results and Discussion1.Dynamic Load Balancing
5. Changing the No. Tasks The number of tasks was varied from 10 to 1000. As we increases the no. of tasks, the time taken to completion time
also increases. When no. of tasks is large, avg. utilization is more then 96%.
30
Experimental results and Discussion1.Dynamic Load Balancing
6. Changing the Sliding Window The sliding window size was changed from 2 to 50
the effect on completion time and avg. Processors utilization is
given below:
31
Experimental results and Discussion2.DAG-Scheduling
1. Test Parameters The measurement of performance for DAG scheduling algorithm was done by speedup .
The speedup value for a given graph is computed by dividing the sequential execution by
the parallel execution time.
Speedup for the proposed DAG scheduling algorithm depends on the following parameters.
No. of Generations
32
Experimental results and Discussion2.DAG-Scheduling
2. Changing the no. of Generation cycles The no. of generations was varied from 1 to 1000. As the no. of Generation cycle was increased, performance of the
schedule also increased. After 250 Generations, it is observed that running the GA does not
seem to improve performance much.
33
Experimental results and Discussion2.DAG-Scheduling
3. Changing the no. of Tasks The no. of tasks was varied from 10 to 60.
34
Conclusion
The result generated by the proposed dynamic load-balancing mechanism using
Genetic Algorithm was extremely good when the number of tasks is large.
The avg. Processor Utilization by proposed algorithm was found more then 97-
98%.
The complete genetic algorithm for DAG scheduling was implemented and tested
on the various input task graphs in a heterogeneous system.
Proposed DAG- Scheduling algorithm gives best speedup when the generation
cycle is more then 250.
35
References
1) AlbertY.Zomaya, Senior Member, IEEE, Chris Ward, and Ben Macey, “Genetic Scheduling for Parallel Processor Systems: Comparative Studies
and Performance Issues” VOL.10, NO.8, AUGUST 1999
2) Andrew J. Page , Thomas M. Keane, Thomas J. Naughton, "Multi-heuristic dynamic task allocation using genetic algorithms in a heterogeneous
distributed system" Journal of Parallel and Distributed Computing Volume 70, Issue 7, July 2010, Pages 758–766.
3) Yuan, Yuan , Xue, Huifeng “Modified Genetic Algorithm for Task Scheduling in Multiprocessor Systems” Jisuanji Celiang yu Kongzhi/
Computer Measurement & Control (China). Vol. 13, no. 5, pp. 488-490. May 2005
4) Y.K. Kwok and I. Ahmad, “Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors”, IEEE
Trans. Parallel and Distributed Systems, Vol. 7, No. 5, pp. 506-521, May 1996.
5) A.T. Haghighat, K. Faez, M. Dehghan, A. Mowlaei, & Y. Ghahremani, “GA-based heuristic algorithms for bandwidth- delay-constrained least-
cost multicast routing”, International Journal of Computer Communications 27, 2004, 111–127.
6) D.E. Goldberg, “Genetic Algorithms in Search, Optimization, and Machine Learning. Reading, Mass” Addison-Wesley, 1989.
7) Albert Y. Zomaya, Senior Member, IEEE, and Yee-Hwei “Observations on using genetic algorithms for dynamic load-balancing ” IEEE
Transactions On Parallel And Distributed Systems, Vol. 12, No. 9, September 2001
8) H.C. Lin and C.S. Raghavendra, “A Dynamic Load-Balancing Policy with a Central Job Dispatcher (LBC)” IEEE Trans. Software Eng., vol. 18,
no. 2, pp. 148-158, Feb. 1992.
9) M. Munetomo, Y. Takai, and Y. Sato, “A Genetic Approach to Dynamic Load-Balancing in a Distributed Computing System” Proc. First Int'l
Conf. Evolutionary Computation, IEEE World Congress Computational Intelligence, vol. 1, pp. 418-421, 1994.
36
Questions Thank you. Questions?