Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?
Results andanalysis
Conclusion
Machine Learning applied to ProcessScheduling
Benoit Zanotti
[email protected]://www.lse.epita.fr
July 17, 2013
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitionsMachine Learning
Process Scheduling
Our target: CFS
What can we do ?
Results andanalysis
Conclusion
Plan
1 Introduction and definitionsMachine LearningProcess Scheduling
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitionsMachine Learning
Process Scheduling
Our target: CFS
What can we do ?
Results andanalysis
Conclusion
Plan
1 Introduction and definitionsMachine LearningProcess Scheduling
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitionsMachine Learning
Process Scheduling
Our target: CFS
What can we do ?
Results andanalysis
Conclusion
Definition of Machine Learning
DefinitionMachine Learning is a field of Computer Science aboutthe construction and study of systems that can learn fromdata.
Usual organizations of ML algorithms :Supervised learning (classification, ...)Unsupervised learning (clustering, ...)Semi-supervised learning...
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitionsMachine Learning
Process Scheduling
Our target: CFS
What can we do ?
Results andanalysis
Conclusion
Notes about Machine Learning
We won’t talk really about the theory. But:Pretreatment is very important.Usually, big tradeoff between speed and efficiency
In Process Scheduling, those factors will be limiting.
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitionsMachine Learning
Process Scheduling
Our target: CFS
What can we do ?
Results andanalysis
Conclusion
Plan
1 Introduction and definitionsMachine LearningProcess Scheduling
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitionsMachine Learning
Process Scheduling
Our target: CFS
What can we do ?
Results andanalysis
Conclusion
What is Process Scheduling ?
DefinitionProcess Scheduling is the method by which processes aregiven access to processor time. It is used to achieved multi-tasking.
There is many well-known scheduling algorithms. Forexample:
First In, First OutRound-Robin (fixed time unit, processes in a circle)
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitionsMachine Learning
Process Scheduling
Our target: CFS
What can we do ?
Results andanalysis
Conclusion
Main concerns
A scheduler has mainly 3 metrics: throughput, latencyand fairness. We can simplify them (in practice) by:
Speed (how much time the scheduler itself uses,number of context-switching, ...)Fairness (giving equal CPU time to each process)Reactivity (are interactive processes given anyadvantages ?)
A scheduler is complicated. Let’s optimize one using ML !
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFSInner workings
Advantages/Inconvenients
What can we do ?
Results andanalysis
Conclusion
Plan
2 Our target: CFSInner workingsAdvantages/Inconvenients
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFSInner workings
Advantages/Inconvenients
What can we do ?
Results andanalysis
Conclusion
Plan
2 Our target: CFSInner workingsAdvantages/Inconvenients
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFSInner workings
Advantages/Inconvenients
What can we do ?
Results andanalysis
Conclusion
Inner workings of CFS
Stands for Completely Fair SchedulerScheduler of Linux since 2.6.23Just an RB-tree with elements indexed by theruntime of the process.Straightforward algorithm: just take the minimum ofthe tree.
CFS in Linux kernel is actually more complicated(handling Real-Time tasks, nice values, ...)
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFSInner workings
Advantages/Inconvenients
What can we do ?
Results andanalysis
Conclusion
Why CFS ?
Quite simple and works really wellMost familiar (I implemented one in mikro)Already efficient. I wanted to see what ML could do.
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFSInner workings
Advantages/Inconvenients
What can we do ?
Results andanalysis
Conclusion
Plan
2 Our target: CFSInner workingsAdvantages/Inconvenients
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFSInner workings
Advantages/Inconvenients
What can we do ?
Results andanalysis
Conclusion
Advantages/Inconvenients
3 Very simple to understand3 Works really well in general cases3 No real corner cases7 A little light on the handling of interactive processes.
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?ML considerations
Applying ML to the CFS
Results andanalysis
Conclusion
Plan
3 What can we do ?ML considerationsApplying ML to the CFS
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?ML considerations
Applying ML to the CFS
Results andanalysis
Conclusion
Plan
3 What can we do ?ML considerationsApplying ML to the CFS
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?ML considerations
Applying ML to the CFS
Results andanalysis
Conclusion
ML considerations
Restricted to supervised learning (classification andregression mainly)Scheduler must be as fast as possible. Its MLcomponents too.Avoiding complex code in the kernel is often a goodidea.
→ precomputed model/profile for each processes→ no complex methods, results will be mitigated
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?ML considerations
Applying ML to the CFS
Results andanalysis
Conclusion
Plan
3 What can we do ?ML considerationsApplying ML to the CFS
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?ML considerations
Applying ML to the CFS
Results andanalysis
Conclusion
Applying ML to the CFS
Ojective: reducing the number of context switchs:A process time quantum should ideally not finish(process going to sleep)An estimation of the next quantum would helpBased on the N lasts quantumsBe careful not to be too unfair
Note: Many other objectives were possible...
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?ML considerations
Applying ML to the CFS
Results andanalysis
Conclusion
Actual implementation
Proof of ConceptOne using Taylor’s Theorem and one using aclassifierNeed to extract real runtime quantums and to createprofiles
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?ML considerations
Applying ML to the CFS
Results andanalysis
Conclusion
Taylor’s theorem
The sequence of quantums can be seen as a functionof the time.Taylor’s theorem gives an approximation of afunction on a point given its derivativesDiscrete derivation is only substraction
→ an approximation of the next quantum is:
f (x + 1) = f (x) + f ′(x − 1) +f ′′(x − 1)
2
This method is simple and fast, but not very precise.
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?ML considerations
Applying ML to the CFS
Results andanalysis
Conclusion
Classifier
Naive Bayes Classifier using the last 4 quantums:It is the best (found) compromise between speed andresultsParameters and output are range of time, not theactual valuesBased on Bayes’ theorem. Outputs the labels withmost probabilityOnly 4 multiplications are needed for each label(there is 10 of them).Using bit manipulation, we can avoid anyconditionals
→ it is fast, but clearly not the most accurate
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?
Results andanalysisperf and Linsched
Methodology and results
Analysis
Conclusion
Plan
4 Results and analysisperf and LinschedMethodology and resultsAnalysis
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?
Results andanalysisperf and Linsched
Methodology and results
Analysis
Conclusion
Plan
4 Results and analysisperf and LinschedMethodology and resultsAnalysis
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?
Results andanalysisperf and Linsched
Methodology and results
Analysis
Conclusion
perf
perfPerformance analysis tools for LinuxBased on kernel-based performance countersCan be used to extract many scheduling stats
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?
Results andanalysisperf and Linsched
Methodology and results
Analysis
Conclusion
Linsched
LinschedLinux Scheduler Simulator (in userland...)
3 Easy to use (cycle of development, debugging, ...)and fast
3 Can replay records from perf7 Hard to quantify how much time is used by the
scheduler
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?
Results andanalysisperf and Linsched
Methodology and results
Analysis
Conclusion
Plan
4 Results and analysisperf and LinschedMethodology and resultsAnalysis
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?
Results andanalysisperf and Linsched
Methodology and results
Analysis
Conclusion
Methodology of the tests
Use perf to extract records and datasetsUse WEKA to compute profiles for each processTest using vanilla/modified linsched to see the gainTime the tests of vanilla/modified linsched toestimate how costly each method is
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?
Results andanalysisperf and Linsched
Methodology and results
Analysis
Conclusion
ResultsTi
me
used
(bas
e=10
0)
Results of the simulation (without scheduler time)
100 98 95
Vanilla Extrapolation Classifier0
20
40
60
80
100
120
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?
Results andanalysisperf and Linsched
Methodology and results
Analysis
Conclusion
ResultsTi
me
used
(bas
e=10
0)
Results of the simulation (with scheduler time)
100 102 98
Vanilla Extrapolation Classifier0
20
40
60
80
100
120
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?
Results andanalysisperf and Linsched
Methodology and results
Analysis
Conclusion
Plan
4 Results and analysisperf and LinschedMethodology and resultsAnalysis
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?
Results andanalysisperf and Linsched
Methodology and results
Analysis
Conclusion
Analysis
CFS is already quite goodML results are positive but very limitedMore complex pretreatment/ML techniques wouldyield better results... at which cost ?
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?
Results andanalysis
Conclusion
Plan
5 Conclusion
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?
Results andanalysis
Conclusion
Conclusion
It was only one idea on one objective.Using ML in scheduling is hard, because of thespeed/results tradeoff
Difficulties for a real kernel integration (passing themodels, limiting abuses, ...)Basic rule in scheduling: "Simpler is Better"Another idea: run a (kernel ?) process every X hoursto compute new profiles...K. Kumar Pusukuri, A. Negi, Applying machinelearning techniques to improve Linux process scheduling,Dec. 2005.
Machine Learningapplied to Process
Scheduling
Benoit Zanotti
Introduction anddefinitions
Our target: CFS
What can we do ?
Results andanalysis
Conclusion
Questions ?
Questions ?
Top Related