Cross-Layer Scheduling in Cloud Computing Systems
Authors: Hilfi Alkaff, Indranil Gupta
Motivation
• Many cloud computing frameworks out there– Batch Processing Framework: Hadoop– Stream Processing Framework: Storm
• Current applications are not aware of underlying network topology– Might schedule tasks on machines with low
bandwidth.
Challenges
• Need to expose underlying network topology efficiently to applications
• Huge state space to search– Thousands of machines in a cluster– Users demand more interactive jobs
• Multiple possible data-path representation– Want to have generic schedulers
Data-Path: Map-Reduce
Data-Path: Stream
Proposed Solution
• Cross-Layer Scheduling Framework– First-level scheduler in application Level– Second-level scheduler in routing level
• Use Simulated Annealing at each level– Probabilistic framework– Idea: If neighboring state is better, always move
there but if it is not, move there with probability P(T) that decreases with time
Proposed ArchitectureApplication
Master
SDN Controller
Cross-Layer Scheduling
Algorithm: Pre-computation
• Pre-compute all-pairs (, k-shortest paths– Stored in Topology-Map hash-table with key=(, ,
value=array of k-shortest paths• Too many duplicates– Intelligently merge similar sub-paths– Hash-Table’s value is now a tree instead of array
Algorithm: Main
Algorithm: genState() Heuristic
• Too many neighboring states– Not possible to traverse all of them
• Application Level– Prefer node that has higher # of sink vertices– Prefer node that has higher # of source vertices
• Routing Level– Prefer paths that have lower number of hops– Prefer paths that have higher amount of available
bandwidth
Emulab Result: Throughput
Simulation Result: Computation Time
Simulation Results: CDF
Le Questions?
Algorithm: Failures
• Link-Failures– Need to re-allocate flows using that link– Keep a separate hash-table where key=edge,
value=flows– Get another path from Topology-Map.
• Machine-failures– Re-run main algorithm on
Top Related