Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki,...

35
Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015

Transcript of Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki,...

Page 1: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Data Center Load Balancing

T-106.5840 SeminarKristian Hartikainen

Aalto University, Helsinki, Finland9.12.2015

Page 2: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Load Balancing

• Efficient distribution of the workload across the available computing resources– Distributing computation over multiple CPU cores– Distributing network requests across multiple servers– And many others...

• The goals is efficient resource usage to optimize the desired performance metrics– Maximizing network throughput– Minimizing latency– And many others...

Page 3: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Data Center Load Balancing

• Load balancing problems arise in several (computing) contexts

• Our focus is on the data center load balancing• Data center load balancing also consists of

several different levels– Network traffic, CPU inside servers, servers, server

racks, server clusters, between data centers• We studied load balancing of network traffic

and virtual servers

Page 4: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

MOTIVATION

Page 5: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

The Free Lunch Is Over

Herb Sutter: The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software. Dr. Dobb’s Journal, 30(3), March 2005 (updated graph in August 2009).

• Single threaded preformance have hit the wall• Number of transistors in the microprocessors is still growing

Page 6: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Amdahl’s Law• Clear limitations in the speed up gains that parallel programs can achieve• Thus, keeping the resources utilized is challenging

https://en.wikipedia.org/wiki/Amdahl%27s_law

Page 7: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Towards Data Center Computing

• At the same time, the amount of data mobile devices, sensors and data transferred in general, have proliferated

• Data center technology and virtualization has been developing rapidly as well

• Data centers are becoming larger and larger and more common– Large companies are building their own data centers– Many other companies are moving their computation,

storage and operations to cloud

Page 8: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Large Scale Data Centers

• Data centers provide many advantages over traditional computing– Economics of scale– Enables use of cheap commodity hardware– Cheap hardware, cooling, electricity, network, etc...

• However, to efficiently utilize data center resources, and to provide the required performance guarantees, efficient load balancing mechanisms are needed on different levels of the data center

Page 9: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

NETWORK TRAFFIC LOAD BALANCING

Page 10: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Network Traffic Load Balancing

• Today’s data centers are huge– Hundreds of thousands of servers– Supporting huge amount of services

• Big data apps• Web services• High performance computing

• Network traffic grows– Both inter and intra data center traffic

• Network bandwidth is one of the major bottlenecks in data centers

Page 11: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Data Center Networks• Traditional data center network topologies are single rooted trees• Limited port density (even in the highest-end) switches forces the data center

topology to take a form of multi-rooted tree• For example fat-tree or leaf-spine

Page 12: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Data Center Networks• The problem: How to efficiently utilize the theoretical bandwidth gains for the

multi-rooted design?

Page 13: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Flow Hashing

• Most of today’s load balancing mechanisms are based on flow hashing– E.g. Equal-Cost Multi-Path forwarding

• Basic idea: split the packet flows randomly across multiple network paths– E.g. by hashing the packet header (e.g. 5-tuple)

• ECMP– Forwarding decisions made hop-by-hop– All the routes are equal cost

Page 14: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Flow Hashing

Page 15: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Flow Hashing

• Pros:– Easy to implement– Good performance in ideal system conditions– Packets are automatically kept in order, which is

crucial for certain protocols such as TCP.• Cons:– Hashing decisions are purely local– And totally unaware of the congestion state of the

system

Page 16: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Congestion Aware Load Balancing

• Several proposals of congestion aware load balancing have been made to overcome the problems of hash-based methods

• Difficulties:– How to handle packet reordering?– Centralized vs. distributed systems?– How to implement fast system with no specialized

hardware?• Couple of examples: Hedera, Presto, CONGA

Page 17: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Hedera: Dynamic Flow Scheduling for Data Center Networks

Page 18: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Flowlets

• One of the problems in non-hash-based load balancing mechanisms is packet reordering

• Several solutions overcome this problem by doing the load balancing decisions on per-flow basis, instead of per-packet basis

• Flowlet is a burst of packets belonging to the same flow, that are separated from other brusts in the same flow by a large enough gap, that splitting them on a separate paths do not cause reordering problem

Page 19: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

CONGA: Distributed Congestion-Aware Load Balancing For Datacenters

• Distributed load balancing scheme• Maintains the congestion state of each path in

the leaf nodes• Congestion information is carried directly in

the hardware data plane of the switches (in the VXLAN virtualization overlay headers)

Page 20: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

CONGA: Distributed Congestion-Aware Load Balancing For Datacenters

• Pros:– High utilization of the network– Reacts fast to congestion– Fairly simple

• Cons:– Distributed load balancing systems are often slow– CONGA and many other distributed systems

overcome this problem by using customized networking hardware• Makes deployment hard

Page 21: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Presto: Edge-based Load Balancing for Fast Datacenter Networks

• Load balancing mechanism implemented in the soft network edge (virtual switches)

• Routes the flowlets through the network using round robin algorithm

• Solves the problems of hash-based algorithms– Works even in asymmetric topologies– Elephant flows do not cause problems

Page 22: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Presto: Edge-based Load Balancing for Fast Datacenter Networks

• Pros:– Deals well with network failures and asymmetry– Fully implemented in the software (~500 lines of

code in Open vSwitch and ~900 lines of code in Linux Generic Receive Offload (GRO))

– Thus easy to deploy• Cons:– Too slow compared to HW solutions?

Page 23: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

VIRTUAL SERVER LOAD BALANCING

Page 24: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Virtual Server Load Balancing

• Another part of data center that needs to be balanced

• Goals and methods differ from network load balancing– Goal seems to be more about energy efficiency

rather than pure speed ups of scalability

Page 25: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Power Usage of Warehouse Scale Server

Figure: L. A. Barroso, J. Clidaras, and U. Hlzle, “The datacenter as a computer: An introduction to the design of warehouse-scale machines, second edition”

Page 26: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Server Load Balancing

• Cloud data center servers are often virtualized• Virtual machine migration allows flexible

movement of servers between physical hardware• Migration brings over head• When and which virtual machine should be

migrated, and where?• How to develop algorithm that scales?• How to cope with heterogeneous allocation

policies and different objectives?

Page 27: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Virtual Server Migration

• Distributed vs. centralized load balancing– Similarly as with network traffic load balancing

• Dynamic vs. Static load balancing• Metrics to make the migration decisions– CPU-, memory-, network usage, etc...

Page 28: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Example Load Balancing System[A. Beloglazov and R. Buyya, “Energy efficient resource management in virtualized cloud data centers”]

• Decentralized• Three level system architecture– Dispatcher

• Distributes requests between global managers

– Global Manager• Supervises a set of local managers• Distributes their own local manager data between other global

managers

– Local manager• Inside each of the physical servers nodes• Responsible for continuous monitoring of the resource utilization

Page 29: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Example Load Balancing System

[A. Beloglazov and R. Buyya, “Energy efficient resource management in virtualized cloud data centers”]

Page 30: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

EXPERIMENT

Page 31: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Experiment: Simulation of ECMP

• Equal-Cost Multi-Path (ECMP) experiment in a small data center

• Simulated using Performance Simulation Environment (PSE)– In-house discrete event simulator

Page 32: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Simulation setup

Figure: http://simula.stanford.edu/~alizade/papers/conga-sigcomm14.pdf

Page 33: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

PSE Model

Page 34: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

CONCLUSIONS

Page 35: Data Center Load Balancing T-106.5840 Seminar Kristian Hartikainen Aalto University, Helsinki, Finland 9.12.2015.

Conclusions

• Load balancing is important• Load balancing is challenging• Experiment is not ready• Network traffic load balancing is more about

scalability without sacrificing latency or throughput under unexpected network conditions

• Server load balancing is more about efficient utilization of the server nodes, to reduce energy consumption