Load balancing in the SRE way
Transcript of Load balancing in the SRE way
Load Balancing in the SRE way
Ke Zhu @shawnzhu Site (Un)Reliability Engineer at IBM
For What?• GitHub Enterprise Cluster
• On Internet
• Zero downtime
• 100M+ HTTP requests per week
• 30k+ attacks per week
• 26k+ git clone per hour(https://help.github.com/enterprise/2.8/admin/guides/installation/maintenance-mode/)
Design Goals• Scripting Platform
• traffic conducting via code
• do social coding
• Observable
• Blue/Green deployment
• High performance
• Security from day one
Software Stack
&& 🎩 magic kernel parameters in /etc/sysctl.conf 🐰
Scripting Platform• OpenResty (Nginx + Lua) - https://openresty.org/en/
(Example: customized request rate limiting)
Blue/Green Deployment• Can not terminate any TCP connection
• Two stacks:
• load-balancer-green
• load-balancer-blue (for experiment)
• Cloud DNS
• Switching A record + short TTL (~5m)
• Simple/Weighted Routing policy
• Run experiment by using docker image tags
• Real time metrics collection by librato.com
• Test docker images
• RSpec + Serverspec
• Travis CI
• Test docker host
• RSpec + Serverspec
• Test Kitchen
Test Driven for Container
❤vault• Secret mgmt via API - https://www.vaultproject.io/
• retrieve all secrets for provisioning load balancer via a single token with TTL 5min
Blocking mode in Production• Signal Sciences - https://signalsciences.net/
Summary
• Conducting HTTPS traffic via Lua code
• Blue-green deployment of Load balancer via DNS
• Testing docker with RSpec + Serverspec
• SignalSciences