Self service build and deployment at Netflix (Agile 2013)

57
Tweet @garethbowles with feedback! Self-Service Build & Deployment @Netflix Monday, August 5, 13

description

How Netflix lets all of our engineers build and deploy their own code to production.

Transcript of Self service build and deployment at Netflix (Agile 2013)

Page 1: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Self-Service Build & Deployment @Netflix

Monday, August 5, 13

Page 2: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

• How would your organization be different if all of your engineers could build, test and deploy their own code ...

• ... and were responsible for fixing what they broke at 3am ?

Monday, August 5, 13

Page 3: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Gareth Bowles

Monday, August 5, 13

Page 4: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Monday, August 5, 13

Page 5: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Netflix is the world’s leading Internet television network with more than 36 million members in 40 countries enjoying more than one billion hours of TV

shows and movies per month, including original series.

Source: http://ir.netflix.com

Monday, August 5, 13

Page 6: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

The Challenge

• We need to innovate rapidly, driven by:

• Global competition

• New connected devices

• Continuous customer feedback

• And we need fast rollback

Monday, August 5, 13

Page 7: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

The Challenge• We need to scale to cope with:

• Growing customer base

• Peaks in demand:

• Special events: holidays, Oscars

• Daily fluctuations (weekdays vs. weekends, daytime vs. evening)

Monday, August 5, 13

Page 8: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Things That Help

• We can push out updates whenever we like

• Company culture

Monday, August 5, 13

Page 9: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Things That Got in Our Way

Monday, August 5, 13

Page 10: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

A Few Short Years Ago ...

• Monolithic web app

• Single points of failure

• Releases were done by following runbooks

• DC-based infrastructure

• Different teams used different tools

Monday, August 5, 13

Page 11: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Meeting the Challenge

Monday, August 5, 13

Page 12: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

http://www.slideshare.net/reed2001/culture-1798664

Monday, August 5, 13

Page 13: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Freedom and Responsibility

• Hire mature people who work well with others

• Give them the context for company success

• Then get out of their way

• But hold them responsible for results

Monday, August 5, 13

Page 14: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Context, not Control

• Be transparent about what the company needs to succeed

• Minimize the processes people need to go through to achieve success

• Value results, not planning and process

Monday, August 5, 13

Page 15: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Highly Aligned, Loosely Coupled

• Clear strategy and goals

• Team interactions focus on strategy, not tactics

• Minimal cross-functional meetings

• Occasional post-mortems to increase alignment

Monday, August 5, 13

Page 16: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

What This Helped Us Achieve

• DVD to Streaming

• DC to cloud

• US-only to 40-plus countries

Monday, August 5, 13

Page 17: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Architecture

Credit: Steve Somers

Monday, August 5, 13

Page 18: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Key Changes

• Service oriented architecture

• Many small teams, each providing their own interconnected service

• Deploy on Amazon Web Services

• Increased reliance on open source

Monday, August 5, 13

Page 19: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Highly aligned, loosely coupled

• Services are built by different teams who work together to figure out what each service will provide.

• The service owner publishes an API that anyone can use.

Monday, August 5, 13

Page 20: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

What AWS Provides

• Machine Images (AMI)

• Instances (EC2)

• Elastic IPs

• Load Balancers

• Security groups / Autoscaling groups

Monday, August 5, 13

Page 21: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Freedom and Responsibility

• Developers deploy when they want

• They also manage their own capacity and autoscaling

• And fix anything that breaks at 3am!

Monday, August 5, 13

Page 22: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Personaliza-­‐Eon  Engine User  Info Movie  

MetadataMovie  RaEngs

Similar  Movies

API

Reviews A/B  Test  Engine

2B  requests  per  day  

into  the  Ne3lix  API

12B  outbound  requests  per  day  to  API  

dependencies

Monday, August 5, 13

Page 23: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Personaliza-­‐Eon  Engine User  Info Movie  

MetadataMovie  RaEngs

Similar  Movies

API

Reviews A/B  Test  Engine

2B  requests  per  day  

into  the  Ne3lix  API

12B  outbound  requests  per  day  to  API  

dependencies

Monday, August 5, 13

Page 24: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Build and Deployment

Monday, August 5, 13

Page 25: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

The Audience

• ~700 engineers

• Large majority are developers

• Test engineers

• Delivery teams

• Operations & reliability engineering

Monday, August 5, 13

Page 26: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Our Goal

• Lower the barriers to build, test and deployment until the entire process is accessible to every developer.

Monday, August 5, 13

Page 27: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

The Team

• 11 engineers and 1 director (but we’re hiring !)

• Developers, build / release engineers, DevOps

• Specialize, but understand the full stack

• Service oriented

Monday, August 5, 13

Page 28: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Self-Service Build & Deployment

• Channel best practices

Monday, August 5, 13

Page 29: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Self-Service Build & Deployment

• Channel best practices

• Promote, don’t dictate

Monday, August 5, 13

Page 30: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Self-Service Build & Deployment

• Channel best practices

• Promote, don’t dictate

• Make adoption easy

Monday, August 5, 13

Page 31: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Self-Service Build & Deployment

• Channel best practices

• Promote, don’t dictate

• Make adoption easy

• Make tools flexible

Monday, August 5, 13

Page 32: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Building and Deploying

Perforce / Git

libraries

source

Ant targets

Ivy

Groovy all over

snapshot / release libraries / apps

Jenkins

sync

resolve

buildcompile report

publishtest

Artifactory yumAminator

Asgard

rpms

Monday, August 5, 13

Page 33: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Building and Deploying

Perforce / Git

libraries

source

Ant targets

Ivy

Groovy all over

snapshot / release libraries / apps

Jenkins

sync

resolve

buildcompile report

publishtest

Artifactory yumAminator

Asgard

rpms

Monday, August 5, 13

Page 34: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Is That Really Self-Service ?

Monday, August 5, 13

Page 35: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Common Build Framework

• Define a build with just a few lines of Ant code

• Templates for libraries and webapps

• Override standard targets if you need to

Monday, August 5, 13

Page 36: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Jenkins Job DSL• Define Jenkins build jobs using a domain

specific language (based on Groovy)

• Loop to create multiple jobs (e.g. for building different branches)

• Make one change and rerun to update all jobs

• The code is the configuration

• https://wiki.jenkins-ci.org/display/JENKINS/Job

Monday, August 5, 13

Page 37: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Jenkins Dynaslaves

• Create build slaves in AWS

• Dedicated slave pools for teams

• Scale slave pools up and down on demand

• https://github.com/Netflix-Skunkworks/dynaslave-plugin

Monday, August 5, 13

Page 38: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

From Build to Deployment

Monday, August 5, 13

Page 39: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Aminator

• Create (“bake”) AMIs

• Image contains a service and everything needed to run it

• Can be automatically triggered as a build step

• https://github.com/Netflix/aminator

Monday, August 5, 13

Page 40: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

How Baking is Different

https://github.com/Netflix/aminator

Monday, August 5, 13

Page 41: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

How Baking is Different

Traditional:•launch OS•install packages•install app

https://github.com/Netflix/aminator

Monday, August 5, 13

Page 42: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

How Baking is Different

Generic AMI

Instance

Traditional:•launch OS•install packages•install app

https://github.com/Netflix/aminator

Monday, August 5, 13

Page 43: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

How Baking is Different

Generic AMI

Instance

Traditional:•launch OS•install packages•install app

https://github.com/Netflix/aminator

Monday, August 5, 13

Page 44: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

How Baking is Different

Generic AMI

Instance

Traditional:•launch OS•install packages•install app

https://github.com/Netflix/aminator

Monday, August 5, 13

Page 45: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

How Baking is Different

Generic AMI

Instance

Traditional:•launch OS•install packages•install app

Netflix:•launch OS+app

https://github.com/Netflix/aminator

Monday, August 5, 13

Page 46: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

How Baking is Different

Generic AMI

Instance

Traditional:•launch OS•install packages•install app

Netflix:•launch OS+app

App AMI Instance

https://github.com/Netflix/aminator

Monday, August 5, 13

Page 47: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Linux Base AMI (CentOS or Ubuntu)

Java (JDK 6 or 7)

Tomcat

Optional Apache

Monitoring

Log Rotation to S3

Appdynamics Machine Agent

Appdynamics App Agent

monitoring

Application war file, base servlet, platform, interface

jars for dependent services

GC and thread dump logging

Healthcheck, status servlets, JMX interface,

Servo autoscale

Monday, August 5, 13

Page 48: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

At Netflix, the AMI is the unit of deployment.

Monday, August 5, 13

Page 49: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Asgard• Web UI and REST API for service deployment

and management

• Manage ASGs, ELBs, security groups, ...

• Application -> cluster -> ASG

• Rapid deployment and rollback

• Available to all engineers

• https://github.com/Netflix/asgard

Monday, August 5, 13

Page 50: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Red / Black Deployment

Monday, August 5, 13

Page 51: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Netflix has moved the granularity from the

instance to the cluster.

Monday, August 5, 13

Page 52: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Simple Service Setup Effort• Write the code (variable :-))

• 15 minutes to write a build file and define dependencies

• 15 mins to create a Jenkins build, 2 to 10 mins to run it

• 5 mins to bake an AMI

• 10 mins to deploy in test, another 10 for prod

Monday, August 5, 13

Page 53: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Just a quick reminder...

(Some of) Netflix is open source:

https://github.com/netflix

Monday, August 5, 13

Page 54: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Why We Open Source

• Give back to Apache license OSS community

• Motivate, retain, hire top engineers

• Benefit from a shared ecosystem

• Make Netflix solutions into common standards

Monday, August 5, 13

Page 55: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

The Netflix PlatformDiscovery (Eureka)Entrypoints (Edda)

Configuration (Archaius)Zookeeper (Exhibitor)logging (Blitz4j & Honu)

NIWS (Ribbon)GeoBase

Hystrix

Circuit Breakers (Hystrix)Cassandra (Priam &

Astyanax & CassJMeter) Cryptex AKMS

EvCacheZuuli18nL10n

Open Source

Monday, August 5, 13

Page 56: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

https://github.com/Netflix/Cloud-Prize/wiki

Monday, August 5, 13

Page 57: Self service build and deployment at Netflix (Agile 2013)

Tweet @garethbowles with feedback!

Thank You !

Email: gbowles@{gmail,netflix}.com

Twitter: @garethbowles

Linkedin: www.linkedin.com/in/garethbowles

Monday, August 5, 13