Engineering Operations

Post on 24-May-2015

189 views 0 download

description

A presentation for the Automation session at the 2014 Cyber Summit by Subbu Allamaraju, Chief Engineer of Cloud at eBay Inc.

Transcript of Engineering Operations

A dev/test cloud Less than a rack of compute Handcrafted by an engineer Supported by another engineer Zero automation Dev to op ratio = 1:0

Thousands of nodes Distributed across several AZs Automated Operated 24x7 Running the business Dev to op ratio: 5:1

2012 2014

1 2 3 4

1 2 3 4

Treat infrastructure as code

1. Fully automate deployments o  Well known principle

2. Treat automation artifacts like you treat code o  Source control → code reviews → tests →

deployment 3. Take automation as a product feature

o  Road map, sprints, bugs, backlog, releases 4. Measure outcomes with KPIs

o  Time to deploy, time to recover, time to rollout a change

1 2 3 4

Manage drift

System is in a state other than the desired state!

-  Incidents waiting to happen -  Impacts time to recover -  Impacts customers

Drift

Automation Gaps

Habits

Transitional

Debugging

Incidents

Accept that drift happens - and manage drift mitigation

1. Automated audits 2. Drift tracking 3. Mitigation as a planned routine activity 4. Culture - reward right habits

1 2 3 4

Awareness of systems and operations

Measure everything

Business KPIs

Config management

Alerts

Drift

Another product feature!

1 2 3 4

Culture of shared accountability

Working on what’s running the business

Working on (wants to work on) new things

Ops Dev

Make TTR a shared goal!

Knows how the system fails

Knows how the system is supposed to work

Worries about TTR Wants to understand why

Operations is Engineering

Thanks