Engineering Operations

14

Click here to load reader

description

A presentation for the Automation session at the 2014 Cyber Summit by Subbu Allamaraju, Chief Engineer of Cloud at eBay Inc.

Transcript of Engineering Operations

Page 1: Engineering Operations

A dev/test cloud Less than a rack of compute Handcrafted by an engineer Supported by another engineer Zero automation Dev to op ratio = 1:0

Thousands of nodes Distributed across several AZs Automated Operated 24x7 Running the business Dev to op ratio: 5:1

2012 2014

Page 2: Engineering Operations

1 2 3 4

Page 3: Engineering Operations

1 2 3 4

Treat infrastructure as code

Page 4: Engineering Operations

1. Fully automate deployments o  Well known principle

2. Treat automation artifacts like you treat code o  Source control → code reviews → tests →

deployment 3. Take automation as a product feature

o  Road map, sprints, bugs, backlog, releases 4. Measure outcomes with KPIs

o  Time to deploy, time to recover, time to rollout a change

Page 5: Engineering Operations

1 2 3 4

Manage drift

Page 6: Engineering Operations

System is in a state other than the desired state!

-  Incidents waiting to happen -  Impacts time to recover -  Impacts customers

Page 7: Engineering Operations

Drift

Automation Gaps

Habits

Transitional

Debugging

Incidents

Page 8: Engineering Operations

Accept that drift happens - and manage drift mitigation

1. Automated audits 2. Drift tracking 3. Mitigation as a planned routine activity 4. Culture - reward right habits

Page 9: Engineering Operations

1 2 3 4

Awareness of systems and operations

Page 10: Engineering Operations

Measure everything

Business KPIs

Config management

Alerts

Drift

Another product feature!

Page 11: Engineering Operations

1 2 3 4

Culture of shared accountability

Page 12: Engineering Operations

Working on what’s running the business

Working on (wants to work on) new things

Ops Dev

Make TTR a shared goal!

Knows how the system fails

Knows how the system is supposed to work

Worries about TTR Wants to understand why

Page 13: Engineering Operations

Operations is Engineering

Page 14: Engineering Operations

Thanks