Everything breaks All the time -...

Everything breaksAll the time

• After IT related jobs;• Dutch Police – enterprise IT• OCOM group - hosting & cloud• SDL – Software As A Service• Spoken Communications – Call Center as Service

History of Cloud backend

• Behind the Website and Apps;• Running in to scale problems • Cost• Complexity• Speed

Running at scale

• The ‘normal’ things didn’t work any more.• New server hardware, networking, software• New datacenters• New economics

Running at scale

• Many layers, abstracted away

Large scale – everybody's problem now

• For cloud providers –millions of servers• For cloud costumers –

millions of instances

Automation

• Large scale;• Deployment• Config• CI/CD• Capacity mgmt• Monitoring

Everything fails… all the time…

• Hardware will fail.• Software will have bugs• Force Majeure will hit

• Focus on restore not on preventing

AFR harddisk failure

• Force Majeure will hit; Amazon datacenter failure 2010

• Human error; Amazon S3 failure 2017

Balancing cost and risk

- Cost goes up- Complexity goes up

Outage !

• What happens ?• Typical Process (ITIL or similar)• Typical Human behaviour

• What happens after ?• Review process, retro or post mortem• Typical Human behaviour

Retro’s

• When ?• Who ?• Structure:

• Detailed Timeline• Root cause• What went well• What could we have done better• Action items

• Automate the input (ChatOps)• Importance of Blameless Retro’s.• Share retro output internal and external

Materials / references

• Barraso, Clidaras, Holzle: The Datacenter as a Computer. 2009/2013

• Mark Burgess: In Search of Certainty. 2013

• Woods DD. STELLA: Report from the SNAFUcatchers Workshop on Coping With Complexity. 2017.

Everything breaks All the time -...

Documents

Transcript of Everything breaks All the time -...

When Time is Everything . . . Time We Manufactu re in Days - Not … · 2018. 11. 23. · When Time is Everything . . . We Manufactu re in Days - Not Weeks ! Time OCTOBER 2018 ...

A Time for Everything There is a time for everything, and a season for every activity under heaven: a time to be born and a time to die, a time to plant.

Beverley kalnin - a time for everything

Top 10 Everything of 2009 by Time

Real-Time Everything - the Era of Communication Ubiquity

Time Heals Everything

cystic fibrosis lecture.pdf

Mirzaei-FractureMechanics Lecture.pdf

1-Basic Terminology of SOCIOLOGY - Class Lecture.pdf

Blais ch01 lecture.pdf

Computer Architecture All lecture.pdf

1974_Flory Nobel-lecture.pdf

Corbett lecture.pdf

all lecture.pdf

TIMING IS EVERYTHING (Ecclesiastes 3:1-11) -G-God’s time is perfect time -G-God is a God of Timing. - HE appointed time for everything - HE set a time.

Early Conditions Class Lecture.pdf

He hath made everything beautiful in his time ...

modeling simulation and system engineering lecture.pdf

Time upgrades everything

A Time for Everything