Automating the application lifecycle.pptx

26
Automating the Application Lifecycle Devan Stormont

Transcript of Automating the application lifecycle.pptx

Automating the Application Lifecycle

Devan Stormont

“Big Picture” Goals

What should we be aiming for?

■ Don’t try to do everything perfectly

■ Do tighten every feedback loop to respond as quickly as possible to problems

Goals in Practice

How do we accomplish this?

■ One-click manual steps

■ Monitoring results at every phase

■ Automatic reporting or action

Develop Deploy

ProductionAnalyze

Development

■ One-click builds

■ Continuous builds

■ Test suites

■ Static code analysis

Deployment

■ One-click deployments

■ Minimizing down time

■ Monitoring rollout health

■ Incremental node rollout

Production

■ Instance health

■ Process/service health

■ Third-party service health

■ User activity

Analysis

■ Automatic data collection

■ Scheduled analysis

■ Report notifications

■ Automatic rollbacks

Development

Development

The primary goals of development automation are:

■ Notify developers of the error(s)

■ Prevent bad code from being released

Development

(There shouldn’t be anything new here to most developers)

■ One-click (or one-command) builds

Ideally, this is exactly the same regardless of developer OS

■ Sanity checks proactively fail builds

Unit testing

Property testing

Static code analysis

■ Continuous builds systems

More thorough functional/integration tests

Every customer-reported issue should have an automated regression test!

Deeper code analysis

Development - Feedback Loop

(There shouldn’t be anything new here to most developers)

■ Developer systems: Failed builds should prevent code check-ins

■ Continuous builds failures

Send out notifications

Automatically roll back check-ins to release branches

Alternatively, success automatically integrates into release branches

Push system - the system is the gatekeeper■ Continuous builds also generate reports about lower-threshold warnings

Static code analysis, test code coverage

Pull system - up to developers to be pro-active

Minimize these as much as possible!

Check-InRelease

Check-InStaging

Development

Build

FailFail

DeploymentDevelopment

Deployment

The primary goals of deployment automation are:

■ Automatically push out changes

■ Actively monitor rollout for problems

■ Automatically roll back to known good states

Deployment

■ One-click (one-command) automatic rollouts

Should be staged across instances/regions

Should minimize down time - hot swap!

■ Monitor rollout health

Node availability

Process/service availability

Data migration health

■ Failure thresholds

Developer notifications

Rollouts automatically unwind to known-good states

Deployment

With enough seamless monitoring in place:

■ Deployments should be invisible to users

■ A “good” code check-in can automatically drive a new deployment

Deployment

ProductionStagedDeploy

Check-InRelease

Rollback

Production

Deployment

Production

The primary goals of end-user production automation are:

■ Monitor for problems

■ Proactively address problems

■ If necessary, roll back to known good states

Production

There are two distinct elements to monitoring in production

■ Detecting system problems

■ Monitoring users

Production

■ System monitoring

Notifications if systems/instances go down or are overloaded

Automatically scale up new resources upon need

■ Service watchdogs

Automatic service restarts

Capture and storage of logs

Pushed by client, service, or cron job/scheduled task■ Third-party APIs

Periodically check health/accessibility

Notifications upon failure

A problem with a necessary third-party service is a problem for your service

Your users will blame you for every issue

Production

■ User monitoring

How many users are active?

What services are those users using?

What services are users hitting errors with?

■ Extended user monitoring

Email

Social media

App store reviews

Automatic notifications!■ Users like interaction - people like to be noticed

Immediate, graceful interaction is likely to earn positive public feedback

Even from users who were complaining about a problem!

Production

Resolve problems

■ Automatically spin up/down resources to adapt to user load

■ Proactive notifications about errors

■ For critical issues, allow the production environment to automatically rollback

to the last known-good state

■ Users who feel like you helped them personally are likely to become your

evangelists

Production

Production

Rollback

Monitor

Analysis Production

Development

Analysis

The primary goals of analysis automation are:

■ Proactive, early warning of known problems

Notifications of significant issues

Automatically resolving where possible

Unwinding bad deployments upon certain thresholds

■ Ability to more easily detect unknown problems

Requires prior collection of good enough information to resolve

Usually feeds into the next development iteration

■ Really touches all of the previous pieces, as already shown

Listed here because analysis should be treated as a first-class citizen

Analysis

■ Really touches all of the previous pieces, as already shown

Listed here because analysis should be treated as a first-class citizen

■ If you’re not driving development (or even features) through the use of

measurement, all you’re really doing is educated guessing

Not this problem

Fix this problem first

Recap - Problem Resolution

The main points applicable to every stage

■ Automatic notifications of failures

■ Rollback to known-good state

■ Automatic resource scaling (up/down)

What this buys us

■ Immediate visibility to every link in the chain

■ Rapid, iterative releases for problem resolution

■ Rapid learning about your users