Experiences with AWS immutable deploys and job processing

29
Experiences with Docker on AWS at Gilt Kevin O’Riordan Sr. Software Engineer @ Gilt Groupe

Transcript of Experiences with AWS immutable deploys and job processing

Page 1: Experiences with AWS immutable deploys and job processing

Experiences with Docker on AWS at Gilt

Kevin O’RiordanSr. Software Engineer @ Gilt Groupe

Page 2: Experiences with AWS immutable deploys and job processing

Agenda

Issues/Monitoring

Monitoring

Cloudwatch logging setup

Job/batch system

Evaluation of options

Sundial

Introduction

Introduction to Gilt

Docker in development

Docker in production.

Deployments

Overview

Ionroller immutable deploys

Codedeploy

Page 3: Experiences with AWS immutable deploys and job processing

Introduction

Page 4: Experiences with AWS immutable deploys and job processing

Introduction to Gilt

4

Online Flash sales company

We source luxury fashion goods, shoot the product in our studios,

then sell every day at Noon.

Micro-services architecture (200 services in production).

Mostly Scala services

Running on AWS (formerly Carpathia).

Page 5: Experiences with AWS immutable deploys and job processing

How we use Docker

In development: Bootstrapping team development environments with

minimum fuss

In production: Standardizing deployments and production

environments

Page 6: Experiences with AWS immutable deploys and job processing

Architecture overview

6

Our legacy architecture

Migration from Carpathia to AWS happened in earnest in March 2015

Most of our deployments at this time were RPM based deployments

Centralized management of service dependencies

Legacy “stop-gap” architecture consists of static nodes behind

Stingray traffic manager

Page 7: Experiences with AWS immutable deploys and job processing

Newer architecture

7

Migration to AWS team accounts

Moving from centrally managed to account to team owned AWS

accounts

Aiming for simplest possible architecture

One container per VM architecture.

Reasoning: Burned in the past with shared hosts.

Service discovery: Each service behind an ELB. Static DNS entry for

ELB.

Page 8: Experiences with AWS immutable deploys and job processing

Container strategy

8

Why not multiple containers per instance?

Small decentralized teams

Clustering solutions such as Mesos, Kubernetes, CoreOS were

deemed too complex for teams to setup and manage on a per

team/account basis

Weren’t fully aware/trusting of Docker isolation features at decision

time

Amazon ECS wasn’t a mature option at the time but we may revisit

this.

Page 9: Experiences with AWS immutable deploys and job processing

DeploymentsSubtitle

Page 10: Experiences with AWS immutable deploys and job processing

Voluntary Adoption

Up to each team to decide how they get their software to production

Teams tend to fall into the following camps:

Ionroller (Immutable deployment stacks)

Cloudformation + Codedeploy

Legacy tooling (mostly RPM based as opposed to Docker based)

Page 11: Experiences with AWS immutable deploys and job processing

Ionroller

Tool for immutable deploys developed by Gilt and open sourced

Allows deploys without downtime and instant rollback

https://github.com/gilt/ionroller

Based on Elasticbeanstalk

Using Docker means Ionroller doesn’t need to be aware of nature of

software being deployed

Using Docker means each service can have its own environment

while being deployed on the standard AMI that all Ionroller services use

Page 12: Experiences with AWS immutable deploys and job processing

Ionroller – DNS migration

First iteration achieved zero downtime using simultaneous

deployments and DNS traffic migration

Ionroller spins up new stack using Elasticbeanstalk Docker stack

Ionroller changes DNS alias in Route53 to point to ELB for new stack

After a configurable period of time, Ionroller terminates stack running

old version of service

Page 13: Experiences with AWS immutable deploys and job processing

13

Page 14: Experiences with AWS immutable deploys and job processing

Issues with Ionroller DNS traffic migration

Gilt internal version used DNS for traffic migration

A new ELB was setup for each version deploy

Loss of traffic history in ELB

In some situations DNS entry for old version might be cached

Public/open source version can use fixed ELBs and does traffic

migration by adding/removing instances to/from autoscaling group

Page 15: Experiences with AWS immutable deploys and job processing

Ionroller Demo

15

Page 16: Experiences with AWS immutable deploys and job processing

Ionroller – Traffic migration through ASG groups

The open source version of Ionroller uses a more reliable version of

traffic migration

In this version a single ELB is used

When a new version of the service is deployed, a new auto scaling

group is created for the new version and registered with ELB

For phased migration, an instance running old version of service is

removed from old version of ASG and an instance running new version is

added to ASG. This allows phased traffic migration.

When deployment is finished, old ASG is eventually destroyed

Page 17: Experiences with AWS immutable deploys and job processing

17

Page 18: Experiences with AWS immutable deploys and job processing

18

Page 19: Experiences with AWS immutable deploys and job processing

The alternative: Cloudformation + Codedeploy

Adopted by teams who were unhappy with Ionroller limitations

Greater flexibility

Default Elasticbeanstalk Docker stack uses Nginx to do port

forwarding between ELB and Docker container which can be problematic

Most teams continue to use Docker to define environment on top of

standard Amazon Linux AMI

Docker host networking proved most reliable for us (--net=host)

Gilt tooling supports live and dark canary deploys

Mutable deployments, no instant rollback or guarantees of zero

downtime like with Ionroller

Page 20: Experiences with AWS immutable deploys and job processing

MonitoringSubtitle

Page 21: Experiences with AWS immutable deploys and job processing

Monitoring

Current situation: Cloudwatch + New Relic

Also tried:

Self hosted Logstash + Kibana Elasticsearch.

Advantages: Great search capabilities, low cost

Disadvantage: Having to manually delete old log files but team is

working on automating this process.

Logentries: Excellent product but abandoned due to high cost

Page 22: Experiences with AWS immutable deploys and job processing

Cloudwatch logging implementation

Directory containing logs in Docker container is mounted onto host

filesystem.

Pass –v flag with host directory and container directory to docker run

command

Install Cloudwatch logging agent on EC2 instance

yum install -y awslogs

Configuration in /etc/awslogs/awslogs.conf

Doc at

http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/

AgentReference.html

service awslogs start

chkconfig awslogs on

Page 23: Experiences with AWS immutable deploys and job processing

Cloudwatch Demo

23

Page 24: Experiences with AWS immutable deploys and job processing

Future development…

Will likely use AWS logger in Docker 1.9 when Amazon Linux has

docker 1.9 package

Amazon Elastisearch

Launched early October

Allows us to search across Cloudwatch logs

In Cloudwatch logs UI, click on Log group and select “Start streaming

to Elasticsearch”

Quite expensive compared to hosting own Elasticsearch solution

though.

Page 25: Experiences with AWS immutable deploys and job processing

Job/batch processingSubtitle

Page 26: Experiences with AWS immutable deploys and job processing

Job and batch processing

First foray into multiple Docker containers per EC2 instance

Our legacy system ran ruby scripts on a Cron timer

Explored Chronos as a replacement but not adopted due to bias

against Mesos

Personalization team developed a solution based on Amazon ECS

(EC2 Container Service) and Dockerized jobs which we call Sundial

Currently monitoring memory usage through Codahale metrics and

Graphite server in companion container.

More detailed monitoring of ECS cluster currently being assessed

(currently doing Datadog trial as well as trialling New Relic Docker

support) .

Page 27: Experiences with AWS immutable deploys and job processing

Sundial

Higher level API and UI on top of Amazon ECS and Docker

ECS is Amazon EC2 Container Service: A clustering solutuion for

Docker containers

Sundial aggregates Docker jobs/ECS tasks into processes

Supports dependencies between jobs, graphical representation of

dependency tree

Collects logs and Graphite metrics from running jobs using Docker

companion container and streams logs to Cloudwatch

Supports viewing live logs and saved logs from jobs

Dependency graph shows failed/pending/succeeded status of jobs

Originally authored by Gary LosHuertos (https://twitter.com/garylosh),

owned and maintained by Gilt Personalization team

Plan to open source in coming months.

Page 28: Experiences with AWS immutable deploys and job processing

Sundial Demo

28

Page 29: Experiences with AWS immutable deploys and job processing

Thank you!Kevin O’Riordan@kevinoriordan

[email protected]