Experiences with AWS immutable deploys and job processing
-
Upload
docker-inc -
Category
Technology
-
view
5.467 -
download
0
Transcript of Experiences with AWS immutable deploys and job processing
Experiences with Docker on AWS at Gilt
Kevin O’RiordanSr. Software Engineer @ Gilt Groupe
Agenda
Issues/Monitoring
Monitoring
Cloudwatch logging setup
Job/batch system
Evaluation of options
Sundial
Introduction
Introduction to Gilt
Docker in development
Docker in production.
Deployments
Overview
Ionroller immutable deploys
Codedeploy
Introduction
Introduction to Gilt
4
Online Flash sales company
We source luxury fashion goods, shoot the product in our studios,
then sell every day at Noon.
Micro-services architecture (200 services in production).
Mostly Scala services
Running on AWS (formerly Carpathia).
How we use Docker
In development: Bootstrapping team development environments with
minimum fuss
In production: Standardizing deployments and production
environments
Architecture overview
6
Our legacy architecture
Migration from Carpathia to AWS happened in earnest in March 2015
Most of our deployments at this time were RPM based deployments
Centralized management of service dependencies
Legacy “stop-gap” architecture consists of static nodes behind
Stingray traffic manager
Newer architecture
7
Migration to AWS team accounts
Moving from centrally managed to account to team owned AWS
accounts
Aiming for simplest possible architecture
One container per VM architecture.
Reasoning: Burned in the past with shared hosts.
Service discovery: Each service behind an ELB. Static DNS entry for
ELB.
Container strategy
8
Why not multiple containers per instance?
Small decentralized teams
Clustering solutions such as Mesos, Kubernetes, CoreOS were
deemed too complex for teams to setup and manage on a per
team/account basis
Weren’t fully aware/trusting of Docker isolation features at decision
time
Amazon ECS wasn’t a mature option at the time but we may revisit
this.
DeploymentsSubtitle
Voluntary Adoption
Up to each team to decide how they get their software to production
Teams tend to fall into the following camps:
Ionroller (Immutable deployment stacks)
Cloudformation + Codedeploy
Legacy tooling (mostly RPM based as opposed to Docker based)
Ionroller
Tool for immutable deploys developed by Gilt and open sourced
Allows deploys without downtime and instant rollback
https://github.com/gilt/ionroller
Based on Elasticbeanstalk
Using Docker means Ionroller doesn’t need to be aware of nature of
software being deployed
Using Docker means each service can have its own environment
while being deployed on the standard AMI that all Ionroller services use
Ionroller – DNS migration
First iteration achieved zero downtime using simultaneous
deployments and DNS traffic migration
Ionroller spins up new stack using Elasticbeanstalk Docker stack
Ionroller changes DNS alias in Route53 to point to ELB for new stack
After a configurable period of time, Ionroller terminates stack running
old version of service
13
Issues with Ionroller DNS traffic migration
Gilt internal version used DNS for traffic migration
A new ELB was setup for each version deploy
Loss of traffic history in ELB
In some situations DNS entry for old version might be cached
Public/open source version can use fixed ELBs and does traffic
migration by adding/removing instances to/from autoscaling group
Ionroller Demo
15
Ionroller – Traffic migration through ASG groups
The open source version of Ionroller uses a more reliable version of
traffic migration
In this version a single ELB is used
When a new version of the service is deployed, a new auto scaling
group is created for the new version and registered with ELB
For phased migration, an instance running old version of service is
removed from old version of ASG and an instance running new version is
added to ASG. This allows phased traffic migration.
When deployment is finished, old ASG is eventually destroyed
17
18
The alternative: Cloudformation + Codedeploy
Adopted by teams who were unhappy with Ionroller limitations
Greater flexibility
Default Elasticbeanstalk Docker stack uses Nginx to do port
forwarding between ELB and Docker container which can be problematic
Most teams continue to use Docker to define environment on top of
standard Amazon Linux AMI
Docker host networking proved most reliable for us (--net=host)
Gilt tooling supports live and dark canary deploys
Mutable deployments, no instant rollback or guarantees of zero
downtime like with Ionroller
MonitoringSubtitle
Monitoring
Current situation: Cloudwatch + New Relic
Also tried:
Self hosted Logstash + Kibana Elasticsearch.
Advantages: Great search capabilities, low cost
Disadvantage: Having to manually delete old log files but team is
working on automating this process.
Logentries: Excellent product but abandoned due to high cost
Cloudwatch logging implementation
Directory containing logs in Docker container is mounted onto host
filesystem.
Pass –v flag with host directory and container directory to docker run
command
Install Cloudwatch logging agent on EC2 instance
yum install -y awslogs
Configuration in /etc/awslogs/awslogs.conf
Doc at
http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/
AgentReference.html
service awslogs start
chkconfig awslogs on
Cloudwatch Demo
23
Future development…
Will likely use AWS logger in Docker 1.9 when Amazon Linux has
docker 1.9 package
Amazon Elastisearch
Launched early October
Allows us to search across Cloudwatch logs
In Cloudwatch logs UI, click on Log group and select “Start streaming
to Elasticsearch”
Quite expensive compared to hosting own Elasticsearch solution
though.
Job/batch processingSubtitle
Job and batch processing
First foray into multiple Docker containers per EC2 instance
Our legacy system ran ruby scripts on a Cron timer
Explored Chronos as a replacement but not adopted due to bias
against Mesos
Personalization team developed a solution based on Amazon ECS
(EC2 Container Service) and Dockerized jobs which we call Sundial
Currently monitoring memory usage through Codahale metrics and
Graphite server in companion container.
More detailed monitoring of ECS cluster currently being assessed
(currently doing Datadog trial as well as trialling New Relic Docker
support) .
Sundial
Higher level API and UI on top of Amazon ECS and Docker
ECS is Amazon EC2 Container Service: A clustering solutuion for
Docker containers
Sundial aggregates Docker jobs/ECS tasks into processes
Supports dependencies between jobs, graphical representation of
dependency tree
Collects logs and Graphite metrics from running jobs using Docker
companion container and streams logs to Cloudwatch
Supports viewing live logs and saved logs from jobs
Dependency graph shows failed/pending/succeeded status of jobs
Originally authored by Gary LosHuertos (https://twitter.com/garylosh),
owned and maintained by Gilt Personalization team
Plan to open source in coming months.
Sundial Demo
28
Thank you!Kevin O’Riordan@kevinoriordan