Lessons learned managing large
AWS EnvironmentsRonald Bradford
http://ronaldbradford.com @RonaldBradford
2013.06
EffectiveMySQL.com - Performance, Scalability & Business Continuity
SCOPE
Consulting experiences with AWS
Several different clientsLargest - 500+ servers
Some 40-50+ servers
Some 2-5 servers
LAMP/RoR/RDS/Windows
EffectiveMySQL.com - Performance, Scalability & Business Continuity
ABOUT MySELF
Enterprise Data Architecture
24 years with RDBMS - 13 years with MySQL
Using AWS 4+ years
Published author - 4 books
Accomplished presenter - 8 years
Work at Independent MySQL Consultant
Ronald BRADFORD
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Covering
1. Products
2. Cost
3. Web Scale
4. Security
5. Instrumentation
6. Failure
EffectiveMySQL.com - Performance, Scalability & Business Continuity
AWS Products & Ecosystem
1
EffectiveMySQL.com - Performance, Scalability & Business Continuity
ABOUT AWS
Many, many products and features
EC2, S3, EBS, ELB, RDS, EMR, VPC, CDN, SWF, SQS, SES, SNS, IAM, ...
Mechanical Turk
Flexible Payments Service (FPS)
AMAZON WEB SERVICES30+
EffectiveMySQL.com - Performance, Scalability & Business Continuity
AWS CONSOLE
May 2013 Aug 2012
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Announcements
Product Announcements
Pricing Changes
New instance types
New features (e.g. IOPS)
New Products (e.g. Redshift/ OpsWorks)
http://aws.amazon.com/about-aws/newsletters/
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Announcements
Product Announcements
Pricing Changes
New instance types
New features (e.g. IOPS)
New Products (e.g. Redshift/ OpsWorks)
Examples in presentation
http://aws.amazon.com/about-aws/newsletters/
EffectiveMySQL.com - Performance, Scalability & Business Continuity
ECOSYSTEM
AWS Marketplacehttps://aws.amazon.com/marketplace/
Over 800
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Product growth
When I started
No RDS, In-memory Cache, DynamoDB, Glacier
No Elastic Beanstalk, OpsWorks
No management console
EffectiveMySQL.com - Performance, Scalability & Business Continuity
AWS Costs2
EffectiveMySQL.com - Performance, Scalability & Business Continuity
operating cost
Are you monitoring your costs?
Daily
Hourly
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Operating Cost
https://github.com/ronaldbradford/aws
$ ec2_cost.sh
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Operating Cost
https://github.com/ronaldbradford/aws
$ ec2_cost.sh
$29,000 p.m.
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Your Money
What is AWS costing you?
Instance types/sizes
Cost options
http://aws.amazon.com/ec2/instance-types
http://aws.amazon.com/ec2/pricing
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Instance Types
General-purpose
Compute-optimized
Memory-optimized
Storage-optimized
GPU
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Instance Prices
$Large Instance (m1.large)
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Instance Prices
$On Demand $0.24 Per hour investment
Reserved $0.136 * + Annual contract ( +$ 0.043)
Spot $0.03+ * Can be terminated (budget)
Large Instance (m1.large)
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Instance Prices
$On Demand $0.24 Per hour investment
Reserved $0.136 * + Annual contract ( +$ 0.043)
Spot $0.03+ * Can be terminated (budget)
Large Instance (m1.large)
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Instance Prices
$On Demand $0.24 Per hour investment
Reserved $0.136 * + Annual contract ( +$ 0.043)
Spot $0.03+ * Can be terminated (budget)
Large Instance (m1.large)
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Instance Prices
$On Demand $0.24 Per hour investment
Reserved $0.136 * + Annual contract ( +$ 0.043)
Spot $0.03+ * Can be terminated (budget)
Large Instance (m1.large)
40% saving
up to 80+% saving
Was $0.32 til 11/19/2012Was $0.26 til 1/16/2013
Light/Medium/Heavy utilization
EffectiveMySQL.com - Performance, Scalability & Business Continuity
SPOT EXAMPLE
One hour (24 cents)
1 x Large - Reserved
7.5G, 4 CPUs, 850G
8 x Large - Spot
or
1 x Eight Extra Large - Spot (cc2.8xlarge)
60G, 88 CPUs, 3.4T,10Gb NIC
EffectiveMySQL.com - Performance, Scalability & Business Continuity
SPOT EXAMPLE
One hour (24 cents)
1 x Large - Reserved
7.5G, 4 CPUs, 850G
8 x Large - Spot
or
1 x Eight Extra Large - Spot (cc2.8xlarge)
60G, 88 CPUs, 3.4T,10Gb NIC
price has changed 3 times in 8 months
EffectiveMySQL.com - Performance, Scalability & Business Continuity
SPOT HISTORY
$ ec2-describe-spot-price-history -t m1.large -d Linux/UNIX SPOTINSTANCEPRICE 0.030000 2013-05-28T17:20:41-0500 m1.large Linux/UNIX us-east-1aSPOTINSTANCEPRICE 0.100000 2013-05-28T17:07:02-0500 m1.large Linux/UNIX us-east-1aSPOTINSTANCEPRICE 0.030000 2013-05-28T16:37:51-0500 m1.large Linux/UNIX us-east-1aSPOTINSTANCEPRICE 0.100000 2013-05-28T16:31:03-0500 m1.large Linux/UNIX us-east-1aSPOTINSTANCEPRICE 0.030000 2013-05-28T16:24:48-0500 m1.large Linux/UNIX us-east-1dSPOTINSTANCEPRICE 0.030000 2013-05-28T16:24:48-0500 m1.large Linux/UNIX us-east-1aSPOTINSTANCEPRICE 0.100000 2013-05-28T16:15:03-0500 m1.large Linux/UNIX us-east-1aSPOTINSTANCEPRICE 0.060000 2013-05-28T16:08:34-0500 m1.large Linux/UNIX us-east-1dSPOTINSTANCEPRICE 0.030000 2013-05-28T16:01:59-0500 m1.large Linux/UNIX us-east-1bSPOTINSTANCEPRICE 0.240000 2013-05-28T15:55:12-0500 m1.large Linux/UNIX us-east-1bSPOTINSTANCEPRICE 0.030000 2013-05-28T15:48:32-0500 m1.large Linux/UNIX us-east-1bSPOTINSTANCEPRICE 0.030000 2013-05-28T15:42:07-0500 m1.large Linux/UNIX us-east-1aSPOTINSTANCEPRICE 0.045000 2013-05-28T15:35:47-0500 m1.large Linux/UNIX us-east-1aSPOTINSTANCEPRICE 0.050000 2013-05-28T15:35:47-0500 m1.large Linux/UNIX us-east-1bSPOTINSTANCEPRICE 0.400000 2013-05-28T15:29:15-0500 m1.large Linux/UNIX us-east-1bSPOTINSTANCEPRICE 0.260000 2013-05-28T15:22:47-0500 m1.large Linux/UNIX us-east-1bSPOTINSTANCEPRICE 0.030000 2013-05-28T15:16:01-0500 m1.large Linux/UNIX us-east-1dSPOTINSTANCEPRICE 0.030000 2013-05-28T15:16:01-0500 m1.large Linux/UNIX us-east-1aSPOTINSTANCEPRICE 0.026000 2013-05-28T15:09:30-0500 m1.large Linux/UNIX us-east-1a
3c to 10c Zone A3c to 40c Zone B2013
EffectiveMySQL.com - Performance, Scalability & Business Continuity
SPOT HISTORY
$ ec2-describe-spot-price-history -t m1.large -d Linux/UNIX 0.0260 2012-09-27T09:45:46-0800 m1.large Linux/UNIX us-east-1b0.0260 2012-09-27T09:45:46-0800 m1.large Linux/UNIX us-east-1d0.0290 2012-09-27T09:38:37-0800 m1.large Linux/UNIX us-east-1b0.0370 2012-09-27T09:38:37-0800 m1.large Linux/UNIX us-east-1d0.0600 2012-09-27T09:31:29-0800 m1.large Linux/UNIX us-east-1b0.1700 2012-09-27T09:31:29-0800 m1.large Linux/UNIX us-east-1d0.1600 2012-09-27T09:24:20-0800 m1.large Linux/UNIX us-east-1d0.0600 2012-09-27T09:17:11-0800 m1.large Linux/UNIX us-east-1b0.0900 2012-09-27T09:17:11-0800 m1.large Linux/UNIX us-east-1d0.0260 2012-09-27T09:09:55-0800 m1.large Linux/UNIX us-east-1c0.0260 2012-09-27T09:09:55-0800 m1.large Linux/UNIX us-east-1b
2.6c to 17c (1/2 of 34c)One AZ only2012
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Using SPOTS
Is your volume predicable?
Splitting on-demand/spot instances
Can work be done asynchronously?
i.e. can be queued
Is work restartable?
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Using SPOTS
Is your volume predicable?
Splitting on-demand/spot instances
Can work be done asynchronously?
i.e. can be queued
Is work restartable? WARNING: Not for general workloads
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Instance sizes
Evaluating the right instance size
What is your bottleneck?
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Instance sizes
Evaluating the right instance size
What is your bottleneck?
Developing a tool to recommend savings
EffectiveMySQL.com - Performance, Scalability & Business Continuity
TRUSTED ADVISOR
AWS now offers Trusted AdvisorRecommendations to save money
Improve performance
Close security problems
http://aws.amazon.com/premiumsupport/trustedadvisor/
EffectiveMySQL.com - Performance, Scalability & Business Continuity
COST SAVINGS
Other players
http://www.newvem.com/http://www.cloudyn.com/
EffectiveMySQL.com - Performance, Scalability & Business Continuity
OTHER COST SAvings
CDN - Cloudfront
Bandwidth
Reduce response size (e.g. 10%)
Storage
old EBS snapshots
Remove unused instances
http://aws.amazon.com/cloudfront/
NEW: Announced 1/9/2103 CloudWatch Alarm Actions
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Web Scale(hint: no humans)
3
EffectiveMySQL.com - Performance, Scalability & Business Continuity
ABOUT WEB SCALE
GUI = #FAIL
CLI is necessary
Manual CLI use is slow
Automation in crucial
Parallel
EffectiveMySQL.com - Performance, Scalability & Business Continuity
AWS CLI’s
Different for EC2, ELB, RDS etc
Updated frequently (i.e. monthly)
$ git clone https://github.com/ronaldbradford/aws.git$ cd aws/scripts$ ./aws_cli_configure.sh
EffectiveMySQL.com - Performance, Scalability & Business Continuity
AWS CLI’s
Different for EC2, ELB, RDS etc
Updated frequently (i.e. monthly)
$ git clone https://github.com/ronaldbradford/aws.git$ cd aws/scripts$ ./aws_cli_configure.sh
Simple helper
EffectiveMySQL.com - Performance, Scalability & Business Continuity
RTFM
http://aws.amazon.com/archives/Amazon-EC2
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Identifiers
Access Key ID
Private Access Key
X.509 Certificates (2 of)
Private (*) & Public
AWS Account ID
Canonical User IDhttps://portal.aws.amazon.com/gp/aws/securityCredentials
EffectiveMySQL.com - Performance, Scalability & Business Continuity
CLI Examples
Launch Script
Demand/Spot or switch between
Verify SSH
Verify MySQL
Verify replication in sync
Add to ELB
EffectiveMySQL.com - Performance, Scalability & Business Continuity
CLI Examples
Audit Script
Consolidates information
Parallel operations
Unused EC2/EBS etc
Feeds reporting
ELB/EC2 usage
EffectiveMySQL.com - Performance, Scalability & Business Continuity
CLI EXAMPLES
Others
Cost Measurement
Cloning (optimizes scale-up)
Move servers between load balancers
Spot History graphing
Spot History email alerts
EffectiveMySQL.com - Performance, Scalability & Business Continuity
AWS Security4
EffectiveMySQL.com - Performance, Scalability & Business Continuity
SECURITY
Do not give away the front door keys
Do not open all the windows
EffectiveMySQL.com - Performance, Scalability & Business Continuity
SECURITY OPTIONS
Keypairs
Security groups
Virtual Private Cloud (VPC)
Identity and Access Management (IAM)
Multi-factor authentication
Learn the different benefits
http://aws.amazon.com/mfa/
EffectiveMySQL.com - Performance, Scalability & Business Continuity
SECURITY TIPS
Restrict open access to port 80/443
Jump box
Restrict IP Access
Additional authentication
Per user SSH authentication
Do not use keypair
EffectiveMySQL.com - Performance, Scalability & Business Continuity
products
Many Others (AWS Summit 2013)
Cloudaware
Enstratius
AlertLogic
Dome9
SafeNet
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Instrumentation5
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Instrumentation
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Instrumentation
What is important to you?
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Instrumentation
What is important to you?
All server stats
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Instrumentation
What is important to you?
All server stats
Sampling issues
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Instrumentation
What is important to you?
All server stats
Sampling issues
Deceiving averages (frequency)
EffectiveMySQL.com - Performance, Scalability & Business Continuity
REQUESTS PER SEC
5 second averages, not 1 minute samplehttps://github.com/ronaldbradford/reqstat
EffectiveMySQL.com - Performance, Scalability & Business Continuity
REQUESTS PER SEC
5 second averages, not 1 minute samplehttps://github.com/ronaldbradford/reqstat
EffectiveMySQL.com - Performance, Scalability & Business Continuity
REQUESTS PER SEC
5 second averages, not 1 minute samplehttps://github.com/ronaldbradford/reqstat
-1,500 RPS
EffectiveMySQL.com - Performance, Scalability & Business Continuity
outliers
EffectiveMySQL.com - Performance, Scalability & Business Continuity
outliersI care about these
EffectiveMySQL.com - Performance, Scalability & Business Continuity
TESTING
End to end testing critical
Network latency
ELB performance
EffectiveMySQL.com - Performance, Scalability & Business Continuity
products
AWS Cloudwatch
Many Others (AWS Summit 2013)
Datadog
Boundary
CopperEgg
AppDynamics
EffectiveMySQL.com - Performance, Scalability & Business Continuity
products
AWS Cloudwatch
Many Others (AWS Summit 2013)
Datadog
Boundary
CopperEgg
AppDynamics
What features matter?
EffectiveMySQL.com - Performance, Scalability & Business Continuity
Failure6
EffectiveMySQL.com - Performance, Scalability & Business Continuity
FAILURE
EffectiveMySQL.com - Performance, Scalability & Business Continuity
FAILURE
Instances fail
EffectiveMySQL.com - Performance, Scalability & Business Continuity
FAILURE
Instances fail
Outages occur
AWS scheduled reboots
EffectiveMySQL.com - Performance, Scalability & Business Continuity
FAILURE
Instances fail
Outages occur
AWS scheduled reboots
Be prepared
Chaos Monkey
http://www.codinghorror.com/blog/2011/04/working-with-the-chaos-monkey.html
EffectiveMySQL.com - Performance, Scalability & Business Continuity
CONCLUSION
EffectiveMySQL.com - Performance, Scalability & Business Continuity
CONCLUSION
Cost Management (saving money)
EffectiveMySQL.com - Performance, Scalability & Business Continuity
CONCLUSION
Cost Management (saving money)
CLI automation
EffectiveMySQL.com - Performance, Scalability & Business Continuity
CONCLUSION
Cost Management (saving money)
CLI automation
Instrumentation (inc business metrics)
EffectiveMySQL.com - Performance, Scalability & Business Continuity
CONCLUSION
Cost Management (saving money)
CLI automation
Instrumentation (inc business metrics)
Distribute your application & data
EffectiveMySQL.com - Performance, Scalability & Business Continuity
CONCLUSION
Cost Management (saving money)
CLI automation
Instrumentation (inc business metrics)
Distribute your application & data
Disaster is inevitable
EffectiveMySQL.com - Performance, Scalability & Business Continuity
AWS for FREE
http://aws.amazon.com/free/
Free EC2 t1.micro for a year
Free RDS t1.micro for a year
S3, DynamoDB, SimpleDB, +++
EffectiveMySQL.com - Performance, Scalability & Business Continuityhttp://effectiveMySQL.comRonald Bradford
Top Related