AWS users group presentation optimizing your AWS account with CloudMGR & CloudCheckr
Advanced Topics - Session 3 - Optimizing AWS Applications
-
Upload
amazon-web-services -
Category
Technology
-
view
3.193 -
download
6
description
Transcript of Advanced Topics - Session 3 - Optimizing AWS Applications
Paul Duffy
Optimizing Your AWS Applications and Usage to Reduce Costs
AWS Product Marketing
Agenda
• Objective
– Review the spectrum of ways to save money on your AWS application
• Tenet: Fit the cloud to your product and business model
– Use Only What You Need (and pay only for what you use!)
– Measure and Manage
– Scale Opportunistically
• Customer Spotlight
– National Rail Enquiries
Use Only What You Need
And pay only for what you use!
Customer Example
Chris Scoggins
CEO, National Rail Enquiries
Background
• Private company created in 1996 owned by the TOCs
• From the busiest phone number in the UK to the #1 website in travel
• Over 1 million visits everyday across web & mobile
• Achieved over 99% migration to self-service
• Customer complaints 1.3 per 100,000 contacts
• Over £800m of sales leads provided to TOCs and 3rd parties p.a.
• Over 500 services provided to 150 clients
• Annual growth of 50%
The Challenge
• Volatility of up to 10x peak demand
• Large deployed computer estate across 6 data centres
• Ageing computer estate
• Rapid growth in B2C and B2B business
• Ever increasing rich functionality in channels
• Multiple service desks
• Suppliers experts in application development not hosting
Why Cloud?
• Agility and elasticity – use what we need, when needed
• High performance – availability & resilience
• Market knowledge – solution provided by hosting & SIAM experts
• Low cost – pay for use, savings of 30%
• Commodity culture – ready and easy to use
• Flexibility and freedom – keep up to date & not locked in
Scale on demand
Rigid On-Premise Resources
Waste
Customer
Dissatisfaction
Actual demand
Predicted Demand
Ca
pa
city
Time
Elastic Cloud Resources
Actual demand
Resources scaled to demand
Ca
pa
city
Time
VS.
Use only what you need: AWS cost savings opportunities
• Right-size your cloud resources
– Use resources that suit your needs (instance types, storage options, etc.)
– Improve performance: reduce churn, underutilization, bottlenecks
– Lower costs: maximize your output per dollar, don’t pay for performance you don’t require
• Fit your payment model to your business model
– Do you value flexibility or predictability?
– Use a portfolio of payment models
• Measure and manage your application and cloud resources
– Monitor your applications to identify new savings opportunities
Right-size your cloud resources: broad EC2 selection
• An instance type for
every purpose
• Assess your memory
& CPU requirements
– Fit your application
to the resource
– Fit the resource to
your application
• Only use a larger
instance when
needed
Optimize your storage choice too: S3 & Glacier
• S3 and Glacier are both:
– Secure
– Flexible
– Low-cost
– Scalable: over 2 trillion customer objects
– Durable: 99.999999999% (11 “9”s) Amazon Glacier
Choosing between S3 and Glacier
• Amazon Simple Storage Service (S3)
– Designed to serve static content at high volumes, low latency, frequent access
– Low cost: as low as 5.5¢ per GB-month (or 3.7¢ for reduced redundancy)
• Amazon Glacier
– Designed for long-term cold storage: infrequent access, long retrieval times (3-5 hrs)
– Extremely low-cost: 1¢ per GB-month
• Tips:
– Optimize access: Reduce payload size, # of accesses (e.g., consolidated logs)
– Monitor for unexpected access/growth patterns: e.g., misconfigured log archiving
– Set Lifecycle Policies: object expiration dates; auto-move S3 files to Glacier
Illumina, the leading provider of DNA sequencing instruments, uses Glacier to store large blocks of genomic data all over the world
Fit your payment model to your business model: EC2 pricing plans
On-Demand Instances
Reserved Instances
Spot Instances
Pay as you go for computing power
Flat hourly rate, no up-front commitments
Pay an up-front fee for a capacity reservation and a lower hourly rate (up to 72% savings)
1-year or 3-year terms
RI Marketplace: sell RIs you no longer need; buy RIs at a discount
Pay what you want for spare EC2 capacity: your instances run if your bid exceeds the Spot price
Potential for large scale at low cost: When they’re available, take advantage of 1,000s of Spot Instances at up to 90% savings
10:00
10:05
10:10
10:15
Use a spectrum of payment models For example:
Frontend Applications on On-Demand/Reserved Instances
+
Backend Applications* on Spot Instances
* e.g., batch video transcoding
Reserved Instance Marketplace: Buy and Sell Your RIs
• Benefits for Buyers:
– Same underlying EC2 hardware
– Buy RIs at a discount from AWS price
– Increased selection of term lengths &
prices
• Benefits for Sellers:
– Moving to a new AWS region
– Changing your instance type
– Switching operating systems
– Selling capacity when project ends
Measure and Manage
“If you cannot measure it, you cannot improve it.”
- Lord Kelvin
Overview of AWS Monitoring and Management Services
• AWS provides detailed cloud monitoring and management
– Consolidated Billing (see “Account Activity” navigation panel)
– CloudWatch (see AWS Management Console)
– Billing Alerts (see “Account Activity” navigation panel)
– Trusted Advisor (see “Support Center”)
– Other APIs: tags, programmatic access, etc.
• Third-party services are also available
Consolidated Billing: Single payer for a group of accounts
• One Bill for multiple accounts
• Easy Tracking of account charges (e.g., download CSV of cost data)
• Group Activities by Paying Account (e.g., Dev, Stage, Test, Prod)
• Volume Discounts can be reached faster with combined usage
• Reserved Instances are shared across accounts (including RDS Reserved DBs)
• AWS Credits are combined to minimize your bill
Consolidated Billing Demo (1/3)
• Get an overall summary total
for all your users and accounts
Consolidated Billing Demo (2/3)
• From your payment account
login, view details of each
linked account in one place
Consolidated Billing Demo (3/3)
• Drill down into detail’s of each
account
• Download a CSV file for line
item details, then analyze via
spreadsheet, pivot tables, etc.
Amazon CloudWatch
• Overview
– Monitoring for AWS cloud resources and applications
• AWS Resources: EC2, RDS, EBS, ELB, SQS, SNS, DynamoDB, EMR, Auto Scaling, …
• Custom metrics from your application (use Put API call)
– Gain insight, set alarms and notifications, react immediately
– Start using within minutes, auto-scale with your application
• Sophisticated Automation
– Use CloudWatch metrics with Auto Scaling to dynamically scale EC2 instances
Use CloudWatch to monitor & manage resource usage
• Monitor your resource utilization
– Are you using the right instance type?
– Have you left instances idle?
– Is your instance usage level or bursty?
• Manage your resource utilization
– Move bursty workloads to other instances
– Rebalance your worker nodes
– Scale nodes automatically with Auto Scaling
Use CloudWatch to create Billing Alerts
• Billing Alerts notify you when estimated charges reach a given threshold
• Use Billing Alerts to track an individual developer, or your whole business
• Easily set up your billing alarm and actions
Trusted Advisor: Enterprise Strength Monitoring/Optimization
• Monitors and recommends
optimizations for:
– Cost
– Security
– Fault Tolerance
– Performance
• Available to customers with
Business and Enterprise-
level support
http://aws.amazon.com/premiumsupport/trustedadvisor/
Trusted Advisor: Cost Optimization Tips
Trusted Advisor: Performance Tips
Third-party services to optimize your AWS usage
Scale Opportunistically
Opportunity favors the prepared application
Time-to-Result Case 1: Value of result quickly diminishes
Example: Engineering simulation Delay Loss of productivity, project slips
Time-to-Result Case 2: Result is valuable…until it’s not
Example: Weekend regression tests Delay Minimal impact until 8:00AM Monday
Consider Spot Instances for greater savings and scale
• Spot in a nutshell
– Spot instances run when Your Bid ≥ Spot Price
– Spot instances = Spare EC2 instances
– Spot instances might be interrupted at any time
• Benefits
– Savings: Up to 90% off On-Demand
– Scale: Access up to 1,000s of EC2 instances
• To use Spot
– Decide on a bid price
– Launch via Console, API, Auto Scaling
– Monitor Bid Statuses via Console/API
What applications work on Spot?
• Good Spot applications are:
– Delayable: to balance SLA/cost
– Scalable: “embarrassingly parallel”
– Fault-tolerant: can be terminated without losing all work
– Portable across regions, AZs, instance types
• Examples:
– MapReduce (Hadoop, Amazon EMR)
– Scientific Computing (Monte Carlo simulations)
– Batch Processing (video transcoding)
– Financial Computing (high-frequency trading algorithm backtesting)
– and many others…
Lucky Oyster crawled 3.4B Web Pages, building a 400M entry index in around 14 hours for $100 (>85% savings)!
• Auto Scaling auto-sizes your cluster based on preset triggers and schedules
• Integrates with CloudWatch metrics
• Use Auto Scaling to
– Improve customer experience, application performance
– Maximize CPU/IO/Memory utilization
– Optimize other metrics
Use Auto Scaling to dynamically scale your app
Scale with Real-Time Demand
Auto-Scaling Example: Netflix
Follow the Money vs. Follow the Customer
• Optimize utilization
– Auto Scale on utilization metrics: CPU, memory, requests, connections, …
• Optimize price paid
– Scale with Spot instances when Spot prices are low
– e.g., Run batch processes off-peak (nights, weekends) when Spot prices are lower
Follow the Money vs. Follow the Customer
• Optimize customer experience with Auto Scaling
• Example 1: Scale resources to meet customer demand
– Video service Auto Scales instances to respond to customer web service requests
• Example 2: Scale resources to ensure fresh results
– A scientific paper search engine Auto Scales on queue depth (# of new docs to crawl)
– 10 instances steady state and up to 5,000+ to ensure minimum throughput time
• Example 3: Scale resources preemptively before large demand
– A TV show marketing site scales up before the show and back down after
Cost-Saving Examples
• Achieve potentially
large savings by
profiling your
application and
paying only for
what you need
Base Case Savings Examples
You run 10 m3.2xlarge’s On-Demand 24x7: 10 instances X $1.00/inst-hours X 24 hours/day X ~30.5 days/month = $7,320/month
If you need to run 100% of the time, indefinitely: 10x 3-yr Heavy RIs @ 100% Utilization
= $2,731/month (63% savings) If you can layer RIs and On Demand to meet demand: 4x 3-yr Heavy RIs @ 100% Utilization 4x 3-yr Light RIs @ 15% Utilization 2x On-Demand @ 5% Utilization
= $1,843/month (75% savings) If you Auto Scale from 2 to 10 instances around primetime TV (6-11pm, Mon-Fri): 2x 3-yr Heavy RIs @ 100% Utilization 8x 3-yr Light RIs @ 15% Utilization
= $1,683/month (77% savings) If you can use 40x Spot Instances at 25% up-time:
= $840/month (89% savings)
Conclusion (Part I):
Fit the cloud to your product and business model
• Use Only What You Need (and pay only for what you use!)
• Measure and Manage
• Scale Opportunistically
An example putting it all together: Saving on Batch Processing
http://aws.amazon.com/architecture/
3. Scale Opportunistically: Auto Scale worker
nodes based on size of input queue 1. Pay Only
for What You Use: Right-size your
cloud resources
2. Monitor and Manage your system
with CloudWatch, Billing Alerts, Trusted
Advisor
Conclusion (Part II):
Use the cloud to create new products & business models
On-Premises
• Failure is
expensive
• Experiment
infrequently
• Less Innovation
Optimized Cloud
• Failure is
inexpensive
• Experiment early
and often
• More Innovation
THANK YOU
APPENDICES
Other simple optimization tips
• Don’t forget to… – Disassociate unused EIPs
– Delete unassociated Amazon EBS volumes
– Delete older Amazon EBS snapshots
– Leverage Amazon S3 Object Expiration
– Defer batch activity (e.g., Hadoop) to periods when your RIs are regularly underutilized
(For Enterprise-level support, Trusted Advisor can help with some of these.)
• Netflix’s Janitor Monkey automates clean-up – Reduces “unintentional” resource usage
– Reduces cost and clutter
Other Spot Instance Use Cases
• Batch Processing: Generic batch processing (scale out computing)
• Hadoop: MapReduce processing (e.g., Search, Big Data)
• Scientific Computing: Scientific trials, simulations, analysis
• Video/Image Processing: Encoding, transcoding, rendering
• Testing: Continuous testing, load testing websites, etc.
• Web/Data Crawling: Analyzing data and processing it
• Financial: Hedge fund analytics, energy trading, etc.
• HPC/HTC: Embarrassingly parallel jobs
• Cheap Compute: Backend servers for Facebook games, MineCraft
Steady State
Example: Corporate Website
Spiky Predictable
Example: Marketing Promotions Website
Uncertain unpredictable
Example: Social game or Mobile Website
Application Usage Patterns
Amazon Elastic MapReduce Hadoop Cluster
HDFS
Task Node
Task Node
Core Node
Core Node
Input Data Output
Data
Amazon S3
Metadata
Amazon SimpleDB
BI Apps
Upload large datasets or log files directly
Data Source
Code/ Scripts
Amazon S3
Service
Amazon Elastic MapReduce
HiveQL Pig Latin Cascading
Mapper Reducer
Runs multiple JobFlow Steps
Name Node
JDBC/ODBC
HiveQL Pig Latin
Query
Amazon EMR (Hadoop): Run Task Nodes on Spot
Paying as you go on AWS lowers your Total Cost of Ownership
• By paying only for what you use, you can save on:
– Servers
– Storage
– Network
– Environment
– Administration
• Example: 82% TCO savings for Thomsen Reuters
• Learn more: aws.amazon.com/economics
Example Spot Customers
Example Architecture 2: Web Application Hosting
http://aws.amazon.com/architecture/