AWS Webinar - Measuring Your Application Performance and Health

92
AWS 201 Measuring Your Applica6on Performance and Health Markku Lepistö A Technology Evangelist @markkulepisto

description

AWS Webinar - Measuring and monitoring application performance and health

Transcript of AWS Webinar - Measuring Your Application Performance and Health

Page 1: AWS Webinar - Measuring Your Application Performance and Health

AWS$201$

Measuring$Your$Applica6on$Performance$and$Health$

Markku$Lepistö$A$Technology$Evangelist$@markkulepisto$

Page 2: AWS Webinar - Measuring Your Application Performance and Health

Housekeeping$

• Presenta6on$~40mins$• Post$Ques6ons$Online$• Q&A$at$the$end$using$the$online$chat$• Reminder$–$Fill$in$the$survey!$$

Page 3: AWS Webinar - Measuring Your Application Performance and Health

Why monitor?

Page 4: AWS Webinar - Measuring Your Application Performance and Health

Without Instrumentation You Are Flying Blind

Page 5: AWS Webinar - Measuring Your Application Performance and Health

Actionable insights of Historical, Current, and Predicted system state Data-driven decisions

Availability Performance Cost-optimization Release speed & quality …

Instrumentation Gives You

Page 6: AWS Webinar - Measuring Your Application Performance and Health

What to monitor?

Page 7: AWS Webinar - Measuring Your Application Performance and Health

Business KPIs Transactions total Customer QoS Customer QoE Revenue Cost …

Operational KPIs Transaction – success & error rate, latency

Throughput Load - system, service, node, component

Health Availability …

KPI = Key Performance Indicator, i.e metric

Page 8: AWS Webinar - Measuring Your Application Performance and Health

What are we actually measuring?

System Inputs, State Changes and Outputs

delta

Page 9: AWS Webinar - Measuring Your Application Performance and Health

What causes system changes? Inputs (customer traffic)

Code changes Manual operations (Ops ! Opps!)

Automated operations (Complex Adaptive System) OS packages & patches

Dependent services Underlying infrastructure

delta

Page 10: AWS Webinar - Measuring Your Application Performance and Health

When and where should we measure?

Page 11: AWS Webinar - Measuring Your Application Performance and Health

Everywhere - All the Time!

Page 12: AWS Webinar - Measuring Your Application Performance and Health

“Big$Data$is$what$happened$when$the$cost$of$storing$informa6on$became$less$than$the$cost$of$making$the$decision$to$

throw$it$away”!

George!Dyson,!!Author!of!“The!Digital!Universe”!

Page 13: AWS Webinar - Measuring Your Application Performance and Health

COLLECT$|$ANALYZE$|$DISPLAY$|$ACT$

Page 14: AWS Webinar - Measuring Your Application Performance and Health

COLLECT$|$ANALYZE$|$DISPLAY$|$ACT$

Page 15: AWS Webinar - Measuring Your Application Performance and Health

Top

to B

otto

m: T

echn

olog

y S

tack

End-to-End: Client – Server / Service

When and Where to Measure & Collect?

Page 16: AWS Webinar - Measuring Your Application Performance and Health

Top

to B

otto

m: T

echn

olog

y S

tack

End-to-End: Client – Server / Service

When and Where to Measure & Collect?

Page 17: AWS Webinar - Measuring Your Application Performance and Health

When$to$Measure?$Throughout$Applica6on$Lifecycle$

test$

Con6nuous$Integra6on$

code$ build$plan$

Agile$Development$Source$h\p://www.collab.net$

deploy$ operate$

DevOps$

release$

Con6nuous$Delivery$

Page 18: AWS Webinar - Measuring Your Application Performance and Health

When$to$Measure?$Throughout$Applica6on$Lifecycle$

test$code$ build$plan$ deploy$ operate$

Commits$Lines$changed$Modules$changed$Issues$resolved$Features$implemented$

release$

Page 19: AWS Webinar - Measuring Your Application Performance and Health

When$to$Measure?$Throughout$Applica6on$Lifecycle$

test$code$ build$plan$ deploy$ operate$

Successful$builds$Failed$builds$Build$dura6on$vs!HW!resources!used!Images$(AMI)$built$

release$

Page 20: AWS Webinar - Measuring Your Application Performance and Health

When$to$Measure?$Throughout$Applica6on$Lifecycle$

test$code$ build$plan$ deploy$ operate$

Integra6on$test$success/failure$Performance$test$metrics$

$Throughput$as$a$func=on!of!virtual!HW!used!Stability$test$metrics$

$Memory$leak?$Filesystem$trends$–$fill/cleanup$etc?$$Degrada6on$of$any$KPI$over!=me?!

Security$test$metrics$–$PEN…$

release$

Page 21: AWS Webinar - Measuring Your Application Performance and Health

When$to$Measure?$Throughout$Applica6on$Lifecycle$

test$code$ build$plan$ deploy$ operate$

#$of$releases$#$of$deploys$#$of$rollbacks$Opera6onal$KPIs$

$Stability,$availability$$Performance,$security$$…$

release$

Page 22: AWS Webinar - Measuring Your Application Performance and Health

When$to$Measure?$Throughout$Applica6on$Lifecycle$

test$code$ build$plan$ deploy$ operate$

#$of$bugs$reported++$#$of$features$requested++$Performance$&$Cost$op6miza6on$A/B$test$results$

release$

Feedback$Loop$

Page 23: AWS Webinar - Measuring Your Application Performance and Health

Challenge:$DevOps$&$Cloud$Increase$Rate$of$Change$

Rare Releases – Static Servers “Waterfall”

Frequent Releases – Dynamic Instances “Agile, Lean, DevOps”

Time!

Change!

Time!

Change!

New$code,$on$bursts$of$new$instances$Instance$role$changes$

Dynamic,$recycled$IP$addresses$

LongAlived$servers$Sta6c$roles$

Sta6c$IP$addresses$

Page 24: AWS Webinar - Measuring Your Application Performance and Health

Top

to B

otto

m: T

echn

olog

y S

tack

End-to-End: Client – Server / Service

When and Where to Measure and Collect?

Page 25: AWS Webinar - Measuring Your Application Performance and Health

Where$to$Measure?$EndAtoAEnd$

Client $$$$$$$$$$$$$$$$$Transport$Net/Services $$$$$Your$App/Service$$$$$$$3rd$Party$Services$

AWS$Services$

Page 26: AWS Webinar - Measuring Your Application Performance and Health

Where$to$Measure?$EndAtoAEnd$

Test$Client$Agents$QoS,$QoE$KPIs$

Client $$$$$$$$$$$$$$$$$Transport$Net/Services $$$$$Your$App/Service$$$$$$$3rd$Party$Services$

AWS$Services$

Page 27: AWS Webinar - Measuring Your Application Performance and Health

Where$to$Measure?$EndAtoAEnd$

Tcpdump$on$Client$and$App$Servers$Wireshark$for$Transport$QoS$KPIs$

Client $$$$$$$$$$$$$$$$$Transport$Net/Services $$$$$Your$App/Service$$$$$$$3rd$Party$Services$

AWS$Services$

Page 28: AWS Webinar - Measuring Your Application Performance and Health

Client/Server$QoS$with$Transport$Layer$Metrics$

Client$

Server$

Page 29: AWS Webinar - Measuring Your Application Performance and Health

Where$to$Measure?$EndAtoAEnd$

AWS$Service$Health$Dashboard$AWS$CloudTrail$AWS$CloudWatch$

Client $$$$$$$$$$$$$$$$$Transport$Net/Services $$$$$Your$App/Service$$$$$$$3rd$Party$Services$

AWS$Services$

Page 30: AWS Webinar - Measuring Your Application Performance and Health

Monitoring$AWS$A$Service$Health$Dashboard$

Page 31: AWS Webinar - Measuring Your Application Performance and Health

Monitoring$AWS$Account$Ac6vi6es$A$AWS$CloudTrail$

You are making API

calls...

On a growing set of services

around the world…

CloudTrail is continuously recording API

calls…

And delivering log files to you

Page 32: AWS Webinar - Measuring Your Application Performance and Health

Partner CloudTrail Solutions

Page 33: AWS Webinar - Measuring Your Application Performance and Health

Monitoring$AWS$Resources$–$Amazon$CloudWatch$

Page 34: AWS Webinar - Measuring Your Application Performance and Health
Page 35: AWS Webinar - Measuring Your Application Performance and Health

AWS$Service$Measurements$

•  Auto$Scaling$groups$

•  AWS$es6mated$charges$

•  Amazon$DynamoDB$tables$

•  Amazon$EBS$volumes$

•  Amazon$EC2$instances$

•  Amazon$Elas6Cache$caches$

•  Elas6c$Load$Balancing$

•  Amazon$Elas6c$MapReduce$jobs$

•  Amazon$RDS$databases$

•  Amazon$SNS$no6fica6ons$

•  Amazon$SQS$queues$

•  AWS$Storage$Gateway$

$$$$$++$

Page 36: AWS Webinar - Measuring Your Application Performance and Health

CloudWatch+Alarms+

Page 37: AWS Webinar - Measuring Your Application Performance and Health

EC2:$$Tell$me$if$my$instance$needs$a\en6on$$$DynamoDB:$$Help$me$balance$cost$and$performance$$$Billing:$$Tell$me$when$my$bill$is$gemng$too$high$

$

Page 38: AWS Webinar - Measuring Your Application Performance and Health
Page 39: AWS Webinar - Measuring Your Application Performance and Health

Custom+Metrics+Example$–$Instance$Memory$

Page 40: AWS Webinar - Measuring Your Application Performance and Health
Page 41: AWS Webinar - Measuring Your Application Performance and Health
Page 42: AWS Webinar - Measuring Your Application Performance and Health
Page 43: AWS Webinar - Measuring Your Application Performance and Health
Page 44: AWS Webinar - Measuring Your Application Performance and Health
Page 45: AWS Webinar - Measuring Your Application Performance and Health
Page 46: AWS Webinar - Measuring Your Application Performance and Health
Page 47: AWS Webinar - Measuring Your Application Performance and Health
Page 48: AWS Webinar - Measuring Your Application Performance and Health
Page 49: AWS Webinar - Measuring Your Application Performance and Health
Page 50: AWS Webinar - Measuring Your Application Performance and Health
Page 51: AWS Webinar - Measuring Your Application Performance and Health
Page 52: AWS Webinar - Measuring Your Application Performance and Health
Page 53: AWS Webinar - Measuring Your Application Performance and Health
Page 54: AWS Webinar - Measuring Your Application Performance and Health
Page 55: AWS Webinar - Measuring Your Application Performance and Health
Page 56: AWS Webinar - Measuring Your Application Performance and Health
Page 57: AWS Webinar - Measuring Your Application Performance and Health
Page 58: AWS Webinar - Measuring Your Application Performance and Health
Page 59: AWS Webinar - Measuring Your Application Performance and Health
Page 60: AWS Webinar - Measuring Your Application Performance and Health
Page 61: AWS Webinar - Measuring Your Application Performance and Health
Page 62: AWS Webinar - Measuring Your Application Performance and Health
Page 63: AWS Webinar - Measuring Your Application Performance and Health
Page 64: AWS Webinar - Measuring Your Application Performance and Health
Page 65: AWS Webinar - Measuring Your Application Performance and Health

Where$to$Measure?$EndAtoAEnd$

Request/Response$success/fail$Response$latency$

Client $$$$$$$$$$$$$$$$$Transport$Net/Services $$$$$Your$App/Service$$$$$$$3rd$Party$Services$

AWS$Services$

Page 66: AWS Webinar - Measuring Your Application Performance and Health

Measuring$External,$Dependent$Services$

Page 67: AWS Webinar - Measuring Your Application Performance and Health

Where$to$Measure?$EndAtoAEnd$

Client $$$$$$$$$$$$$$$$$Transport$Net/Services $$$$$Your$App/Service$$$$$$$3rd$Party$Services$

AWS$Services$

Page 68: AWS Webinar - Measuring Your Application Performance and Health

Top

to B

otto

m: T

echn

olog

y S

tack

End-to-End: Client – Server / Service

When and Where to Measure and Collect?

Page 69: AWS Webinar - Measuring Your Application Performance and Health

User$Applica6on$

Applica6on$Server$

Web$/$DB$Server$

Language$Interpreter$/$$JVM$

Guest$Opera6ng$System$&$Services$

EC2$Instance$

Measure$the$En6re$Stack,$Top$to$Bo\om$

Page 70: AWS Webinar - Measuring Your Application Performance and Health

Applica6on$Internal$Metrics$

Page 71: AWS Webinar - Measuring Your Application Performance and Health

COLLECT$|$ANALYZE$|$DISPLAY$|$ACT$

Page 72: AWS Webinar - Measuring Your Application Performance and Health

$$$STORE$$$|$$ANALYZE$

Glacier$

S3$ EC2$

Redshir$DynamoDB$$

EMR$

Data$Pipeline$

Leverage$AWS$Big$Data$Services$

Kinesis$

Page 73: AWS Webinar - Measuring Your Application Performance and Health

COLLECT$|$ANALYZE$|$DISPLAY$|$ACT$

Page 74: AWS Webinar - Measuring Your Application Performance and Health

METRICS+@ETSY+

Page 75: AWS Webinar - Measuring Your Application Performance and Health

Values$over$Time$$at$Sampling!Rate!

Visualiza6on$A$Graph$

Page 76: AWS Webinar - Measuring Your Application Performance and Health

Sampling+Rate+How$oCen$should$I$measure?$

Depends$on$what$you$measure$A$Depends$on$its$rate!of!change!(frequency)$

Page 77: AWS Webinar - Measuring Your Application Performance and Health

Nyquist$$Frequency$$

Original$signal$=$Red$Measured$signal$=$Blue$

You!should!measure!at!least!twice!as!oCen!as!your!value!changes!

Page 78: AWS Webinar - Measuring Your Application Performance and Health

System$Measurements$==$Signal$We$can$do$Digital$Signal$Processing$

Linear+Regression$–$trendline$predicts$filesystem$running$out$of$inodes$(cannot$create$files)$

Page 79: AWS Webinar - Measuring Your Application Performance and Health

System$Measurements$==$Signal$We$can$do$Digital$Signal$Processing$

Linear+regression+&+Fast+Fourier+TransformaAon+for$pa\erns,$anomalies$and$future$predic6ons$

Page 80: AWS Webinar - Measuring Your Application Performance and Health

Visualiza6on$–$Sca\er$Plot$

Page 81: AWS Webinar - Measuring Your Application Performance and Health

Visualiza6on$–$Box$Plot$

Page 82: AWS Webinar - Measuring Your Application Performance and Health

Including$outliers$&$ends$of$distribu6on$

Visualiza6on$–$Normal$Curve$&$Histogram$

opsly.com$

Page 83: AWS Webinar - Measuring Your Application Performance and Health

COLLECT$|$ANALYZE$|$DISPLAY$|$ACT$

Page 84: AWS Webinar - Measuring Your Application Performance and Health

Manual$/$Human$Ac6ons$A$OODA$Loop$

Page 85: AWS Webinar - Measuring Your Application Performance and Health

Automated$Human$Ac6ons$$Amazon$CloudWatch,$Amazon$SNS$&$Pager$Duty$

Page 86: AWS Webinar - Measuring Your Application Performance and Health

Automatic resizing of compute clusters based on measurements, thresholds and actions

Trigger$autoAscaling$policy$

Feature+ Details+Control+ Define$minimum$and$maximum$instance$pool$

sizes$and$when$scaling$and$cool$down$occurs.$

Integrated+to+Amazon+CloudWatch+

Use$metrics$gathered$by$CloudWatch$to$drive$scaling.$

Instance+types+ Run$Auto$Scaling$for$OnADemand$and$Spot$Instances.$Compa6ble$with$VPC.$

as-create-auto-scaling-group MyGroup --launch-configuration MyConfig --availability-zones us-east-1a --min-size 4 --max-size 200

Amazon$CloudWatch$

Automated$Ac6ons$–$AWS$Auto$Scaling$

Page 87: AWS Webinar - Measuring Your Application Performance and Health

Automated$Ac6ons$A$PID$Controller$System$Reaches$Target$State$with$Calculated$Changes$and$Monitoring$Feedback$Loop$

Propor6onal,$$Integral,$$Deriva6ve$

Page 88: AWS Webinar - Measuring Your Application Performance and Health

Useful+Tools+and+Services+

Page 89: AWS Webinar - Measuring Your Application Performance and Health
Page 90: AWS Webinar - Measuring Your Application Performance and Health

Thank$you$

Markku$Lepistö$A$Technology$Evangelist$@markkulepisto$

Page 91: AWS Webinar - Measuring Your Application Performance and Health

Your$Feedback$is$Important$

Please$complete$the$Survey!$What’s!good,!what’s!not!

What!you!want!to!see!at!these!events!

What!you!want!AWS!to!deliver!for!you!

$

Page 92: AWS Webinar - Measuring Your Application Performance and Health

Q&A