Deep Dive on Amazon Elastic Block Store

80
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Rob Alexander, AWS Principal Solutions Architect August 2016 Amazon EBS Deep Dive

Transcript of Deep Dive on Amazon Elastic Block Store

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Rob Alexander, AWS Principal Solutions Architect

August 2016

Amazon EBS Deep Dive

(Flexible)

Block

Store

Elastic

What is Amazon EBS?

EBS

volume

Availability Zone

AWS region

EC2

instance

What is Amazon EBS?

EBS

boot

volume

Availability Zone

AWS region

EC2

instance

EBS

data

volume

EBS

data

volume

Separation of Compute and Storage

Persistent

block storage Compute

Memory

What is Amazon EBS?

EBS

volume

Availability Zone Availability Zone

AWS region

Replica

What is Amazon EBS?

EBS

volume

Availability Zone

AWS region

Amazon

S3EBS snapshot

Availability Zone

Replica

How does an EBS snapshot work?

A B C

EBS volume

A B C

1

How does an EBS snapshot work?

A B C

EBS volume

A B C

12

C1

C1

D E -B

How does an EBS snapshot work?

A B

EBS volume

A B C

12

C1

C1

3

D E

D E -B

AC1

How does an EBS snapshot work?

A

EBS volume

A B C

12

C1

C1

3

D E

C1 D E

A

AC1

ED -B

How does an EBS snapshot work?

A

EBS volume

C1

3

D E

Create new EBS volume

What is Amazon EBS?

EBS

volume

Availability Zone

AWS region

Amazon

S3EBS snapshot

Availability Zone

EBS

volume

Replica Replica

What is Amazon EBS?

EBS

volume

Availability Zone

AWS region

Amazon

S3EBS snapshot

EBS

volume

Availability Zone

AWS region

EBS snapshot

Replica Replica

EBS Snapshot: Public Data Sets

AWS region

Public data sets

available as EBS

snapshots:

Availability Zone

EBS

volume

https://aws.amazon.com/public-data-sets/

• Genomic

• Census

• Global weather

• Transportation

Replica

What if an EBS volume fails?

EBS

volume

Availability Zone

EC2

instance

AWS region

EBS

volume

Replica

What if an EBS volume fails?

Availability Zone

EBS

volume

EC2

instance

AWS region

Replica

What about EC2 instance termination?

Availability Zone

EBS

volume

EC2

instance

DeleteOnTermination = True

DeleteOnTermination = False

AWS region

Replica

What about EC2 instance failure?

Availability Zone

AWS region

EBS

volume

EC2

instance

Replica

What about EC2 instance failure?

Availability Zone

AWS region

EBS

volume

New

EC2

instance

Replica

EBS Enables EC2 Auto Recovery

EBS Enables EC2 Auto Recovery

StatusCheckFailed_System

StatusCheckFailed_Instance

CloudWatch

per-instance metrics:

Amazon CloudWatch Alarm Actions

Instance

status check fails?

REBOOT

System

status check fails?

RECOVER

Instance ID

Instance metadata

Private IP addresses

Elastic IP addresses

EBS volume attachments

Instance retains:

• Limited to C3, C4, M3, M4, R3, and T2 instance types with EBS only storage

Amazon EC2

Auto Recovery

Set your failed check threshold

Choose 1-minute period

and statistic minimum

Choose recover action

Metric = StatusCheckFailed_System

CloudWatch

console

Amazon EC2

Auto Reboot

Choose reboot action

Metric = StatusCheckFailed_Instance

CloudWatch

console

EBS Volume Types

EBS Volume Types

Hard disk driveSolid state drive

EBS Volume Types

General Purpose

SSD

gp2

Provisioned IOPS

SSD

io1

Throughput Optimized

HDD

st1

Cold

HDD

sc1

SSD HDD

EBS Volume Types: I/O Provisioned

General Purpose SSD

gp2

Throughput: 160 MB/s

Latency: Single-digit ms

Capacity: 1 GB to 16 TB

Baseline: 3 IOPS per GB up to 10,000

Burst: 3,000 IOPS (for volumes up to 1 TB)

Great for boot volumes, low latency applications and bursty databases

Burst Bucket: General Purpose SSD (GP2)

Max I/O credit per bucket is 5.4M

You can spend up to

3000 IOPS per second

Baseline performance = 3 IOPS per GiB or 100 IOPS

Always accumulating

3 IOPS per GiB per second

gp2

Burst & Baseline: General Purpose SSD (GP2)IO

PS

0 1 16

1,000

2,000

3,000

8,000

10,000

BASELINE IOPS(Baseline of 3 IOPS/GB)

Burstable to

3,000 IOPS

3 90.5

Volume size (TB)

~ 3334 GB

Time to Deplete a Full GP2 Credit Bucket

0

100

200

300

400

500

600

700

1 8 30 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950

Min

ute

s o

f B

urs

t

Volume size in GB

43 min 1 hour

10 hours

EBS Volume Types: I/O Provisioned

Provisioned IOPS SSD

io1

Baseline: 100 to 20,000 IOPS

Throughput: 320 MB/s

Latency: Single-digit ms

Capacity: 4 GB to 16 TB

Ideal for critical applications and databases with sustained IOPS

Scaling Provisioned IOPS SSD (IO1)IO

PS

0 1 16

1,000

5,000

10,000

15,000

20,000

3 90.5

MAX PROVISIONED IOPS(Maximum IOPS:GB ratio of 30:1)

Available Provisioned IOPS

Volume Size (TB)

~ 667 GB

EBS Volume Types: Throughput Provisioned

Throughput

Optimized HDD

st1

Baseline: 40 MB/s per TB up to 500 MB/s

Capacity: 500 GB to 16 TB

Burst: 250 MB/s per TB up to 500 MB/s

Ideal for large block, high throughput sequential workloads

Burst Bucket: Throughput Optimized HDD (ST1)

Max I/O bucket credit is 1 TB of

credit per TB in volume

You can spend up to

250 MB/s per TB

Baseline performance = 40 MB/s per TB

Always accumulating 40 MB/s per TB

st1

Up to 8 TB in I/O credit

Always accumulating 320 MB/s

You can spend up

to 500 MB/s

Burst Bucket: Example 8 TB ST1 Volume

Baseline performance = 320 MB/s

st1

Throughput Optimized HDD – Burst and Base

0

100

200

300

400

500

600

0.5 1 2 4 6 8 10 12 14 16

Th

rou

gh

pu

t in

MB

/s

Volume Size in TB

Burst Base

320

ST1

Cold HDD

sc1

EBS Volume Types: Throughput Provisioned

Baseline: 12 MB/s per TB up to 192 MB/s

Capacity: 500 GB to 16 TB

Burst: 80 MB/s per TB up to 250 MB/s

Ideal for sequential throughput workloads such as logging and backup

Burst Bucket: Cold HDD (SC1)

Max I/O bucket credit is 1 TB of

credit per TB in volume

You can spend up to 80

MB/s per TB

Baseline performance = 12 MB/s per TB

Always accumulating 12 MB/s per TB

Cold HDD – Burst and Base

0

50

100

150

200

250

300

0.5 1 2 4 6 8 10 12 14 16

Th

rou

gh

pu

t in

MB

/s

Volume size in TB

Burst Base

192

SC1

I/O Provisioned Volumes Throughput Provisioned Volumes

sc1st1io1gp2

$0.10 per GB $0.125 per GB

$0.065 per PIOPS

* All prices are per month and from the us-west-2 region as of April 2016

$0.045 per GB $0.025 per GB

EBS Volume Types

Magnetic

EBS Volume Types: Magnetic

Magnetic

Baseline: 100 IOPS best effort

Capacity: 1 GB to 1 TB

Burst: 40 MB/s to 90 MB/s best effort

$0.05 per GB

$0.05 per 1 million I/O requests

EBS Best Practices

EBS Best Practices

Performance

Performance: Counting IOPS

io1

gp2

Instance

Application

4 * 16 K random I/Os

IOPS = 4

Up to 256 KiB

400

Blocks 77, 125, 11, 400

11 125 77

Performance: Counting IOPS

io1

gp2

Instance

Application

4 * 16 K sequential I/Os

IOPS = 1

64 KiB

Blocks 11, 12, 13, 14

11 12 13 14

Performance: Counting Throughput

Instance

Application

Up to 1 MB/s

4 3 2 1

4 * 1 MiB sequential I/Os

sc1

st1

MB/s = 4

Performance: Counting Throughput

Instance

Applications

87 621 2 1

2 * 1 MB sequential I/Os

sc1

st1

MB/s = 42 * 16 K random I/Os

Performance: Burst Balance

0

20

40

60

80

100

120

0 1 2 3 4 5 6 7 8 9 10

Burs

t B

ala

nce %

Time in Hours

1 MB Sequential 16 KB Random

4TB SC1 Volume

1 MB Sequential:

500 MB/s for 3 hours

16 KB Random:

8 MB/s for 3 hours

Performance: Verifying ST1 & SC1 Workloads

iostat

256 sectors x 512 bytes/sector = 128 KiB

128 KiB

Performance: Verifying ST1 & SC1 Workloads

CloudWatch

Console

Under

64 KiB?

Performance: Verifying ST1 & SC1 Workloads

Small or

random IOPS

likely

CloudWatch

Console

Stuck

around

44 KiB?

Performance: Verifying ST1 & SC1 Workloads

CloudWatch

Console

Upgrade your kernel

to at least 3.8

Performance: Bandwidth Matters

EBSc3.2xlarge

EC2

instances

Internet

Databases

~ 125 MB/s

S3

c3.8xlarge

10 Gbps ~ 1250 MB/s

Performance: EBS-Optimized Instances

For max throughput statistics per instance types, see:

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptimized.html

• Dedicated network bandwidth for EBS I/O

• Enabled by default on c4, d2, and m4 instances

• Can be enabled at instance launch or on a running instance

• Not an option on some 10 Gbps instance types

(c3.8xlarge, r3.8xlarge, i2.8xlarge)

Performance: EBS-Optimized Instances

EBS

EC2

instances InternetDatabases

c3.2xlarge

~ 125 MB/s

S3

Performance: Bandwidth Matters

c3.xlarge

EBS-optimized

dedicated 500 Mbps

~ 62.5 MB/s

2 TB GP2 volume:6,000 IOPS

160 MB/s max throughput4,000 16K IOPS

c3.2xlarge

EBS-optimized

dedicated 1 Gbps

~ 125 MB/s

8,000 16K IOPS

2 TB GP2 volume:6,000 IOPS

160 MB/s max throughput

Performance: Throughput Workload

EC2

instances InternetDatabases

m4.10xlarge

S310 Gbps ~ 1250 MB/s

~ 500 MB/s

32,000 16K IOPS

8 TB ST1 volume:320 MB/s base

500 MB/s burst

Performance: I/O Workload

EC2

instances InternetDatabases

m4.10xlarge

S310 Gbps ~ 1250 MB/s

~ 500 MB/s

32,000 16K IOPS

1 TB io1 volume:16,000 IOPS

Performance: I/O Workload

EC2

instances InternetDatabases

m4.10xlarge

S310 Gbps ~ 1250 MB/s

~ 500 MB/s

32,000 16K IOPS

RAID0

2 * 1 TB io1 volumes:32,000 IOPS

Best Practice: RAID

When to RAID?

• Storage requirement > 16 TB

• Throughput requirement > 500 MB/s

• IOPS requirement > 20,000 @ 16K

Best Practice: RAID

EBS

volume

Availability Zone

AWS region

EC2

instance

EBS

volume

RAID0RAID10

Replica Replica

Best Practice: RAID

Avoid RAID for redundancy

• RAID1 halves available EBS bandwidth

• RAID5/6 loses 20 – 30% of usable I/O to parity

Performance: Volume Initialization

New EBS volume? New EBS volume from snapshot?

• Attach and its ready to go • Initialize for best performance

• Random read across volume

EBS Best Practices

Performance

Management

Best Practice: Taking Snapshots

Quiesce I/O

1. Database: FLUSH and LOCK tables

2. Filesystem: sync and fsfreeze

3. EBS: snapshot all volumes

When CreateSnapshot API returns

success, it is safe to resume

Best Practice: Automate Snapshots

Key ingredients:

AWS Lambda Amazon EC2

Run commandTagging

https://aws.amazon.com/ec2/run-command/

Best Practice: Automate Snapshots

AWS Lambda

scheduled event:

daily snapshots

EC2

instances

Backup

Retention30 days

Search for instances

tagged “Backup”

EC2 Run commands to

fsfreezeSnapshot all

attached volumes

Tag snapshots with

expire date

1. 2. 3. 4.

Best Practice: Automate Snapshot Expiration

AWS Lambda

scheduled event:

daily expire

Search for snapshots

tagged to “Expire On”

today

Delete expired

snapshots

1. 2.

EBS

snapshots

BackupExpireOn

Date

EBS Best Practices

Performance

Management

Security

Best Practice: Security

EBS

encryption

• Attach both encrypted and unencrypted

• No volume performance impact

• Any current generation instance

• Supported by all EBS volume types

• Snapshots also encrypted

• No extra cost

Best Practice: Security

EBS encryption:

data volumes

Best Practice: Security

Create a new AWS KMS Master key for EBS

• Define key rotation policy

• Enable AWS CloudTrail auditing

• Control who can use key

• Control who can administer key

Best Practice: Security

EBS encryption:

data volumes

Best Practice: Security

EBS

volume

EBS

master

key

AWS KMS

EBS

volume

EBS

volume

Data key 1

Data key 2

Data key 3Envelope encryption

• Limits exposure risk

• Performance

• Simplifies key management

Best Practice: Security

EBS encryption:

boot volumes

Summary

Use encryption if

you need it

Take snapshots,

tag snapshotsSelect the right

instance for your

workload

Select the right

volume for your

workload

Thank you!