(BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

58
© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. November 12, 2014 | Las Vegas BDT312 Using the Cloud to Scale from a Database to a Data Platform Ryan Horn, Lead Software Engineer at Twilio

description

Scaling highly available database infrastructure to 100x, 1000x, and beyond has historically been one of the hardest technical challenges that any successful web business must face. This is quickly changing with fully-managed database services such as Amazon DynamoDB and Amazon Redshift, as the scaling efforts which previously required herculean effort are now as simple as an API call. Over the last few years, Twilio has evolved their database infrastructure to a pipeline consisting of Amazon SQS, Sharded MySQL, Amazon DynamoDB, Amazon S3, Amazon EMR and Amazon Redshift. In this session, Twilio cover show they achieved success, specifically: - How they replaced their data pipeline deployed to Amazon EC2 to meet their scaling needs with zero downtime. - How they adopted Amazon DynamoDB and Amazon Redshift at the same scale as their MySQL infrastructure, at 1/5th the cost and operational overhead. - Why they believe adopting managed database services like Amazon DynamoDB is key to accelerating delivery of value to their customers. Sponsored by Twilio.

Transcript of (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Page 1: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

November 12, 2014 | Las Vegas

BDT312

Using the Cloud to Scale

from a Database to a Data Platform

Ryan Horn, Lead Software Engineer at Twilio

Page 2: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Hi, I’m RyanTech Lead of the User Data team at Twilio

Page 3: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

What is Twilio?

We provide a communications

API that enables phones,

VoIP, and messaging to be

embedded into web, desktop

and mobile software.

Page 4: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

How Does it Work?

A user calls your

number

Twilio receives the call Your app responds

Page 5: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

What is the User Data Team?• We scale Twilio's backend database infrastructure

• We build customer facing data APIs

• We manage data policies and data security at rest

Page 6: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Databases at Twilio

Page 7: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Calls and Messages are Stateful

Queued

Ringing

In Progress

Completed

Queued

Sending

Sent

Delivered

Page 8: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

In the Beginning…All data was placed in the same physical database

regardless of where the call or message was in its

lifecycle.

Page 9: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

The Monolithic Database Model

API

Web

Billing

MySQLCall/Message

Service

Carriers

Page 10: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Problems at Scale• Many consumers of data

• Data with different performance characteristics

• Failure in the database degrades many services

• Horizontal scaling and orchestration is

complicated

Page 11: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Moving to a Service-Oriented Architecture

Page 12: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

What is a Service-Oriented Architecture?

An architecture in which required system behavior

is decomposed into discrete units of functionality,

implemented as individual services for applications

to compose and consume.

Page 13: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Communicate Through Interfaces, Not Databases

API

Web

Billing

In Flight

MySQL

Call/Message

Service

In Flight

Service

Post Flight

Service

Post Flight

MySQL

Carriers

Page 14: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Database Can Change Without Changing Every Service

API

Web

Billing

In Flight

MySQL

Call/Message

Service

In Flight

Service

Post Flight

Service

Post Flight

Amazon

DynamoDB

Carriers

Page 15: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

SOA Doesn’t Solve EverythingNo matter how many services you put in front

of MySQL, it’s still a single point of failure.

Page 16: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Sharding MySQL

Page 17: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Implementing Sharding (the easy part)

1. Choose partitioning scheme

2. Implement routing logic

3. Send application queries through router

4. Go!

Page 18: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Sharding at Twilio

Application Router Shard1

Shard2

Shard0

0-3

3-6

6-9

Page 19: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Rolling it Out With Zero Downtime (the hard part)

• We provide a 24/7, always on service

• Communications is intolerant of inconsistency

and latency

• There is no maintenance window

Page 20: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Bringing Up a New ShardMaster1

Slave1

Master2

Slave2

Application

0-9

Page 21: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Split Odds and Evens for WritesMaster1

Slave1

Master2

Slave2

Application

Odds

Evens

0-9

Page 22: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Update RoutingMaster1

Slave1

Master2

Slave2

Application

Odds

Evens

0-4

5-9

Page 23: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Cut Slave LinkMaster1

Slave1

Master2

Slave2

Application

0-4

5-9

Page 24: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

New Solutions, New Problems

Page 25: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

A Necessary BurdenIn the beginning, the burden of managing our

own databases was non-negotiable.

Page 26: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

The Landscape has ChangedWe now have a variety of managed database

services which solve these problems for us,

such as Amazon RDS, Amazon DynamoDB,

Amazon SimpleDB, Amazon Redshift, etc.

Page 27: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Cost Is Never OptimizedApplication developers do not (and should not)

optimize for database cost.

Page 28: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Self Managed Databases are Costly

Everything

Else 22%

Databases

78%

Source: Twilio Data Usage

Page 29: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Keeping up With GrowthAs growth continues to accelerate, we need to

somehow keep up.

Page 30: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

A Change in Approach• Change our hiring practices and bring in specialists

• Remove the context switching

Page 31: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Focusing on What We Do Well

Page 32: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Adopting Amazon DynamoDB

Page 33: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Thinking in Terms of ThroughputAmazon DynamoDB allows us to scale in terms of

throughput, not machines. This is the future of

resource provisioning.

Page 34: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

OperationsManagement and scaling of our cluster is fully

abstracted away from us.

Page 35: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Cost Compared to MySQL

MySQL 82%

Amazon

DynamoDB 18%

Source: Twilio Data Usage

Page 36: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Cost with MySQL Fully Replaced

Everything

Else 61%

Databases

39%

Source: Twilio Data Usage

Page 37: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

A Relational Model with Amazon DynamoDB

Many of our services allow for querying data in a way

that maps naturally to a relational database.

Page 38: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

GET /Accounts/2/Events

SELECT * FROM events ORDER BY date DESC;

Page 39: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

SELECT * FROM events WHERE IpAddress=“5.6.7.8”

ORDER BY date DESC;

Page 40: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

SELECT * FROM events WHERE IpAddress=“5.6.7.8”

AND Date<=“2014-10-03” ORDER BY date DESC;

GET /Accounts/2/Events?IpAddress=5.6.7.8&Date<=2014-10-03

Page 41: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

AccountId (Hash) Date (Range) IpAddress_Date Type

2 2014-10-03 5.6.7.8|2014-10-03 call

2 2014-10-01 5.6.7.8|2014-10-01 message

GET /Accounts/2/Events

AccountId=2, ScanIndexForward=false

Page 42: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

AccountId (Hash) IpAddress_Date

(Range)

Date Type

2 5.6.7.8|2014-10-03 2014-10-03 call

2 5.6.7.8|2014-10-01 2014-10-01 message

GET /Accounts/2/Events?IpAddress=5.6.7.8

AccountId=2, IpAddress_Date begins with “5.6.7.8|”, ScanIndexForward=false

Page 43: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

AccountId (Hash) IpAddress_Date

(Range)

Date Type

2 5.6.7.8|2014-10-03 2014-10-03 call

2 5.6.7.8|2014-10-01 2014-10-01 message

GET /Accounts/2/Events?IpAddress=5.6.7.8&Date<=2014-10-03

AccountId=2, IpAddress_Date LT “5.6.7.8|2014-10-03”, ScanIndexForward=false

Page 44: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Need to Handle Exceeded Throughput Failures

Exceeding provisioned throughput is a runtime error.

Page 45: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Handling Exceeded Write Throughput with

Amazon SQS

Queuing events to Amazon SQS processing

asynchronously allows us to gracefully deal with write

throughput errors.

Page 46: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

API

Web

Billing

Amazon SQSEvents

Processor

Amazon

DynamoDB

Page 47: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Maximum of 5 Global and 5 Local Indexes

You can manage your own indexes, but your

application must then handle partial mutation failures.

Page 48: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Local Index Size LimitsLocal secondary indexes provide immediate

consistency… and limit the data set for a given hash

key to 10GB.

Page 49: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Data Warehouse

Page 50: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Brief History2008 - 2011

All business intelligence queries run on replicas of

MySQL clusters serving production traffic.

Page 51: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Brief History2011 - 2013

Data pushed to Amazon S3 and queried with Pig,

Amazon EMR, improving ability to aggregate, but with

high latency.

Page 52: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Brief History2013 - Present

Move to Amazon Redshift cut the time these reports

took from hours to seconds allowing us to answer

critical BI and financial questions in near real time.

Page 53: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Pushing Data Into Amazon Redshift

Post Flight

ServiceKafka

SQS (DLQ)

Amazon S3

Loader

S3Warehouse

Loader

Amazon

Redshift

Page 54: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Wrapping Up

Page 55: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Managed Services as a CultureOur focus is on creating an experience that unifies

and simplifies communications is a reflection on our

adoption of managed services.

Page 56: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Managed Services as a CultureUnderstanding and focusing on our areas of expertise

and leveraging managed services for the rest

accelerates the delivery of value and innovation

to our customers.

Page 57: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

Thank You!

Page 58: (BDT312) Using the Cloud to Scale from a Database to a Data Platform | AWS re:Invent 2014

http://bit.ly/awsevals