Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

44
1 powering lightning fast apps

description

A presentation by Redis Labs' CTO, Yiftach Shoolman, given at the July 2nd meet up, hosted by I am OnDemand and IGT Cloud at the Microsoft ILDC Auditorium. See the video at: https://www.youtube.com/watch?v=eymqHZaUOH4 In this In this session Yiftach shares tips on how the company manages 50,000+ scalable and highly avaliable Redis databases over the 4 largest public clouds, 8 leading Platforms-as-a-Service, and across 10 geographical regions. He explains the service's back-end architecture, the open-source projects it uses, and which tools the company builds in-house. Shoolman also shares what Redis Labs' small DevOps team does automatically, and what it still does manually. Finally, he offers advice on how to build a strong R&D team that lives and breathes DevOps. Since the company launched its Redis Cloud service, it has dealt with 150+ node failure events and a half-dozen complete data-center outages. In addition, its team has experienced many interesting scenarios, such as hard to believe scaling patterns like 0 to a few hundreds gigabytes of in-memory data in just a few minutes, and 0 to 300K+ ops/sec in just a few seconds.

Transcript of Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

Page 1: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

1

powering lightning fast apps

Page 2: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

2

The newest NoSQL

The fastest data store available today (served entirely

from RAM)

Among the top 3 databases chosen by developers

Much more than a simple key/value - Strings, Hashes,

Lists, Sets, Sorted Set, LUA, transactions, Bits

operations

Strong use cases, dynamic community, large eco-

system

Redis

Page 3: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

3

Leading the commercial Redis market

Founded in 2011; GA in 02/2013

2,400+ paying customers; 52,000+ DBs; 100+

new DBs/day

2nd largest contributor to open source Redis

Raised $13M - Bain/Carmel/Strategic/Angels

Offices in Santa Clara and Tel-Aviv

Redis Labs

Page 4: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

4

Redis Cloud Memcached Cloud

Our offering

Fully-managed cloud services.

On-prem server license - soon.

Page 5: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

5

100msec =

Fast apps requirements

max E2E response time, under any load

50msec = average Internet latency

50msec = required app response time (includes processing & multi DB accesses)

1msec = required DB response time

The only database to meet requirement

=

Page 6: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

6

DB performance comparison@<1mse

c

@<1msec

@<1msec

@<20msec

@<10-50msec

@<10-50msec

@<100msec

@<100msec

@>100msec

Page 7: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

7

Why is Redis efficient ?

Many data-structures

Many cool commands (atomicity

maintained)

Complexity aware

Page 8: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

8

Real world use case:

•500+GB

•400K writes/sec

•1500 reads/sec

•37.5KB average object size

Efficiency

No extra work at app level

1.5Gbps 120Gbps

Tones of work at

app level

NoSQL

6 Nodes cluster

150+ Nodes cluster

Page 9: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

9

Timeline

Followers

Caching

Messaging

Geo search

Leaderboards

Job management

RT analytics

Verticals & main use cases

Online advertisin

g

Social Gaming

Financial Services

Page 10: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

10

• Multi-TB in memory

• ~ 300,000 reads/sec

• ~ 5,000*N writes/sec

N - # of followers

Twitter

Every Timeline

(800 tweets per user)

is on Redis

Page 11: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

11

• 20TB+ in memory

• ~ 6,000,000 reads/sec

• ~ 600,000 writes/sec

Weibo (Chinese Twitter)

• Counting

• Reverse cache

• Top 10 lists

• Last Index

• Relational list/Message Queue

• Fast transactions w/ LUA

Page 12: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

12

Pinterest

Object graph:

• Per user (Sorted Set w/ timestamp as

score)

store the users followed (explicit+

implicit)

store the user’s followers

(explicit+implicit)

• Per board

Redis Hash for storing explicit followers

Redis Set for storing explicit unfollowers

Page 13: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

13

Stack Overflow

Three levels of cache:

• Local cache (no persistence)

sessions, and pending view count

updates

• Site cache

hot question id lists, users acceptance

rates..

• Global cache

Inboxes, API usage quotas, …

Page 14: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

14

Github

• Redis is used for routing info

• Matching user repositories to server

names

Page 15: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

15

Hipchat

• Which users are in which room

• Who is online

• XMPP server balancing

Page 16: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

16

Youporn

Most data is found in Hashes with ordered Sets used to

know what data to show

(1) ZinterStore on:

{videos:filters:release}{videos:filters:orientation:straig

ht}

{videos:filters:categories(id)}{videos:ordering:rating}

(2) Perform a ZRANGE to get the pages we want and get

the list of video_ids back

(3) Start pipelining to get all the videos from Hashes

Page 17: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

17

Snapchat

• 500+ instances

• 15-50TB

• Running on GCE

400M messages/day

Page 18: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

18

Why Redis Labs ?

Page 19: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

19

Infinite seamless scalability

True high-availability

Stable top performance

Zero management

Users choose us because..

Page 20: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

Dynamic Clustering Technology

Zero-latency proxy

Cluster

manager

In-Memory Node

Cross-shard processor

In-Memory Cluster

+

Page 21: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

21

Challenge #1

How to serve users from the same data-center ?

Page 22: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

4 clouds /10 regions

Page 23: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

18 data-centers / 30 clusters

Page 24: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

24

AWS zones mapping dilemma

Redis Labs Userus-east-1a us-east-1c

us-east-1b

us-east-1c us-east-1e

us-east-1d us-east-1a

us-east-1e us-east-1b

Page 25: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

25

Eric Hammond’s post on: Matching EC2 Availability

Zones Across AWS Accounts

How did we solve it

Page 26: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

26

How did we solve it

Redis Labs

User

Page 27: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

27

Challenge #2

Which instance type shall we use for our cluster?

Page 28: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

28

Various instance types in the same cluster• High load scenarios • High memory usage scenarios • New generation of instances

Dedicated instances

As cheap as possible

Cluster’s node requirements

Page 29: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

29

Adrian Cockcroft's Blog - Understanding and using Amazon EBS - Elastic Block Store

• use large instances and get dedicated instances for free

The tip

Page 30: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

30

What we use today

C3 & R3 A4/5/6/7n1-standardn1-highmemn1-highcpu

BM+VM

Page 31: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

31

Challenge #3

How to mange data-persistence with high volumes

of ‘writes’ and slow cloud storage ?

Page 32: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

32

Ephemeral vs. Persistence storage

Ephemeral

EBS/Cloud Drive/Persistent

Disk/SAN

Network attachedPersistent

Slow

Direct attachedEphemeral

“Fast”

Page 33: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

33

Adrian’ s Blog use the larger EBSes if you want speed

Google (GCP) “Larger volumes can achieve higher I/O levels than smaller volumes”

The tips

Page 34: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

34

We use large volumes (1TB+)

We use both ephemeral and persistent storage

We improved/tuned/optimized the Redis persistent storage interface

If replication is enabled, slave writes to disk

We don’t use PIOPS

What we do

Page 35: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

35

Why not PIOPS

Page 36: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

36

Challenge #4

How to monitor 50K+ databases, 30+ clusters and

hundreds of nodes ?

Page 37: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

37

Zabbix (not Nagios) - per node metrics

Limbic (home made) - databases’ metrics• 50K (databases) x 100+(metrics) x 10K+(time

resolutions)

• Based on Python, RRD, Redis

Redis adminUI – cluster configuration

Monitoring

Page 38: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

38

Team/Method/Spirit

Page 39: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

39

Team /Method/Spirit

Tiny devops team

Core dev. team knows ops (very well)

Baby steps, especially in production

The practical approach always wins

Review your plans every 3 months

Page 40: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

40

We are hiring !

Page 41: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

41

Thank You

Page 42: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

42

Why is Redis efficient ?

Many data-structures

Many cool commands (atomicity

maintained)

Complexity aware

Page 43: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

43

Think data-structure • Strings

• Hashes

• Lists

• Sets Sorted Sets

• HyperLogLogs

Page 44: Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

44

Cool commands• SET if it doesn’t exist – O(1)

• Blocking POP (with timeout) – O(1)

• (blocking) POP from one list, PUSH to another – O(1)

• Get/Set string ranges (and bit operation) – O(N)

• Union/Intersect/Ranges of SETs – O(N)+O(Mxlog(M)) 

• Pub/Sub – O(1)/O(M)/O(M+N)

• LUA / Transactions / Pipelining