4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine...

46
4 Myths about in-memory databases busted Yiftach Shoolman – Co-Founder & CTO @ Redis Labs @yiftachsh, @redislabsinc

Transcript of 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine...

Page 1: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

4 Myths about

in-memory databases busted

Yiftach Shoolman – Co-Founder & CTO @ Redis Labs @yiftachsh, @redislabsinc

Page 2: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Background - Redis

2

Created by Salvatore Sanfilippo (@antirez)

OSS, in-memory NoSQL k/v database/data-structure engine

# 1 # 2 # 3 # 12 Fastest growing

DB 2013-01 – to date:

DB-Engines

Most popular database on containers:

@DevOps.com & ClusterHQ

Top NoSQL databases: DB-Engines

Top tools developers love:

StackShare

Page 3: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Backgroud – Redis Labs

3

Founded in 2011. HQ in Mountain View CA, R&D in Tel-Aviv IL

The largest commercial company behind OSS Redis

•  5000+ paying customers

•  30,000+ free users

•  100,000+ databases under management

•  ±200 new databases/day

Provide enterpsie class Redis deployement

•  As a service – Redis Cloud

•  On-premises – Redis Labs Enterprise Cluster (RLEC)

$28MM VC funding

Page 4: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Why an in-memory operational DBMS ?

4

Database App Servers

100msec = The new standard for E2E application response time, under any load

50msec = Average internet latency (round-trip)

50msec = Required app response time (includes processing & multi-DB access)

1msec = Required DB response time

Page 5: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Why an in-memory analytics DBMS ?

5

Query time:

Days/Hours

Disk-based

Query time:

Mins/Secs

RAM

RAM-based

Page 6: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Myth #1 all in-memory databases

are equally fast

6

Page 7: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

How many ops/sec can a single EC2 instance do?

7

1,000,000

0 200,000 400,000 600,000 800,000 1,000,000 1,200,000

1xc3.8xlarge EC2 instance

(1)  Deployed in a VPC (2)  Enhanced Network Interface

(ENI) enabled (3)  Placement group enabled (4)  IRQ Distribution enabled (5)  Interrupt coalescing enabled (6)  4xNICs configured (7)  Receive Packet Steering (RPS)

enabled

Page 8: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

How many ops/sec can a single EC2 instance do?

8

1,228,432

1,000,000

0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 1,400,000

1xc3.8xlarge EC2 instance Fully optimized (1)  Deployed in a VPC (2)  ENI enabled (3)  Placement group enabled (4)  IRQ Distribution enabled (5)  Interrupt coalescing enabled (6)  4xNICs configured (7)  RPS enabled

Not optimized (1)  Deployed on Classic EC2 (2)  No tuning (3)  Reached packets/sec limit

Page 9: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

How many ops/sec can a single EC2 instance do?

9

1,750,000

1,228,432

1,000,000

0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 1,400,000 1,600,000 1,800,000 2,000,000

1xc3.8xlarge EC2 instance

Not optimized (1)  Deployed on Classic EC2 (2)  No tuning (3)  Reached packets/sec limit

Half optimized (1)  Deployed in a VPC (2)  ENI enabled (3)  RPS enabled (4)  Only 1xNIC

Fully optimized (1)  Deployed in a VPC (2)  ENI enabled (3)  Placement group enabled (4)  IRQ Distribution enabled (5)  Interrupt coalescing enabled (6)  4xNICs configured (7)  RPS enabled

Page 10: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

How many ops/sec can a single EC2 instance do?

10

2,000,000

1,750,000

1,228,432

1,000,000

0 500,000 1,000,000 1,500,000 2,000,000 2,500,000

1xc3.8xlarge EC2 instance

Not optimized (1)  Deployed on Classic EC2 (2)  No tuning (3)  Reached packets/sec limit

Half optimized (1)  Deployed in a VPC (2)  ENI enabled (3)  RPS enabled (4)  Only 1xNIC

Fully optimized (1)  Deployed in a VPC (2)  ENI enabled (3)  Placement group enabled (4)  IRQ Distribution enabled (5)  Interrupt coalescing enabled (6)  4xNICs configured (7)  RPS enabled

Fully optimized

Page 11: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

How many servers to get 1M writes/sec on GCE?

300

0

50

100

150

200

250

300

350

11

Page 12: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

How many servers to get 1M writes/sec on GCE?

300

50 50

0

50

100

150

200

250

300

350

12

Page 13: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

How many servers to get 1M writes/sec on GCE?

300

50 50

2 0

50

100

150

200

250

300

350

In-memory databases 13

Page 14: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Real-world write intensive app

14

Page 15: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Hbase+Internal Cache vs. Hbase+Redis

15

Page 16: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

16

Same Redis core, same HW, different performance

16

Page 17: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

So why aren’t in-memory DBs equally fast?

17

Most are written in C/C++…. but programming language isn’t the only thing to consider

Page 18: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

What affects in-memory DB performance?

(1) Complexity of processing commands

à How many lines of code per command ? What is the computation

complexity (e.g. in Redis most commands are O(1))?

(2) Query efficiency

à Is it limited to blob queries? Can you query a discrete value?

(3) Pipelining

à Can you send multiple requests at once to get lower latency and less

context switches?

18

Page 19: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

What affects in-memory DB performance?

(4) Protocol efficiency

à How long it takes to parse a request or to serialize a response

(5) TCP overhead

à Long-lived (connection pool) vs. short-lived connections

19

Page 20: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

What affects in-memory DB performance?

(6) Single-threaded or multi-threaded architecture

à Lock-free vs. parallel computing

(7) Shared-nothing (the best) vs. shared-something vs. shared-everything

(8) Built-in acceleration components

20

Page 21: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Myth #2 A single node is not a cluster

21

Page 22: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

The truth: A single node can be a cluster

but not a HA cluster

22

Page 23: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

In the new containers/VMs world a cluster is:

A bunch of processes that

together look like one big

process

23

Page 24: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

A real-world example

A Binary Option platform; 400MB dataset; 1,000,000 ops/sec

A single node, 9 shards cluster

=

3 nodes, 9 shards cluster 24

Page 25: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Myth #3 In-memory databases are

inconsistent and unreliable

25

Page 26: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

A few facts/questions about consistency

Almost all NoSQL databases (not just in-memory) ack the client before

commiting to disk

Almost all in-memory databases can commit to disk before they ack the

client

However, even if you ack after everything is committed:

•  Is your driver memory buffer persistent and consistent?

•  Is your storage system cache persistent and consistent?

Most databases are NOT bulletproof

26

Page 27: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Most in-memory databases are async most of the time

Master

Storage

Slave

Storage

2 1

4

3

6 5

Application Async replication Async disk I/O

27

Page 28: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Some of them can partially sync

Master

Storage

Slave

Storage

4 1

3

2

6 5

Application Sync replication Async disk I/O

28

Page 29: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

A few of them fully sync

Master

Storage

Slave

Storage

6 1

5

2

4 3

Application Sync replication Sync disk I/O

29

Page 30: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

CP

AP

CAP and in-memory databases

CA

Memcached

in-memory

30

Page 31: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

AP

Behavior during network splits

Y X X

Wri

te

Inconsistency

Read

CP

X X X

Full-consistency

Read

31

31

Page 32: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Are in-memory databases reliable?

Redis Labs facts: •  Provisioned 100s of TBs of RAM

•  500+ node failure events à 1 failure every 2 days

•  ~30 complete data-center outages à 1 outage every month

•  Users with high availability (HA) features enabled haven’t lost a single

byte of data

32

Page 33: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Myth #4 In-memory databases

are expensive

33

Page 34: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Real-world use case:

•  500+GB

•  400K writes/sec

•  1500 reads/sec

•  37.5KB average object size

Which one costs more?

No extra work at

app level

1.5Gbps 120Gbps

Tons of work at app level

Others

6-node cluster

30+ node cluster

34

Page 35: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Redis Labs Aerospike/Couchbase Cassandra

Read

Write

Cost

Which one costs more (2)?

35

Page 36: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Sometimes in-memory can be very expensive

#1 200GB

#2 200GB

#50 200GB

50 x r3.8xlarge instances

#51 200GB

#51 200GB

#100 200GB

1st replica for HA

#101 200GB

#102 200GB

#150 200GB

2nd replica for quorum

HA deployment of 10TB in-memory dataset on EC2

Total cost (reserved instances) = $2,132,250/yr

36

Page 37: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Do we really need 2 replicas?

#1 200GB

#2 200GB

#50 200GB

50 x r3.8xlarge instances

#51 200GB

#51 200GB

#100 200GB

1 replica for HA

#101 15GB

Efficient HA deployment of 10TB in-memory dataset on EC2

Total cost (reserved instances) = $1,421,500/yr Savings = $710,750/yr

Redis Labs Patent Pending Technology

1 quorum server

37

Page 38: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Can we save more?

38

Page 39: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Price/performance of memory technologies

$10

$3

$1

0.10

1.00

10.00

100.00

1000.00

$0

$2

$4

$6

$8

$10

$12

DRAM SCM NAND (flash)

mic

ose

con

d

1GB memory - cost vs read/write performance

1 GB cost Read latency Write latency

39

Page 40: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Redis on Flash

Flash (slow RAM)

RAM All keys + ‘hot’ values

‘cold’ values ●  Asynchronous ●  Multi-threaded

Flash used as a RAM extender and NOT as persistent storage

40

Page 41: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

How to achieve optimal price/performance

By dynamically setting RAM/Flash ratio

RAM

4410 GB 590 GB

88.2% 11.8%

41

Page 42: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Single server performance - 10% in RAM / 90% in Flash

*IBM P8 with IBM FlashSystem 42

RAM Hits Ratio Ops/Sec Latency

Low latency scenarios

100% 1.35M 1.00 msec

80% 340K 1.07 msec

50% 200K 0.96 msec

20% 160K 1.00 msec

High throughput scenarios

100% 2.00M 2.40 msec

80% 671K 6.20 msec

50% 483K 10.00 msec

20% 366K 14.50 msec

Page 43: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

10TB Redis deployment on EC2

43

Redis (on RAM) 2 replicas

Redis (on RAM) 1 replicas

Redis on Flash 1 replica

Instance type r3.8xlarge r3.8xlarge i2.8xlarge

# of instances 150 100 10

RAM 30TB 20TB 2TB

Flash - - 64TB

Persistent storage (EBS) 150TB 100TB 80TB

1yr costs (reserved instances)

$2,132,250 $1,421,500 $318,090

Yearly savings - $710,750 $1,814,160

Savings % - 33.33% 85.08%

Page 44: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Summary

44

Page 45: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

4 myths about in-memory databases busted

All in-memory databases are NOT equally fast

You can create a single node in-memory cluster

In-memory databases can be consistent and reliable

With the right technology, in-memory databases are not expensive

45

Page 46: 4 Myths about in-memory databases busted · OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular

Thank you Click to get more info about

Redis, Redis Labs, Redis Cloud and RLEC