Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik...

31
Understanding and tuning WiredTiger the new high performance database engine in MongoDB Henrik Ingo Solutions Architect, MongoDB

Transcript of Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik...

Page 1: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

Understanding and tuning WiredTigerthe new high performance database engine in MongoDB

Henrik IngoSolutions Architect, MongoDB

Page 2: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

Agenda:

- MongoDB and NoSQL - Storage Engine API - WiredTiger configuration + performance

Page 3: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

3

Most popular NoSQL database

Page 4: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

4

5 NoSQL categories

Key Value Wide Column Document

Graph Map Reduce

Redis, Riak Cassandra

Neo4j Hadoop

Page 5: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

5

MongoDB is a Document Database

MongoDBRich Queries

• Find Paul’s cars• Find everybody in London with a car

built between 1970 and 1980

Geospatial • Find all of the car owners within 5km of Trafalgar Sq.

Text Search • Find all the cars described as having leather seats

Aggregation • Calculate the average value of Paul’s car collection

Map Reduce• What is the ownership pattern of colors

by geography over time? (is purple trending up in China?)

{ first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } }}

Page 6: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

6

Operational Database Landscape

Page 7: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

MongoDB 3.0 & storage engines

Page 8: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

8

MongoDB until 3.0

Read-heavy apps

• Great performance• B-tree• Low overhead

• Good scale-out perf• Secondary reads• Sharding

Write-heavy apps

• Good scale-out perf• Sharding

• Per-node efficiency wish-list:• Doc level locking• Write-optimized data

structures (LSM)• Compression

Other

• Multi statement transactions• In-memory engine• SSD optimized engine• etc...

Page 9: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

9

Current state in MongoDB 2.6

Read-heavy apps

• Great performance• B-tree• Low overhead

• Good scale-out perf• Secondary reads• Sharding

Write-heavy apps

• Good scale-out perf• Sharding

• Per-node efficiency wish-list:• Doc level locking• Write-optimized data

structures (LSM)• Compression

Other

• Complex transactions• In-memory engine• SSD optimized engine• etc...

How to get all of the above?

Page 10: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

10

MongoDB 3.0 Storage Engine API

MMAP

Read-heavy app

WiredTiger

Write-heavy app

3rd party

Special app

Page 11: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

11

MMAP

Read-heavy app

WiredTiger

Write-heavy app

3rd party

Special app

• One at a time:– Many engines built into mongod– Choose 1 at startup– All data stored by the same engine– Incompatible on-disk data formats (obviously)– Compatible client API

• Compatible Oplog & Replication– Same replica set can mix different engines– No-downtime migration possible

MongoDB 3.0 Storage Engine API

Page 12: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

12

• MMAPv1– Improved MMAP (collection-level locking)

• WiredTiger– Discussed next

• RocksDB– LSM style engine developed by Facebook– Based on LevelDB

• TokuMXse– Fractal Tree indexing engine from Percona

Some existing engines

Page 13: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

13

• Heap– In-memory engine

• Devnull– Write all data to /dev/null– Based on idea from famous flash animation...

• SSD optimized engine (e.g. Fusion-IO)• KV simple key-value engine

Some rumored engines

https://github.com/mongodb/mongo/tree/master/src/mongo/db/storage

Page 14: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

WiredTiger

Page 15: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

15

• Modern NoSQL database engine– flexible schema

• Advanced database engine– Secondary indexes, MVCC, non-locking algorithms– Multi-statement transactions (not in MongoDB)

• Very modular, tunable– Btree, LSM and columnar indexes– Snappy, Zlib, 3rd-party compression– Index prefix compression, etc...– Encryption at rest

• Built by creators of BerkeleyDB• Acquired by MongoDB in 2014• source.wiredtiger.com, @WiredTigerInc

What is WiredTiger

Page 16: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

16

Choosing WiredTiger at server startup

mongod --storageEngine wiredTiger

http://docs.mongodb.org/master/reference/program/mongod/#cmdoption--storageEngine

Default engine:MongoDB 3.0 = MMAP

MongoDB 3.2 = WiredTiger

Page 17: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

17

Main tunables exposed as MongoDB options

mongod --storageEngine wiredTiger --wiredTigerCacheSizeGB 8 --wiredTigerDirectoryForIndexes /data/indexes --wiredTigerCollectionBlockCompressor zlib --dbpath /data/datafiles

http://docs.mongodb.org/master/reference/program/mongod/#cmdoption--storageEngine

Page 18: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

18

All WiredTiger options via configString (hidden)

mongod --storageEngine wiredTiger --wiredTigerEngineConfigString "cache_size=8GB,eviction=(threads_min=4,threads_max=8), checkpoint(wait=30)"

--wiredTigerCollectionConfigString "block_compressor=zlib"

--wiredTigerIndexConfigString "type=lsm,block_compressor=zlib" --wiredTigerDirectoryForIndexes /data/indexes

See docs for wiredtiger_open() & WT_SESSION::create()http://source.wiredtiger.com/2.5.0/group__wt.html#ga9e6adae3fc6964ef837a62795c7840edhttp://source.wiredtiger.com/2.5.0/struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb

Page 19: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

19

Also via createCollection(), createIndex()

db.createCollection( "users", { storageEngine: { wiredTiger: { configString: "block_compressor=none" } } )

http://docs.mongodb.org/master/reference/method/db.createCollection/#db.createCollectionhttp://docs.mongodb.org/master/reference/method/db.collection.createIndex/#db.collection.createIndex

Page 20: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

20

• db.serverStatus()• db.collection.stats()

More...

Page 21: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

Understanding and OptimizingWiredTiger

Page 22: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

22

Understanding WiredTiger architectureW

iredT

iger

SE

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

Page 23: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

23

Covering 90% of your optimization needsW

iredT

iger

SE

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

Decompression time

Disk seek time

Page 24: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

24

Strategy 1: fit working set in CacheW

iredT

iger

SE

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

cache_size = 80%

Page 25: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

25

Strategy 2: fit working set in OS Disk CacheW

iredT

iger

SE

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

cache_size = 10%

OS Disk Cache (Remaining: 90%)

Page 26: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

26

Strategy 3: SSD disk + compression to save €W

iredT

iger

SE

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical diskSSD

Page 27: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

27

Strategy 4: SSD disk (no compression)W

iredT

iger

SE

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical diskSSD

Page 28: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

28

Compression benchmarks

Page 29: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

29

What problem is solved by LSM indexes?P

erfo

rman

ce

Fast reads Fast writesBoth

Easy: Add indexes

Easy: No indexes

Hard: Smart schema design (hire a consultant) LSM index structures (or columnar)

Page 30: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

30

2B inserts (with 3 secondary indexes)

http://smalldatum.blogspot.fi/2014/12/read-modify-write-optimized.html

Page 31: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)