MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)

Post on 15-Jul-2015

1.203 views 1 download

Transcript of MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)

What’s New in MongoDB 3.0

Jake Angerman Sr. Solutions Architect, MongoDB

Agenda

Agenda

•  Pluggable Storage Engines •  WiredTiger Storage Engine

–  Document-Level Locking Concurrency Control –  Compression –  Installation & Upgrade

•  Other New Stuff in 3.0 •  Public Service Announcement •  There will be a test at the end

Pluggable Storage Engines

How does MongoDB persist data?

•  MongoDB <= 2.6 –  MMAPv1 Storage Engine –  Uses Memory Mapped Files

•  MongoDB 3.0 –  MMAPv1

•  still the default •  now with collection-level locking!

–  WiredTiger

Storage Engine

Content Repo

IoT Sensor Backend Ad Service Customer

Analytics Archive

MongoDB Query Language (MQL) + Native Drivers

MongoDB Document Data Model

MMAP V1 WT In-Memory ? ?

Supported in MongoDB 3.0 Future Possible Storage Engines

Man

agem

ent

Sec

urity

Example Future State

Experimental

Storage Engine API

•  Allows to "plug-in" different storage engines –  Different working sets require different performance

characteristics –  MMAPv1 is not ideal for all workloads –  More flexibility: you can mix storage engines on same

replica set/sharded cluster •  Opportunity to integrate further (HDFS, native encrypted,

hardware optimized …)

WiredTiger

History

•  Authors Former Members of Berkeley DB team –  WT product and team acquired by MongoDB –  Standalone Engine already in use in large

deployments including Amazon

Why is WiredTiger Awesome

•  Document-level concurrency •  Compression •  Consistency without journaling •  Better performance on certain workloads

– write heavy •  Vertically scalable

– Allows full hardware utilization – More tunable

Document-Level Concurrency

•  Uses algorithms to minimize contention between threads –  One thread yields on write contention to same document –  Atomic update replaces latching/locking

•  Writes no longer block all other writers •  CPU utilization directly correlates with

performance

50%-80% Less Storage via Compression

•  Better storage utilization •  Higher I/O scalability •  Multiple compression options

–  Snappy (default) - Good compression benefits with little CPU/performance impact

–  zlib - Extremely good compression at a cost of additional CPU/degraded performance

–  None •  Data and journal compressed on disk •  Indexes compressed on disk and in memory •  No more cryptic field names in documents!

WiredTiger Internals

Filesystem Layout

•  Data stored as conventional B+ tree on disk •  Each collection and index stored in own file •  WT fails to start if MMAPv1 files found in

dbpath •  No in-place updates

–  Rewrites document every time, reuses space –  No more padding factor!

•  Journal has own folder under dbpath •  You can now store indexes on separate

volumes!

Cache

•  WT uses two caches –  WiredTiger cache stores uncompressed data

•  ideally, working set fits in WT cache –  File system cache stores compressed data –  WT cache uses higher value of 50% of

system memory or 1GB (by default)

Supported Platforms

•  Supported Platforms –  Linux –  Windows –  Mac OSX

•  Non-Supported Platforms –  NO Solaris (yet) –  NO 32Bit (ever)

Gotchas

•  Deprecate MMAPv1-specific catalog metadata –  system.indexes & system.namespaces –  System metadata should be accessed via

explicit commands going forward db.getIndexes() db.getCollectionNames()

•  Cold start penalty –  due to separate WiredTiger cache

How to Run WiredTiger

How Do I Install It?

•  If starting from scratch add 1 additional flag when launching mongod:   --storageEngine=wiredTiger

How Do I Upgrade to it? •  2 ways:

1.  Mongodump/Mongorestore 2.  Initial sync a new replica member running

WT •  Note: you can run replicas with mixed

storage engines •  CANNOT copy raw data files!

–  WT will fail to start if wrong data format in dbpath

Other New Stuff in 3.0

Native Auditing for Any Operation

•  Essential for many compliance standards (e.g., PCI DSS, HIPAA, NIST 800-53, European Union Data Protection Directive)

•  MongoDB Native Auditing –  Construct and filter audit trails for any operation

against the database, whether DML, DCL or DDL –  Can filter by user or action –  Audit log can be written to multiple destinations

50 Node Replica Sets

Enhanced Query Language and Tools

•  All Tools rewritten in GO –  Smaller Package Size –  More rapid iteration –  Faster Loading and Export

•  Easier Query Optimization –  Explain 2.0

•  Improved Logging System –  Faster Debugging

•  Aggregation Framework Improvements •  Geospatial Index Improvements

Single-click provisioning, scaling & upgrades, admin tasks Monitoring, with charts, dashboards and alerts on 100+ metrics Backup and restore, with point-in-time recovery, support for sharded clusters

MMS & Ops Manager 1.6

The Best Way to Manage MongoDB Up to 95% Reduction in Operational Overhead

A Public Service Announcement

Please Upgrade to the Latest Version

•  2.4.14 •  2.6.9

25% off discount code: JakeAngerman

Questions?