MongoDB: Awesomely Dangerous

Post on 15-Jan-2015

2.788 views 1 download

Tags:

description

 

Transcript of MongoDB: Awesomely Dangerous

MongoDB:Awesomely Dangerous

Twin Cities Code CampOctober 2010

Sunday, October 10, 2010

Ethan Gundersonhttp://ethangunderson.com

Twitter & Github: ethangunderson

Sunday, October 10, 2010

Sunday, October 10, 2010

Our agenda1) Intro to MongoDB

2) Advanced Features

3) Write Path

4) Durability

5) Scaling

6) Takeaways

7) Qs and As

Sunday, October 10, 2010

30,000 foot overview

• Schema-less

• Scalable

• High-Performance

• Open Source

• Document Database

Sunday, October 10, 2010

Driving Principles

1) Performance

2) Performance

3) Scalability

Sunday, October 10, 2010

Documents

• Similar to traditional rows

• First attribute is _id, which is an ObjectId

Sunday, October 10, 2010

ObjectID

1) Timestamp

2) Machine Id

3) Process Id

4) Counter

4b6857a07613c367094426b2

Sunday, October 10, 2010

Embedded Documents• Documents in Documents

• Indexable

• Queryable

Sunday, October 10, 2010

Max Document Size

• 4MB limit on individual documents

• Realistically ~250kb

• Used to force better data modeling

Sunday, October 10, 2010

Collections• Similar to traditional tables

• Collections of like documents

• Schema-less

Sunday, October 10, 2010

Capped Collections

• Fixed size collections

• Must be explicitly created

• Limited functionality (no deletes, limited updates)

Sunday, October 10, 2010

Sunday, October 10, 2010

BSONA language independent data interchange format

• The language of Mongo

• Similar to JSON, but BETTER

• Fast

• 10gen driver support for: C, C++, Java, Javascript, Perl, PHP, Python, Ruby

• Community support for:REST, C#, Clojure, Coldfusion, Scala, and a lot more

Sunday, October 10, 2010

B-Tree Indexes• Similar to a traditional RDBMS

• Can index any field, including arrays

• Missing keys in a unique will be given a value of null

• Blocking by default

Sunday, October 10, 2010

Sunday, October 10, 2010

Inserting

Sunday, October 10, 2010

Updating

Sunday, October 10, 2010

Modifier Operators

$inc$set$push$pushAll$pop$pull$pullAll$addToSet

Conventional updates do work, they’re just not as fast.

Sunday, October 10, 2010

Querying

Sunday, October 10, 2010

Query Operators$in$nin $all $ne $gt$gte$lt

$lte$size$where$limit $offset$sort$slice

Sunday, October 10, 2010

Our agenda1) Intro to MongoDB

2) Advanced Features

3) Write Path

4) Durability

5) Scaling

6) Takeaways

7) Qs and As

Sunday, October 10, 2010

Map / Reduce• Replaces GROUP BY in SQL

• Similar in spirit to Hadoop with all info coming from a collection and going to a collection

• Runs in parallel on all shards, but only one thread per node

• map and reduce functions written in Javascript

Sunday, October 10, 2010

GIS

• Built with location based queries in mind

• Assumes a flat map model of the Earth(!)

Sunday, October 10, 2010

Near Queries

Sunday, October 10, 2010

Bounded Queries

Sunday, October 10, 2010

GridFS• How you store large files in Mongo

• Spreads data to multiple 256kb documents in a ‘chunks’ collection

• Meta data about the file is stored in the files collection

• Permits range operations (x bytes from file)

Sunday, October 10, 2010

Our agenda1) Intro to MongoDB

2) Advanced Features

3) Write Path

4) Durability

5) Scaling

6) Takeaways

7) Qs and As

Sunday, October 10, 2010

The journey of a write

Sunday, October 10, 2010

Memory Mapped File

Save!Success!

Sunday, October 10, 2010

Sunday, October 10, 2010

Safe Mode

• Allows you to determine the durability of a write per query

• Sacrifice performance for safety

• Options for each stage of the write

Sunday, October 10, 2010

Safe mode query

Sunday, October 10, 2010

Memory Mapped File

Save!

Success!

Sunday, October 10, 2010

FsyncEvery 60 seconds, or when the kernel forces it

Sunday, October 10, 2010

Save mode with fsync

Sunday, October 10, 2010

Memory Mapped File

Save!

Success!

Sunday, October 10, 2010

Safe mode with replication flag

Sunday, October 10, 2010

Memory Mapped File

Save!

Success!

Sunday, October 10, 2010

Save Flag with fsync and replication

Sunday, October 10, 2010

Memory Mapped File

Save!

Success!

Sunday, October 10, 2010

Low and High Value

• Useful to determine when building queries

• Allows you to be more careful(and slow), with data that is more important

Sunday, October 10, 2010

Our agenda1) Intro to MongoDB

2) Advanced Features

3) Write Path

4) Durability

5) Scaling

6) Takeaways

7) Qs and As

Sunday, October 10, 2010

Sunday, October 10, 2010

Single Server DurabilityIt’s not

Sunday, October 10, 2010

Sunday, October 10, 2010

What 10gen has to say...

http://blog.mongodb.org/post/381927266/what-about-durability

True single server durability is almost never done correctly.First, there are many scenarios in which that server loses all its data no matter what.  If there is water damage, fire, some hardware problems, etc… no matter how durable the software is, data can be lost.

The path to true durability is replication.

Sunday, October 10, 2010

Our agenda1) Intro to MongoDB

2) Advanced Features

3) Write Path

4) Durability

5) Scaling

6) Takeaways

7) Qs and As

Sunday, October 10, 2010

ScalingSince we’re forced to think about it

Sunday, October 10, 2010

Determining how to scale

Reads

Writes

Sunday, October 10, 2010

Replica Sets

• Distribute reads across the cluster

• Replaces the traditional Master/Slave setup

• Replication is done via an ops log

• Auto failover

• Rack and Datacenter aware

• Smart, very smart

Sunday, October 10, 2010

Master

SlaveSlave Slave

Sunday, October 10, 2010

Master

SlaveSlave Slave

X

Sunday, October 10, 2010

Now a slave

New Master

Slave Slave

X

Sunday, October 10, 2010

Slave

New Master

Slave Slave

Sunday, October 10, 2010

Slave

Master

SlaveSlave

Sunday, October 10, 2010

Slave

Master

SlaveSlave

X

Sunday, October 10, 2010

Master

Slave Slave

X

Me! Me!

Sunday, October 10, 2010

Master

Slave Slave

X

Me! Me!

Web Slice(Arbiter)

You

Sunday, October 10, 2010

Determining how to scale

Reads

Writes

Sunday, October 10, 2010

Being write heavyCurrently, Mongo can only process one concurrent write.

Usually not a problem, as writes are wicked fast

Sunday, October 10, 2010

Auto-Sharding• Partitions data across the cluster in an

order preserving manner

• No support for load based partitioning

• Automatic failover and balancing of nodes

• Distributes writes across the cluster

• Based very heavily off of Yahoo!’s PNUTS and Google’s BigTable

Sunday, October 10, 2010

Sunday, October 10, 2010

Sunday, October 10, 2010

Our agenda1) Intro to MongoDB

2) Advanced Features

3) Write Path

4) Durability

5) Scaling

6) Takeaways

7) Qs and As

Sunday, October 10, 2010

Takeaways• Mongo is fast, but it does interesting things to

be that fast

• Mongo is not SQL. You will need to learn new things

Sunday, October 10, 2010

Our agenda1) Intro to MongoDB

2) Advanced Features

3) Write Path

4) Durability

5) Scaling

6) Takeaways

7) Qs and As

Sunday, October 10, 2010

Qs and As

http://spkr8.com/t/4756

Sunday, October 10, 2010

ResourcesOfficial MongoDB sitehttp://mongodb.org

BSON sitehttp://bsonspec.org

Comprehensive writeup of mongo featureshttp://www.markus-gattol.name/ws/mongodb.html

Sunday, October 10, 2010