Capped collections

download Capped collections

If you can't read please download the document

Transcript of Capped collections

Doing great things with MongoDB Capped Collections

Lennart Koopmann, MongoBerlin2011

www.lennartkoopmann.net / @_lennart

About me

22 years oldLiving in HamburgRails developer at XING AGDeveloper of Graylog2

Capped collections are fixed sized collections that have a very high performance auto-FIFO age-out feature (age out is based on insertion order).

...mongodb.org says.

Lets take a deeper look at that

Capped collections have a fixed size

Can also have fixed number of objects but must have a fixed size.

db.createCollection("somecoll", {capped:true,size:100000})

bytes

max 1GB on 32 bit system

only limited by system resources on 64bit systems

db.createCollection("somecoll", {capped:true,size:100000,max:100})

# of objects

someDb.createCollection("somecoll",BasicDBObjectBuilder.start().add("capped", true).add("size", 100000).add("max", 100);

JAVA

Easy to create from code.

Should be available in all drivers.

some_db.create_collection(somecoll, {:capped => true,:size => 100000,:max => 100})

RUBY equivalent

automatic FIFO age-out

Once the space is fully utilized, newly added objects will replace the oldest objects in the collection.

CAPPED COLLECTIONS HAVE A...

everything just flowing through the available space

natural ordering

Queries with no sorting return objects in forward natural order.

much faster than for example sorting on timestamp field. Can be reversed: .sort({$natural:-1})

No sorting required fast!

Like tail -f on logfile

Fully compatible to normal collections

No special queries, no special connections.

Normal collections are 100% compatible (e.g. $natural will still work)

No one would notice that it's not capped.

Re-create from your code, don't rely on automatic/lazy re-creation from mongod would be re-created as normal collection otherwise

Pitfall: Must be explicitly created!

Must be (re-)created with their capped/size/max parameters. Think of this in your application design.

Normal collections are 100% compatible (e.g. $natural will still work)

No one would notice that it's not capped.

Re-create from your code, don't rely on automatic/lazy re-creation from mongod would be re-created as normal collection otherwise

Perfect for:

Logging: Speedy by natural ordering. LRU behavior. Also think of activity logs.

LRU: least recently used no size explosions

GRAYLOG2

Perfect for:

Archiving: Roll out data over time. No need to write (and run) cron cleaner scripts.

graylog2 historic_system_values

Restrictions

Yepp, there are some restrictions

Objects/Documents can't grow in size.

You can change attributes, but only in the range of the original document size. Smaller is okay. Update will fail otherwise. Pre-filling is a workaround.

Really important for your application design! Always, always keep this in mind. Once the docs are in it's to late.

Objects/Documents can't be deleted.

You'll need logical deletion. Remember to already set deleted=false when creating the objects because they can't grow.

Graylog2 messages. Can be turned off completely for performance reasons.

Maximum size on 32bit arch.

1GB on 32 bit architectures. (Only limited by system resources on 64 bit)

Sharding not supported.

> db.somecoll.stats();{"count" : 137502,"size" : 47799720,"avgObjSize" : 347.6292708469695,"storageSize" : 50000128,"capped" : 1,"max" : 2147483647,}

(Only parameters relevant to capped collections for better readability.)

Number of objects

Current size. (but on disk pre-allocated tostorageSize)

Capped to this size

Max # of objects. No limit was given here.

(Only parameters relevant to capped collections for better readability.)

Can be read from every driver. Show in interfaces!

Tailable Cursors

OMG! It's tail -f on a collection!

Cursor is not closed after all data is received. Continue later and get all new data. Available in most mongo drivers.

Live tail, web sockets, candy

db = Mongo::Connection.new().db("my_db")coll = db.collection("somecoll")cursor = Mongo::Cursor.new(coll,:tailable => true)

loop doif doc = cursor.next_documentputs docelsesleep 1endend

This example misses handling of cursors that became invalid.

RUBY

Prepare to become invalid.

Example: Cursor may become invalid if the last object returned is at the end of the collection and is deleted. Re-query in surrounding loop. Cursor with id 0 is dead.

Cursor can be dead upon creation when the initial query returned nothing.

MongoDB replication uses tailable cursors.

Needs fast lookups to the end of the oplog. No indexes Faster inserts!

Thank you.

http://www.lennartkoopmann.net / @_lennart