Mongo presentation conf

14
MongoDB Introduction and Internal by Shridhar Joshi

description

 

Transcript of Mongo presentation conf

Page 1: Mongo presentation conf

MongoDBIntroduction and Internal

by Shridhar Joshi

Page 2: Mongo presentation conf

What is MongoDB?

• Open source, scalable, high-performance, document-oriented NoSQL Key-Value based database.

• Features• JSON-style document –oriented storage with schema-less• B-tree index supported on any attribute• Log-based replication for Master/Slave and Replica Set• Auto-sharding architecture (via horizontal partition) scales to thousands of nodes• NoSQL-style query• Surprising updating behaviors• Map/Reduce support• GridFS specification for storing large files• Developed by 10gen with commercial support

Page 3: Mongo presentation conf

Well/Less Well Suited

Source: http://www.mongodb.org/display/DOCS/Use+Cases

Page 4: Mongo presentation conf

Basic concepts in MongoDB

NoSQL MongoDB

Database

Collection

Document

Field

Index

Cursor

Relational DBMS

Database

Relation

Tuple

Column

Index

Cursor

MongoDB

Databases*

Collections*

Documents* Indexes*

Fields*

* means 0 or more objects

Relational DBMS

Databases*

Relations*

Columns* Indexes*

Each document has its own fields and makes MongoDB schema-less.

Page 5: Mongo presentation conf

CRUD Demo time

Ø show dbs view existing databases Ø use test use database “test”Ø db.t.insert({name:’bob’,age:’30’}) insert 30 years bobØ db. t.insert({name:’alice’,gender:’female’}) insert lady alice Ø db. t.find() list all documents in

collection tØ db. t.find({name:’bob’},{age:1}) find 1 year old bob Ø db. t.find().limit(1).skip(1) find the second document Ø db. t.find().sort({name:1}) sort the results with ascend

nameØ db. t.find({$or:[{name:’bob’},{name:’tom’}]}) find bob or tom’s documentsØ db. t.update({name:’ bob’},{$set:{age:31}}, update all bob’s age to 31Ø false,true})Ø db.stats() database statistic Ø db.getCollectionNames() collections under this dbØ db.t.ensureIndex({name:1}) create index on nameØ db.people.find({name:“bob"}).explain() explain plan step

Page 6: Mongo presentation conf

Query Optimization

db.people.find({x:10,y:”foo”})

Index on x

Index on y

Collection people

Index Scan

Index Scan

DiskLocation Scan

Page 7: Mongo presentation conf

MongoDB Architecture

Source: mongoDB Replication and Replica Set by Dwight Merriman 10gen

Page 8: Mongo presentation conf

MongoDB ShardingMongoDB uses two key operations to facilitate sharding - split and migrate. Split splits a chunk into two ranges; it is done to assure no one chunk is unusually large.Migrate moves a chunk (the data associated with a key range) to another shard. This is done as needed to rebalance.

Split is an inexpensive metadata operation, while migrate is expensive as large amounts of data may be moving server to server. Both splits and migrates are performed automatically.

MongoDB has a sub-system called Balancer, which monitors shards loads and moves chunks around if it finds an imbalance.

If you add a new shard to the system, some chunks will eventually be moved to that shard to spread out the load.

A recently split chunk may be moved immediately to a new shard if the system predicts that future insertions will benefit from that move.

Page 9: Mongo presentation conf

MongoDB Sharding

Pull mode

Page 10: Mongo presentation conf

MongoDB Sharding: Briefly

FROM:C TO:N

#Copy Index Definition from C#Remove existing data in [min~max]#Clone the data in[min~max] from C#Ask C to replicate the changes

#Make sure my view is complete and lock#Get the document’s DiskLoc for sharding#Trigger the N to sharding in Pull mode

Sequence

#N commit#Ask N to commit

Page 11: Mongo presentation conf

MongoDB Sharding: In Details FROM TO

Notice: The FROM can be updated/deleted during sharding and TO can catch up in step 4.

Page 12: Mongo presentation conf

Replication and Sharding

Source: http://www.mongodb.org/display/DOCS/Simple+Initial+Sharding+Architecture

Page 13: Mongo presentation conf

MongoDB Replication: Pull mode

Slave continuously pull the OpLog from Master.