MongoSV - Schema Design
-
Upload
alvin-john-richards -
Category
Documents
-
view
219 -
download
0
Transcript of MongoSV - Schema Design
-
8/8/2019 MongoSV - Schema Design
1/58
Schema DesignAlvin Richards
-
8/8/2019 MongoSV - Schema Design
2/58
Topics
Introduction
Basic Data Modeling
Evolving a schemaCommon patterns
Single table inheritance One-to-Many & Many-to-Many Trees Queues
-
8/8/2019 MongoSV - Schema Design
3/58
So why model data?
http://www.flickr.com/photos/42304632@N00/493639870/
-
8/8/2019 MongoSV - Schema Design
4/58
A brief history of normalization 1970 E.F.Codd introduces 1st Normal Form (1NF)
1971 E.F.Codd introduces 2nd and 3rd Normal Form (2NF, 3NF)
1974 Codd & Boyce define Boyce/Codd Normal Form (BCNF)
2002 Date, Darween, Lorentzos define 6th Normal Form (6NF)
Goals:
Avoid anomalies when inserting, updating or deleting
Minimize redesign when extending the schema
Make the model informative to users
Avoid bias towards a particular style of query
* source : wikipedia
-
8/8/2019 MongoSV - Schema Design
5/58
The real benefit of relational
Before relational
Data and Logic combined
After relational Separation of concerns Data modeled independent of logic Logic freed from concerns of data design
MongoDB continues this separation
-
8/8/2019 MongoSV - Schema Design
6/58
Relational made normalizeddata look like this
-
8/8/2019 MongoSV - Schema Design
7/58
Document databases makenormalized data look like this
-
8/8/2019 MongoSV - Schema Design
8/58
Terminology
RDBMS MongoDB
Table Collection
Row(s) JSONDocument
Index Index
Join Embedding&Linking
Partition Shard
PartitionKey ShardKey
-
8/8/2019 MongoSV - Schema Design
9/58
DB Considerations
How can we manipulatethis data ?
Dynamic Queries
Secondary Indexes
Atomic Updates
Map Reduce
Considerations No Joins
Document writes are atomic
Access Patterns ?
Read / Write Ratio
Types of updates
Types of queries
Data life-cycle
-
8/8/2019 MongoSV - Schema Design
10/58
So todays example will use...
-
8/8/2019 MongoSV - Schema Design
11/58
Design Session
Design documents that simply map toyour application
post={author:Herg,date:newDate(),text:DestinationMoon,tags:[comic,adventure]}
>db.post.save(post)
-
8/8/2019 MongoSV - Schema Design
12/58
>db.posts.find()
{_id:ObjectId("4c4ba5c0672c685e5e8aabf3"),
author:"Herg",
date:"SatJul24201019:47:11GMT-0700(PDT)",text:"DestinationMoon",
tags:["comic","adventure"]
}
Notes: ID must be unique, but can be anything youd like MongoDB will generate a default ID if one is notsupplied
Find the document
-
8/8/2019 MongoSV - Schema Design
13/58
Secondary index for author
//1meansascending,-1meansdescending
>db.posts.ensureIndex({author:1})
>db.posts.find({author:'Herg'})
{_id:ObjectId("4c4ba5c0672c685e5e8aabf3"),
date:"SatJul24201019:47:11GMT-0700(PDT)",
author:"Herg",
...}
Add and index, find via Index
-
8/8/2019 MongoSV - Schema Design
14/58
Verifying indexes exist
>db.system.indexes.find()
//IndexonID
{name:"_id_",ns:"test.posts",
key:{"_id":1}}
//Indexonauthor
{_id:ObjectId("4c4ba6c5672c685e5e8aabf4"),
ns:"test.posts",
key:{"author":1},
name:"author_1"}
-
8/8/2019 MongoSV - Schema Design
15/58
Examine the query plan>db.blogs.find({author:'Herg'}).explain()
{
"cursor":"BtreeCursorauthor_1",
"nscanned":1,
"nscannedObjects":1,
"n":1, "millis":5,
"indexBounds":{
"author":[
[
"Herg",
"Herg"
]
]
}}
-
8/8/2019 MongoSV - Schema Design
16/58
Query operators
Conditional operators:$ne, $in, $nin, $mod, $all, $size, $exists, $type, ..$lt, $lte, $gt, $gte, $ne,
//findpostswithanytags
>db.posts.find({tags:{$exists:true}})
-
8/8/2019 MongoSV - Schema Design
17/58
Query operators
Conditional operators:$ne, $in, $nin, $mod, $all, $size, $exists, $type, ..$lt, $lte, $gt, $gte, $ne,
//findpostswithanytags>db.posts.find({tags:{$exists:true}})
Regular expressions://postswhereauthorstartswithh
>db.posts.find({author:/^h/i})
-
8/8/2019 MongoSV - Schema Design
18/58
Query operators
Conditional operators:$ne, $in, $nin, $mod, $all, $size, $exists, $type, ..$lt, $lte, $gt, $gte, $ne,
//findpostswithanytags>db.posts.find({tags:{$exists:true}})
Regular expressions://postswhereauthorstartswithh
>db.posts.find({author:/^h/i})
Counting://numberofpostswrittenbyHerg
>db.posts.find({author:Herg}).count()
-
8/8/2019 MongoSV - Schema Design
19/58
Extending the Schema
new_comment={author:Kyle,
date:newDate(),
text:greatbook}
>db.posts.update(
{text:DestinationMoon},
{$push:{comments:new_comment},
$inc:{comments_count:1}})
-
8/8/2019 MongoSV - Schema Design
20/58
{_id:ObjectId("4c4ba5c0672c685e5e8aabf3"),author:"Herg",date:"SatJul24201019:47:11GMT-0700(PDT)",
text:"DestinationMoon",tags:["comic","adventure"],comments:[ { author:"Kyle", date:"SatJul24201020:51:03GMT-0700(PDT)", text:"greatbook" }],comments_count:1}
Extending the Schema
-
8/8/2019 MongoSV - Schema Design
21/58
//createindexonnesteddocuments:
>db.posts.ensureIndex({"comments.author":1})
>db.posts.find({comments.author:Kyle})
Extending the Schema
-
8/8/2019 MongoSV - Schema Design
22/58
//createindexonnesteddocuments:
>db.posts.ensureIndex({"comments.author":1})
>db.posts.find({comments.author:Kyle})
//findlast5posts:
>db.posts.find().sort({date:-1}).limit(5)
Extending the Schema
-
8/8/2019 MongoSV - Schema Design
23/58
//createindexonnesteddocuments:
>db.posts.ensureIndex({"comments.author":1})
>db.posts.find({comments.author:Kyle})
//findlast5posts:
>db.posts.find().sort({date:-1}).limit(5)
//mostcommentedpost:
>db.posts.find().sort({comments_count:-1}).limit(1)
When sorting, check if you need an index
Extending the Schema
-
8/8/2019 MongoSV - Schema Design
24/58
Watch for full table scans
>db.blogs.find({text:'DestinationMoon'}).explain()
{
"cursor":"BasicCursor",
"nscanned":1, "nscannedObjects":1,
"n":1,
"millis":0,
"indexBounds":{
}
}
-
8/8/2019 MongoSV - Schema Design
25/58
Map Reduce
-
8/8/2019 MongoSV - Schema Design
26/58
Map reduce : count tagsmapFunc=function(){
this.tags.forEach(function(z){emit(z,{count:1});});
}
reduceFunc=function(k,v){
vartotal=0;
for(vari=0;idb[res.result].find()
{_id:"comic",value:{count:1}}
{_id:"adventure",value:{count:1}}
-
8/8/2019 MongoSV - Schema Design
27/58
Group
Equivalent to a Group By in SQL
Specific the attributes to group the data
Process the results in a Reduce function
-
8/8/2019 MongoSV - Schema Design
28/58
Group - Count post by Author
cmd={key:{"author":true},initial:{count:0},reduce:function(obj,prev){prev.count++;},
};result=db.posts.group(cmd);
[
{
"author":"Herg",
"count":1
},
{
"author":"Kyle",
"count":3
}]
-
8/8/2019 MongoSV - Schema Design
29/58
Review
So Far:
- Started out with a simple schema
- Queried Data
- Evolved the schema- Queried / Updated the data some more
-
8/8/2019 MongoSV - Schema Design
30/58
http://devilseve.blogspot.com/2010/06/like-drinking-from-fire-hose.html
-
8/8/2019 MongoSV - Schema Design
31/58
Inheritance
-
8/8/2019 MongoSV - Schema Design
32/58
Single Table Inheritance - RDBMS
shapes tableid type area radius d length width
1 circle 3.14 1
2 square 4 2
3 rect 10 5 2
-
8/8/2019 MongoSV - Schema Design
33/58
Single Table Inheritance -MongoDB>db.shapes.find(){_id:"1",type:"circle",area:3.14,radius:1}{_id:"2",type:"square",area:4,d:2}
{_id:"3",type:"rect",area:10,length:5,width:2}
-
8/8/2019 MongoSV - Schema Design
34/58
Single Table Inheritance -MongoDB>db.shapes.find(){_id:"1",type:"circle",area:3.14,radius:1}{_id:"2",type:"square",area:4,d:2}
{_id:"3",type:"rect",area:10,length:5,width:2}
//findshapeswhereradius>0>db.shapes.find({radius:{$gt:0}})
-
8/8/2019 MongoSV - Schema Design
35/58
Single Table Inheritance -MongoDB>db.shapes.find(){_id:"1",type:"circle",area:3.14,radius:1}{_id:"2",type:"square",area:4,d:2}
{_id:"3",type:"rect",area:10,length:5,width:2}
//findshapeswhereradius>0>db.shapes.find({radius:{$gt:0}})
//createindex>db.shapes.ensureIndex({radius:1})
-
8/8/2019 MongoSV - Schema Design
36/58
One to ManyOne to Many relationships can specify
degree of association between objects containment life-cycle
-
8/8/2019 MongoSV - Schema Design
37/58
One to Many
- Embedded Array / Array Keys- slice operator to return subset of array- some queries hard
e.g find latest comments across all documents
-
8/8/2019 MongoSV - Schema Design
38/58
One to Many
- Embedded Array / Array Keys- slice operator to return subset of array- some queries hard
e.g find latest comments across all documents
- Embedded tree- Single document- Natural- Hard to query
-
8/8/2019 MongoSV - Schema Design
39/58
One to Many
- Embedded Array / Array Keys- slice operator to return subset of array- some queries hard
e.g find latest comments across all documents
- Embedded tree- Single document- Natural- Hard to query
- Normalized (2 collections)- most flexible- more queries
-
8/8/2019 MongoSV - Schema Design
40/58
One to Many - patterns
- Embedded Array / Array Keys
- Embedded Array / Array Keys- Embedded tree- Normalized
-
8/8/2019 MongoSV - Schema Design
41/58
Many - Many
Example:- Product can be in many categories
- Category can have many products
-
8/8/2019 MongoSV - Schema Design
42/58
products:{_id:ObjectId("4c4ca23933fb5941681b912e"),
name:"DestinationMoon",
category_ids:[ObjectId("4c4ca25433fb5941681b912f"),
ObjectId("4c4ca25433fb5941681b92af]}
Many - Many
-
8/8/2019 MongoSV - Schema Design
43/58
products:{_id:ObjectId("4c4ca23933fb5941681b912e"),
name:"DestinationMoon",
category_ids:[ObjectId("4c4ca25433fb5941681b912f"),
ObjectId("4c4ca25433fb5941681b92af]}
categories:{_id:ObjectId("4c4ca25433fb5941681b912f"),
name:"adventure",
product_ids:[ObjectId("4c4ca23933fb5941681b912e"),
ObjectId("4c4ca30433fb5941681b9130"),
ObjectId("4c4ca30433fb5941681b913a"]}
Many - Many
-
8/8/2019 MongoSV - Schema Design
44/58
products:{_id:ObjectId("4c4ca23933fb5941681b912e"),
name:"DestinationMoon",
category_ids:[ObjectId("4c4ca25433fb5941681b912f"),
ObjectId("4c4ca25433fb5941681b92af]}
categories:{_id:ObjectId("4c4ca25433fb5941681b912f"),
name:"adventure",
product_ids:[ObjectId("4c4ca23933fb5941681b912e"),
ObjectId("4c4ca30433fb5941681b9130"),
ObjectId("4c4ca30433fb5941681b913a"]}
//Allcategoriesforagivenproduct>db.categories.find({product_ids:ObjectId
("4c4ca23933fb5941681b912e")})
Many - Many
-
8/8/2019 MongoSV - Schema Design
45/58
-
8/8/2019 MongoSV - Schema Design
46/58
products:
{_id:ObjectId("4c4ca23933fb5941681b912e"),
name:"DestinationMoon",category_ids:[ObjectId("4c4ca25433fb5941681b912f"),
ObjectId("4c4ca25433fb5941681b92af]}
categories:
{_id:ObjectId("4c4ca25433fb5941681b912f"),name:"adventure"}
//Allproductsforagivencategory
>db.products.find({category_ids:ObjectId
("4c4ca25433fb5941681b912f")})
Alternative
-
8/8/2019 MongoSV - Schema Design
47/58
products:
{_id:ObjectId("4c4ca23933fb5941681b912e"),
name:"DestinationMoon",category_ids:[ObjectId("4c4ca25433fb5941681b912f"),
ObjectId("4c4ca25433fb5941681b92af]}
categories:
{_id:ObjectId("4c4ca25433fb5941681b912f"),name:"adventure"}
//Allproductsforagivencategory
>db.products.find({category_ids:ObjectId
("4c4ca25433fb5941681b912f")})
//Allcategoriesforagivenproductproduct=db.products.find(_id:some_id)
>db.categories.find({_id:{$in:product.category_ids}})
Alternative
-
8/8/2019 MongoSV - Schema Design
48/58
TreesFull Tree in Document
{comments:[{author:Kyle,text:...,replies:[
{author:Fred,text:...,replies:[]}]}]}
Pros: Single Document, Performance, Intuitive
Cons: Hard to search, Partial Results, 4MB limit
-
8/8/2019 MongoSV - Schema Design
49/58
TreesParent Links- Each node is stored as a document- Contains the id of the parent
Child Links- Each node contains the ids of the children- Can support graphs (multiple parents / child)
-
8/8/2019 MongoSV - Schema Design
50/58
Array of Ancestors- Store all Ancestors of a node{_id:"a"}
{_id:"b",ancestors:["a"],parent:"a"}
{_id:"c",ancestors:["a","b"],parent:"b"}
{_id:"d",ancestors:["a","b"],parent:"b"}
{_id:"e",ancestors:["a"],parent:"a"}{_id:"f",ancestors:["a","e"],parent:"e"}
-
8/8/2019 MongoSV - Schema Design
51/58
-
8/8/2019 MongoSV - Schema Design
52/58
Array of Ancestors- Store all Ancestors of a node{_id:"a"}
{_id:"b",ancestors:["a"],parent:"a"}
{_id:"c",ancestors:["a","b"],parent:"b"}
{_id:"d",ancestors:["a","b"],parent:"b"}
{_id:"e",ancestors:["a"],parent:"a"}{_id:"f",ancestors:["a","e"],parent:"e"}
//findalldescendantsofb:
>db.tree2.find({ancestors:b})
//findalldirectdescendantsofb:
>db.tree2.find({parent:b})
//findallancestorsoff:
>ancestors=db.tree2.findOne({_id:f}).ancestors>db.tree2.find({_id:{$in:ancestors})
-
8/8/2019 MongoSV - Schema Design
53/58
Trees as Paths
Store hierarchy as a path expression- Separate each node by a delimiter, e.g. /- Use text search for find parts of a tree
{comments:[{author:Kyle,text:initialpost,path:/},{author:Jim,text:jimscomment,path:/jim},{author:Kyle,text:KylesreplytoJim,path:/jim/kyle}]}
//FindtheconversationsJimwaspartof
>db.posts.find({path:/^jim/i})
-
8/8/2019 MongoSV - Schema Design
54/58
QueueRequirements
See jobs waiting, jobs in progress Ensure that each job is started once and only once
{inprogress:false,
priority:1,...
}
-
8/8/2019 MongoSV - Schema Design
55/58
QueueRequirements
See jobs waiting, jobs in progress Ensure that each job is started once and only once
{inprogress:false,
priority:1,...
}
//findhighestpriorityjobandmarkasin-progressjob=db.jobs.findAndModify({
query:{inprogress:false},
sort:{priority:-1),
update:{$set:{inprogress:true,started:newDate()}},
new:true})
Remember me?
-
8/8/2019 MongoSV - Schema Design
56/58
Remember me?
http://devilseve.blogspot.com/2010/06/like-drinking-from-fire-hose.html
Summary
-
8/8/2019 MongoSV - Schema Design
57/58
Summary
Schema design is diferent in MongoDB
Basic data design principals stay the same
Focus on how the apps manipulates data
Rapidly evolve schema to meet your requirements
Enjoy your new freedom, use it wisely :-)
-
8/8/2019 MongoSV - Schema Design
58/58
@mongodb
conferences,appearances,andmeetupshttp://www.10gen.com/events
http://bit.ly/mongo>
Facebook|Twitter|LinkedInhttp://linkd.in/joinmongo
download at mongodb.org
Were Hiring [email protected]