Schema design mongo_boston

48
Senior Solution Architect, 10gen Chad Tindel #MongoBoston Schema Design

Transcript of Schema design mongo_boston

Page 1: Schema design mongo_boston

Senior Solution Architect, 10gen

Chad Tindel

#MongoBoston

Schema Design

Page 2: Schema design mongo_boston

Single Table En

Agenda

• Working with documents

• Evolving a Schema

• Queries and Indexes

• Common Patterns

Page 3: Schema design mongo_boston

Terminology

RDBMS MongoDB

Database ➜ Database

Table ➜ Collection

Row ➜ Document

Index ➜ Index

Join ➜ Embedded Document

Foreign Key ➜ Reference

Page 4: Schema design mongo_boston

Working with Documents

Page 5: Schema design mongo_boston

Modeling Data

Page 6: Schema design mongo_boston

Documents

Provide flexibility and performance

Page 7: Schema design mongo_boston

Normalized Data

Page 8: Schema design mongo_boston

Seek = 5+ ms Read = really really fast

Post

AuthorComment

Tags

Disk seeks and data locality

Page 9: Schema design mongo_boston

De-Normalized (embedded) Data

Page 10: Schema design mongo_boston

Disk seeks and data locality

Article

Author

CommentCommentCommentCommentComment

Comment

Tag

Page 11: Schema design mongo_boston

Relational Schema Design

Focus on data storage

Page 12: Schema design mongo_boston

Document Schema Design

Focus on data use

Page 13: Schema design mongo_boston

Schema Design Considerations• How do we manipulate the data?

– Dynamic Ad-Hoc Queries

– Atomic Updates

– Map Reduce

• What are the access patterns of the application?– Read/Write Ratio

– Types of Queries / Updates

– Data life-cycle and growth rate

Page 14: Schema design mongo_boston

Data Manipulation

• Conditional Query Operators– Scalar: $ne, $mod, $exists, $type, $lt, $lte, $gt, $gte, $ne

– Vector: $in, $nin, $all, $size

• Atomic Update Operators– Scalar: $inc, $set, $unset

– Vector: $push, $pop, $pull, $pushAll, $pullAll, $addToSet

Page 15: Schema design mongo_boston

Data Access

• Flexible Schemas

• Ability to embed complex data structures

• Secondary Indexes

• Compound Indexes

• Multi-Value Indexes (Indexes on Arrays)

• Aggregation Framework– $project, $match, $limit, $skip, $sort, $group, $unwind

• No Joins

Page 16: Schema design mongo_boston

Getting Started

Page 17: Schema design mongo_boston

Library Management Application• Patrons

• Books

• Authors

• Publishers

Page 18: Schema design mongo_boston

An Example

One to One Relations

Page 19: Schema design mongo_boston

patron = {_id: "joe"name: "Joe Bookreader”

}

address = {patron_id = "joe",street: "123 Fake St. ",city: "Faketon",state: "MA",zip: 12345

}

Modeling Patrons

patron = {

_id: "joe"

name: "Joe Bookreader",

address: {

street: "123 Fake St. ",

city: "Faketon",

state: "MA",

zip: 12345

}

}

Page 20: Schema design mongo_boston

One to One Relations

• Mostly the same as the relational approach

• Generally good idea to embed “contains” relationships

• Document model provides a holistic representation of objects

Page 21: Schema design mongo_boston

An Example

One To Many Relations

Page 22: Schema design mongo_boston

patron = {_id: "joe"name: "Joe Bookreader",join_date: ISODate("2011-10-15"),addresses: [

{street: "1 Vernon St.", city: "Newton", state: "MA", …},{street: "52 Main St.", city: "Boston", state: "MA", …},

]}

Modeling Patrons

Page 23: Schema design mongo_boston

Publishers and Books

• Publishers put out many books

• Books have one publisher

Page 24: Schema design mongo_boston

MongoDB: The Definitive Guide,By Kristina Chodorow and Mike DirolfPublished: 9/24/2010Pages: 216Language: English

Publisher: O’Reilly Media, CA

Book

Page 25: Schema design mongo_boston

book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ]published_date: ISODate("2010-09-24"),pages: 216,language: "English",publisher: {

name: "O’Reilly Media",founded: "1980",location: "CA"

}}

Modeling Books – Embedded Publisher

Page 26: Schema design mongo_boston

publisher = {name: "O’Reilly Media",founded: "1980",location: "CA"

}

book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ]published_date: ISODate("2010-09-24"),pages: 216,language: "English"

}

Modeling Books & Publisher Relationship

Page 27: Schema design mongo_boston

publisher = {_id: "oreilly",name: "O’Reilly Media",founded: "1980",location: "CA"

}

book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ]published_date: ISODate("2010-09-24"),pages: 216,language: "English",publisher_id: "oreilly"

}

Publisher _id as a Foreign Key

Page 28: Schema design mongo_boston

publisher = {name: "O’Reilly Media",founded: "1980",location: "CA"books: [ "123456789", ... ]

}

book = {_id: "123456789",title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ]published_date: ISODate("2010-09-24"),pages: 216,language: "English"

}

Book _id as a Foreign Key

Page 29: Schema design mongo_boston

Where do you put the foreign Key?

• Array of books inside of publisher– Makes sense when many means a handful of items

– Useful when items have bound on potential growth

• Reference to single publisher on books– Useful when items have unbounded growth (unlimited # of

books)

• SQL doesn’t give you a choice, no arrays

Page 30: Schema design mongo_boston

Another Example

One to Many Relations

Page 31: Schema design mongo_boston

Books and Patrons

• Book can be checked out by one Patron at a time

• Patrons can check out many books (but not 1000s)

Page 32: Schema design mongo_boston

patron = {_id: "joe"name: "Joe Bookreader",join_date: ISODate("2011-10-15"),address: { ... }

}

book = {_id: "123456789"title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],...

}

Modeling Checkouts

Page 33: Schema design mongo_boston

patron = {

_id: "joe"

name: "Joe Bookreader",

join_date: ISODate("2011-10-15"),

address: { ... },

checked_out: [

{ _id: "123456789", checked_out: "2012-10-15" },

{ _id: "987654321", checked_out: "2012-09-12" },

...

]

}

Modeling Checkouts

Page 34: Schema design mongo_boston

De-normalize for speed

De-normalizationProvides data locality

Page 35: Schema design mongo_boston

patron = {_id: "joe"name: "Joe Bookreader",join_date: ISODate("2011-10-15"),address: { ... },checked_out: [

{ _id: "123456789",title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],checked_out: ISODate("2012-10-15")

},{ _id: "987654321"

title: "MongoDB: The Scaling Adventure", ...}, ...

]}

Modeling Checkouts -- de-normalized

Page 36: Schema design mongo_boston

Referencing vs. Embedding

• Embedding is a bit like pre-joined data

• Document level ops are easy for server to handle

• Embed when the “many” objects always appear with (viewed in the context of) their parents.

• Reference when you need more flexibility

Page 37: Schema design mongo_boston

An Example

Single Table Inheritance

Page 38: Schema design mongo_boston

book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ]published_date: ISODate("2010-09-24"),kind: ”loanable"locations: [ ... ]pages: 216,language: "English",publisher: {

name: "O’Reilly Media",founded: "1980",location: "CA"

}}

Single Table Inheritance

Page 39: Schema design mongo_boston

An Example

Many to Many Relations

Page 40: Schema design mongo_boston

book = {title: "MongoDB: The Definitive Guide",authors = [

{ _id: "kchodorow", name: ”Kristina Chodorow” }, { _id: "mdirolf", name: ”Mike Dirolf” }

]published_date: ISODate("2010-09-24"),pages: 216,language: "English"

}

author = { _id: "kchodorow",name: "Kristina Chodorow",hometown: "New York"

}

db.authors.find ( { _id : ‘kchodorow’ } ) db.books.find( { authors.id : ‘kchodorow’ } )

Books and Authors

Page 41: Schema design mongo_boston

An Example

Trees

Page 42: Schema design mongo_boston

Modeling Trees

• Parent Links

- Each node is stored as a document

- Contains the id of the parent

• Child Links

- Each node contains the id’s of the children

- Can support graphs (multiple parents / child)

Page 43: Schema design mongo_boston

book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("2010-09-24"),pages: 216,language: "English",category: "MongoDB"

}

category = { _id: MongoDB, parent: Databases }category = { _id: Databases, parent: Programming }category = { _id: Programming }

Parent Links

Page 44: Schema design mongo_boston

book = {_id: 123456789,title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("2010-09-24"),pages: 216,language: "English"

}

category = { _id: MongoDB, children: [ 123456789, … ] }category = { _id: Databases, children: [ MongoDB, Postgres }category = { _id: Programming, children: [ DB, Languages ] }

Child Links

Page 45: Schema design mongo_boston

book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("2010-09-24"),pages: 216,language: "English",parent: ”MongoDB”,categories: [ Programming, Databases, MongoDB ]

}

book = {title: "MySQL: The Definitive Guide",authors: [ ”Michael Kofler" ],published_date: ISODate("2010-09-24"),pages: 320,language: "English",parent: ”MySQL",categories: [ Programming, Databases, MySQL ]

}

db.books.find( { categories : MongoDB } )

Array of Ancestors

Page 46: Schema design mongo_boston

An Example

Queues

Page 47: Schema design mongo_boston

book = {_id: 123456789,title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("2010-09-24"),pages: 216,language: "English",available: 3

}

db.books.findAndModify({query: { _id: 123456789, available: { "$gt": 0 } },update: { $inc: { available: -1 } }

})

Book Document

Page 48: Schema design mongo_boston

Job Title, 10gen

Speaker Name

#ConferenceHashTag

Thank You