Webinar: MongoDB for Content Management

31
Consulting Engineer, 10gen Bryan Reinero https://twitter.com/mongodb MongoDB for Content Management

description

MongoDB's flexible schema makes it a great fit for your next content management application as its data model makes it easy to catalog multiple content types with diverse meta data. In this session, we'll review schema design for content management, using GridFS for storing binary files, and how you can leverage MongoDB's auto-sharding to partition your content across multiple servers.

Transcript of Webinar: MongoDB for Content Management

Page 1: Webinar: MongoDB for Content Management

Consulting Engineer, 10gen

Bryan Reinero

https://twitter.com/mongodb

MongoDB for Content Management

Page 2: Webinar: MongoDB for Content Management

Agenda

• Sample Content Management System (CMS) Application

• Schema Design Considerations

• Viewing the Final Product

• Building Feeds and Querying Data

• Replication, Failover, and Scaling

• Further Resources

Page 3: Webinar: MongoDB for Content Management

Sample CMS Application

Page 4: Webinar: MongoDB for Content Management

CMS Application Overview

• Business news service

• Hundreds of stories per day

• Millions of website visitors per month

• Comments

• Related stories

• Tags

Page 5: Webinar: MongoDB for Content Management

Viewing Stories (Web Site)

Headline

Date, Byline

Copy

Comments

Tags

Related Stories

Page 6: Webinar: MongoDB for Content Management

Viewing Categories/Tags (Web Site)

Headline

Date, Byline

Lead Text

Headline

Date, Byline

Lead Text

Page 7: Webinar: MongoDB for Content Management

Sample ArticleHeadline

Byline, Date, Comments

Copy

Related Stories

Image

Page 8: Webinar: MongoDB for Content Management

Schema Design Considerations

Page 9: Webinar: MongoDB for Content Management

Sample Relational DB Structure

story

id

headline

copy

authorid

slug

author

id

first_name

last_name

title

tag

id

name

comment

Id

storyid

name

Email

comment_text

related_story

id

storyid

related_storyid

link_story_tag

Id

storyid

tagid

Page 10: Webinar: MongoDB for Content Management

Sample Relational DB Structure

• Number of queries per page load?

• Caching layers add complexity

• Tables may grow to millions of rows

• Joins will become slower over time as dbincreases in size

• Schema changes

• Scaling database to handle more reads

Page 11: Webinar: MongoDB for Content Management

MongoDB Schema Design

• “Schemaless”, however, schema design is important

• JSON documents

• Design for the use case and work backwards

• Do not use a relational model in MongoDB

• No joins or transactions, most related information should be contained in the same document

• Atomic updates on documents, equivalent of transaction

Page 12: Webinar: MongoDB for Content Management

{

_id: 375,

headline: ”Apple Reports Second Quarter Earnings",

date: ISODate("2012-07-14T01:00:00+01:00"),

slug: “apple-reports-second-quarter-earnings”,

byline: {

author: “Jason Zucchetto”,

title: “Lead Business Editor”

},

copy: “Apple reported second quarter revenue today…” ,

tags: [

”AAPL",

”Earnings”

],

Comments: [

{ name: “Frank”, comment: “Great story!”}

]

}

Sample MongoDB Schema

Page 13: Webinar: MongoDB for Content Management

{

_id: 375,

headline: ”Apple Reports Second Quarter Earnings",

date: ISODate("2012-07-14T01:00:00+01:00"),

slug: “apple-reports-second-quarter-earnings”,

byline: {

author: “Jason Zucchetto”,

title: “Lead Business Editor”

},

copy: “Apple reported second quarter revenue today…”,

tags: [

”AAPL",

”Earnings”

],

image: “/images/aapl/tim-cook.jpg”,

ticker: “AAPL”

}

Adding Fields Based on Story

Page 14: Webinar: MongoDB for Content Management

{

_id: 375,

headline: ”Apple Reports Second Quarter Earnings",

date: ISODate("2012-07-14T01:00:00+01:00"),

slug: “apple-reports-second-quarter-earnings”,

copy: “Apple reported second quarter revenue today…” ,

tags: [

”AAPL",

”Earnings”

],

Last25Comments: [

{ name: “Frank”, comment: “Great story!”},

{ name: “John”, comment: “This is interesting”}

]

}

High Comment Volume

Page 15: Webinar: MongoDB for Content Management

{

_id: 375,

headline: ”Apple Reports Second Quarter Earnings",

date: ISODate("2012-07-14T01:00:00+01:00"),

slug: “apple-reports-second-quarter-earnings”,

RelatedStories: [

{

headline: “Google Reports on Revenue”,

date: ISODate("2012-07-15T01:00:00+01:00"),

slug: “goog-revenue-third-quarter”

}, {

headline: “Yahoo Reports on Revenue”,

date: ISODate("2012-07-15T01:00:00+01:00"),

slug: “yhoo-revenue-third-quarter”

}

]

}

Managing Related Stories

Page 16: Webinar: MongoDB for Content Management

{ // Story Collection (sample document)

_id: 375,

headline: ”Apple Reports Second Quarter Earnings",

date: ISODate("2012-07-14T01:00:00+01:00"),

slug: “apple-reports-second-quarter-earnings”,

byline: {

author: “Jason Zucchetto”,

title: “Lead Business Editor”

},

copy: “Apple reported second quarter revenue today…” ,

tags: [

”AAPL",

”Earnings”

],

Last25Comments: [

{ name: “Frank”, comment: “Great story!”},

{ name: “John”, comment: “This is interesting”}

]

Full Sample Schema

Page 17: Webinar: MongoDB for Content Management

image: “/images/aapl/tim-cook.jpg”,

ticker: “AAPL”,

RelatedStories: [

{

headline: “Google Reports on Revenue”,

date: ISODate("2012-07-15T01:00:00+01:00"),

slug: “goog-revenue-third-quarter”

}, {

headline: “Yahoo Reports on Revenue”,

date: ISODate("2012-07-15T01:00:00+01:00"),

slug: “yhoo-revenue-third-quarter”

}

]

}

{ // Comment collection (sample document)

_id: 1891, storyid: 375, name: “Frank”, comment: “Great story!”

}

Full Sample Schema (Contd.)

Page 18: Webinar: MongoDB for Content Management

Querying and Indexing

Page 19: Webinar: MongoDB for Content Management

// Inserting new stories are easy, just submit JSON document

db.cms.insert( { headline: “Apple Reports Revenue”... });

// Adding story tags

db.cms.update( { _id : 375 }, { $addToSet : { tags : "AAPL" } } )

// Adding a comment (if embedding comments in story)

db.cms.update( { _id : 375 }, { $addToSet : { comments: { name: „Jason‟, „comment: „Great Story‟} } } )

Inserting and Updating Stories

Page 20: Webinar: MongoDB for Content Management

// Index on story slug

db.cms.ensureIndex( { slug : 1 });

// Index on story tags

db.cms.ensureIndex( { tags: 1 });

MongoDB Indexes for CMS

Page 21: Webinar: MongoDB for Content Management

// All Story information

db.cms.find( { slug : “apple-reports-second-quarter-earnings” });

// All Stories for a given tag

db.cms.find( { tags: “Earnings” });

Querying MongoDB

Page 22: Webinar: MongoDB for Content Management

Building Custom RSS Feeds

Page 23: Webinar: MongoDB for Content Management

// Very simple to gather specific information for a feed

db.cms.find( { tags: { $in : [“Earnings”, “AAPL”] } }).sort({ date : -1 });

Query Tags and Sort by Date

Page 24: Webinar: MongoDB for Content Management

Replication, Failover, and Scaling

Page 25: Webinar: MongoDB for Content Management

Replication

• Extremely easy to set up

• Replica node can trail primary node and maintain a copy of the primary database

• Useful for disaster recovery, failover, backups, and specific workloads such as analytics

• When Primary goes down, a Secondary will automatically become the new Primary

Page 26: Webinar: MongoDB for Content Management

Replication

Page 27: Webinar: MongoDB for Content Management

Reading from Secondaries (Delayed Consistency)

Reading from Secondaries (Delayed Consistency)

Page 28: Webinar: MongoDB for Content Management

Scaling Horizontally

• Important to keep working data set in RAM

• When working data set exceeds RAM, easy to add additional machines and segment data across machines (sharding)

Page 29: Webinar: MongoDB for Content Management

Sharding with MongoDB

Page 30: Webinar: MongoDB for Content Management

Additional Resources

• Use Case Tutorials: http://docs.mongodb.org/manual/use-cases/

• What others are doing: http://www.10gen.com/use-case/content-management

• This presentation & video recording: https://www.10gen.com/presentations/webinar

Page 31: Webinar: MongoDB for Content Management

Consulting Engineer, 10gen

Bryan Reinero

https://twitter.com/mongodb

Thank You