Download - Webinar: MongoDB for Content Management

Transcript
Page 1: Webinar: MongoDB for Content Management

Consulting Engineer, 10gen

Bryan Reinero

https://twitter.com/mongodb

MongoDB for Content Management

Page 2: Webinar: MongoDB for Content Management

Agenda

• Sample Content Management System (CMS) Application

• Schema Design Considerations

• Viewing the Final Product

• Building Feeds and Querying Data

• Replication, Failover, and Scaling

• Further Resources

Page 3: Webinar: MongoDB for Content Management

Sample CMS Application

Page 4: Webinar: MongoDB for Content Management

CMS Application Overview

• Business news service

• Hundreds of stories per day

• Millions of website visitors per month

• Comments

• Related stories

• Tags

Page 5: Webinar: MongoDB for Content Management

Viewing Stories (Web Site)

Headline

Date, Byline

Copy

Comments

Tags

Related Stories

Page 6: Webinar: MongoDB for Content Management

Viewing Categories/Tags (Web Site)

Headline

Date, Byline

Lead Text

Headline

Date, Byline

Lead Text

Page 7: Webinar: MongoDB for Content Management

Sample ArticleHeadline

Byline, Date, Comments

Copy

Related Stories

Image

Page 8: Webinar: MongoDB for Content Management

Schema Design Considerations

Page 9: Webinar: MongoDB for Content Management

Sample Relational DB Structure

story

id

headline

copy

authorid

slug

author

id

first_name

last_name

title

tag

id

name

comment

Id

storyid

name

Email

comment_text

related_story

id

storyid

related_storyid

link_story_tag

Id

storyid

tagid

Page 10: Webinar: MongoDB for Content Management

Sample Relational DB Structure

• Number of queries per page load?

• Caching layers add complexity

• Tables may grow to millions of rows

• Joins will become slower over time as dbincreases in size

• Schema changes

• Scaling database to handle more reads

Page 11: Webinar: MongoDB for Content Management

MongoDB Schema Design

• “Schemaless”, however, schema design is important

• JSON documents

• Design for the use case and work backwards

• Do not use a relational model in MongoDB

• No joins or transactions, most related information should be contained in the same document

• Atomic updates on documents, equivalent of transaction

Page 12: Webinar: MongoDB for Content Management

{

_id: 375,

headline: ”Apple Reports Second Quarter Earnings",

date: ISODate("2012-07-14T01:00:00+01:00"),

slug: “apple-reports-second-quarter-earnings”,

byline: {

author: “Jason Zucchetto”,

title: “Lead Business Editor”

},

copy: “Apple reported second quarter revenue today…” ,

tags: [

”AAPL",

”Earnings”

],

Comments: [

{ name: “Frank”, comment: “Great story!”}

]

}

Sample MongoDB Schema

Page 13: Webinar: MongoDB for Content Management

{

_id: 375,

headline: ”Apple Reports Second Quarter Earnings",

date: ISODate("2012-07-14T01:00:00+01:00"),

slug: “apple-reports-second-quarter-earnings”,

byline: {

author: “Jason Zucchetto”,

title: “Lead Business Editor”

},

copy: “Apple reported second quarter revenue today…”,

tags: [

”AAPL",

”Earnings”

],

image: “/images/aapl/tim-cook.jpg”,

ticker: “AAPL”

}

Adding Fields Based on Story

Page 14: Webinar: MongoDB for Content Management

{

_id: 375,

headline: ”Apple Reports Second Quarter Earnings",

date: ISODate("2012-07-14T01:00:00+01:00"),

slug: “apple-reports-second-quarter-earnings”,

copy: “Apple reported second quarter revenue today…” ,

tags: [

”AAPL",

”Earnings”

],

Last25Comments: [

{ name: “Frank”, comment: “Great story!”},

{ name: “John”, comment: “This is interesting”}

]

}

High Comment Volume

Page 15: Webinar: MongoDB for Content Management

{

_id: 375,

headline: ”Apple Reports Second Quarter Earnings",

date: ISODate("2012-07-14T01:00:00+01:00"),

slug: “apple-reports-second-quarter-earnings”,

RelatedStories: [

{

headline: “Google Reports on Revenue”,

date: ISODate("2012-07-15T01:00:00+01:00"),

slug: “goog-revenue-third-quarter”

}, {

headline: “Yahoo Reports on Revenue”,

date: ISODate("2012-07-15T01:00:00+01:00"),

slug: “yhoo-revenue-third-quarter”

}

]

}

Managing Related Stories

Page 16: Webinar: MongoDB for Content Management

{ // Story Collection (sample document)

_id: 375,

headline: ”Apple Reports Second Quarter Earnings",

date: ISODate("2012-07-14T01:00:00+01:00"),

slug: “apple-reports-second-quarter-earnings”,

byline: {

author: “Jason Zucchetto”,

title: “Lead Business Editor”

},

copy: “Apple reported second quarter revenue today…” ,

tags: [

”AAPL",

”Earnings”

],

Last25Comments: [

{ name: “Frank”, comment: “Great story!”},

{ name: “John”, comment: “This is interesting”}

]

Full Sample Schema

Page 17: Webinar: MongoDB for Content Management

image: “/images/aapl/tim-cook.jpg”,

ticker: “AAPL”,

RelatedStories: [

{

headline: “Google Reports on Revenue”,

date: ISODate("2012-07-15T01:00:00+01:00"),

slug: “goog-revenue-third-quarter”

}, {

headline: “Yahoo Reports on Revenue”,

date: ISODate("2012-07-15T01:00:00+01:00"),

slug: “yhoo-revenue-third-quarter”

}

]

}

{ // Comment collection (sample document)

_id: 1891, storyid: 375, name: “Frank”, comment: “Great story!”

}

Full Sample Schema (Contd.)

Page 18: Webinar: MongoDB for Content Management

Querying and Indexing

Page 19: Webinar: MongoDB for Content Management

// Inserting new stories are easy, just submit JSON document

db.cms.insert( { headline: “Apple Reports Revenue”... });

// Adding story tags

db.cms.update( { _id : 375 }, { $addToSet : { tags : "AAPL" } } )

// Adding a comment (if embedding comments in story)

db.cms.update( { _id : 375 }, { $addToSet : { comments: { name: „Jason‟, „comment: „Great Story‟} } } )

Inserting and Updating Stories

Page 20: Webinar: MongoDB for Content Management

// Index on story slug

db.cms.ensureIndex( { slug : 1 });

// Index on story tags

db.cms.ensureIndex( { tags: 1 });

MongoDB Indexes for CMS

Page 21: Webinar: MongoDB for Content Management

// All Story information

db.cms.find( { slug : “apple-reports-second-quarter-earnings” });

// All Stories for a given tag

db.cms.find( { tags: “Earnings” });

Querying MongoDB

Page 22: Webinar: MongoDB for Content Management

Building Custom RSS Feeds

Page 23: Webinar: MongoDB for Content Management

// Very simple to gather specific information for a feed

db.cms.find( { tags: { $in : [“Earnings”, “AAPL”] } }).sort({ date : -1 });

Query Tags and Sort by Date

Page 24: Webinar: MongoDB for Content Management

Replication, Failover, and Scaling

Page 25: Webinar: MongoDB for Content Management

Replication

• Extremely easy to set up

• Replica node can trail primary node and maintain a copy of the primary database

• Useful for disaster recovery, failover, backups, and specific workloads such as analytics

• When Primary goes down, a Secondary will automatically become the new Primary

Page 26: Webinar: MongoDB for Content Management

Replication

Page 27: Webinar: MongoDB for Content Management

Reading from Secondaries (Delayed Consistency)

Reading from Secondaries (Delayed Consistency)

Page 28: Webinar: MongoDB for Content Management

Scaling Horizontally

• Important to keep working data set in RAM

• When working data set exceeds RAM, easy to add additional machines and segment data across machines (sharding)

Page 29: Webinar: MongoDB for Content Management

Sharding with MongoDB

Page 30: Webinar: MongoDB for Content Management

Additional Resources

• Use Case Tutorials: http://docs.mongodb.org/manual/use-cases/

• What others are doing: http://www.10gen.com/use-case/content-management

• This presentation & video recording: https://www.10gen.com/presentations/webinar

Page 31: Webinar: MongoDB for Content Management

Consulting Engineer, 10gen

Bryan Reinero

https://twitter.com/mongodb

Thank You