Creating social features at BranchOut using MongoDB

57
Building Social Features with MongoDB Nathan Smith BranchOut.com Jan. 22, 2013 Tuesday, January 22, 13

description

Slides from the MongoDB MeetUp "IRC Bots and Activity Feeds with MongoDB - At BranchOut", presented by the San Francisco MongoDB User Group and 10gen. http://www.meetup.com/San-Francisco-MongoDB-User-Group/events/95713262/ Over the past year, we've used MongoDB to power more and more of BranchOut's functionality, including some cool social features such as a Facebook-like activity feed. In this talk, I discuss the design decisions that went into developing these features and outline how Mongo is used under the hood. I discuss not only what makes Mongo a good technology choice, but also list a few things about Mongo that need to be worked around. If you have any questions regarding these slides, feel free to reach out to me on Twitter: @nate510. Thanks!

Transcript of Creating social features at BranchOut using MongoDB

Page 1: Creating social features at BranchOut using MongoDB

Building Social Features with MongoDB

Nathan SmithBranchOut.comJan. 22, 2013

Tuesday, January 22, 13

Page 2: Creating social features at BranchOut using MongoDB

BranchOut

• Connect with your colleagues (follow)

• Activity feed of their professional activity

• Timeline of an individual’s posts

A more social professional network

Tuesday, January 22, 13

Page 3: Creating social features at BranchOut using MongoDB

BranchOut

• 30M installed users

• 750MM total user records

• Average 300 connections per installed user

A more social professional network

Tuesday, January 22, 13

Page 4: Creating social features at BranchOut using MongoDB

MongoDB @ BranchOut

Tuesday, January 22, 13

Page 5: Creating social features at BranchOut using MongoDB

MongoDB @ BranchOut

• 100% MySQL until ~July 2012

Tuesday, January 22, 13

Page 6: Creating social features at BranchOut using MongoDB

MongoDB @ BranchOut

• 100% MySQL until ~July 2012

• Much of our data fits well into a document model

Tuesday, January 22, 13

Page 7: Creating social features at BranchOut using MongoDB

MongoDB @ BranchOut

• 100% MySQL until ~July 2012

• Much of our data fits well into a document model

• Our data design avoids RDBMS features

Tuesday, January 22, 13

Page 8: Creating social features at BranchOut using MongoDB

Follow System

Tuesday, January 22, 13

Page 9: Creating social features at BranchOut using MongoDB

Follow SystemBusiness logic

Tuesday, January 22, 13

Page 10: Creating social features at BranchOut using MongoDB

Follow System

• Limit of 2000 followees (people you follow)

Business logic

Tuesday, January 22, 13

Page 11: Creating social features at BranchOut using MongoDB

Follow System

• Limit of 2000 followees (people you follow)

• Unlimited followers

Business logic

Tuesday, January 22, 13

Page 12: Creating social features at BranchOut using MongoDB

Follow System

• Limit of 2000 followees (people you follow)

• Unlimited followers

• Both lists reflect updates in near-real time

Business logic

Tuesday, January 22, 13

Page 13: Creating social features at BranchOut using MongoDB

Follow SystemTraditional RDBMS (i.e. MySQL)

follower_uid followee_uid follow_time123 456 2013-01-22 15:43:00

456 123 2013-01-22 15:52:00

Tuesday, January 22, 13

Page 14: Creating social features at BranchOut using MongoDB

Follow SystemTraditional RDBMS (i.e. MySQL)

follower_uid followee_uid follow_time123 456 2013-01-22 15:43:00

456 123 2013-01-22 15:52:00

Advantage: Easy inserts, deletes

Tuesday, January 22, 13

Page 15: Creating social features at BranchOut using MongoDB

Follow SystemTraditional RDBMS (i.e. MySQL)

follower_uid followee_uid follow_time123 456 2013-01-22 15:43:00

456 123 2013-01-22 15:52:00

Advantage: Easy inserts, deletes

Disadvantage: Data locality, index size

Tuesday, January 22, 13

Page 16: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (first pass)

followee: { _id: 123 uids: [456, 567, 678]}

Tuesday, January 22, 13

Page 17: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (first pass)

Advantage: Compact data, read locality

followee: { _id: 123 uids: [456, 567, 678]}

Tuesday, January 22, 13

Page 18: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (first pass)

Advantage: Compact data, read locality

Disadvantage: Can’t display a user’s followers

followee: { _id: 123 uids: [456, 567, 678]}

Tuesday, January 22, 13

Page 19: Creating social features at BranchOut using MongoDB

db.follow.find({uids: 456}, {_id: 1});

Follow SystemCan’t display a user’s followers (easily)

followee: { _id: 123 uids: [456, 567, 678]}

...with multi-key index on uids

Tuesday, January 22, 13

Page 20: Creating social features at BranchOut using MongoDB

db.follow.find({uids: 456}, {_id: 1});

Follow SystemCan’t display a user’s followers (easily)

Expensive! Also, no guarantee of order.

followee: { _id: 123 uids: [456, 567, 678]}

...with multi-key index on uids

Tuesday, January 22, 13

Page 21: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (second pass)

followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}

follower: { _id: 1, uids: [2]}, follower: { _id: 2, uids: [1]}follower: { _id: 3, uids: [1, 2]}

Tuesday, January 22, 13

Page 22: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (second pass)

Advantages: Local data, fast selects

followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}

follower: { _id: 1, uids: [2]}, follower: { _id: 2, uids: [1]}follower: { _id: 3, uids: [1, 2]}

Tuesday, January 22, 13

Page 23: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (second pass)

Advantages: Local data, fast selects

Disadvantages: Follower doc size

followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}

follower: { _id: 1, uids: [2]}, follower: { _id: 2, uids: [1]}follower: { _id: 3, uids: [1, 2]}

Tuesday, January 22, 13

Page 24: Creating social features at BranchOut using MongoDB

Follow SystemFollower document size

Tuesday, January 22, 13

Page 25: Creating social features at BranchOut using MongoDB

Follow SystemFollower document size

• Max Mongo doc size: 16MB

Tuesday, January 22, 13

Page 26: Creating social features at BranchOut using MongoDB

Follow SystemFollower document size

• Max Mongo doc size: 16MB

• Number of people who follow our community manager: 30MM

Tuesday, January 22, 13

Page 27: Creating social features at BranchOut using MongoDB

Follow SystemFollower document size

• Max Mongo doc size: 16MB

• Number of people who follow our community manager: 30MM

• 30MM uids × 8 bytes/uid = 240MB

Tuesday, January 22, 13

Page 28: Creating social features at BranchOut using MongoDB

Follow SystemFollower document size

• Max Mongo doc size: 16MB

• Number of people who follow our community manager: 30MM

• 30MM uids × 8 bytes/uid = 240MB

• Max followers per doc: ~2MM

Tuesday, January 22, 13

Page 29: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (final pass)

follower: { _id: “1”, uids: [2,3,4,...], count: 20001, next_page: 2},follower: { _id: “1_p2”, uids: [23,24,25,...], count: 10000}

followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}

Tuesday, January 22, 13

Page 30: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (final pass)

follower: { _id: “1”, uids: [2,3,4,...], count: 20001, next_page: 2},follower: { _id: “1_p2”, uids: [23,24,25,...], count: 10000}

followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}

follower: { _id: “1”, uids: [2,3,4,...], count: 10001, next_page: 3},follower: { _id: “1_p2”, uids: [23,24,25,...], count: 10000}

Tuesday, January 22, 13

Page 31: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (final pass)

Asynchronous thread manages follower documents

follower: { _id: “1”, uids: [2,3,4,...], count: 20001, next_page: 2},follower: { _id: “1_p2”, uids: [23,24,25,...], count: 10000}

followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}

follower: { _id: “1”, uids: [2,3,4,...], count: 10001, next_page: 3},follower: { _id: “1_p2”, uids: [23,24,25,...], count: 10000}

Tuesday, January 22, 13

Page 32: Creating social features at BranchOut using MongoDB

Activity Feed

Tuesday, January 22, 13

Page 33: Creating social features at BranchOut using MongoDB

Push vs Pull architecture

Activity Feed

Tuesday, January 22, 13

Page 34: Creating social features at BranchOut using MongoDB

Push vs Pull architecture

Activity Feed

Tuesday, January 22, 13

Page 35: Creating social features at BranchOut using MongoDB

Push vs Pull architecture

Activity Feed

Tuesday, January 22, 13

Page 36: Creating social features at BranchOut using MongoDB

Business logic

Activity Feed

Tuesday, January 22, 13

Page 37: Creating social features at BranchOut using MongoDB

Business logic

• All connections and followees appear in your feed

Activity Feed

Tuesday, January 22, 13

Page 38: Creating social features at BranchOut using MongoDB

Business logic

• All connections and followees appear in your feed

• Reverse chron sort order (but should support other rankings)

Activity Feed

Tuesday, January 22, 13

Page 39: Creating social features at BranchOut using MongoDB

Business logic

• All connections and followees appear in your feed

• Reverse chron sort order (but should support other rankings)

• Support for evolving set of feed event types

Activity Feed

Tuesday, January 22, 13

Page 40: Creating social features at BranchOut using MongoDB

Business logic

• All connections and followees appear in your feed

• Reverse chron sort order (but should support other rankings)

• Support for evolving set of feed event types

• Tagging creates multiple feed events for the same underlying object

Activity Feed

Tuesday, January 22, 13

Page 41: Creating social features at BranchOut using MongoDB

Business logic

• All connections and followees appear in your feed

• Reverse chron sort order (but should support other rankings)

• Support for evolving set of feed event types

• Tagging creates multiple feed events for the same underlying object

• Feed events are not ephemeral -- Timeline

Activity Feed

Tuesday, January 22, 13

Page 42: Creating social features at BranchOut using MongoDB

Traditional RDBMS (i.e. MySQL)

activity_id uid event_time type oid1 oid21 123 2013-01-22 15:43:00 photo 123abc 789ghi

2 345 2013-01-22 15:52:00 status 456def foobar

Activity Feed

Tuesday, January 22, 13

Page 43: Creating social features at BranchOut using MongoDB

Traditional RDBMS (i.e. MySQL)

activity_id uid event_time type oid1 oid21 123 2013-01-22 15:43:00 photo 123abc 789ghi

2 345 2013-01-22 15:52:00 status 456def foobar

Advantage: Easy inserts

Activity Feed

Tuesday, January 22, 13

Page 44: Creating social features at BranchOut using MongoDB

Traditional RDBMS (i.e. MySQL)

activity_id uid event_time type oid1 oid21 123 2013-01-22 15:43:00 photo 123abc 789ghi

2 345 2013-01-22 15:52:00 status 456def foobar

Advantage: Easy inserts

Disadvantages: Rigid schema adapts poorly to new activity types, doesn’t scale

Activity Feed

Tuesday, January 22, 13

Page 45: Creating social features at BranchOut using MongoDB

MongoDB

ufc:{ _id: 123, // UID total_events: 18, 2013_01_total: 4, 2012_12_total: 8, 2012_11_total: 6, ...other counts...}

ufm:{ _id: “123_2013_01”, events: [ { uid: 123, type: “photo_upload”, content_id: “abcd9876”, timestamp: 1358824502, ...more metadata... }, ...more events... ]}

user_feed_card user_feed_month

Activity Feed

Tuesday, January 22, 13

Page 46: Creating social features at BranchOut using MongoDB

Algorithm

Activity Feed

Tuesday, January 22, 13

Page 47: Creating social features at BranchOut using MongoDB

Algorithm

1. Load user_feed_cards for all connections

Activity Feed

Tuesday, January 22, 13

Page 48: Creating social features at BranchOut using MongoDB

Algorithm

1. Load user_feed_cards for all connections

2. Calculate which user_feed_months to load

Activity Feed

Tuesday, January 22, 13

Page 49: Creating social features at BranchOut using MongoDB

Algorithm

1. Load user_feed_cards for all connections

2. Calculate which user_feed_months to load

3. Load user_feed_months

Activity Feed

Tuesday, January 22, 13

Page 50: Creating social features at BranchOut using MongoDB

Algorithm

1. Load user_feed_cards for all connections

2. Calculate which user_feed_months to load

3. Load user_feed_months

4. Aggregate events that refer to the same story

Activity Feed

Tuesday, January 22, 13

Page 51: Creating social features at BranchOut using MongoDB

Algorithm

1. Load user_feed_cards for all connections

2. Calculate which user_feed_months to load

3. Load user_feed_months

4. Aggregate events that refer to the same story

5. Sort (reverse chron)

Activity Feed

Tuesday, January 22, 13

Page 52: Creating social features at BranchOut using MongoDB

Algorithm

1. Load user_feed_cards for all connections

2. Calculate which user_feed_months to load

3. Load user_feed_months

4. Aggregate events that refer to the same story

5. Sort (reverse chron)

6. Load content, comments, etc. and build stories

Activity Feed

Tuesday, January 22, 13

Page 53: Creating social features at BranchOut using MongoDB

Performance

Activity Feed

Tuesday, January 22, 13

Page 54: Creating social features at BranchOut using MongoDB

Performance

• Response times average under 500 ms (98th percentile under 1 sec

Activity Feed

Tuesday, January 22, 13

Page 55: Creating social features at BranchOut using MongoDB

Performance

• Response times average under 500 ms (98th percentile under 1 sec

• Design expected to scale well horizontally

Activity Feed

Tuesday, January 22, 13

Page 56: Creating social features at BranchOut using MongoDB

Performance

• Response times average under 500 ms (98th percentile under 1 sec

• Design expected to scale well horizontally

• Need to continue to optimize

Activity Feed

Tuesday, January 22, 13

Page 57: Creating social features at BranchOut using MongoDB

Building Social Features with MongoDB

Nathan Smith BrO: http://branchout.com/nate

FB: http://facebook.com/neocortica Twitter: @nate510

Email: [email protected]

Aditya Agarwal on Facebook’s architecture: http://www.infoq.com/presentations/Facebook-Software-Stack

Dan McKinley on Etsy’s activity feed: http://www.slideshare.net/danmckinley/etsy-activity-feeds-architecture

Good Quora questions on activity feeds: http://www.quora.com/What-are-the-scaling-issues-to-keep-in-mind-while-developing-a-social-network-feed

http://www.quora.com/What-are-best-practices-for-building-something-like-a-News-Feed

Tuesday, January 22, 13