Creating social features at BranchOut using MongoDB
-
Upload
nathan-smith -
Category
Technology
-
view
5.704 -
download
2
description
Transcript of Creating social features at BranchOut using MongoDB
Building Social Features with MongoDB
Nathan SmithBranchOut.comJan. 22, 2013
Tuesday, January 22, 13
BranchOut
• Connect with your colleagues (follow)
• Activity feed of their professional activity
• Timeline of an individual’s posts
A more social professional network
Tuesday, January 22, 13
BranchOut
• 30M installed users
• 750MM total user records
• Average 300 connections per installed user
A more social professional network
Tuesday, January 22, 13
MongoDB @ BranchOut
Tuesday, January 22, 13
MongoDB @ BranchOut
• 100% MySQL until ~July 2012
Tuesday, January 22, 13
MongoDB @ BranchOut
• 100% MySQL until ~July 2012
• Much of our data fits well into a document model
Tuesday, January 22, 13
MongoDB @ BranchOut
• 100% MySQL until ~July 2012
• Much of our data fits well into a document model
• Our data design avoids RDBMS features
Tuesday, January 22, 13
Follow System
Tuesday, January 22, 13
Follow SystemBusiness logic
Tuesday, January 22, 13
Follow System
• Limit of 2000 followees (people you follow)
Business logic
Tuesday, January 22, 13
Follow System
• Limit of 2000 followees (people you follow)
• Unlimited followers
Business logic
Tuesday, January 22, 13
Follow System
• Limit of 2000 followees (people you follow)
• Unlimited followers
• Both lists reflect updates in near-real time
Business logic
Tuesday, January 22, 13
Follow SystemTraditional RDBMS (i.e. MySQL)
follower_uid followee_uid follow_time123 456 2013-01-22 15:43:00
456 123 2013-01-22 15:52:00
Tuesday, January 22, 13
Follow SystemTraditional RDBMS (i.e. MySQL)
follower_uid followee_uid follow_time123 456 2013-01-22 15:43:00
456 123 2013-01-22 15:52:00
Advantage: Easy inserts, deletes
Tuesday, January 22, 13
Follow SystemTraditional RDBMS (i.e. MySQL)
follower_uid followee_uid follow_time123 456 2013-01-22 15:43:00
456 123 2013-01-22 15:52:00
Advantage: Easy inserts, deletes
Disadvantage: Data locality, index size
Tuesday, January 22, 13
Follow SystemMongoDB (first pass)
followee: { _id: 123 uids: [456, 567, 678]}
Tuesday, January 22, 13
Follow SystemMongoDB (first pass)
Advantage: Compact data, read locality
followee: { _id: 123 uids: [456, 567, 678]}
Tuesday, January 22, 13
Follow SystemMongoDB (first pass)
Advantage: Compact data, read locality
Disadvantage: Can’t display a user’s followers
followee: { _id: 123 uids: [456, 567, 678]}
Tuesday, January 22, 13
db.follow.find({uids: 456}, {_id: 1});
Follow SystemCan’t display a user’s followers (easily)
followee: { _id: 123 uids: [456, 567, 678]}
...with multi-key index on uids
Tuesday, January 22, 13
db.follow.find({uids: 456}, {_id: 1});
Follow SystemCan’t display a user’s followers (easily)
Expensive! Also, no guarantee of order.
followee: { _id: 123 uids: [456, 567, 678]}
...with multi-key index on uids
Tuesday, January 22, 13
Follow SystemMongoDB (second pass)
followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}
follower: { _id: 1, uids: [2]}, follower: { _id: 2, uids: [1]}follower: { _id: 3, uids: [1, 2]}
Tuesday, January 22, 13
Follow SystemMongoDB (second pass)
Advantages: Local data, fast selects
followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}
follower: { _id: 1, uids: [2]}, follower: { _id: 2, uids: [1]}follower: { _id: 3, uids: [1, 2]}
Tuesday, January 22, 13
Follow SystemMongoDB (second pass)
Advantages: Local data, fast selects
Disadvantages: Follower doc size
followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}
follower: { _id: 1, uids: [2]}, follower: { _id: 2, uids: [1]}follower: { _id: 3, uids: [1, 2]}
Tuesday, January 22, 13
Follow SystemFollower document size
Tuesday, January 22, 13
Follow SystemFollower document size
• Max Mongo doc size: 16MB
Tuesday, January 22, 13
Follow SystemFollower document size
• Max Mongo doc size: 16MB
• Number of people who follow our community manager: 30MM
Tuesday, January 22, 13
Follow SystemFollower document size
• Max Mongo doc size: 16MB
• Number of people who follow our community manager: 30MM
• 30MM uids × 8 bytes/uid = 240MB
Tuesday, January 22, 13
Follow SystemFollower document size
• Max Mongo doc size: 16MB
• Number of people who follow our community manager: 30MM
• 30MM uids × 8 bytes/uid = 240MB
• Max followers per doc: ~2MM
Tuesday, January 22, 13
Follow SystemMongoDB (final pass)
follower: { _id: “1”, uids: [2,3,4,...], count: 20001, next_page: 2},follower: { _id: “1_p2”, uids: [23,24,25,...], count: 10000}
followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}
Tuesday, January 22, 13
Follow SystemMongoDB (final pass)
follower: { _id: “1”, uids: [2,3,4,...], count: 20001, next_page: 2},follower: { _id: “1_p2”, uids: [23,24,25,...], count: 10000}
followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}
follower: { _id: “1”, uids: [2,3,4,...], count: 10001, next_page: 3},follower: { _id: “1_p2”, uids: [23,24,25,...], count: 10000}
Tuesday, January 22, 13
Follow SystemMongoDB (final pass)
Asynchronous thread manages follower documents
follower: { _id: “1”, uids: [2,3,4,...], count: 20001, next_page: 2},follower: { _id: “1_p2”, uids: [23,24,25,...], count: 10000}
followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}
follower: { _id: “1”, uids: [2,3,4,...], count: 10001, next_page: 3},follower: { _id: “1_p2”, uids: [23,24,25,...], count: 10000}
Tuesday, January 22, 13
Activity Feed
Tuesday, January 22, 13
Push vs Pull architecture
Activity Feed
Tuesday, January 22, 13
Push vs Pull architecture
Activity Feed
Tuesday, January 22, 13
Push vs Pull architecture
Activity Feed
Tuesday, January 22, 13
Business logic
Activity Feed
Tuesday, January 22, 13
Business logic
• All connections and followees appear in your feed
Activity Feed
Tuesday, January 22, 13
Business logic
• All connections and followees appear in your feed
• Reverse chron sort order (but should support other rankings)
Activity Feed
Tuesday, January 22, 13
Business logic
• All connections and followees appear in your feed
• Reverse chron sort order (but should support other rankings)
• Support for evolving set of feed event types
Activity Feed
Tuesday, January 22, 13
Business logic
• All connections and followees appear in your feed
• Reverse chron sort order (but should support other rankings)
• Support for evolving set of feed event types
• Tagging creates multiple feed events for the same underlying object
Activity Feed
Tuesday, January 22, 13
Business logic
• All connections and followees appear in your feed
• Reverse chron sort order (but should support other rankings)
• Support for evolving set of feed event types
• Tagging creates multiple feed events for the same underlying object
• Feed events are not ephemeral -- Timeline
Activity Feed
Tuesday, January 22, 13
Traditional RDBMS (i.e. MySQL)
activity_id uid event_time type oid1 oid21 123 2013-01-22 15:43:00 photo 123abc 789ghi
2 345 2013-01-22 15:52:00 status 456def foobar
Activity Feed
Tuesday, January 22, 13
Traditional RDBMS (i.e. MySQL)
activity_id uid event_time type oid1 oid21 123 2013-01-22 15:43:00 photo 123abc 789ghi
2 345 2013-01-22 15:52:00 status 456def foobar
Advantage: Easy inserts
Activity Feed
Tuesday, January 22, 13
Traditional RDBMS (i.e. MySQL)
activity_id uid event_time type oid1 oid21 123 2013-01-22 15:43:00 photo 123abc 789ghi
2 345 2013-01-22 15:52:00 status 456def foobar
Advantage: Easy inserts
Disadvantages: Rigid schema adapts poorly to new activity types, doesn’t scale
Activity Feed
Tuesday, January 22, 13
MongoDB
ufc:{ _id: 123, // UID total_events: 18, 2013_01_total: 4, 2012_12_total: 8, 2012_11_total: 6, ...other counts...}
ufm:{ _id: “123_2013_01”, events: [ { uid: 123, type: “photo_upload”, content_id: “abcd9876”, timestamp: 1358824502, ...more metadata... }, ...more events... ]}
user_feed_card user_feed_month
Activity Feed
Tuesday, January 22, 13
Algorithm
Activity Feed
Tuesday, January 22, 13
Algorithm
1. Load user_feed_cards for all connections
Activity Feed
Tuesday, January 22, 13
Algorithm
1. Load user_feed_cards for all connections
2. Calculate which user_feed_months to load
Activity Feed
Tuesday, January 22, 13
Algorithm
1. Load user_feed_cards for all connections
2. Calculate which user_feed_months to load
3. Load user_feed_months
Activity Feed
Tuesday, January 22, 13
Algorithm
1. Load user_feed_cards for all connections
2. Calculate which user_feed_months to load
3. Load user_feed_months
4. Aggregate events that refer to the same story
Activity Feed
Tuesday, January 22, 13
Algorithm
1. Load user_feed_cards for all connections
2. Calculate which user_feed_months to load
3. Load user_feed_months
4. Aggregate events that refer to the same story
5. Sort (reverse chron)
Activity Feed
Tuesday, January 22, 13
Algorithm
1. Load user_feed_cards for all connections
2. Calculate which user_feed_months to load
3. Load user_feed_months
4. Aggregate events that refer to the same story
5. Sort (reverse chron)
6. Load content, comments, etc. and build stories
Activity Feed
Tuesday, January 22, 13
Performance
Activity Feed
Tuesday, January 22, 13
Performance
• Response times average under 500 ms (98th percentile under 1 sec
Activity Feed
Tuesday, January 22, 13
Performance
• Response times average under 500 ms (98th percentile under 1 sec
• Design expected to scale well horizontally
Activity Feed
Tuesday, January 22, 13
Performance
• Response times average under 500 ms (98th percentile under 1 sec
• Design expected to scale well horizontally
• Need to continue to optimize
Activity Feed
Tuesday, January 22, 13
Building Social Features with MongoDB
Nathan Smith BrO: http://branchout.com/nate
FB: http://facebook.com/neocortica Twitter: @nate510
Email: [email protected]
Aditya Agarwal on Facebook’s architecture: http://www.infoq.com/presentations/Facebook-Software-Stack
Dan McKinley on Etsy’s activity feed: http://www.slideshare.net/danmckinley/etsy-activity-feeds-architecture
Good Quora questions on activity feeds: http://www.quora.com/What-are-the-scaling-issues-to-keep-in-mind-while-developing-a-social-network-feed
http://www.quora.com/What-are-best-practices-for-building-something-like-a-News-Feed
Tuesday, January 22, 13