Implementing MongoDB
MongoSF April 2010 Kenny Gorman, Data Architect
Shutterfly Inc.
• Founded in December 1999 • Public company (NASDAQ: SFLY) • Millions of customers have billions of pictures on
Shutterfly • Photo site, books, sharing, prints, gifts • Only photo sharing site that doesn’t down-
sample, compress, or force delete photos • > 6B photos, adding 400TB/mo
April 30, 2010 Business Confidential 2
Existing Metadata Storage Architecture
• Metadata is persisted in RDBMS • Images/media stored outside DB • Java/Spring, C#,.Net • Oracle™ RDBMS • Sun™ servers and storage • Vertically partitioned by function • Hot Standbys used for availability • > 20tb of RDBMS storage • > 10000 ex/sec • Extreme uptime requirements
April 30, 2010 Business Confidential 3
Problems
• Time to Market • Cost • Performance • Scalability
April 30, 2010 Business Confidential 4
New Metadata Storage Architecture
• Performance ! Reduce complexity ! Partition data
• Scalability ! Move to clustered system
• Time to Market ! Simple API
• Cost ! OSS software ! Simple hardware
April 30, 2010 Business Confidential 5
New Data Architecture Fundamentals
• Partition data • Relax consistency (where applicable) • Data locality • Highly available configuration • Keep design simple/fast • Keep hardware simple/cheap • Keep software simple/cheap
April 30, 2010 Business Confidential 6
MongoDB
• Open Source • Best of RDBMS, yet not quite k,v store • Features we need • Commercial support • Active community • Performance
April 30, 2010 Business Confidential 7
MongoDB Development
• Data modeling • Java, .Net • Simple, fast development • JSON just makes sense • Data access layer • GridFS
April 30, 2010 Business Confidential 8
MongoDB in production
• Simple use case, simple project • Primary and 2 replica DB’s, 1 ‘lagged’ • Manual failover • Monitoring: http interface • Tools: mongostat, custom rrd graphs • Linux on Intel™ • MongoDB 1.4.2 (stable)
April 30, 2010 Business Confidential 9
Going Live Plan
• Walk before you run • Shutterfly project/product selection • Write through architecture
• \
• Good metrics • Subset of MongoDB features
April 30, 2010 Business Confidential 10
So how did we do?
• Time to Market • Application developed in 1 sprint
• Cost • 500% improvement
• Performance • 900% improvement • 18ms to 2ms avg latency for inserts
• Scalability • Shard on demand
April 30, 2010 Business Confidential 11
The future
• More MongoDB • Replication as durability (getLasterror(w=2)) • Replica sets
• Excitement from developers • Lots of attribute and media metadata types • Object mapper
• New projects and old systems • Evaluate as they come up
April 30, 2010 Business Confidential 12
Lessons Learned
• Keep it simple • Data Modeling • Walk before you run • Use Jira for MongoDB issues • There is life after Larry
April 30, 2010 Business Confidential 13
Q&A
April 30, 2010 Business Confidential 14
Questions?
Contact: [email protected] http://www.kennygorman.com http://github.com/kgorman http://www.shutterfly.com [email protected]
Top Related