MongoDB @ fliptop

28
MongoDB @ Fliptop 2011/12/10

description

tech talk about how fliptop leverage mongodb in its infrastructure for better scalability @ twjug

Transcript of MongoDB @ fliptop

Page 1: MongoDB @ fliptop

 MongoDB @ Fliptop

2011/12/10

Page 2: MongoDB @ fliptop

Agenda

• Fliptopo infrastructure

• MongoDBo architectureo sharding strategyo data schemao index and queryo miscellaneous

Page 3: MongoDB @ fliptop

What is Fliptop?

• Social profiles lookupo facebook, twitter, linkedino campaign analysiso api lookup

•  Our problemso scalability

Data ~ 7 billion data

Infrastructure ~ 1MM lookup/day

Page 4: MongoDB @ fliptop

Fliptop Infrastructure

• Infrastructureo Amazon EC2

 • NoSQL Database

o MongoDB

• Indexing and full-text searcho Apache SOLR

• Distributed computingo AWS Elastic MapReduce (Hadoop)

Page 5: MongoDB @ fliptop

Fliptop DataBases 

• Fliptop Datao ~50MM records

• w/t MongoDBo MySQL

AWS RDS x1o Solr

AWS EC2 m1.large x 10• w MongoDB

o MySQL AWS RDS x1

o Solr AWS EC2 m1.large x 2 (master/slave)

o MongoDB AWS EC2 m2.large x 10 (replication set)

Page 6: MongoDB @ fliptop

From Solr to MongoDB

• Our Storage Requiremento auto shardingo richness of querieso short insert latency

• Other Reasonso documentationo active communityo word of mouth

•  Migration Effortso querieso db drivero performance tuning

Page 7: MongoDB @ fliptop

MongoDB Features

• Auto-Shardingo scale out to 1000 nodes 

• Replication & High Availabilityo master/slave and replication set 

• Queryingo most SQL syntax

• Document-oriented storageo json, schema-free

• Full Index Supporto inde any field

• Map/Reduceo javascript at server side

Page 8: MongoDB @ fliptop

MongoDB Servers

 

Page 9: MongoDB @ fliptop

MongoDB Shardings

• Automatic balancing for changes in load and data distribution

• Easy addition of new machines• Scaling out to one thousand nodes• No single points of failure• Automatic failover

Page 10: MongoDB @ fliptop

MongoDB Replication

• master/slaveo easy setupo manually fail-over 

• replication seto bit complex setupo automatic fail-overo minimun nodes: 3 (1 abriter)o maximun nodes :12 

Page 11: MongoDB @ fliptop

MongoDB Failover

• Voting algorithm (replication set)o  floor(all nodes/current nodes)+1

• Priorityo if 0, never becomes primary

backup with small machine

Page 12: MongoDB @ fliptop

Fliptop MongoDB Infrastructures

 • Data

o 10MM/replication set

• MongoDB serverso router x 1o config server x1o shards servers x 10

5 primary 5 secondary

o abriter servers x 5

• AWS EC2 Instanceso m2.large x 10

Page 13: MongoDB @ fliptop

MongoDB and AWS EC2

• Instances typeo m2.xlarge

 17.1 GB of memory  6.5 EC2 Compute Units

•  Storageo  Local Drive

faster i/o not portable

o EBS i/o = network + disk i/o portable easy backup raid 1/0 

Page 14: MongoDB @ fliptop

MongoDB Sharding Strategy

• Sharding Key Strategyo Ascending shard key

data locality hotspot for read/write ex. timestamp, auto-incement PK

o Random sharding key  evenly distribute read/write no data locality ex. UUID, md5

o Hybrid sharding key  ascending   evenly distribute ex. timestamp + uuid

Page 15: MongoDB @ fliptop

From timestamp to uuid

• Why timestamp?o same sharding key with our solro issues

slowness of count (traverse) query maintenance headache

add node more frequently duplication of uuids

• From timestamp to uuido  performance gain with cout

2x faster ex. count 1MM, from 10s ~ 5s.

o less maintenance enable multiple nodes at the same time

o dedup uniqueness of uuid is guarantee local only

Page 16: MongoDB @ fliptop

MongoDB Balancer

• if number of chunks are not evenly distributed, balancer can fix ito stop criteria

until diff between each nodes is <=2o balancer window

active time windowo blocking if moving massive data

while add brand new node

Page 17: MongoDB @ fliptop

MongoDB Schema

• Document orientedo json

• Schema Freeo pros

no predefined schema is required save 'as is'

o cons overhead of headers low sensitivity of broken data

Page 18: MongoDB @ fliptop

MongoDB Schema and Size

• Size matterso simple schema is better

payment:[{"publisher_id": 176, "paid":true}] payment:[176_1]

o abbreviation of headers payment:[176_1] pm:[176_1]

Page 19: MongoDB @ fliptop

MongoDB Queries

1) COLUMN = VALUE2) COLUMN in RANGE3) boolean operators AND, OR, NOT4) pagination (start, rows)5) sort6) count (of query result)7) COLUMN is non-existent8) multiValued fields9) dynamic fields10) dynamic multiValued fields11) stats queries (min, max)12) faceted queries (aggregation of specific fields)13) free text search (regular expression)

Page 20: MongoDB @ fliptop

MongoDB Index

• Tree structure Index• At most 64 indexes per collection(table)• A query only leverages 1 index unless using $or query• Index entails addition work on insert, delete, update 

Page 21: MongoDB @ fliptop

MongoDB Index Types

• Basic Indexo  db.persons.ensureIndex({name:1});

• Embedded Indexo  db.pesons.ensureIndex({location.city:1})

• Compound Indexo  db.persins.ensureIndex({name:1, location.city:1})

• Sparse Indexo  db.persons.ensureIndex({job:1}, {sparse: true})

Page 22: MongoDB @ fliptop

MongoDB Index Limits

• negations operationo  $ne, $noto  ex. db.things.find( { x : { $ne : 3 } } );

• arithmetic operations o $modo ex. db.things.find( "this.a % 10 == 1")

• most regular expressiono yes

db.persons.find({/^robbie/}) db.persons.find({/^robbie.*/}) db.persons.find({/^robbie.*/i})

o no db.persons.find({/robbie}})

• $where

Page 23: MongoDB @ fliptop

MongoDB Index Optimization

• simple data typeo ex. int is faster than string

• simple data schemao ex. {payment: "176_1"}

• sparse indexo if optional fields

Page 24: MongoDB @ fliptop

MongoDB Miscellaneous

• Monitoringo CPU

if high which implies index is brokeno Driver Size

time to add new instance• Backup

o EBS: snapshoto mongo import/export tool

mongodump/mongoimport• Auto Deployment

o Hudson + fabric (python)

Page 25: MongoDB @ fliptop

What's Next?

• Further Data and Index weight loseo target: 20MM/instance

• introduce Java POJO/DAOo Morphiao Spring mongodb

• Watchdog mechanismo restart server automatically

Page 26: MongoDB @ fliptop

Q & A

Robbie ChengLead Software [email protected]

Page 27: MongoDB @ fliptop

We're Hiring

• please mail to [email protected]

Page 28: MongoDB @ fliptop

Thank you!