CouchConf Israel 2013_Couchbase at Playtika

25

Transcript of CouchConf Israel 2013_Couchbase at Playtika

SOME GENERIC STUFFPreface

Myself

• Many years @ 888 an online Casino & Poker company. Finished as Gaming Architect

• CTO @ UMOO, a financial market gaming platform

• CTO @ Playtika

Playtika

• Founded mid 2010• Social gaming company• Multiple social networks and sites• Multiple platforms (SNs, destination, Android,

iOS)• Acquired by by Caesar’s Interactive

Entertainment mid 2011

Where

Our Games

• A few million daily active users.• A few (more) million monthly active users• Our biggest game today Slotomania – 200K+

concurrent users at peak hours. No less than 100K during off-peak

• At least one request per user every 5 seconds• At least one read-modify write on user state per

request• In many cases multiple read-modify-writes

OUR STORY BEGINSSo…

2 years ago, Slotomania on Facebook

• 70K Concurrent users • A few million registered

users• One MySQL node

• Hit a glass ceiling Needed to scale “yesterday”

So what now?

• Had Sharding code branch that had to be dropped

• Within 3 weeks!!! MySQL only => MySQL + Couchbase

• Meantime – 3 ring MySQL circus

Why Couchbase

• Open source • Backed by a serious company• Indications all around that the tech is being

successfully used with some big-time players• Almost no comparison work. Online only!

OUR SOLUTIONintroducing

Keep MySQL – Why?

• Fear is the mind killer• Site to site replication• Can still “SQL Query” the data• Easy as pie…• BUT, not everyone can keep everything on one

node.

What do we keep in couchbase?

• Everything key value• Anything we need to access fast and frequent– User Score– User Save– User Messages

• Only things where true ACIDness is not 100% mandatory or can be faked.

Keys and Values

• Be creative with keys and objects• Most common: user_id• E.g. user_id & session_id• E.g. user_id & game_type

“Schema” – SQL vs. Document based

• Sometimes you need to turn the data on its head

• 2 Table join in SQL = multi get in Couchbase• Excessive ‘multigetting’ = network saturation• Our Lesson: Don’t migrate as is.

Refactor/reimagine the schema for a document-oriented database

Serialization

• Right now, some JSON some Binary• Intend to move most if not all to JSON – 2.0!– Browse keys without having to have entire context– Easier to support object versioning

• Consider other serialization methods (thrift, protobuf)

Data Services or Direct Access?

• Encapsulate data access?• Direct Access => High Performance• Data layer Encapsulation => mitigate complex

SOA-ish environments

Data Services vs. Direct Access

Service A

DAL Lib

Service B

Couchbase

DAL Lib

Service A

RPC Lib

Service B

Couchbase

RPC Lib

Data Services

Locks

• Pessimistic vs. Optimistic• We use 2 different lock types in different cases:• CAS + spin X times (and fail)• Lock object with timeout (actually can be done

better than we do it!)• Avoid contention if possible (especially cross-process

) by old-school tricks– Queue stuff– IP based stickiness (you lose nothing by this)– Don’t be a nerd! Cut corners where possible.

THE OPS ANGLEBarzelim

Our cluster(s)

• 12 Couchbase buckets + 4 memcache buckets On two clusters

• The 2 “heavy” buckets are on their own cluster

• 40K ops/s on couchbase, 20K ops/s on memcache

• 30-35% Couchbase ops are writes• We still live in a pre-2.0 world (1.8.1)

Ops/s (off-peak)

• “Small cluster” (actually with a bug on prod)

• “Big cluster”

Our hardware

• Couchbase boxes are physical• Latest “spec” – Dell R710 with • 2 CPUs (12 cores) – overkill!!• 6 SAS HDDs• 64 Gigs RAM• Running CentOS 5.8

What’s Next?

QUESTIONS?Thanks for listening