2012-11-30-scalable game servers

36
Scalable game servers Knut Nesheim @knutin tirsdag 4. desember 12

description

 

Transcript of 2012-11-30-scalable game servers

Page 1: 2012-11-30-scalable game servers

Scalable game serversKnut Nesheim @knutin

tirsdag 4. desember 12

Page 2: 2012-11-30-scalable game servers

tirsdag 4. desember 12

Page 3: 2012-11-30-scalable game servers

What we do

• Casual social games

• Casual: simpler, easier, nicer, cuter

• Social: interact with your friends

• Average title: 1 000 000 players every day

tirsdag 4. desember 12

Page 4: 2012-11-30-scalable game servers

Purpose of backend

• Play anywhere, HTTP

• Cheat prevention

• Ensure state integrity

• Highscores

• Multiplayer

• Payments

tirsdag 4. desember 12

Page 5: 2012-11-30-scalable game servers

The first backend

tirsdag 4. desember 12

Page 6: 2012-11-30-scalable game servers

The first backend: What

• Stateless: state only in DB, mutate in app

• Linux, on EC2

• Apache

• MySQL

• Ruby on Rails

tirsdag 4. desember 12

Page 7: 2012-11-30-scalable game servers

The first backend: Challenges

• MySQL becomes a bottleneck

• Sharding necessary

• Working data set must fit in memory

• Slow restores

tirsdag 4. desember 12

Page 8: 2012-11-30-scalable game servers

The first backend: Challengesapp db

10+ db reqs per HTTP reqtirsdag 4. desember 12

Page 9: 2012-11-30-scalable game servers

The first backend: Challenges

• Very sensitive to network latency, synchronous IO

• 16 ruby processes per node * 125 nodes = 2000 concurrent reqs

• 10K HTTP rps / 2000 workers = 5 reqs/worker/second

• Each request must be served in 1000ms / 5 = 200 ms

tirsdag 4. desember 12

Page 10: 2012-11-30-scalable game servers

The first backend: Solutions

Challenge Solution

MySQL bottleneck Shard, tune, use Redis

Sensitive to latency Redis, more app servers

~200 nodes Chef, Scalarium

tirsdag 4. desember 12

Page 11: 2012-11-30-scalable game servers

The first backend: Conclusions

• Pain to operate massive DB cluster

• Must read & write on master

• Many single points of failure

• Inefficient

• All data in memory

tirsdag 4. desember 12

Page 12: 2012-11-30-scalable game servers

The second backend

tirsdag 4. desember 12

Page 13: 2012-11-30-scalable game servers

The second backend: What

• Evolution

• Dedicated hardware and network

• Ruby on Rails

• Redis only

• Fewer and faster DB reqs

tirsdag 4. desember 12

Page 14: 2012-11-30-scalable game servers

The second backend: Conclusions

• Good progress

• Still sensitive to latency

• Single points of failure

• Can we do better?

tirsdag 4. desember 12

Page 15: 2012-11-30-scalable game servers

The third backend

tirsdag 4. desember 12

Page 16: 2012-11-30-scalable game servers

The third backend: What

• Revolution

• Stateful: app stores state while user is online, flush to disk when user goes offline

• Only used data in memory

• Not sensitive to latency

• Use Database-as-a-service

tirsdag 4. desember 12

Page 17: 2012-11-30-scalable game servers

The third backend: Challenges

• Erlang

• State

• Resource locking

• Load balancing

• Deployment

• Lack of libraries

tirsdag 4. desember 12

Page 18: 2012-11-30-scalable game servers

The third backend: Challenges

• Data structures

• Functional language

• Immutability

• Transactions

State

tirsdag 4. desember 12

Page 19: 2012-11-30-scalable game servers

The third backend: Challenges

• Only one process per user

• Central serialization

• Redis, CAS, SETNX

• Single point of failure

Resource locking

tirsdag 4. desember 12

Page 20: 2012-11-30-scalable game servers

The third backend: Challenges

• Users play for 3-5 minutes

• “proxy” forwards requests

• “worker” notifies “proxy” about health

Load balancing

tirsdag 4. desember 12

Page 21: 2012-11-30-scalable game servers

The third backend: Challenges

• Use Erlang code reload

• “git pull && make && ./upgrade.sh”

• State migration on the fly

• Rolling upgrades

Deployment

tirsdag 4. desember 12

Page 22: 2012-11-30-scalable game servers

The third backend: Challenges

• No CPAN, PiP, RubyGems

• Roll our own

• Eredis, github.com/wooga/eredis

Lack of libraries

tirsdag 4. desember 12

Page 23: 2012-11-30-scalable game servers

The third backend: Conclusions

First backend Second backend Third backend

Frequent downtime Little downtime No downtime

150+ servers 10+ servers 3 servers

100k DB reqs 10k DB reqs 1k DB reqs

:( :| :)

tirsdag 4. desember 12

Page 24: 2012-11-30-scalable game servers

The fourth backend

tirsdag 4. desember 12

Page 25: 2012-11-30-scalable game servers

The fourth backend: What

• Evolution

• Multiplayer!

• Real-time push updates

• Improve operations

• Use hosted databases

• Remove SPOF

tirsdag 4. desember 12

Page 26: 2012-11-30-scalable game servers

The fourth backend: Challenges

• User and world as Erlang processes

• Serialization

• Reason about concurrency

• Scale

Multiplayer

tirsdag 4. desember 12

Page 27: 2012-11-30-scalable game servers

The fourth backend: Challenges

• Push updates to client

• Need browser support

• “Transfer-Encoding: chunked”

• Elli, github.com/knutin/elli

Real-time push updates

tirsdag 4. desember 12

Page 28: 2012-11-30-scalable game servers

The fourth backend: Challenges

• Better load balancing

• Nodes broadcasts “alive” message

• Local failure detector

• Divergent cluster views

Improve operations

tirsdag 4. desember 12

Page 29: 2012-11-30-scalable game servers

The fourth backend: Challenges

• Collect more metrics

• ETS

• Histograms

• github.com/knutin/statman

Improve operations, cont.

tirsdag 4. desember 12

Page 30: 2012-11-30-scalable game servers

The fourth backend: Challenges

• Amazon AWS

• DynamoDB

• S3

• Low operational overhead

• Higher availability

Hosted databases

tirsdag 4. desember 12

Page 31: 2012-11-30-scalable game servers

The fourth backend: Challenges

• “locker”: distributed consistent leases

• Atomic CAS

• Write quorum, 2PC

• Eventually consistent replication

• 330 SLOC

Remove SPOF

tirsdag 4. desember 12

Page 32: 2012-11-30-scalable game servers

The fourth backend: Conclusions

• Still in development

• Benchmarks shows good results

• Must prove it live

tirsdag 4. desember 12

Page 33: 2012-11-30-scalable game servers

The future backends

tirsdag 4. desember 12

Page 34: 2012-11-30-scalable game servers

The future backends

• Erlang: processes, good IO, clustering

• Harder, better games

• Faster, stronger backends

• Stateful, also with JVM

• Another revolution?

tirsdag 4. desember 12

Page 35: 2012-11-30-scalable game servers

Conclusions

• Understand the past

• The future is stateful

• Identify, think, do

tirsdag 4. desember 12

Page 36: 2012-11-30-scalable game servers

http://wooga.com/jobs/students/

tirsdag 4. desember 12