(GAM201) Cloud Gaming Architectures from Mobile to Social to MMO

79
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Mark Bate, Solutions Architecture, AWS Jaeman An, Software Engineer, Devsisters Corp. GAM201 Cloud Gaming Architectures From Social to Mobile to MMO

Transcript of (GAM201) Cloud Gaming Architectures from Mobile to Social to MMO

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Mark Bate, Solutions Architecture, AWS

Jaeman An, Software Engineer, Devsisters Corp.

GAM201

Cloud Gaming ArchitecturesFrom Social to Mobile to MMO

Gratuitous logo slide

Traditional: Rigid AWS: Elastic

Servers

Demand

Capacity

Excess Capacity

Wasted $$

Demand

Unmet Demand

Upset Players

Missed Revenue :(

Scale to what you need, pay for what you use

11 regions

53 edge locations

Continuous expansion

Global is good

Common game back-end concepts

Think in terms of APIs

HTTP + JSON

Get friends, leaderboard

Binary asset data

Multiplayer servers

High availability

Scalability

Core (HA) game back end

ELB

• Choose region

• >=2 Availability Zones

• Amazon EC2 for app

• Elastic Load Balancing

• Amazon RDS database

• Multi-AZ

Region

Scale it way out

ELB

• Amazon S3 for game data

• Assets

• UGC

• Analytics

Region

Scale it way out

ELB

• Amazon S3 for game data

• Assets

• UGC

• Analytics

• ... With Amazon

CloudFront!

Region

CloudFront

CDN

Scale it way out

• Amazon S3 for game data

• Assets

• UGC

• Analytics

• ... with CloudFront!

• Auto Scaling group

• Capacity on demand

• Respond to users

• Automatic healing

ELB

Region

CloudFront

CDN

Scale it way out

• Amazon S3 for game data

• Assets

• UGC

• Analytics

• ... with CloudFront!

• Auto Scaling group

• Capacity on demand

• Respond to users

• Automatic healing

• Amazon ElastiCache

• Memcached

• Redis

ELB

Region

CloudFront

CDN

Writing is painful

• Games are write heavy

• Caching of limited use

• Key value

• Binary structures

• Database = bottleneck

ELB

Region

CloudFront

CDN

Sharding (not fun)

Amazon DynamoDB

• Fully managed

• NoSQL data store

• Provisioned throughput

• Secondary indexes

• PUT/GET keys

• Document support!

ELB

Region

CloudFront

CDN

Example: Leaderboard in DynamoDB

• Hash key = Primary key

• Range key = Sub key

• Range key = Sort key

• Others attributes are

undefined

• So… How to sort based

on top score?

UserID

(hash key)

BoardName

(range key)

TopScore TopScoreDate

"101" "Galaxy Invaders" 5842 "2014-09-15T17:24:31"

"101" "Meteor Blasters" 1000 "2014-10-22T23:18:01"

"101" "Starship X" 24 "2014-08-31T13:14:21"

"102" "Alien Adventure" 192 "2014-07-12T11:07:56"

"102" "Galaxy Invaders" 0 "2014-09-18T07:33:42"

"103" "Attack Ships" 3 "2014-10-19T01:13:24"

"103" "Galaxy Invaders" 2317 "2014-09-11T06:53:00"

"103" "Meteor Blasters" 723 "2014-10-19T01:14:24"

"103" "Starship X" 42 "2014-07-11T06:53:03"

Leaderboard with secondary indexes

• Create a secondary index!

• Set hash key to BoardName

• Set range key to TopScore

• Project extra attributes as needed

• Can now query by BoardName,

sorted by TopScore

• Handles many common gaming

use cases

BoardName

(hash key)

TopScore

(range key)

UserID

"Alien Adventure" 192 "101"

"Attack Ships" 3 "103"

"Galaxy Invaders" 0 "102"

"Galaxy Invaders" 2317 "103"

"Galaxy Invaders" 5842 "101"

"Meteor Blasters" 723 "103"

"Meteor Blasters" 1000 "101"

"Starship X" 24 "101"

"Starship X" 42 "103"

UserID

(hash key)

BoardName

(range key)

TopScore TopScoreDate

"101" "Galaxy Invaders" 5842 "2014-09-15T17:24:31"

Documents in DynamoDB

Scalar types: String, Number, Binary, Boolean, Null

Multivalue types: String Set, Number Set, Binary Set

Document types: List, Map

Document content addressing

"name": ”Mark",

"games": ["Megablast","Spacerace"],

"score": {"Megablast" : 123,"Spacerace" : 41

}

"name": {"S": ”Mark"

}"games": {"L": [ { "S": "Megablast" },

{ "S": "Spacerace" } ]},"score": {"M": {"Megablast": { "N": "123" },"Spacerace": { "N": "41" }

}}

"name": {"S": ”Mark"

}"games": {"L": [ { "S": "Megablast" },

{ "S": "Spacerace" } ]},"score": {"M": {"Megablast": { "N": "123" },"Spacerace": { "N": "41" }

}}

document.score.Megablast

Related sessions

DAT204 NoSQL? No Worries: Building Scalable

Applications on AWS NoSQL Services

DAT401 Amazon DynamoDB Deep Dive: Schema Design,

Indexing, JSON, Search, and More

GAM401 Serverless Mobile Game Development with

Amazon Cognito, AWS Lambda, and Amazon

DynamoDB

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Jaeman An (Devsisters Corp.)

18

GAM201

Cloud Gaming Architecturesfrom Mobile to Social to MMO

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

What to expect from the session

19

How we improved our design

Tips and tricks

Retrospect

How we started

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Cookie Run

20

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Cookie Run video

21

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO 22

About Cookie Run

• 70M~ downloads

• 10M DAU

• Top free 1st in 10 countries

• Top free 10th in 38 countries

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

More about Devsisters and Cookie Run

23

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

How We Started

24

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

In early 2013…

Lack of infrastructure, lack of developer, no hope

(1 server developer / 0 system engineers)

Only 1 game in service

Ovenbreak 2

- AWS US East

Cookie Run

- Only 1 person, 1 month left

25

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Goal

26

Highly reliable Quality assured

Scalable designAuto configuring

and scaling

Real-time monitoring system

Log system

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

First design

27

Game server

Operation tool

Monitoring

Java, Spring MVC, MySQL 5.5

Python, Django, Boto

Amazon CloudWatch, Zabbix, Statsd, Graphite

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

First design

28

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

After 11 days

29

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Design Improvements

30

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Design improvements

31

Improving the logging system

Improving the game patch system

Adding global user ranking system

Redesigning the back end

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Redesigning the back end

Players send game hearts to each other. Back ends do the bookkeeping

- Plan A: Used MySQL for storing data

Trouble: MySQL can’t keep up; too many rows (100M ~)

- Plan B: Gave unlimited hearts to users! Disabled the feature

Trouble: Not so bad, but need to come up with a better solution

32

Situation

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Solution

MySQL → NoSQL (Couchbase)

Use MySQL for game data (shop data, stage data, …)

Use NoSQL for user data (user items, level, coin, …)

33

Redesigning the back end

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Before

34

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

After

35

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Improving the logging system

We need real-time log querying capability

36

Real-time log viewing system based on ELK

Situation

Solution

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Before

37

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

After

38

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

/Real-time log viewing system

39

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Improving the game patch system

40

App Store binary size limit

Some resources need to be downloaded on demand

Wanted to distribute patches without App Store update

Constructed a decent patch system

Based on Amazon S3 and Amazon CloudFront

Situation

Solution

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Before

41

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

After

42

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Improving the logging system

43

Total log size >10 TB; want to analyze all logs

Situation forced us to look for big data solutions

Adopted big data platforms using Amazon EMR or Amazon EC2

Situation

Solution

Eventually migrated to Spark and Spark SQL

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Before

44

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

After

45

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Spark

46

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Log dashboard

47

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Adding global user ranking system

48

Want to introduce global user ranking system

Use ordered set based on skip list using with ElastiCache

…with custom caching and a lot of optimization techniques

Situation

Solution

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Before

49

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

After

50

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Tips and Tricks

51

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

THIS CAN HAPPEN TO YOU

BASED ON THE TRUE STORY OF OUR TEAM

52

WARNING

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Auto Scaling gotchas

Frequency: More than 10 times during 2 years

Many users connect to the game simultaneously

• During holiday seasons

• Start of in-game events

• When bulk push notifications are sent

• Or reasons unknown

Booting instances takes several minutes,

which isn’t quick enough to handle spikey loads

We have to predict traffic surges and prepare beforehand

53

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Our bulk push system

54

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Auto Scaling gotchas

Don’t set minimum instance of 1 or 2

If one machine dies, service fails

Use multiple Availability Zones

Sometimes instance availability of a single AZ can run out

Use multiple AZ with ELB cross-zone balancing

55

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Auto Scaling gotchas

Set scale-out(scale-in) policy meticulously

scale-out: +4 when Latency >= 0.1 for 2 minutes

scale-in: -2 when CPUUtilization < 10 for 2 minutes

Sometimes scale-up can be a useful option

56

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Chef server failure

Auto Scaling group relying on Chef server is dangerous

Chef server is a single point of failure (SPOF)

May become unresponsive when too many servers start simultaneously

Errors happen in unexpected places!

57

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Couchbase storage failure

Hardware problems can occur in EC2 instances

The worst, the most hopeless system failure

Front end API server can crash; that’s OK

But if you are maintaining a database on EC2, this can be a tragedy

It really happens

58

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Couchbase storage failure

June 2015

A monumental hell gate in our company history

Server down for 12 consecutive hours because of a disk error in Couchbase

Also, our daily backup script had not worked for 1 week prior to the shutdown

Some data were restored via replication

The other data were restored through adding the lost week’s logs to previously

backed up data

Lesson learned: Replica is necessary. Confirm backups.

59

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Overseas network failure

Frequency: More than 5 times over 2 years

This situation has really happened

ISPs cut costs leading to overseas packet loss

Just Call AWS

60

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Final Design Review

61

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

First design

62

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Final

63

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Future Plans

64

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cloud Gaming Architectures from Mobile to Social to MMO

Future plans

Transactional log system (Logstash → Kafka)

High latency / packet loss networks : QUIC

Entertain the world!

65

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Thank [email protected]

Amazon Cognito

Identity

Providers

Unique

IdentitiesJoe Anna Bob

Any Device

Any Platform

Any AWS

Service

Helps implement security best practicesSecurely access any AWS service from mobile

device; it simplifies the interaction with AWS

Identity and Access Management

Support multiple login providersEasily integrate with major login providers for

authentication, or use your own authentication

system

Unique users vs. devicesManage unique identities; automatically recognize

unique user across devices and platforms

Mobile

AnalyticsS3 DynamoDB Kinesis

Your own

Auth

Amazon Cognito

Synchronize data across devices with Amazon Cognito

Sync game state

across OS, devices State transition

(link multiple accounts)Sync user profiles

across OS, devices, web

Related sessions

GAM401 Serverless Mobile Game Development with

Amazon Cognito, AWS Lambda, and Amazon

DynamoDB

MBL402 Mobile Identity Management and Data

Synchronization Using Amazon Cognito

WRK202 Build a Scalable Mobile App on Serverless,

EventTriggered, BackEnd Logic

Player TwoPress Start

Multiplayer game servers

Region

• API back-end app

• Core session

• Matchmaking

• S3 + CloudFront

• DLC, assets

• Game saves

• UGC

• Public server tier

• Direct client socket

• Scale on players

Multiplayer game servers

① Login via API

② Request matchmaking

③ Get game server IP

Region

Multiplayer game servers

① Login via API

② Request matchmaking

③ Get game server IP

④ Connect to server

⑤ Pull down assets

⑥ Other players join

Region

Multiregion game servers

Region ARegion B

Related sessions

GAM403 From 0 to 60 Million Player Hours in 400 Billion

Star Systems

GAM404 Evolve: Hunting Monsters in a Low Latency

Multiplayer Game on Amazon EC2

GAM407 Quiplash: The Multiscreen, Multidevice,

Multiplayer Game for 10,000

Wrap it up already

Use Auto Scaling to save money

Amazon CloudFront + Amazon S3 for download and upload

Painful DIYDB? No! Use Amazon DynamoDB

Dynamically manage game servers using the APIs

• Even multiregion!

Remember to complete

your evaluations!