Sharding Architectures

38
Sharding Architectures International PHP Conference 2008

description

 

Transcript of Sharding Architectures

Page 1: Sharding Architectures

Sharding ArchitecturesInternational PHP Conference 2008

Page 2: Sharding Architectures

Who the f*** is talking?

David Soria Parra <[email protected]>

PHP since 7 years

Software Engineer at Sun Microsystems

Sun Microsystems Web Stack

PHP Source Committer

Various other OSS Projects (Mercurial, etc)

Page 3: Sharding Architectures

Agenda

Problem

Possible Solutions ...and what not

What is sharding?

Implementation

Optimization

Migration

Page 4: Sharding Architectures

Problem

Scalability

What kind of scalability?

High read and write traffic

Huge amount of data

Page 5: Sharding Architectures

Solutions

Page 6: Sharding Architectures

Master/Slave replication

SlaveMaster

Slave

Page 7: Sharding Architectures

Master/Master replication

SlaveMaster

SlaveMaster Master

Page 8: Sharding Architectures

Master/Master Master/Slave

Master / Master

unstable / no official support

replication not distribution

Master / Slave

no write optimization

replication not distribution

Page 9: Sharding Architectures

Sharding

Shard

Shard

Shard

Aggregation

Database

Page 10: Sharding Architectures

What is sharding?

Application side implementation

Application side aggregation

Splitting data across databases

Parallelization

Page 11: Sharding Architectures

Sharding is not...

sharding is...

NOT replication / backup

NOT clustering

NOT just spreading

Page 12: Sharding Architectures

Implementation

Page 13: Sharding Architectures

shard 0

PHPapplication

shard 1

shard 2

Page 14: Sharding Architectures

shard 0

PHPapplication

user ID: 9

shard 1

shard 2

Page 15: Sharding Architectures

shard 0

PHPapplication

user ID: 9

shard 1

shard 2

9 % 3 = 0

Page 16: Sharding Architectures

shard 0

PHPapplication

user ID: 9

shard 1

shard 2

9 % 3 = 0

Page 17: Sharding Architectures

shard 0

PHPapplication

user ID: 9

shard 1

shard 2

9 % 3 = 0

Page 18: Sharding Architectures

Shard

Tables with a lot of writes

Tables with a lot of reads

Related data on one shard

Page 19: Sharding Architectures

Splitting: Tables

DatabasePosts

FeedsUser

Page 20: Sharding Architectures

Splitting: Tables

Shard1Posts

User

Shard2

Feeds

Page 21: Sharding Architectures

Splitting: Data

DatabaseThread

ConfigUser

Post

Page 22: Sharding Architectures

Splitting: Data

Shard1 Shard2

even user id odd user id

Thread

ConfigUser

PostThread

ConfigUser

Post

Page 23: Sharding Architectures

Data distribution

Page 24: Sharding Architectures

Criteria

Fast lookup

Few database connections

Inexpensive reorganization

Equal distribution

Page 25: Sharding Architectures

Modulo

ShardUid: 10-19

ShardUid: 20-30

ShardUid: 1-9

Page 26: Sharding Architectures

Modulo

ShardUid: 8-15

ShardUid: 16-23

ShardUid: 1-7

ShardUid: 24-30

24

6

Page 27: Sharding Architectures

Lookup table

ShardUid: 10-18

ShardUid: 1-9

Lookup

ShardUid: 19-28

Page 28: Sharding Architectures

Optimizing

Page 29: Sharding Architectures

Optimizing

caches (memcache)

model optimization

normalization? no!

connection pools

persistant connections (using mysqli+mysqlnd)

Page 30: Sharding Architectures

Optimizing II

aggregation daemons

persistent

MySQL Proxy with LUA

asynchronous queries (mysqlnd)

Page 31: Sharding Architectures

Shard 0

App

Shard 1

Shard 2

Optimizing III

Aggregation

App

Page 32: Sharding Architectures

Problems

Joins, Unions, Intersections

Grouping

Selecting and projecting on groups

Aggregation

Primary keys

Referential integrity (foreign keys)

(De-)Normalization

Page 33: Sharding Architectures

Problems

Global tables

Sharing

Tagging

Invitations

Search

Lots of relations (due to normalization) between tables

Page 34: Sharding Architectures

Migration

Find the bottleneck

Refactoring of DB model needed?

Prepare implementation

prepare DBA layer

write unit tests

split existing data

Page 35: Sharding Architectures

Migration II

Setup shards

Migrate your code

Test, Test, Test

Check if everything runs smooth

MySQL proxy

DTrace

Page 36: Sharding Architectures

Scale!

Page 37: Sharding Architectures

Some code

$conn->getConnection($userId)->query();

$conn->getGlobalConnection()->query();