Advanced Sharding Techniques with Spider (MUC2010)

Advanced sharding techniques with Spider

Kentoku SHIBAkentokushiba at gmail dot com

How to shard databasewithout stopping the service

How to shard database

What is database sharding?When the data volume increases or the updating traffic increases, your updating database server cannot process effectively.We often use the technique for dividing data into two or more databases to solve the problem. This is database sharding.

Here, I will explain how to shard a data,without stopping the service.

Initial Structure

There is 1 MySQL server without Spider.

DB1tbl_a

Create table tbl_a (col_a int,col_b int,primary key(col_a)

) engine = InnoDB;

Step 1 (for sharding)

Create table on DB2 and DB3.Then create tables on DB1.

tbl_aDB2tbl_a

DB3tbl_atbl_a3

Create table tbl_a3 (col_a int,col_b int,primary key(col_a)

) engine = SpiderConnection ‘table “tbl_a”,user “user”,password “pass”‘partition by list(mod(col_a, 2)) (partition pt1 values in(0)comment ‘host “DB2”’,partition pt2 values in(1)comment ‘host “DB3”’

tbl_a4

tbl_a2Create table tbl_a4 (col_a int,col_b int,primary key(col_a)

) engine = VPComment ‘cit "2",cil "2",ctm “1”,ist “1”,zru “1”,tnl “tbl_a2 tbl_a3”‘;

) engine = InnoDB;

col_a%2=1

col_a%2=0

Step 2

Rename table on DB1.(rename table tbl_a2 to tbl_a5, tbl_a to tbl_a2, tbl_a4 to tbl_a)

tbl_a2DB2tbl_a

DB3tbl_atbl_a3

tbl_a5col_a%2=1

col_a%2=0

Step 3

Copy data from tbl_a2 to tbl_a3 on DB1.(select vp_copy_tables(‘tbl_a’, ‘tbl_a2’, ‘tbl_a3’))

tbl_a2DB2tbl_a

DB3tbl_atbl_a3

tbl_a5col_a%2=1

col_a%2=0

Step 4

Rename table on DB1.(rename table tbl_a to tbl_a4, tbl_a3 to tbl_a)

tbl_a2DB2tbl_a

DB3tbl_atbl_a

tbl_a4

tbl_a5col_a%2=1

col_a%2=0

Finish

Drop table on DB1.(drop table tbl_a2, tbl_a4, tbl_a5)

DB2tbl_a

DB3tbl_atbl_a

col_a%2=1

col_a%2=0

How to re-shard databasewithout stopping the service

How to re-shard database

What is re-sharding?When the data volume increases or the updating traffic increases so much, even if you had your database sharded, your updating database server cannot process right again.So we solve that problem by increasing the number of servers and distributing the load.It is called re-sharding to increase the number of servers, and to distribute the load.

Here, I will explain how to re-shardwithout stopping the service.

Initial Structure

There are 1 MySQL server with Spider and 2 remote MySQL servers without Spider.

DB1tbl_a

tbl_aCreate table tbl_a (col_a int,col_b int,primary key(col_a)

DB3tbl_a

col_a%2=1col_a%2=0

Step 1 (for re-sharding)

Create table on DB4 and DB5.Then create tables on DB3.

DB1tbl_a

col_a%2=1col_a%2=0

DB4tbl_a

DB5tbl_a

tbl_a3

tbl_a4

tbl_a2

col_a%4=3

col_a%4=1

Step 2

DB1tbl_a

col_a%2=1col_a%2=0

DB4tbl_a

DB5tbl_a

tbl_a2

tbl_a3

tbl_a5col_a%4=3

col_a%4=1

Step 3

DB1tbl_a

col_a%2=1col_a%2=0

DB4tbl_a

DB5tbl_a

tbl_a2

tbl_a3

tbl_a5col_a%4=3

col_a%4=1

Step 4

DB1tbl_a

col_a%2=1col_a%2=0

DB4tbl_a

DB5tbl_a

tbl_a2

tbl_a4

tbl_a5col_a%4=3

col_a%4=1

Rename table on DB3.Then alter table on DB1.

Alter table tbl_apartition by list(mod(col_a, 4)) (partition pt1 values in(0,2)comment ‘host “DB2”’,partition pt2 values in(1)comment ‘host “DB4”’,partition pt2 values in(3)comment ‘host “DB5”’

Rename tabletbl_a to tbl_a4,tbl_a3 to tbl_a;

Finish

DB1tbl_a

col_a%2=0

DB4tbl_a

DB5tbl_a

col_a%4=3

col_a%4=1

Drop DB3.

How to add an indexwithout stopping the service

How to add an index

If you add an index in MySQL, you cannotupdate your data until the process is completed.When it comes to a big table, it takesa long time to complete, sometimes you cannotuse the service during the change.

Here, I will explain how to add an index,without stopping the update of your data.

Initial Structure

There is 1 MySQL server.

DB1tbl_a

) engine = InnoDB;

Step 1 (for adding an index)

Create tables on DB1.

tbl_a3

tbl_a4

tbl_a2 Create table tbl_a4 (col_a int,col_b int,primary key(col_a)

) engine = InnoDB;

Create table tbl_a3 (col_a int,col_b int,primary key(col_a),key idx1(col_b)

) engine = InnoDB;

Step 2

tbl_a2

tbl_a3

tbl_a5

Step 3

tbl_a2

tbl_a3

tbl_a5

Step 4

tbl_a2

tbl_a4

tbl_a5

Finish

How to change the schemawithout stopping the service

How to change the schema

If you change schema in MySQL, you cannotupdate your data until the process is completed.When it comes to a big table, it takesa long time to complete, sometimes you cannotuse the service during the change.

Here, I will explain how to change schema,without stopping the update of your data.

Initial Structure

DB1tbl_a

) engine = InnoDB;

Step 1 (for adding a column)

tbl_a3

tbl_a4

tbl_a2 Create table tbl_a4 (col_a int,col_b int,primary key(col_a)

) engine = InnoDB;

Create table tbl_a3 (col_a int,col_b int,col_c int default null,primary key(col_a)

) engine = InnoDB;

Step 2

tbl_a2

tbl_a3

tbl_a5

Step 3

tbl_a2

tbl_a3

tbl_a5

Step 4

tbl_a2

tbl_a4

tbl_a5

Finish

How to set up a clusterfor fault tolerance

without stopping the service

How to set up a cluster for fault tolerance

Spider can set up a cluster for fault toleranceby each table.

Here, I will explain how to set up cluster,without stopping service.

'Monitoring node' in this slide is a node that works to observethe trouble of each node that composes clustering.'Spider_copy_tables' in this slide is in development , so pleasewait for a while to use it.

Initial Structure

There are 1 MySQL server with Spider and 1 remote Mysql servers without Spider.

DB1tbl_a

) engine = SpiderConnection ‘table “tbl_a”,user “user”,password “pass”,host “DB2”‘;

) engine = InnoDB;

Step 1 (for clustering)

Add new data nodes(DB3 and DB4) and tables.

tbl_aDB3tbl_a

DB4tbl_a

) engine = InnoDB;

Step 2

Add new monitoring nodes(DB5, DB6, DB7) and tables.

tbl_aDB3tbl_a

DB4tbl_a

DB7DB6

DB5tbl_a

) engine = SpiderConnection ‘table “tbl_a”,user “user”,password “pass”,host “DB2 DB3 DB4”‘;

Step 3

Register monitornig node information toMySQL servers with Spider.

Then alter table on DB1.

tbl_aDB3tbl_a

DB4tbl_a

DB7DB6

DB5tbl_a

insert into mysql.spider_link_mon_servers(db_name, table_name, link_id, sid, server, scheme, host, port, socket, username, password)values('db_name', 'tbl_a', 0, DB5_sid, null, 'mysql', 'DB5', 3306, null, 'user', 'pass‘),('db_name', 'tbl_a', 0, DB6_sid, null, 'mysql', 'DB6', 3306, null, 'user', 'pass‘),('db_name', 'tbl_a', 0, DB7_sid, null, 'mysql', 'DB7', 3306, null, 'user', 'pass‘);

Alter table tbl_aConnection ‘table “tbl_a”,user “user”,password “pass”,host “DB2 DB3 DB4”,mbk “2”, mkd “2”,msi “DB5_sid”,link_status “0 2 2”‘;

DB7DB6

DB5tbl_aDB1

Select spider_copy_tables(‘tbl_a’, ‘’, ‘’);

Step 4

Copy data from DB2 to DB3 and DB4.

tbl_aDB3tbl_a

DB4tbl_a

Finish

Alter table on DB1.

tbl_aDB3tbl_a

DB4tbl_a

DB7DB6

DB5tbl_a

How to add new nodeafter failoverand preparing new server

Create a table of a new node to the clustered table

You need to create a new node, in order tomaintain redundancy, when there is a troubleat the node that composes the cluster.

Here, I will explain how to add a table of a new node, without stopping the service.

'Monitoring node' in this slide is a node that works to observethe trouble of each node that composes clustering.'Spider_copy_tables' in this slide is still in development , it will

be available in future releases.

Initial Structure

There are 4 MySQL servers with Spider(include 3 monitoring nodes) and

3 MySQL servers without Spider (including 1 broken node).

tbl_aDB3tbl_a

DB4tbl_a

DB7DB6

DB5tbl_a

Step 1

Add new data node(DB8) and table.

tbl_aDB3tbl_a

DB4tbl_a

DB7DB6

DB5tbl_a

DB8tbl_a

) engine = InnoDB;

Step 2

Alter table on monitoring nodes(DB5, DB6 and DB7).

tbl_aDB3tbl_a

DB4tbl_a

DB7DB6

DB5tbl_a

DB8tbl_a

Alter table tbl_aConnection ‘table “tbl_a”,user “user”,password “pass”,host “DB2 DB4 DB8”‘;

Step 3

Alter table on DB1.

tbl_aDB3tbl_a

DB4tbl_a

DB7DB6

DB5tbl_a

DB8tbl_a

Step 4

Copy data from DB2 to DB8.

tbl_aDB3tbl_a

DB4tbl_a

DB7DB6

DB5tbl_a

DB8tbl_a

Select spider_copy_tables(‘tbl_a’, ‘’, ‘’);

Finish

Alter table on DB1.

tbl_aDB3tbl_a

DB4tbl_a

DB7DB6

DB5tbl_a

DB8tbl_a

How to avoid table partitioningUNIQUE column limitation

How to avoid table partitioning UNIQUE column limitation

Right now, there is a restriction of MySQL thatyou cannot partition in other columns whenthere is a PK or UNIQUE.

Here, I will show you how to partition a table by any columns even if there is a PK or UNIQUE.

Initial Structure

DB1tbl_a

) engine = InnoDB;

Step 1 (for avoiding partitioning limitation)

tbl_a3

tbl_a5

tbl_a2Create table tbl_a5 (col_a int,col_b int,primary key(col_a)

) engine = VPComment ‘ctm “1”, ist “1”,zru “1”, pcm “1”‘Connection ‘tnl “tbl_a2 tbl_a3 tbl_a4”‘;

) engine = InnoDB;

Create table tbl_a3 (col_a int,primary key(col_a)

) engine = InnoDBpartition bylinear hash(col_a)partitions 4;

tbl_a4

Create table tbl_a4 (col_a int,col_b int,key idx1(col_a),key idx2(col_b)

) engine = InnoDBpartition by list(mod(col_b, 2)) (partition pt1 values in(0),partition pt2 values in(1)

Step 2

tbl_a2

tbl_a3

tbl_a6

tbl_a4

Step 3

Copy data from tbl_a2 to tbl_a3 and tbl_a4.(select vp_copy_tables(‘tbl_a’, ‘tbl_a2’, ‘tbl_a3 tbl_a4’))

tbl_a2

tbl_a3

tbl_a6

tbl_a4

Step 4

Alter table tbl_a.

tbl_a2

tbl_a6

tbl_a3

tbl_a4 Alter table tbl_aComment ‘ctm “1”, ist “1”,pcm “1”‘,Connection ‘tnl “tbl_a3 tbl_a4”‘;

Finish

Drop table.(drop table tbl_a2, tbl_a6)

tbl_a3

tbl_a4

Case study

About MicroAd

MicroAd is an advatising company.

This company can advertise efficientlyusing "behavioral targeting" technology.

http://www.microad.jp/english/【MicroAd, Inc.]

The previous architecture

Batch processing updates new statistical rules every day.(For every advertisers, every advertising medias

and every users)

MasterDB

replication

Register new statistical rules from batch server

SlaveDB

…… AP AP ……

The problem with business expansion

Increase data and request.At that time the limit of updates were 20 million records a day.They needed to update 100 million records a day.

They also wanted to improve the performance of the reference slave by decreasing the amount of the update by one slave.

They did not want to change or modify their application to support the increase.

Then, Spider was used.

The architecture with Spider

They created the shards withthe unit of the replication.

MasterDBreplication

APwith Spider

Register newstatistical rules from batch server

SlaveDB SlaveDB

…… APwith Spider

APwith Spider

……

SlaveDB SlaveDB

MasterDBreplication

SpiderDB(MySQL with Spider)

Spider sharding

Resolved the problem

As a result,They achieved update 100 million records a dayand improved the performance of the reference.

They didn't need to change or modify their applications so much.

They are planning in the near future of resharding, when they expand the business.

http://wild-growth.blogspot.com/http://spiderformysql.com

Kentoku SHIBA (kentokushiba at gmail dot com)

Any Questions?

Thank you for taking

your time!!

Advanced Sharding Techniques with Spider (MUC2010)

Technology

Transcript of Advanced Sharding Techniques with Spider (MUC2010)

PGConf India 2020 Sharding in PostgreSQL · Existing PostgreSQL Sharding solutions BDR3 Enterprise-grade solution for sharding Applications do not need heavy modiﬁcations High Availability

Introduction to sharding

2 Proprietary & Confidential What is Sharding Benefits of Sharding Alternatives of Sharding When to start Sharding Agenda.

New features and enhancements of Spider Storage Engine for ... · What is the Spider Storage Engine? Spider is a sharding solution and proxying solution. Spider Storage Engine is

Sharding Methods for MongoDB

Using Oracle ShardingContents 1 Oracle Sharding OverviewWhat is Sharding 1-2 About Oracle Sharding 1-3 Benefits of Oracle Sharding 1-4 Example Applications using Database Sharding

The Future of Postgres Sharding - Bruce Momjian · The Future of Postgres Sharding BRUCE MOMJIAN This presentation will cover the advantages of sharding and future Postgres sharding

Sharding Architectures 777

Using spider for sharding in production

Lightning Talk: MongoDB Sharding

Spider Storage Engine for the sharding - Percona...What is the Spider Storage Engine? 1.request 2. Execute SQL 4.response AP All databases can be used as ONE database through Spider.

Using SPIDER for sharding in production - … · How to get SPIDER working? (2/5) Create one to one Spider table. CREATE TABLE t1( c1 int, c2 varchar(100), PRIMARY KEY(c1) )ENGINE=spider

Spider Storage Engine: The sharding plugin of MySQL ... · Spider Storage Engine: The sharding plugin of MySQL/MariaDB Introducing and newest topics Spiral-Arm / Team-Lab Kentoku

14603839 SPIDER Storage Engine Database Sharding by Storage Engine

Five data models for sharding - PGConf ASIA · • Not sharding is always easier than sharding • Identify your sharding approach/key early, denormalize it even when you’re small

Sharding Using Spockproxy_ A Sharding-only Version of MySQL Proxy Presentation

MariaDB 10 - percona.com · SPIDER • Transparent sharding and re-sharding via SQL • Partition by range/key/hash/list • vertical partitioning engine, allows partition by columns

Sharding Overview

Secure Sharding in MongoDB

Sharding and MongoDB - Genoveva Vargas-Solarvargas-solar.com/.../sites/32/2014/01/MongoDB-sharding-guide.pdf · Sharding and MongoDB Release 2.8.0-rc3 MongoDB Documentation Project