Massively Distributed Backups at Facebook Scale - Shlomo Priymak, Facebook - DevOpsDays Tel Aviv...

Post on 14-Apr-2017

546 views 0 download

Transcript of Massively Distributed Backups at Facebook Scale - Shlomo Priymak, Facebook - DevOpsDays Tel Aviv...

Massively Distributed Backupat Facebook Scale

Shlomo Priymak (shlomo@fb.com, @shlomoid)Production Engineering Manager, MySQL Infrastructure

MySQL at Facebook

Sharding

fbid is 64bit integermap(fbid) = shard id

{ "id": "101231234567123", "name": "Shlomo Priymak" }

graph API Examplegraph.facebook.com/me

serverinstance

shard #4

shard #3

shard #2

shard #1

serverinstance

shard #4

shard #3

shard #2

shard #1

serverinstance

shard #4

shard #3

shard #2

shard #1

serverinstance

shard #4

shard #3

shard #2

shard #1

Master

Slaves

Replica Set

Prineville, Oregon

Altoona, Iowa

Forest City, North Carolina

Ashburn, Virginia

Luleå, Sweden

100

1000+

Backup Fundamentals

• `mysqldump` • --single-transaction • Logical Read Ahead

Full Dumps

Logical vs. Physical

Logical Physical

External Tools Yes No

Size Small Large

Single Table Restore Easy Difficult

Debug Corruption Easy Difficult

Compressibility Excellent Meh

Backup / Restore Duration Long Short

Differential Backup

Differential Backup

0

2

4

6

8

% of space taken by differential backups

Day 1 Day 2 Day 3 Day 40

25

50

75

100

Relative backup space usage

Day 1 Day 2 Day 3 Day 4

Full Backup Differential Backup

Differential Backup Generation

INSERT INTO t VALUES (1, ‘San Fransisco’), (2, ‘Santa Clara`), (400, ‘Los Angeles’), [...] );

INSERT INTO t1 VALUES ( );

INSERT INTO t1 VALUES ( );

INSERT INTO t VALUES (1, ‘San Fransisco’), (2, ‘Oakland`), (3, ‘Menlo Park’), [...] );

No Change

Inserted Rows

CREATE TABLE t (id int, city char(50); /* ORDERING KEY : (id) */

Full Backup (old) Full Backup (new)

Deleted Rows

Differential Backup Generation

INSERT INTO t VALUES (1, ‘San Fransisco’), (2, ‘Santa Clara`), (400, ‘Los Angeles’), [...] );

INSERT INTO t1 VALUES ( (2, ‘Santa Clara’), );

INSERT INTO t1 VALUES ( (2, ‘OakLand’), );

INSERT INTO t VALUES (1, ‘San Fransisco’), (2, ‘Oakland`), (3, ‘Menlo Park’), [...] );

Row Updated

Inserted RowsDeleted Rows

Full Backup (old) Full Backup (new)

CREATE TABLE t (id int, city char(50); /* ORDERING KEY : (id) */

Differential Backup Generation

INSERT INTO t VALUES (1, ‘San Fransisco’), (2, ‘Santa Clara`), (400, ‘Los Angeles’), [...] );

INSERT INTO t1 VALUES ( (2, ‘Santa Clara’), );

INSERT INTO t1 VALUES ( (2, ‘OakLand’), (3, ‘Menlo Park’), );

INSERT INTO t VALUES (1, ‘San Fransisco’), (2, ‘Oakland`), (3, ‘Menlo Park’), [...] ); Row Deleted

Inserted RowsDeleted Rows

Full Backup (old) Full Backup (new)

CREATE TABLE t (id int, city char(50); /* ORDERING KEY : (id) */

Differential Backup Generation

INSERT INTO t VALUES (1, ‘San Fransisco’), (2, ‘Santa Clara`), (400, ‘Los Angeles’), [...] );

INSERT INTO t1 VALUES ( (2, ‘Santa Clara’), (400, ‘Los Angeles’), );

INSERT INTO t1 VALUES ( (2, ‘OakLand’), (3, ‘Menlo Park’), );

INSERT INTO t VALUES (1, ‘San Fransisco’), (2, ‘Oakland`), (3, ‘Menlo Park’), [...] ); Row Inserted

Inserted RowsDeleted Rows

Full Backup (old) Full Backup (new)

CREATE TABLE t (id int, city char(50); /* ORDERING KEY : (id) */

Final Output

INSERT INTO t1 VALUES ( (2, ‘Santa Clara’), (400, ‘Los Angeles’), );

INSERT INTO t1 VALUES ( (2, ‘OakLand’), (3, ‘Menlo Park’), );

Inserted RowsDeleted Rows

Restoring Diff Backup

INSERT INTO t1 VALUES ( (2, ‘Santa Clara’), (400, ‘Los Angeles’), );

Inserted Rows

INSERT INTO t1 VALUES ( (2, ‘OakLand’), (3, ‘Menlo Park’), );

Deleted Rows

INSERT INTO t VALUES (1, ‘San Fransisco’), (2, ‘Oakland`), (3, ‘Menlo Park’), [...] );

Full Backup (old)

3-Way Merge

INSERT INTO t VALUES (1, ‘San Fransisco’), (2, ‘Santa Clara`), (400, ‘Los Angeles’), [...] );

Full Backup (new)

• Point in time recovery • Global Transaction IDs

Binary Logs

Continuous Restore

• Everything, Every Day • Streaming Binary Logs • Multiple stages

• HDFS • Offsite

Backup ScheduleWhat, When, Where

Backup ScheduleFull, Diff, Diff, Diff, Full, Diff, Diff, Diff, Full, Diff, Diff, Diff…

Full5

Diff6

Diff7

Diff8

Full9

Diff10

Diff11

Diff12

Full1

Diff2

Diff3

Diff4

Backup Traffic

3.5 Tb/sPeak

~0.5 Tb/sTrough

Differential Backup Stages

1) mysql HDFSFull (new)

2) HDFSFull (new)

Full (old)Diff (new) HDFSDiffer

• Too much HDFS I/O • Too much network I/O • Too long

#fail

Differential Backup Stages

1) mysql HDFSFull (new)

2) HDFSFull (new)

Full (old)Diff (new) HDFSDiffer

Differential Streaming

mysql

HDFS

Full (new)

Full (

old)Diff (new)Differ HDFS

Database Server

Prineville, Oregon

Altoona, Iowa

Forest City, North Carolina

Ashburn, Virginia

Luleå, Sweden

1. Target2. Source

• Equalize HDFS cluster usage • Minimize cross-region traffic • Avoid broken replicas • Consistency • Backup at least once!

System Design Goals

• Define allocation globally • Hash shards into a 1000 buckets • Allocate buckets to clusters,

proportional to size

Distribution Algorithm

serverinstance

shard 14

shard 13

shard 12

shard 11

shard 10

shard 9

shard 8

shard 7

shard 6

shard 5

shard 4

shard 3

shard 2

shard 1

998997 999

4

1000

321

shard 14

shard 13shard 12

shard 11

shard 10

shard 9 shard 8

shard 7 shard 6

shard 5

shard 4

shard 3

shard 2

shard 1

1000 Buckets

HDFS 2HDFS 1 HDFS 3 HDFS 4 HDFS 5

101 400100 401 600 601 850 851 1000

1 PB 3 PB 2 PB 2.5 PB 1.5 PB

1

100 buckets 300 buckets 200 buckets 250 buckets 150 buckets

Total buckets: 1000Total size: 10 PB (Example)

HDFS 2HDFS 1 HDFS 3 HDFS 4 HDFS 5

101 400100 401 600 601 850 851 1000

1 PB 3 PB 2 PB 2.5 PB 1.5 PB

Bucket 20Bucket 200

Bucket 500

Bucket 650

Bucket 900Bucket 30

Bucket 400

1

Bucket 700

Bucket 800

Unified Pool

Some are More Equal than Others?

sorted unsorted

sorted unsorted

Rebalance / Convergence

A

CB

1. C2. A3. B

1. C2. A3. B

1. C2. A3. B

Is Alive?

A

CB

αHDFS

ɣHDFS

Δ(B, α) = 10

Δ(A, α) = 0

Δ(A, ɣ) = 20

Δ(B, ɣ) = 15

Δ(C, ɣ) = 0

HDFS Priority

ɣ 0α 1

rank(mysql, hdfs) ≡ ( Δ(mysql, hdfs), priority(hdfs), )

𝕄 ={A, B, C}// MySQL Servers 𝓗 ={α, ɣ}// HDFS Clusters 𝕄×𝓗 ={(m, h): m∈𝕄 ∧ h∈𝓗} return sort(𝕄×𝓗, key=rank(m, h))

Δ priorityA, α 0 1A, ɣ 20 0B, α 10 1B, ɣ 15 0C, α 20 1C, ɣ 0 0

Δ priorityC, ɣ 0 0A, α 0 1B, α 10 1B, ɣ 15 0A, ɣ 20 0C, α 20 1

Pile it Up

A

CB

War Story: Cluster Turn Up

• New HDFS Cluster • New Datacenter • Slow Ramp-Up

New Region Cluster Turn Up

• Network woes • Pulling full backups to create diffs! • Fix: run full when target HDFS changes

New Region Cluster Turn Up

New Region Cluster Turn Up, cont.

New Region Cluster Turn Up, cont.

Call From the Engine Room

• Emergency meeting • Fix: Turn off a few racks

Divert Power to the Shields!

The Future!

• Record previous value by default • Binary Logs + Binary Logs => Diff • Full + Diff => Full • In theory, run full backup only once!

Row Based Binary Logs

Questions!

Shlomo Priymak (shlomo@fb.com)@shlomoid