WalB: Block-level WAL. Concept.
-
Upload
takashi-hoshino -
Category
Technology
-
view
667 -
download
1
description
Transcript of WalB: Block-level WAL. Concept.
![Page 1: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/1.jpg)
WalB: Block-level WALfor Efficient Incremental Backup
Dec 2, 2010
Takashi HOSHINO
Cybozu Labs, Inc.
![Page 2: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/2.jpg)
Contents
• Motivation
• WalB
– Architecture
– Main Algorithm
– Data Format
– Pros and Cons
• Current Progress
• Summary
![Page 3: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/3.jpg)
Motivation
• There is no good backup solution
– Online
– Small performance overhead
– Supports various applications
– Cost-effective
• We need
– Consistent full backup
– Incremental backup
– Block-level backup
– To use commodity hardware and free software only
![Page 4: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/4.jpg)
Requirements inside Cybozu
• Guarantee backup interval
– within 5-10min
• Keep multiple backup archives
– also in remote site
• Keep multiple snapshots
– per 1day for 1week
![Page 5: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/5.jpg)
WalB
• A wrapper block device driver
– Data device to store data
– Log device to store WAL (write-ahead log)
• Related user-land tools
– Device controller
– Log extractor
• Target OS and architecture
– Latest Linux kernel (2.6.36)
– x86_64 host
![Page 6: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/6.jpg)
Applications
System Architecture with WalB
File System
WalB
Software RAID1
Volume0
Database System
Volume1
OS
Storage
App
For Backup
For Mirroring
![Page 7: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/7.jpg)
Backup vs Mirroring
Typical methods
Recoverable failures
Can keep latest data?
Backup Mirroring
Making snapshot, Logging writes
Operation miss,Application bug
No
RAID1, Replication
Failure of facilities
Yes
We need both functionalities to save data from lost.
![Page 8: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/8.jpg)
WalB Architecture
WrapperBlock Device(WalB Dev)
Any Block Devicefor Data (Data device)
Any Block Devicefor Log (Log device)
Read Write Log
Not special format An original format
Any Application(File System, DBMS, etc)
WalB LogExtractor
WalB DevController
Control
![Page 9: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/9.jpg)
WalB Functionalities
• Online incremental backup
• Online consistent full backup
• Snapshot creation/deletion
– not accessible due to no index
• Volume resize
– not resize of log capacity
![Page 10: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/10.jpg)
Incremental Backup with WalB
LV1
LV2
… …
LV1 bkpLV1 bkpLV1 bkp1
LV1 bkpLV1 bkpLV2 bkp1
…
LV1 bkpLV1 bkpLV1 bkp2
LV1 bkpLV1 bkpLV2 bkp2
Primary Server Backup Server 1 Backup Server 2
![Page 11: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/11.jpg)
Log Transfer
• Each log is compressed with lzop and transferred via ssh/rsh.
• Proxy may be useful for pipelined transfer to multiple sites.
• Delete original log in primary server after all backup servers have a replica.
Primary Server Backup Server 1 Backup Server 2
Log 2
Log 3
Log 2 (lzoped)
Log 1 (lzoped) Log 1 (lzoped)Log 1
Log 4
![Page 12: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/12.jpg)
Consistent Full Backup with WalB
Primary Server Backup Server
WalB Device(online)
Log
Full Archive(inconsistent)
Log
Apply
(A)
(B)
Start full backup
t1
Get consistent Image at t1
(C)
(A) (B) (C)
Timet0 t2
Get log fromt0 to t1
![Page 13: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/13.jpg)
Wait completion
Read/Write Algorithm (1)RequestQueue
•Generate logpack header•Generate request for log device•Issue the request
•Generate request for data device•Issue the request
•Send completion for upper layer•Update written_lsid•Free logpack header buffer
Wait completion
Wait completion
Write
•Generate and issue to the data device
•Send completion for upper layer
Read
Simple Algorithm
![Page 14: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/14.jpg)
Wait completion
Read/Write Algorithm (2)RequestQueue
•Generate logpack header•Allocate data buffer and copy data for data device write•Generate request for log device•Issue the request
•Insert to PendingTree•Send completion for upper layer•Generate request for data device•Issue the request
•Delete from PendingTree•Free data buffer for data device write•Update written_lsid•Free logpack header buffer
Wait completion
Wait completion
Write
•Search PendingTreeIf all data are in the treeThen copy data and send completion
for upper layerElse generate request for lack data and
issue the request to the data device
•If all requests have finishedThen send completion for upper layer
Read
Faster but more complex than (1)
![Page 15: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/15.jpg)
Parallel Task Processing
Generateread request
Generatelogpack
RequestQueue
Selectread/write
Writelogpack
Submit read request,Wait completion,Send completion
ReadQueue
LogpackQueue
LogpackCompletion
Queue
DatapackCompletion
Queue
Wait completion,Generate datapack
Wait completion,Update written_lsid,
Send completion
Writelogpack
Submitlogpack
Writelogpack
Writelogpack
Writedatapack
DatapackQueue
This is of algorithm (1)
• Data write must start after log write completion• Send completion for write must be serialized
![Page 16: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/16.jpg)
WalB Data Format
• Data device
– The same image as wrapping block device
• Log device
– Overview
– Snapshot metadata
– Ring buffer
– Logpack
![Page 17: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/17.jpg)
Log Device Format
• Snapshot metadata– The size is determined at device creation.
• Ring buffer– Stores write-ahead log.
– The size is determined at device creation.
– The size can not be changed.
Ring bufferSnapshot metadata
Address
Superblock (SECTOR_SIZE)
Reserved (PAGE_SIZE)
PAGE_SIZE = 4096 bytesSECTOR_SIZE = 512 or 4096 bytes
![Page 18: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/18.jpg)
Snapshot Metadata
• Snapshot header contains checksum and allocation bitmap
• Lsid: Log sequence id.
• Snapshot record size is 80 bytes. (6 records in 512 byte sector)
…
u32checksum
u8[64]name
SnapshotMetadata
SnapshotSector
SnapshotRecord
(fixed size)
u64 timestamp
u64lsid
u32bitmap
Snapshot Header
![Page 19: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/19.jpg)
Ring Buffer
start_offset
2nd
written data
Ring buffer
Log pack LogpackHeader
1st
written data …
Log packwith oldest_lsid
SECTOR_SIZE
![Page 20: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/20.jpg)
Logpack Header
1st 2nd 3rd …Logpack Header
u64 lsid; /* Log sequence id */u64 offset; /* IO offset by the sector. */u16 lsid_local; /* local sequence id
as the data offset in the log record. */u16 size; /* IO size by the sector. */u16 is_exist; /* 0 if this record does not exist. */u16 reserved1;
Log Record
Log Record
u32checksum
u16# of IO
u16Total IO size
![Page 21: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/21.jpg)
Write twice (1)
Pros and Cons
Read
Write
WalB(redo log)
Snapshot with redo log
No overhead Index search
Index modification
Snapshot with undo log
Bitmap search
Bitmap search/modification
and old data copyNo overhead (2)
WalB + Index ~= Block device with snapshot management
TypicalSoft.
--- ZFS, BtrFS LVM
![Page 22: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/22.jpg)
Current Progress
• Survey and study linux kernel programming
• Design of rough architecture
• Prototype implementation
• Basic evaluation and redesign if required
• Implementation of full functionalities and test
• Operation inside Cybozu
• Publication as GPLv2
• Merging to device-mapper if required
• Merging to main repository (hopefully)
![Page 23: WalB: Block-level WAL. Concept.](https://reader034.fdocuments.in/reader034/viewer/2022042813/54814707b4795979578b489f/html5/thumbnails/23.jpg)
Summary
• Motivaion
– No good backup solutioncovering various applications
• WalB
– Is a block device driver with WAL
– Provides efficient incremental backup