DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we...
Transcript of DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we...
![Page 1: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/1.jpg)
„DRBD 9“
Linux Storage Replication
Lars Ellenberg
LINBIT HA Solutions GmbHVienna, Austria
![Page 2: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/2.jpg)
What this talk is about
• What is replication
• Why block level replication
• Why replication
• What do we have to deal with
• How we are dealing with it now
• Where development is headed
![Page 3: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/3.jpg)
Linux Storage Replication
Replication Basics
DRBD 8 Overview
DM-Replicator
DRBD 9
Other Ideas
![Page 4: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/4.jpg)
Linux Storage Replication
Replication Basics
DRBD 8 Overview
DM-Replicator
DRBD 9
Other Ideas
![Page 5: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/5.jpg)
Standalone Servers
Node 1 Node 2 Node 3
• No System Level Redundancy
• Vulnerable to FailuresImportant Systems
![Page 6: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/6.jpg)
Application Level Replication
Node 1 Node 3
• Special Purpose Solution
• Difficult to add to an application after the fact
Important Systems
AppApp
![Page 7: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/7.jpg)
Filesystem Level Replication
Node 1 Node 3
• Special Filesystem
• Complex
• Replicate on dirty?
• ... on writeout?
• ... on close?
• What about metadata?
• Resilience?
Important Systems
FSFS
![Page 8: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/8.jpg)
Shared Storage/SAN
Shared Storage (SAN)
Shared data
Node 1 Node 2 Node 3
• No Storage Redundancy
Important Systems
FC, iSCSI
![Page 9: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/9.jpg)
Shared Storage/SAN
Replication capable SAN
Shared data
Node 1 Node 2 Node 3
• Application agnostic
• Expensive Hardware
• Expensive License costs
Important Systems
FC, iSCSI
Shared Storage/SAN
Replica
![Page 10: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/10.jpg)
Cluster
Block Level Replication
Node 1 Node 2DRBD
• Storage Redundancy
• Application Agnostic
• Generic
• Flexible
![Page 11: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/11.jpg)
Storage Cluster
SAN Replacement Storage Cluster
Node 1 Node 2DRBD
Node 1 Node 2 Node 3
iSCSI
• Storage Redundancy
• Application Agnostic
• Generic
• Flexible
Important Systems
![Page 12: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/12.jpg)
Linux Storage Replication
Replication Basics
DRBD 8 Overview
DM-Replicator
DRBD 9
Other Ideas
![Page 13: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/13.jpg)
How it works: Normal operation
Data blocks
Wr
ite
I/
O
Data blocks
Replicate Acknowledge
Primary Node
Secondary Node
Re
ad
I/
OApplication
Re
ad
I/
O
Wri
te
I/
O
Replicate Acknowledge
![Page 14: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/14.jpg)
How it works: Primary Node Failure
Wr
ite
I/
O
Data blocks
Replicate Acknowledge
Primary Node
Secondary Node
Re
ad
I/
OApplication
Re
ad
I/
O
Primary Node
Wr
ite
I/
O
Re
ad
I/
O
Data blocksApplication
Re
ad
I/
O
![Page 15: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/15.jpg)
Offline Node
How it works: Secondary Node Failure
Data blocks
Wr
ite
I/
O
Data blocks
Primary Node
Re
ad
I/
OApplication
Re
ad
I/
O
Wri
te
I/
O
![Page 16: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/16.jpg)
How it works: Secondary Node Recovery
Data blocks
Data blocks
Resync Acknowledge
Primary Node
Secondary Node
Re
ad
I/
OApplication
Re
ad
I/
O
Resync Acknowledge
![Page 17: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/17.jpg)
What if ...
• We want additional replica for desaster recovery
- we can stack DRBD
• The latency to the remote site is too high
- stack DRBD for local redundancy,run the high latency link in asynchronous mode,add buffering and compressing with DRBD proxy
• Primary node/site fails during resync
- Snapshot before becoming sync target
![Page 18: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/18.jpg)
It Works.
• Though it may be ugly.
• Can we do better?
![Page 19: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/19.jpg)
Linux Storage Replication
Replication Basics
DRBD 8 Overview
DM-Replicator
DRBD 9
Other Ideas
![Page 20: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/20.jpg)
Generic Replication Framework
• Track Data changes
- Persistent (on Disk) Data Journal
- “global” write ordering over multiple volumes
- Fallback to bitmap based change tracking
• Multi-node.
- many “site links” feed from the journal
• Flexible Policy
- When to report completion to upper layers
- (when to) do fallback to bitmap
![Page 21: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/21.jpg)
Current „default“ reference implementation
• Only talks to “dumb” block devices
• “Software RAID1”allowing some legs to lag behind
• No concept of “data generation”
• Cannot communicate metadata
• Not directly suitable for failover solutions
• Primary objective: cut down on “hardware” replication licence costs, replicate SAN-LUNs in softwareto desaster recovery sites.
![Page 22: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/22.jpg)
DRBD 9
Replication Basics
DRBD 8 Overview
DM-Replicator
DRBD 9
Other Ideas
![Page 23: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/23.jpg)
Replicating smarter, asynchronous
• Detect and discard overwrites
- shipped batches must be atomic
• Compress
• Compress XOR-diff
• Side effects
- Can be undone
- Checkpointing of generic block data
- Point in time recovery
![Page 24: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/24.jpg)
Replicating smarter, synchronous
• Identify a certain Data Set Version
• Start from scratch
• continuous stream of changes
• Data Generation Tags, dagtag
- which clone (node name)
- which volume (label)
- who modified it last (committer)
- modification date (position in the change stream)
![Page 25: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/25.jpg)
Colorful Replication Stream
Primary Node Changes
atomic batchdiscardingoverwrites
Data Set Divergence
![Page 26: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/26.jpg)
Advantages of the Data Generation Tag scheme
• On handshake, exchange dagtags
- Trivially see who has the best dataeven on primary site failurewith multiple secondaries possibly lagging behind
• Communicate dagtags with atomic (compressed, xor-diff) batches
- allows for daisy chaining
• keep dagtag and batch payload
- Checkpointing: just store the dagtag.
![Page 27: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/27.jpg)
DRBD 9
Replication Basics
DRBD 8 Overview
DM-Replicator
DRBD 9
Other Ideas
![Page 28: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/28.jpg)
Stretched cluster file systems?
• Multiple branch offices
• One cluster filesystem
• Latency would make unusable
• But when- keeping leases and- inserting lock requests into the replication data stream- while having mostly self-contained access
in the branch offices
• It may feel like low latency most of the time, with occasional longer delays on access.
• Tell me why I'm wrong :-)
![Page 29: DRBD + Heartbeat + Xen: HA Virtualizationdata.guug.de/slides/lk2008/le_drbd9-lk2008-slides.pdf- we can stack DRBD • The latency to the remote site is too high - stack DRBD for local](https://reader030.fdocuments.in/reader030/viewer/2022021703/5e6917e5a0e5c37ca03cd824/html5/thumbnails/29.jpg)
Comments?
http://www.linbit.comhttp://www.drbd.org
If you think you can help,
we are Hireing!