Introduction to DRBD
-
Upload
dawnlua -
Category
Technology
-
view
119 -
download
0
description
Transcript of Introduction to DRBD
Sudoers BarcelonaOctubre 2013
alba ferrer
What is it?
Distributed Replicated Block Device
What is it?
Distributed Replicated Block Device
Software-based, shared-nothing replicated storage solution mirroring the contents of block devices
What is it?
Distributed Replicated Block Device
Software-based, shared-nothing replicated storage solution mirroring the contents of block devices
• In real time
What is it?
Distributed Replicated Block Device
Software-based, shared-nothing replicated storage solution mirroring the contents of block devices
• In real time• Transparently
What is it?
Distributed Replicated Block Device
Software-based, shared-nothing replicated storage solution mirroring the contents of block devices
• In real time• Transparently• Synchronously/asynchronously
Kernel module
User space admin tools• drbsetup
• Used to configure the kernel module• All parameters in command-line
User space admin tools• drbsetup
• Used to configure the kernel module• All parameters in command-line
• drbdmeta• Create/dump/restore/modify DRBD metadata
User space admin tools• drbsetup
• Used to configure the kernel module• All parameters in command-line
• drbdmeta• Create/dump/restore/modify DRBD metadata
• drbdadm• High-level, frontend for drbdsetup/drbdmeta• Reads from /etc/drbd.conf• Has a dry-run option (-d)
Resources
• A particular replicated storage device
Resources
• A particular replicated storage device
• Resource name• DRBD device: virtual block device (major=147).
The associated block device is always /dev/drbdm (m=minor)
• Disk configuration: local copy of the data• Network configuration: comms with peer
ConfigurationPer resource (/etc/drbd.d/mysql.res):
resource mysql {device minor 0; # /dev/drbd0disk /dev/sdb;meta-disk internal;
on alice {address 192.168.133.111:7000;
}on bob {
address 192.168.133.112:7000;}
syncer {rate 10M; # static resync rate of
10MByte/s}
}
Configuration
Global (/etc/drbd.d/global_common.conf):global {
usage-count yes;}
common {protocol C;disk {
on-io-error detach;}syncer {
al-extents 3833;}
}
Resource roles
• Primary: read and write ops• Secondary: receives updates from primary,
disallows any other access.
• Promotion: from secondary to primary drbdadm primary all
• Demotion: from primary to secondarydrbdadm secondary all
Modes
• Single-primary• Dual-primary (>= 8.0)
Modes
• Single-primary• Dual-primary (>= 8.0)
• Replication modes:• Protocol A: asynchronous• Protocol B: memory synchronous• Protocol C: synchronous
Features: efficient synchronization
• Synchronization != replication
Features: efficient synchronization
• Synchronization != replication• Inconsistent remote dataset during sync
• Useless
Features: efficient synchronization
• Synchronization != replication• Inconsistent remote dataset during sync
• Useless• Service in active node unaffected
Features: efficient synchronization
• Synchronization != replication• Inconsistent remote dataset during sync
• Useless• Service in active node unaffected• Synchronization and replication happen at the
same time
Features: efficient synchronization
• Only one write op per several successive writes in active node in a block
Features: efficient synchronization
• Only one write op per several successive writes in active node in a block
• Linear access to blocks
Features: efficient synchronization
• Only one write op per several successive writes in active node in a block
• Linear access to blocks• Configure rate of sync
Features: efficient synchronization
• Only one write op per several successive writes in active node in a block
• Linear access to blocks• Configure rate of sync
• Checksum-based synchronization
Features: data verification
• On-line device verification• block-by-block data integrity check
between nodes
Features: data verification
• On-line device verification• block-by-block data integrity check
between nodes• Replication traffic integrity checking
• end-to-end message integrity checking using cryptographic message digest algorithms
Features: disk
• Support for disk flushes
Features: disk
• Support for disk flushes• Disk error handling strategies
• Passing• Masking• DIY
Features: disk
• Support for disk flushes• Disk error handling strategies
• Passing• Masking• DIY
• Deal with outdated data• DRBD won't promote an outdated
resource -> fencing
Features: replication• Three-way replication
Features: replication
• Long distance replication with DRBD Proxy• Not free
• Truck based replication
Split-brain
Split brain is a situation where, due to temporary failure of all network links between cluster nodes, and possibly due to intervention by a cluster management software or human error, both nodes switched to the primary role while disconnected.
Split-brain
• Configurable notifications
Split-brain
• Configurable notifications• Automatic recovery methods
• Discard modifications on 'younger' primary.• Discard modifications on 'older' primary.• Discard modifications on primary with
fewer changes.• Graceful recovery if one primary had no
changes.
Metadata
• Various pieces of information about the data DRBD keeps in a dedicated area• The size of the DRBD device• The generation identifier• The activity log• The quick-sync bitmap
Metadata
• Can be stored internally or externally
Metadata
• Can be stored internally or externally• Size
root@bob:~ # blockdev --getsz /dev/drbd0root@bob:~ # 8388280
(8388280/2^18) * 8 + 72 = 328 sectors328 sectors = 0,16MB
What it’s not/What it can’t do
• It’s not a backup system
What it’s not/What it can’t do
• It’s not a backup system
• It can’t add features to upper layers
What it’s not/What it can’t do
• It’s not a backup system
• It can’t add features to upper layers• DRBD cannot auto-detect file system
corruption • DRBD cannot add active-active clustering
capability to file systems like ext3 or XFS.
Limitations
• Only two nodes• Stacked resources• Version 9
Limitations
• Only two nodes• Stacked resources• Version 9
• There is no automatic failover.
Limitations
• Only two nodes• Stacked resources• Version 9
• There is no automatic failover.• Promotion/demotion is manual.
Limitations
• Only two nodes• Stacked resources• Version 9
• There is no automatic failover.• Promotion/demotion is manual.• Needs a CRM to be useful
PACEMAKER FTW
Funcionament
root@alice:/etc/drbd.d # cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)GIT-hash: 234a142f7cf5bb21ffa1e95afa4f31608089c8b8 build by buildsystem@linbit, 2012-09-12 14:27:28 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:152 nr:4 dw:156 dr:4017 al:5 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
More info
• drbd.org
• www.drbd.org/home/mailinglists
• www.linbit.com