2
www.galeracluster.com
Agenda
● MariaDB Galera Cluster● Release 3.0 Features
● WAN Replication● MySQL Replication Support
● Towards Galera 4● Intelligent Donor Selection● Cluster Crash Recovery● Inconsistency Voting● Huge Transaction Support● Non Blocking DDL
3
www.galeracluster.com
WSREP APIWSREP API
Galera Replication Plugin
MariaDB Galera Cluster
WSREP API
MariaDB
Clients
MariaDB MariaDB
➔ Synchronous Replication➔ Multi-Master➔ Read & Writes to Any Node➔ Automatic Node provisioning
➔ No Lost Transactions➔ No Slave Lag➔ Scalability➔ Works in LAN / WAN / Cloud
4
www.galeracluster.com
Galera Cluster
DBMS
wsrep provider
GCS framework
replication
wsrep hooks wsrep
Replication API
certification
vsbes gcommspread
Galera Plugin
dlopen
5
www.galeracluster.com
Galera Cluster
DBMS
wsrep provider
GCS framework
replication
wsrep hooks wsrep API
certification
vsbes gcommspread
Galera Plugin
6
www.galeracluster.com
Replication Plugin
Replication plugin is runtime loadable● Set global wsrep_provider=none;
● Set global wsrep_provider='/usr/lib/libgalera_smm.so';
With no replication plugin specified, server works as vanilla MariaDB server
MariaDB 10.1 will have wsrep API built in
7
www.galeracluster.com
Galera 3.0
Released Nov 2013, Featuring:● Optimization for WAN replication
● Cluster can be divided in segments based on location
● Asynchronous replication topologies● Async replication can be interleaved with Galera replication● Support for MySQL 5.6 GTID
● New write set key format● Makes certification faster, takes less RAM● A step towards huge transaction support
● A number of bug fixes and minor improvements
10
www.galeracluster.com
Towards 4.0
1. 3.x Improvements● we will have constant back log of 3.0 issues to sort out
while 4.0 will be under development
2. Test system development● MySQL test suite integration
3. New Features● Some new features going in for 3.* releases● Major uplift for 4.0 , feature set fixed● Feedback from community & partners
11
www.galeracluster.com
New Features...
● Non Blocking DDL● Huge transactions by streaming replication● Inconsistency voting
Galera 4.0
● Intelligent Donor selection● Cluster crash recovery
Galera 3.*
v
v
v
13
www.galeracluster.com
Better Donor Selection Support
● In 3.0, SST donor was selected in random
● New SST “handshake” makes intelligent donor choice:
● Favor donor which can provide IST● Favor proximity (segment)● Introduced in Galera 3.6
● SST donor can still be forced by wsrep_sst_donor
19
www.galeracluster.com
Engine Room Power Out
Node A Node CNode B
Service mysql startService mysql start
20
www.galeracluster.com
Cluster Crash Recovery
● Engine room power out – use case● If all nodes shutdown
● New cluster must be started and first node to elected● This is manual operation (error prone)● Other nodes can join back automatically, either through
IST or SST
21
www.galeracluster.com
Cluster Crash Recovery
● Configure automatic crash recovery:● pc.recovery=ON
● Nodes maintain the group information in persistent storage
● After shutdown, the full group can start with same configuration
27
www.galeracluster.com
Huge Transaction Support
● Currently transaction processes in master node until commit time
● For large transactions, the write size will be big, and is hard to handle
● Maximum supported writeset size: 2GB● There are means to prevent too large
transactions● wsrep_max_ws_rows● wsrep_max_ws_size (not enforced atm)● wsrep_provider_options=''repl.max_ws_size=#'
28
www.galeracluster.com
Huge Transaction Replication
Huge transaction
Galera Replication
Node A Node B
Huge trx
29
www.galeracluster.com
Huge Transaction Replication
Huge transaction
Galera Replication
Node A Node B
Huge trx
30
www.galeracluster.com
Huge Transaction Replication
Huge transaction
Galera Replication
Node A Node B
Huge trx
Ws
commit
31
www.galeracluster.com
Huge Transaction Replication
Huge transaction
Galera Replication
Node A Node B
Huge trx
32
www.galeracluster.com
Huge Transaction Replication
Huge transaction
Galera Replication
Node A Node B
Huge trx
WS WS WS
Slave queue
WSWSWS
33
www.galeracluster.com
Huge Transaction Support
● Galera Cluster 4.0 Implements Streaming Replication:
● Possible to replicate transactions of any size● Transaction size limits will remain, cluster can still reject
too large transactions
34
www.galeracluster.com
Streaming Replication
Huge transaction
Galera Replication
Node A Node B
Huge trx
WS
35
www.galeracluster.com
Streaming Replication
Huge transaction
Galera Replication
Node A Node B
Huge trx
WS
36
www.galeracluster.com
Streaming Replication
Huge transaction
Galera Replication
Node A Node B
Huge trx
WS
commit
37
www.galeracluster.com
Streaming Replication
● Transaction is replicated in small increments● Size threshold for replication is configurable● Replicated rows are locked in all cluster nodes
➔ they cannot be conflicted later
38
www.galeracluster.com
Streaming Replication
Huge transaction
Galera Replication
Node A Node B
Huge trx
WS
Update t1.....
40
www.galeracluster.com
Inconsistency Voting
● Current Policy for Inconsistency:● For suspected inconsistency, cluster node will do emergency shutdown
● (However, DDL failures are logged only as warnings)● Injected inconsistency in one node can cause all other nodes to shutdown
41
www.galeracluster.com
Inconsistency Voting
Galera Replication
Node A Node B Node C
Create table t1 (i int)
t1 t1 t1
42
www.galeracluster.com
Inconsistency Voting
Galera Replication
Node A Node B Node C
Set wsrep_on=OFFInsert into t values (8)
t1 t1 t1
8
43
www.galeracluster.com
Inconsistency Voting
Node A Node B Node C
Set wsrep_on=ONDelete from t;
t1 t1 t1
8
Del 8
Del 8
44
www.galeracluster.com
Inconsistency Voting
Node A Node B Node C
t1 t1 t1
Set wsrep_on=ONDelete from t;
Del 8
Del 8
46
www.galeracluster.com
Inconsistency Voting
● Galera Cluster 4.0 will minimize downtime due to suspected inconsistency
● Nodes will communicate through inconsistency voting protocol if inconsistency is observed
● Target is to shutdown minimal number of nodes
47
www.galeracluster.com
Inconsistency Voting
Node A Node B Node C
8t1 t1 t1
Set wsrep_on=ONDelete from t;
Del 8
Del 8
48
www.galeracluster.com
Inconsistency Voting
Node A Node B Node C
8t1 t1 t1
Set wsrep_on=ONDelete from t;
Inconsistency Voting
51
www.galeracluster.com
Non-Blocking DDL
Current DDL replication blocks whole cluster for the duration of DDL statement processing
Galera Cluster 4.0 optimizes DDL replication (TOI (Total Order Isolation)) to lock only the affected table
56
www.galeracluster.com
Non-Blocking DDL - TOI
ALTER TABLE t1
Node A Node B
ALTER t1ALTER t1
UPDATE t1
57
www.galeracluster.com
Non-Blocking DDL - TOI
ALTER TABLE t1
Node A Node B
ALTER t1ALTER t1
UPDATE t1
UPDATE t3
59
www.galeracluster.com
Non-Blocking DDL - TOI
ALTER TABLE t1
Node A Node B
UPDATE t1
WSseqno
ALTER t1ALTER t1
UPDATE t3
61
www.galeracluster.com
Non Blocking DDL
● Affected table is locked in all cluster nodes● This table lock is native MySQL lock, Galera is not
adding replication level locks anymore
● Other tables are accessible to everybody
Huge Transaction Demo Setup
1. Two nodes
2. Steady load of pure autocommit updates to measure trx throughput
3. A huge table with ~1.5M rows
4. Run update on huge table to modify all rows
→ monitor trx/sec rate in the cluster when the huge transaction kicks in
Impact of Huge Transaction
0
500
1000
1500
2000
2500
3000
3500
4000
4500
Huge Transaction Slave Lag
Trx in master24 secs
Trx in slave9 secs
Streaming Replication Demo Setup
1. Same scenario as before
2. Configure node1 to fragment huge transaction in 10K batches
→ monitor trx/sec rate in the cluster when streaming replication progresses
Streaming Replication
0
500
1000
1500
2000
2500
3000
3500
4000
4500
Streaming Replication
time
trx/
sec
Streaming Replication70 secs
Top Related