Managing and Visualizing your Replication Topologies with Orchestrator
-
Upload
shlomi-noach -
Category
Technology
-
view
224 -
download
0
Transcript of Managing and Visualizing your Replication Topologies with Orchestrator
Managing and VisualizingYour Replication Topologies
with
OrchestratorShlomi Noach Percona Live, Sep 2015
Agenda:● What? Why?● The killer demo● Breakdown: discovery, refactoring, recovery, interface● Pseudo GTID, GTID, binlog servers● Architecture & stack● Deployment at scale @ Booking.com● The CLI demo● Supported/unsupported● Contributing
Managing and Visualizing … yada yada ...Orchestrator
Not a sales pitch
● orchestrator is free and open source
● Designed to be as generic as possible
● Some company specific rules or processes externalized via configuration
https://github.com/outbrain/orchestrator
3
What? Why?
● With so many replication topologies; with many servers per topology, spanning multiple data
centers; with the periodic server failures and movements,
- Do you know how your topologies look like?
- Does management know?
● With the complexity of moving slaves around the topology; the rules allowing/disallowing server
X to replication from Y; with the implications of cross-DC traffic on slave latency,
- Who in your company can refactor your topologies other than yourself?
● In the event of server failure, master or intermediate master breakage,
- Do you have a clear visual into what fails?
- What kind of solutions do you use?
- Who can execute failover / override a failover / understand what’s going on?
4
The killerdemo
or
Let’s break our
production servers
right now
and deal with the
consequences
once this conference
is over
Orchestrator breakdown: Discovery
● Crawls through your topologies
● Automatically recognizes new servers
● Resolves IPs, CNAMEs
● Revisits your servers periodically
● Collects data (version, binlogs, replication, …)
7
8
Anonymous Booking.com topology
9
Anonymous Booking.com topology
10
Anonymous Booking.com topology
Orchestrator breakdown: Refactoring
● Understands:
● binlog file:pos
● Pseudo-GTID
● GTID (Oracle + MariaDB)
● binlog servers
● Knows the rules for replication X from Y
● Will refactor your topology for you: safely redesign your topology
● Fine grained control or “just do it for me, I’m too tired to think”
● Can refactor via slick web UI
● Or via nerdy command line interface
11
Orchestrator breakdown: Recovery
● Keeps a state of your topology
● Uses holistic approach to detect failures
http://code.openark.org/blog/mysql/what-makes-a-mysql-server-failurerecovery-case
● If replication breaks, orchestrator knows what the expected topology looked like
● And can recommend “the next best option”, based on state, not on configuration
● And, if you like, can execute an automated/manual failover that heals your topology and
leaves no slave (or only those utterly incapable of restoring) behind
12
Orchestrator breakdown: Interface
● Command line
● Well formed output
● Go-to if you like your --debug logs
● Web API
● Simple GET (not REST)
● Web UI
● Uses Web API
● Designed to be friendly
13
Binary logs coordinates: recap
● Replication based on file:pos
● Different file names on masters & slaves
● Different positions on masters & slaves
● Once the connection is broken, difficult to match up again
14
MySQL binary & relay logs
15
MasterSlave
MySQL binary & relay logs: different languages
16
MasterSlave
MySQL binary & relay logs: even more languages
17
Master
Slave
Slave
GTID
● Every transaction has a unique identifier
● When a slave connects to a master, it looks for the last GTID statement it already executed
● Available in Oracle MySQL 5.6, MariaDB 10.0
● Completely different implementations; may cause lockup
● 5.6 migration path is painful (alleviated in 5.7)
● 5.6 requires binary logs & log-slave-updates enabled on all slaves (alleviated in 5.7)
● 5.6 errant transactions, unexecuted sequences
● GTID will be the requirement in future Oracle features
● MariaDB GTID supports domains; easy to use
18
Pseudo GTID
● Pseudo GTID offers what GTID offers, without GTID. This includes:
● Slave repointing
● Failover schemes
● With less requirements
● Bulk operations
● Without upgrading your servers; without installing anything on them; in short: not touching your
beloved existing setup
● No vendor lockdown; no migration paths
19
Pseudo GTID
● Application-side enhancement
● We inject a uniquely identified statement every X seconds. We call it Pseudo GTID.
● Pseudo GTID statements are searchable and identifiable in binary and relay logs
● Make for “markers” in the binary/relay logs
● Injection can be made via MySQL event scheduler or externally
● Otherwise non intrusive. No changes to topology/versions/methodologies
20
Injecting Pseudo GTID
create event if not exists create_pseudo_gtid_event
on schedule every 5 second starts current_timestamp
on completion preserve enable
do begin
set @pseudo_gtid_hint := uuid();
set @_create_statement := concat('drop ',
'view if exists `meta`.`_pseudo_gtid_hint__', @pseudo_gtid_hint, '`');
PREPARE st FROM @_create_statement;
EXECUTE st;
DEALLOCATE PREPARE st;
end $$
21
In the binary logs
mysql> show binlog events in 'mysql-bin.015631' \G
...
Log_name: mysql-bin.015631
Pos: 1632
Event_type: Query
Server_id: 1
End_log_pos: 1799
Info: use `meta`; drop view if exists `meta`.`_pseudo_gtid_hint__50731a22-9ca4-
11e4-aec4-e25ec4bd144f`
...
22
Recap: MySQL binary & relay logs
23
MasterSlave
MySQL binary & relay logs: a virtual contiguous log file
24
MasterSlave
MySQL binary & relay logs: Pseudo GTID injection
25
Master Slaveinsert> PGTID 17updatedeletecreate> PGTID 82deletedelete> PGTID 56insertinsertupdatedropupdate
insert> PGTID 17updatedeletecreate> PGTID 82deletedelete> PGTID 56insertinsertupdatedrop
insert> PGTID 17updatedeletecreate> PGTID 82deletedelete> PGTID 56insertinsert
insert> PGTID 17updatedeletecreate> PGTID 82deletedelete> PGTID 56insertinsert
Pseudo GTID: repoint, based on binary logs
26
Master Slaveinsert> PGTID 17updatedeletecreate> PGTID 82deletedelete> PGTID 56insertinsertupdatedropupdate
insert> PGTID 17updatedeletecreate> PGTID 82deletedelete> PGTID 56insertinsertupdatedrop
Pseudo GTID: repoint, based on relay logs
27
Master Slaveinsert> PGTID 17updatedeletecreate> PGTID 82deletedelete> PGTID 56insertinsertupdatedropupdate
● Please see
https://speakerdeck.com/shlominoach/pseudo-gtid-and-easy-mysql-replication-topology-
management
to learn about advantages, limitations and implementation.
● Pseudo-GTID deployed on all Booking.com chains
More on Pseudo GTID
28
● A MySQL-server-like entity
● Which merely relays the master’s binary logs
● Under same name and position
● Nested binlog servers allow for simplified refactoring and offer a simplified & faster master
recovery mechanism
● See Binlog Servers @ Booking.comhttps://www.percona.com/live/europe-amsterdam-2015/sessions/binlog-servers-bookingcom
● Orchestrator supports:
● hybrid standard + binlog-server replication topologies
● Pure binlog server topologies
Binlog Servers
29
● Can execute as long-running service
● Provides HTTP UI, Web API
● Polls servers, checks for crashes, recovers, periodic operations
● Leader election
● Can run as command line
● Issue a single command & exit
● Requires (same, single) MySQL backend for any operation.
● Backend database has the state of topologies
● orchestrator itself mostly stateless (pending operation excluded, optimistic mode)
● Agent-less for most operations; communicates directly with MySQL instances
Orchestrator architecture
30
● HA: orchestrator leader election
● Self healing backend data
● All locks auto expiring
● Support authentication (basic-auth, reverse proxy)
● Operations friendly, e.g.:
● Server maintenance flag
● Downtiming servers
● Marking as “best candidate”
Orchestrator architecture
31
Orchestrator general architecture
32
orchestratorservice
backend db
web API
web xface
● Stack:
● golang - in retrospect a very good choice: a lot of concurrency; easy deployment; rapid
development
● MySQL as backend database (duh)
● go-martini web framework
● Page generation via dirty JavaScript/jQuery (sue me)
● Twitter bootstrap
● Graphs via D3
● Development:
● Github, open source; accepting pull-requests
https://github.com/outbrain/orchestrator/
Orchestrator stack & development
33
MySQL @ Booking.com 2015
● We are a big MySQL shop
● We have ALOT production servers on ALOT topologies (aka chains, aka clusters)
● As small as 1 server per topology, as large as hundreds of servers per topology
● Two major data centers, now populating our third
● Single master, plenty slaves
● All chains are deployed with Pseudo-GTID and controlled by orchestrator
● Larger chains: hybrid, normal + binlog servers topologies (complex!)
● “Pure” binlog-server topologies experimental, non-production
● Some topologies sharded
● A little bit of active/passive master-master
34
Orchestrator architecture @ Booking.com
35
appleader
app
app
app
HTTP load balancerorchestrator-cli on all MySQL nodes
Orchestrator @ Booking.com 2015
● 5-6 hosts running orchestrator service, one is elected as leader at any given time
● ALOT hosts with orchestrator as CLI
● Single elected service polls all our instances
● Each MySQL instance polled every 30s
● Pseudo-GTID deployed on all chains
● Orchestrator configured to auto-recover the death of any intermediate master
● Orchestrator configured to auto recover from some master failures
● Both the above happen
● Some checks & dashboards rely on orchestrator data (API / DB)
● Some operations rely on orchestrator logic
36
CLI demoWhere real stuff gets done
including pie charts
..,.. .M +.. .....N$7$$O$77MZN?.. .~8?: M+8.. ...~7$7777$77$$$$$$$$$7D......=I+?M. .M?+=MMM$$Z$Z$$7$$$7$$7$77$$$$7$$$MZI+=I= . .~III7$$$$$$7777$$$$777777777$$$$$7$7?IM . .M7Z$$777777777777777777777$$$ZD+Z?M. MZ$$77777=7777777777$$Z7777$Z8OMMN~. .M,M8$7777O.MI7$$7IONMMMMMZ777$N.8DM,NM8.,.... ... M+D.MZ7777Z~M88OOODNZZZZZ8NM$77NM=..MD7$Z7$ZM:..,, ..~:DO777IMZZ$777$7777777$$ZZOMODNNNZ$7ZZ$ZZZZ$ZZZZZ8ONMM.. ..MMNZ$77M8Z$77777777777777777$ZM$7I7I$ZZ$ZZOO$OZO$ZZZZZZZ8M... M777778MZZ777777777777777777777$ZN$777$Z$Z$Z$ZOZOOZOZZZZ$7$$Z$ZMDI$7ZM.. I777$NZZ777777777777777777777777$ZOD77$Z$$ZZZZOZ$$$O$$$ZZZZZZZOZZOOZOZZOZ... M7$7MO$777777777777777777777777777O8M77Z$ZZZZZ$ZZ$Z$$$$ZZ$ZZZ$$$ZZZZZ$ZZZZ8M. OI$77$877777777777777777777777777777$ZZI$$$ZZ$ZZZZZZZZ$$ZZ$Z$$$ZZZZ$$ZZZZ$$O$M.. .,$$$I777777777777777777777777777777777$Z7$$$$ZZZZZ$Z$ZZZZ$$ZZ$ZZZ$ZZZ$$$ZZZZZZZI8. M777777777777777777777777777777777777777$$$7$ZZZZ$ZZZZZZZZZZZZ7$ZZZZZZZ$ZZZZZ$$$ZO,. .NZ7I77777777777777777777777777777777777777$$ZZZZZ$$ZZZZ7ZZZZZ$Z$$$$$Z$$$Z$Z7$Z$OODMM ~$$7II77777777777777777777777777777777777777$Z$$ZZ$$ZZ$Z$$$$Z$ZZZZ7$$Z$Z$$$Z$ZZ777$.. ,ZZ$7III7777777777777777777777777777777I77777Z$$$$$$$$$$$Z$$$Z$77$ZZ$Z$$$$$Z$ZZ7$$77M,.. .NZMZ$III777777777777777777777777777777III7777Z$$Z7$$$$77$$$$Z$$Z$7$$$$$$$Z77$$$$7$$$$M:... ,:.MZ$IIII7777777777777777777777777777IIII7777ZZ$$$$$$$777$Z77Z7Z$$$$Z$Z$$$$Z$$$$$7$$$7,:. .NOI7III777777777777777777777777III7IIIII777$$$$$$$$$$Z$$$$$$$7$$$$$$$7$$7$$$$$$$$$$I8. OZOI77II77777777777777777777777IIIIIIII7777Z$$7$7Z$$$7$7$$$7$$$$$$$$$$$$Z7$$Z77777$77$.. .,777$7III777777777777777777IIIIIIIIIII77777Z$$$$7$$$Z7$$$$$$7$Z$$$$$7$Z$7Z$$$$7777II7$M. .:$MI$$I$7I77I7I7777777777777IIIIII?II777777$$$$$$$$7$$$7$$$$777II7$$777Z$$Z$$$77$I$7I7+. .:N I7O7$$IIII7I7777777777IIIIIII???77777777$$$$$$$$$$7$77$$$7I77IIIII77$$$$$77II7$I$7$O ?$ZZ$$I?7IIIIIIIIIIIIIIII????I?I777777II$777$7$7$777I$I7II7???I??II7I778777777I77$$M. +8O$I$IIIIIII7III7IIIIII?????I?I7777777II77I7I$$7777777IIII??III?IIIIII7I77777$7777I.. .MMII7I7III7III7II??7?????IIII777777777II77$7III7II777777I7II7I7III77?I?77II77I$7777N ,8OZMII7I7I?7I7????I????I7I77777777777III777II?II7I77I777II?II77777I77I7777II7777777N ,8ONO$Z$77??I??7+?+???IN7777777777777O$$III77III?I7I?II?77?II7777777777?77I77II7777DM. .OZZOONMZ77ZI77??I?+I++7$$777777777777OZ$7I7?IIII?IIIIIIII?IDII7777777777777777I777777M . $$$$ZZZOOMZ7$I$7II?77I7$ZD$77777777777$D$77IIIIIIIIIIIIIIIIZ$$M$$7777777777I777777777777.. ..77$$$$$ZZZOMIII7$$I$$7$$ZZ$777777777777O$$?IIIIII7I7IIIIII7I7$$$MI7777777777777777777777M. =I777$I777$$Z$ZMO777777777$$M$777777777777M$7IIIIIIIIIIIIIIIIIII77$M$I7777777777777777777I7M. .O77777777777$M.. OM.877777$$7$777777777777$Z$7I7IIIIIIIIIIII7I7II77ZO7777777777777777777777. M7777777777777. .877I77$$77777777777777MN7$7IIIIIIIIIIIII?77MD7$D77777777777777777I7777D. NM7M+MM++MNM,,. :MMN77777777777777IMZ7Z?IIIN$I7I.Z8M$M.$ON+M77777I77II7DMM77MM?II7M. ,M. .DM7MDIIMMM77MM.. .8 .=MM. .7 . MN77IMZ7NMM$.M. N .MM..M:..MOOMM.~. ...... ..
orchestrator -c discover -i replica1.host:3306
orchestrator -c topology -i master.host:3306
orchestrator -c relocate -i replica3.host:3306 -d replica2.host:3306
orchestrator -c relocate -i replica3.host:3306 -d master.host --debug
orchestrator -c which-slaves -i master.host | while read i ;
do orchestrator -c disable-gtid -i $i ; done
orchestrator -c regroup-slaves -i master.host --debug
orchestrator -c relocate-slaves -i replica1.host -d master.host
orchestrator -c which-slaves -i master.host | while read i ;
do orchestrator -c enable-gtid -i $i ; done
CLI samples
In-production experiments, trust● Periodic crash experiments
● And periodically avoiding these experiments as well
● Getting more people involved (on call sys admins)
● ALOT of input is gained by people inexperienced with MySQL, leading to more visibility on
orchestrator’s side
● And of course periodic real crash scenarios
39
Supported replication topologies & technologies
● Standard binlog file:pos replication
● GTID (Oracle & MariaDB)
● Pseudo GTID
● Binlog servers (MaxScale)
● Statement based, row based, semi-sync replication
● Single master replication
● Master-master (2 node circular) replication
● 5.7 parallel replication (in-order required for Pseudo-GTID)
40
Unsupported
● 5.6 per-schema parallel replication
● Discovery & visualization good, operations unsupported
● Master-master-master (#nodes > 2) replication
● Galera
● Unrecognized by orchestrator, identifies each co-master as its own head of topology
● Multi-master aka multi source (neither Oracle 5.7 nor MariaDB)
● Tungsten
41
Contributions & usage● Known to be deployed by various companies
● Orchestrator accepts pull requests
● Please consider making your own PR
● Please submit bug reports
● Please assist in documentation
42
● Orchestrator manualhttps://github.com/outbrain/orchestrator/wiki/Orchestrator-Manual
● Orchestrator deploymenthttps://github.com/outbrain/orchestrator/wiki/Orchestrator-deployment
● Orchestrator first stepshttps://github.com/outbrain/orchestrator/wiki/Orchestrator-first-steps
● Orchestrator for developers https://github.com/outbrain/orchestrator/wiki/Orchestrator-for-developers
● openark.orghttp://code.openark.org/blog/tag/orchestratorhttp://code.openark.org/blog/tag/pseudo-gtid
● Binlog servers master promotionhttp://blog.booking.com/abstracting_binlog_servers_and_mysql_master_promotion_wo_reconfiguring_slaves.html
Links of interest
● http://hyperboleandahalf.blogspot.nl/2010/04/alot-is-better-than-you-at-everything.html● http://www.hbo.com/game-of-thrones● https://imgflip.com/memegenerator/Brace-Yourselves-X-is-Coming● http://www.glassgiant.com/ascii/● https://www.keepcalm-o-matic.co.uk/● @isamlambert for making a couple sparkles to ignite this ● Team @ Booking.com for ideas, input, time testing, time using● Contributors!
Image, sources & other credits
● Binlog Servers at Booking.comhttps://www.percona.com/live/europe-amsterdam-2015/sessions/binlog-servers-bookingcom
● Combining Redis and MySQL to store HTTP cookie datahttps://www.percona.com/live/europe-amsterdam-2015/sessions/combining-redis-and-mysql-store-http-cookie-data
● Encrypted MySQL Backups and instant recoverability on large scalehttps://www.percona.com/live/europe-amsterdam-2015/sessions/encrypted-mysql-backups-and-instant-recoverability-large-scale
● Events storage and analysis with Riak at Booking.comhttps://www.percona.com/live/europe-amsterdam-2015/sessions/events-storage-and-analysis-riak-bookingcom
● Riding the Binlog: an in Deep Dissection of the Replication Streamhttps://www.percona.com/live/europe-amsterdam-2015/sessions/riding-binlog-deep-dissection-replication-stream
● Unicode and MySQLhttps://www.percona.com/live/europe-amsterdam-2015/sessions/unicode-and-mysql
● Your Clone Army: Better scalability through more database servershttps://www.percona.com/live/europe-amsterdam-2015/sessions/your-clone-army-better-scalability-through-more-database-servers
● The CIS MySQL Security Benchmark (LT)https://www.percona.com/live/europe-amsterdam-2015/sessions/cis-mysql-security-benchmark
● The Virtues of Boring Technology (Keynote)https://www.percona.com/live/europe-amsterdam-2015/sessions/virtues-boring-technology
Other Booking.com talks
Questions?
@ShlomiNoachhttp://openark.orghttp://blog.booking.com
Thank you!