Building a High-Availability PostgreSQL Cluster - Team...
Transcript of Building a High-Availability PostgreSQL Cluster - Team...
![Page 1: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/1.jpg)
Building a High-Availability PostgreSQL Cluster
Presenter: Devon Mizelle System Administrator
Co-Author: Steven Bambling System Administrator
ARIN — “critical internet infrastructure”
![Page 2: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/2.jpg)
What is ARIN?•Regional internet registry for North
America and parts of the Caribbean.
•Distributes IPv4 & IPv6 addresses and
Autonomous System Numbers (Internet
number resources) in the region
•Provides authoritative WHOIS services
for number resources in the region
2
![Page 3: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/3.jpg)
ARIN’s Internal Data
3
!Inside of our database exists all of the v4 and v6 networks that we manage, the organizations that they belong to, and the contacts at those organizations. This means that data integrity and how we store said data is extremely important.
![Page 4: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/4.jpg)
Requirements
4
Multi-‐member Automatic Failover Prevent a ‘tainted’ master from coming online Needs to be ACID-‐Compliant
![Page 5: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/5.jpg)
Why Not Slony or pgpool-II?
• Slony replaces pgSQL’s replication – Why do this? –Why not let pgSQL handle it?
• Pgpool is not ACID-Compliant – Doesn’t confirm writes to multiple nodes
5
![Page 6: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/6.jpg)
Our solution
• CMAN / Corosync – Red Hat + Open-source solution for cross-
node communication • Pacemaker – Red Hat and Novell’s solution for service
management and fencing • Both under active development by
Clusterlabs
6
Interested in using it due to active development by Clusterlab
![Page 7: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/7.jpg)
CMAN/ Corosync
• Provides a messaging framework between nodes
• Handles a heartbeat between nodes – “Are you up and available?” – Does not provide ‘status’ of service,
Pacemaker does • Pacemaker uses Corosync to send
messages between nodes
7
CMAN has the ability to do more - but we just use it as a messaging framework
![Page 8: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/8.jpg)
CMAN / Corosync
8
Builds a cluster ‘ring’ using a configuration file Used by Pacemaker in order to pass status messages between the nodes Simply a framework for communication – no heavy lifting in our implementation
![Page 9: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/9.jpg)
About Pacemaker
• Developed / maintained by Red Hat and Novell • Scalable – Anywhere from a two-node to a 16-
node setup • Scriptable – Resource scripts can be written in
any language – Monitoring – Watches out for service state changes – Fencing – Disables a box and switches roles when
failures occur • Shareable database between nodes about
status of services / nodes
9
![Page 10: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/10.jpg)
Pacemaker
10
Master
AsyncSync
?
An XML ‘database’ (known as a CIB -‐ cluster information base) is generated with the status of each resource and passed between nodes The state of pgSQL is controlled by Pacemaker itself Pacemaker uses a ‘resource script’ to interact with pgSQL Can determine the state of the service (Master / Sync / Async)
![Page 11: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/11.jpg)
Other Pacemaker Resources
11
Fencing IP Addresses
Pacemaker also handles the following resources besides PGSQL: * Fencing of resources * IP Address colocation
![Page 12: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/12.jpg)
How does it all tie together?From the bottom up…
12
![Page 13: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/13.jpg)
Pacemaker
13
Client “vip”Replication “vip”
Master
Sync Async App
All slaves in the cluster point to a replication ‘vip’ This interface moves to whichever node is the master -‐ this is called a colocation constraint Another ‘vip’ for our application servers to connect to follows the master as well
![Page 14: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/14.jpg)
Event Scenario
14
?X
XMaster Sync AsyncMaster SyncAsync
In the event that a node becomes unavailable, cman notifies pacemaker to ‘fence’ or shut off communication to the node via SNMP to the switch The SYNC slave becomes the Master The ASYNC slave becomes the SYNC slave Upon manual recovery, the old Master becomes the async slave If any resources inside of Pacemaker on the master fail their monitoring check, fencing occurs as well These resources include:
Both replication and client ‘vips’
![Page 15: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/15.jpg)
PostgreSQL
• Still in charge of replicating data • The state of the service and how it
starts is controlled by Pacemaker
15
![Page 16: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/16.jpg)
Layout
16
💙 💙
MasterSlave Slave
cman cman cman
Client
![Page 17: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/17.jpg)
Using Tools to Look DeeperIntrospection…
17
![Page 18: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/18.jpg)
# crm_mon -i 1 -Arf
18
We disable quorum within the pacemaker HA cluster to allow for failure down to a single node cluster in the event multiple nodes fail • 8 Resources configured • ofce::heartbeat::IPaddr2 is the resource used to create the vip – can be shell, ruby, etc. • Primitive vs multistate
• Primitive – only runs on one of the nodes in the cluster (vips, fencing) • Multi-‐state resource – runs on multiple nodes (pgsql)
• The vips are colocated. If anything happens to either of them, the entire node fails and moves to the next master • There is a specific check interval for each resource • stonith for fencing
![Page 19: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/19.jpg)
# crm_mon –i 1 -Arf (cont)
19
* All of the status comes from the pgsql pacemaker resource script • receiver-‐status is error because the resource is written to monitor and check for cascading. We don’t use cascading, haven’t invested cycles • Master-‐postgresql is the ‘weight’. Uses the weight to determine whom should be promoted next in line, which is why async has –INFINITY • STREAMING
![Page 20: Building a High-Availability PostgreSQL Cluster - Team …teamarin.net/.../Building-a-High-Availability-PostgreSQL-Cluster.pdf · Building a High-Availability PostgreSQL Cluster Presenter:](https://reader031.fdocuments.in/reader031/viewer/2022020316/5b5bcf747f8b9a885b8ec996/html5/thumbnails/20.jpg)
Questions?20
Devon Mizelle