Pacemaker · Dynamic recheck interval s 4 cluster-recheck-interval Applies to failure-timeout and...

13
Recent changes and future plans Pacemaker Ken Gaillot <[email protected]> Senior Software Engineer, Red Hat 1 ClusterLabs Summit 2020

Transcript of Pacemaker · Dynamic recheck interval s 4 cluster-recheck-interval Applies to failure-timeout and...

Page 1: Pacemaker · Dynamic recheck interval s 4 cluster-recheck-interval Applies to failure-timeout and date-based rules (date_expression with an operation of gt, lt, or in_range but not

Recent changes and future plans

Pacemaker

Ken Gaillot <[email protected]>

Senior Software Engineer, Red Hat

1

ClusterLabsSummit 2020

Page 2: Pacemaker · Dynamic recheck interval s 4 cluster-recheck-interval Applies to failure-timeout and date-based rules (date_expression with an operation of gt, lt, or in_range but not

Recent changes (2.0.1 - 2.0.3)

Coming soon (2.0.4 - 2.0.5)

Future directions

Pacem

aker: Overview

2

Page 3: Pacemaker · Dynamic recheck interval s 4 cluster-recheck-interval Applies to failure-timeout and date-based rules (date_expression with an operation of gt, lt, or in_range but not

Synchronized across all nodes

Questions?

Fence history displayPacem

aker: Recent changes

3

Can be erased manuallystonith_admin -H ‘*’ --cleanup

Failures/pending shown in crm_mon by default# crm_mon -1 --fence-history=3… <snip> ...Fencing History: * reboot of rhel8-3 successful: delegate=rhel8-1, client=stonith_admin.5516, origin=rhel8-4, completed='2020-01-20 13:06:40 -06:00' * reboot of rhel8-3 successful: delegate=rhel8-1, client=stonith_admin.1633, origin=rhel8-4, completed='2020-01-20 12:05:43 -06:00'

Page 4: Pacemaker · Dynamic recheck interval s 4 cluster-recheck-interval Applies to failure-timeout and date-based rules (date_expression with an operation of gt, lt, or in_range but not

Dynamic recheck intervalPacem

aker: Recent changes

4

cluster-recheck-interval

Applies to failure-timeout and date-based rules(date_expression with an operation of gt, lt, or in_range but not date_spec)

Questions?

Page 5: Pacemaker · Dynamic recheck interval s 4 cluster-recheck-interval Applies to failure-timeout and date-based rules (date_expression with an operation of gt, lt, or in_range but not

Pacemaker Remote hardeningPacem

aker: Recent changes

5

TLS priority preferences(see https://gnutls.org/manual/html_node/Priority-Strings.html)PCMK_tls_priorities="NORMAL:-VERS-SSL3.0:-VERS-TLS1.0:-MD5:-3DES-CBC"

Listen address and portPCMK_remote_address="192.0.2.1"PCMK_remote_port=3121

TLS Diffie-Hellman prime lengthPCMK_dh_min_bits=1024PCMK_dh_max_bits=2048

Questions?

Start-up environment variables (/etc/sysconfig, /etc/default)

Page 6: Pacemaker · Dynamic recheck interval s 4 cluster-recheck-interval Applies to failure-timeout and date-based rules (date_expression with an operation of gt, lt, or in_range but not

Bundle enhancementsPacem

aker: Recent changes

6

<bundle id="httpd-bundle"> <podman image="pcmk:http" replicas="3" options=”-e MY_VAR=true”/> <storage> <storage-mapping id="httpd-env" options=”r” source-dir="/srv/httpd-files/pacemaker-environment-vars" target-dir="/etc/pacemaker/pcmk-init.env" /> </storage> <primitive class="ocf" id="httpd" provider="heartbeat" type="apache"/></bundle>

● Support for Docker, podman, and rkt● Can be used in cluster with sbd● Per-node environment variables for Pacemaker Remote

Questions?

Page 7: Pacemaker · Dynamic recheck interval s 4 cluster-recheck-interval Applies to failure-timeout and date-based rules (date_expression with an operation of gt, lt, or in_range but not

Machine-friendly tool outputPacem

aker: Recent changes

7

Output formats can take options (such as HTML stylesheet)crm_mon -1 --output-as=html --html-stylesheet=”https://example.com/css/main.css”

--output-as/--output-to# stonith_admin --list-installed --output-as=xml<pacemaker-result api-version="2.0" request="stonith_admin --list-installed --output-as=xml"> <list name="Installed fence devices" count="3"> <item name="device">fence_virt</item>

<item name="device">fence_virtd</item><item name="device">fence_xvm</item>

</list> <status code="0" message="OK"/></pacemaker-result>

API XML schema

Page 8: Pacemaker · Dynamic recheck interval s 4 cluster-recheck-interval Applies to failure-timeout and date-based rules (date_expression with an operation of gt, lt, or in_range but not

Pacem

aker: Recent changes

8

Page 9: Pacemaker · Dynamic recheck interval s 4 cluster-recheck-interval Applies to failure-timeout and date-based rules (date_expression with an operation of gt, lt, or in_range but not

Pacem

aker: Recent changes

9

Page 10: Pacemaker · Dynamic recheck interval s 4 cluster-recheck-interval Applies to failure-timeout and date-based rules (date_expression with an operation of gt, lt, or in_range but not

Machine-friendly tool outputPacem

aker: Recent changes

10

Output formats can take options such as HTML stylesheetcrm_mon -1 --output-as=html --html-stylesheet=”https://example.com/css/main.css”

--output-as/--output-to# stonith_admin --list-installed --output-as=xml<pacemaker-result api-version="2.0" request="stonith_admin --list-installed --output-as=xml"> <list name="Installed fence devices" count="3"> <item name="device">fence_virt</item>

<item name="device">fence_virtd</item><item name="device">fence_xvm</item>

</list> <status code="0" message="OK"/></pacemaker-result>

API XML schema

Questions?

Page 11: Pacemaker · Dynamic recheck interval s 4 cluster-recheck-interval Applies to failure-timeout and date-based rules (date_expression with an operation of gt, lt, or in_range but not

Expected this yearPacem

aker: Com

ing soon

11

● Section selection in crm_mon

● Machine-friendly output: high-level C API

● Fencing reasons

● Shutdown locks

● Enhancements to access control lists (ACLs)

○ Colorized display

○ Groups

○ PAM integration

Questions?

Page 12: Pacemaker · Dynamic recheck interval s 4 cluster-recheck-interval Applies to failure-timeout and date-based rules (date_expression with an operation of gt, lt, or in_range but not

Big ideasPacem

aker: Future directions

12

● More intelligent disaster recovery support

○ Coordinated status across related clusters

○ Coordinated configuration changes

○ Easier failover testing (live or dry run)

● Event-driven resources

○ “Push” rather than poll for systemd resource status

○ Persistent daemonized resources

● More configurable failure response

○ Colocation constraint option for “noncritical resources”

○ failure-restart + failure-escalation

Questions?

Page 13: Pacemaker · Dynamic recheck interval s 4 cluster-recheck-interval Applies to failure-timeout and date-based rules (date_expression with an operation of gt, lt, or in_range but not

linkedin.com/company/red-hat

youtube.com/user/RedHatVideos

facebook.com/redhatinc

twitter.com/RedHat

Red Hat is the world’s leading provider of

enterprise open source software solutions.

Award-winning support, training, and consulting

services make Red Hat a trusted adviser to the

Fortune 500.

Thank you

13