QRep Performance and Optimisation - GSE...

QRep Performance and OptimisationMark TurnerRoyal Bank of Scotland

01/11/2016Session IC

3

AgendaQRep Introduction/Refresher

Replication Tools

Application Considerations

Monitoring & Analyzing Replication Performance

Performance Best Practices

4

Acknowledgments

Some of these materials were ‘borrowed’ from Serge Bourbonnais, Lead Architect for IBM Infosphere Queue Replication

QRep Introduction/Refresher

6

What is QRepIBMs strategic data replication technology for the DB2 Family

– Is developed in close cooperation with DB2 development teams– Supports new DB2 features at DB2 GA time– Supports both mainframe and distributed platforms– Is a key technology of IBM‟s Active-Active strategy

Technical characteristics– Asynchronous replication – not limited by geographical distance– Application oriented – can replicate a subset of tables– Log-based change data capture – lowest impact to source system– Only changed data is delivered – minimum data processing– Transferring and staging data in IBM WebSphere® MQ queues – excellent data recovery capability– Target data always transactional consistent – target data available at any time– Parallel data applying – high performance

7

QRep Architecture• Q Capture reads data changes using IFI (IFCID 306) and publishes each

transaction to MQ queue (usually 1 message = 1 transaction)

• Q Apply reads the messages and constructs Insert, Update and Delete statements from the message content and applies changes to target tables

• End to end latency ~ 1second across any distance

MQ Queue MangerMQ Queue Manger

DB2

RecoveryLog QCaptureIFI QApply

DB2

Agent1

Agent2

Agentn

Applications

Application Data Queues

Admin Queue

SOURCE3SOURCE2

SOURCE1

TARGET1TARGET2

TARGET3

Administration & Monitoring

Replication Center Replication Dashboard

Administration & Monitoring

Replication Center Replication Dashboard

8

What is Replication Used For?Q Replication (and other software replication solutions) have been used for many years to:– Provide data feeds of changes to downstream operational systems– Maintain replica databases for workloads that could impact an operational system– Provide data feeds to MIS and data warehouse systems– Provide an audit of who changed what and when

Above uses tend to provide feeds into batch processes. – Replication latency is not a concern

GDPS Active-Active provides a Stand-in/Query capability– Replication latency is critical

9

What is GDPS Active-Active

10

GDPS A-A and Q Replication

MQ Queue MangerMQ Queue MangerMQ Queue MangerMQ Queue MangerDB2

RecoveryLog QCaptureIFI QApply

DB2

Agent1Agent2Agentn

Applications

MQ Queue MangerMQ Queue MangerMQ Queue MangerMQ Queue Manger

QCaptureQApply

RecoveryLog

IFIAgent1Agent2Agentn

WorkloadDistributor

Site 1 Site 2

Network

RoutingOLTPWorkload

Query Workload

Replication Tools

12

Replication Tools - AdministrationReplication Center– Thick client delivered with the DB2 Data

Server Client

– Use to configure qmaps and subscriptions

– Operations supported for LUW

– Useful for testing/prototyping

Replication Dashboard– Browser based (server runs on Windows, AIX,

or Linux - including zLinux)

– Multi-site monitoring, troubleshooting, tuning

and reporting

– Operations – start/stop/spill/resume

subscriptions/queues etc

ASNCLP– Interactive or batch scripting tool for

configuration

– Runs on z/OS or workstation

CREATE QMAP “MAP1” USING ADMINQ ADMIN1RECVQ “RCV1” SENDQ “SND1”………;

CREATE QSUB “SUB1” USING REPLQMAP “MAP1”(SUBNAME SUB1 TABLE1…….;)

13

Replication Tools - Utilities– ASNTDIFF

– Compare Source and Target tables

– Can be large tables across large

distances

– Minimal network traffic

– Can be run online

Example – Compare of 40M row table

13 columns - avg row 48 bytes

Elapsed Time = 195 Seconds

– ASNQMFMT– Display and format queued messages

– Alert Monitor– Checks status of replication

environment

– Sends alerts/notifications when certain

conditions detected

1. Summary of compare---------------------Number of different rows: 0Number of common rows: 40041147---------------------2. Difference Details (rows)---------------------Source total: 40041147Target total: 40041147Source only: 0Target only: 0Update: 0

Application Considerations

15

Application ConsiderationsTables with no Unique index– Target tables must have a unique replication key– If none available consider using a ‘hidden’ identity column to provide uniqueness

Database Sequences and Identity Columns– Identity columns must be defined as GENERATED BY DEFAULT– Sequences can not be replicated– Need to prevent conflicts (ie duplicate keys) when secondary system is being updated – may

consider using odd values in Site 1 and even values in Site 2Large Objects (LOBS)– If not inline – LOBs fetched by capture at time log record is read

Database Load/Reorg Discard Utilities– Utility data changes are not replicated (except online Load)– Can force automatic refresh when Load detected (CAPTURE_LOAD=‘R’)

Row Level Locking may be required– Parallel apply can result in contention for large batch processes

16

Application ConsiderationsSchema changes made to the operational system may need to be applied to the target system (GDPS Active-Active requirement)– QRep can automatically replicate some schema changes such as Alter Table Add Column

All application processes must commit regularly– Very large transactions can result in

• MQ maximum message size being breached

• Latency problems as Apply serializes when processing Monster transactions

Beware Mass Deletes– QRep requires DATA CAPTURE CHANGES on source table

– TRUNCATE and DELETE FROM TABLE will result in every row being deleted to be logged.

– Use LOAD with DUMMY input file instead

• QRep adding support to detect this in December 2016 (full tablespace) and partition level in 2017

Monitoring and Analysing Replication Performance

18

Monitoring and Troubleshooting1. Capture and Apply Program Log

– Environment information – startup parms, etc– Error, warning and Informational messages

2. z/OS Console/Job log– Environment information – startup parms, number of active subscriptions etc– Error/warning messages– Use LOGSTDOUT=Y to have all log messages written to job log

3. DB2 Tables– TRACE tables - IBMQREP_APPLYTRACE, IBMQREP_CAPTRACE– MONITOR tables – IBMQREP_CAPMON, IBMQREP_CAPQMON, IBMQREP_APPLYMON– EXCEPTIONS table – IBMQREP_EXCEPTIONS

4. Run time status queries– E.g. F taskname,status show details (on z/OS)

5. Replication Dashboard– Visualise and analyse information from the DB2 tables for all sites in a configuration

19

Replication Control TablesCapture Apply

ConfigurationIBMQREP_CAPPARMS IBMQREP_APPLYPARM

IBMQREP_SENDQUEUES IBMQREP_RECVQUEUES

SubscriptionsIBMQREP_SUBS IBMQREP_TARGETS

OperationsIBMQREP_SIGNAL IBMQREP_APPLYCMD (New in v10.2.1)

MonitoringIBMQREP_CAPMON IBMQREP_APPLYMON

IBMQREP_CAPQMON

Message Logs & ErrorsIBMQREP_CAPTRACE IBMQREP_APPLYTRACE

IBMQREP_EXCEPTIONS

20

Steps to Identify Performance Bottlenecks

Analyse replication latencies– High CAPTURE_LATENCY usually indicates issue with Capture or accessing Source DB2 log– High QLATENCY usually indicates issue with MQ of Q Apply– High APPLY_LATENCY usually indicates issues with Apply or the target database

Determine if performance issue is caused by lack of System Resource– Q Apply in particular is a busy DB2 application– Includes CPU, memory, network bandwidth

Analyze performance metrics in QRep Monitor Tables– Lack of apply agents?– Database contention at target?– Inefficient DB2 accesses at target?– Sub-optimal MQ configuration?– etc

21

Key Performance Metrics•Extensive performance metrics are located in the MONITOR tables.•TIP: Ensure monitor interval for both Capture and Apply are suitably low – say 10 seconds

Monitor Table Column Description

IBMQREP_CAPMON

(one row per Capture monitor interval)

At the SOURCE database

CURRENT_LOG_TIME Latest transaction commit time seen by Q Capture

LOGREAD_API_TIME Time spent in DB2 API to read log records

LOGRDR_SLEEP_TIME Sleep time of log reader thread in this monitor interval

CURRENT_MEMORY The amount of memory used by Q Capture to construct transactions

TRANS_SPILLED Number of transactions that are too large and spilled to disk

MQCMIT_TIME Time spent on MQCMIT calls

IBMQREP_CAPQMON

(one row per QMAP per Capture monitor interval)

At the SOURCE database

ROWS_PUBLISHED Total number of rows put into MQ by Q Capture

MQ_MESSAGES Total number of messages put into MQ by Q Capture

MQPUT_TIME Time spent on MQPUT calls

XMITQDEPTH Number of messages currently in the MQ transmit queue.

IBMQREP_APPLYMON

(one row per receive queue per apply monitor interval)

At the TARGET database

OLDEST_TRANS Q Replication synchronization point - all source transactions prior to this timestamp have been applied

ROWS_APPLIED Total number of rows applied to target database.

CURRENT_MEMORY The amount of memory used by Q Apply to read transactions

MQGET_TIME Total time spent on MQGET calls in the interval.

QDEPTH Number of messages currently in the MQ receive queue.

END2END_LATENCY Average end-to-end latency time for all transactions applied in this monitor interval - between source DB commit and target DB commit

CAPTURE_LATENCY Latency time spent in Capture – between source DB commit and source MQ commit

QLATENCY Latency time spent in MQ – between source MQ commit and target MQGET

APPLY_LATENCY Latency time spent in Apply – between target MQGET and target DB commit

DBMS_TIME Average time spent per transaction in target database for SQL processing

APPLY_SLEEP_TIME Total sleep time of all apply agents in this monitor interval

22

Q Replication LatencyEnd 2 End Latency

DB2Log

SOURCE3

SOURCE2

SOURCE1

QCapture

CAPQMONCAPMON

QApply

TARGET1

TARGET2

TARGET3

APPLYMON

Capture Latency Q Latency Apply Latency

LOGREAD_API_TIME

LOGRDR_SLEEPTIME

MQGET_TIMETransport andSTAGING time

WORKQ_WAIT_TIMEDEPENDENCY_DELAY

MCGSYNC_DELAYRETRY_TIME DBMS_TIME

MQPUT_TIMEMQCMIT_TIME

23

Understanding Capture Latency

MQ Queue ManagerCAPTURE

DB2LOG

SOURCE3

SOURCE2

SOURCE1

CAPQMONCAPMON

LOGRDTHREAD

(SLEEP_INTERVAL)

TransMgr(MAX_MEMORY)

PUBLISHTHREAD

(COMMIT_INTERVAL)

LOGREAD_API_TIMELOGRD_SLEEPTIME

CAPTURE_IDLEMQPUT_TIMEMQCMIT_TIME

CURRENT_MEMORY

Restart Queue

Send Queues

Admin Queue

XMITQDEPTH

MONITORTHREAD

Notifications

(internal) wait Q


IBMQREP_CAPMON(per Capture instance)

LOGREAD_API_TIME Time spent in DB2 API to read log records

LOGRDR_SLEEPTIME Sleep time of log reader thread in this monitor interval

CURRENT_MEMORY The amount of memory used by Q Capture to construct transactions

MQCMIT_TIME Time spent on MQCMIT calls

CAPTURE_IDLE Time PUBLISH is waiting for LOGRD to read transactions

IBMQREP_CAPQMON

(one row per send queue)

MQPUT_TIME Time spent on MQPUT calls

XMITQDEPTH Number of messages currently in the MQ transmit queue.

24

Monitoring Capture Log Read activity

IBMQREP_CAPMON

LOGREAD_API_TIMEThe number of milliseconds that the Q Capture program spent making IFI calls to retrieve log records.

NUM_LOGREAD_CALLSThe number of log read IFI calls that Q Capture made.

CURRENT_MEMORYThe amount of memory (in bytes) that Capture uses for storing log records.

ROWS_PROCESSEDThe number of rows (individual insert, update, or delete operations) that the Q Capture program read from the log.

TRANS_PROCESSEDThe number of transactions read from the log.

TRANS_SPILLEDThe number of transactions that the Q Capture program spilled to a file after exceeding the MEMORY_LIMIT threshold ('MONSTER' transactions)

LOGRDR_SLEEPTIMEThe number of milliseconds that the Q Capture log reader thread slept because there were no changes to capture or because Q Capture is operating at its memory limit.

25

Monitoring Capture Log Read activity

MAX_TRANS_SIZEThe largest transaction, in bytes, that the Q Capture program processed

NUM_END_OF_LOGSThe number of times that Q Capture reached the end of the log.

Publisher Thread activity across all send queues

CAPTURE_IDLEThe number of milliseconds that the PUBLISH thread is waiting for LOGRD thread to produce transactions.

MQCMIT_TIMEThe number of milliseconds during the monitor interval that the Q Capture program spent in MQCMIT -committing messages on all send queues.

26

Monitoring Capture Publishing Activity

IBMQREP_CAPQMON

ROWS_PUBLISHEDNumber of rows (individual insert, update, or delete operations) that Q Capture program put on this send queue.

TRANS_PUBLISHEDNumber of transactions that Q Capture put on this send queue.

MQ_BYTESNumber of bytes put on this send queue during the monitor interval, including data from the source table and the message header

MQ_MESSAGESThe number of messages put on the send queue during the monitor interval.

MQPUT_TIMENumber of milliseconds during the monitor interval that the Q Capture program spent putting MQ messages on this send queue.

XMITQDEPTHNumber of messages on the MQ transmission queue. If you are using parallel send queues, the value is an aggregate of all transmission queues..

27

Where is Capture Time SpentLog Reader

Time in DB2 + Time Sleeping + Time Processing Log Records

LOGREAD_API_TIME + LOGRDR_SLEEPTIME + (MONITOR_INTERVAL - LOGREAD_API_TIME - LOGRDR_SLEEPTIME)

Publisher Thread

Time spent in MQ + Idle Time + time spent by worker thread decoding and formatting

(MQCMIT_TIME + SUM(MQPUT_TIME)) +CAPTURE_IDLE +(MONITOR_INTERVAL – [TIME IN MQ] – CAPTURE_IDLE)

Log Reader Latency

MONITOR_TIME – CURRENT_LOG_TIME

28

Capture PerformanceWorkload:

3 x Updates per UOW3 x Threads on each of 6 TablesMONITOR_INTERVAL = 10 Seconds

Workload Started

CPU ConstraintSlowed Capture

29

Capture PerformanceWorkload:

Large Batch sequential delete processMONITOR_INTERVAL = 60 Seconds

Nothing of Concern Showing

30

Tuning Capture• SLEEP_INTERVAL

– Defined in IBMQREP_CAPPARMS– How long Log Reader thread sleeps when End of log reached– Default – 500ms– For very busy systems you may not need to reduce this as end of log will not be reached. However most systems have

quieter periods so recommendation is to reduce from default.– Recommend – 50ms

• TRANS_BATCH_SZ– Defined in IBMQREP_CAPPARMS– How many source transactions are packaged by Capture into a single MQ message– Aim is to avoid small MQ messages– Default – 1 (ie no batching)– For OLTP systems with small transactions and high volume consider increasing

• MEMORY_LIMIT– Defined in IBMQREP_CAPPARMS– The amount of memory Capture uses to build transactions.– If memory limit is exceeded, Capture spills to file– Default – 0 – Default is usually acceptable on z/OS. – A good alternative is 200MB

31

Tuning Capture• NUM_PARALLEL_SENDQS

– Defined in IBMQREP_SENDQUEUES– How many send queues will be used to replicate transactions for a queue map– Default – 1– For very high volume replication parallel sendqs will improve MQ throughput by utilising multiple send queues/transmit

queues/channels

• CHANGED_COLS_ONLY– Defined in IBMQREP_SUBS– Determines if Capture will publish non-key columns that have not been changed– Affects UPDATE operations only– Default – ‘Y’– For tables with many columns and only a few updated this will reduce the amount of data sent to Apply– If used, CONFLICT_ACTION = ‘F’ can NOT be used– Recommend – ‘Y’

32

Understanding Apply Latency

(internal) WORKQ

Q Apply agent pool DB2connections

BROWSER

(internal) DONEQ

Control Tables

User tables

DONEMSG

AGENTAGENTAGENT

AGENT

AGENT

AGENTAGENT

SQL operations

23489

23489

In-memory transactions

3412

TRANSACTIONS or HEARTBEAT messages

RETRYAGENT

(internal) RETRY Q


IBMQREP_APPLYMON

(one row per receive queue)

At the TARGET database

ROWS_PROCESSED Total number of rows in the process of being applied to target database.

APPLY_LATENCY Latency time spent in Apply – between target MQGET and target DB commit

DBMS_TIME Time spent in target database for SQL processing

APPLY_SLEEP_TIME Total sleep time of all apply agents in this monitor interval

DEPENDENCY_DELAY Total time transactions waited for dependent transactions to finish

MCGSYNC_DELAY Total time transactions waited for other CG (browser) to move up applyuptopoint, when operating in synchronized apply mode

WORKQ_WAIT_TIME Total time transactions waited in the WORKQ for an agent to execute them.

RETRY_TIME Total time a failed transaction was tried again.

Q Apply

33

Interpreting Apply Latency Counters

Latency counters are an average per transaction for the monitor interval

– Q Apply adds up latency values for each transaction, and divides the final sum by the number of transactions during the monitor interval.

– All counters are reset at each monitor interval

– If there are no transactions applied in the monitor interval, latency is 0

– All times are calculated using GMT

– All times are reported in milliseconds.

34

Monitoring Apply ThroughputIBMQREP_APPLYMON

APPLY_SLEEP_TIME– The number of milliseconds that Q Apply agents for this receive queue were idle while waiting for work.– e.g. If there are 10 apply agents and they were each 50% busy for the interval (10 seconds, say) APPLY_SLEEP_TIME

= 5,000ms (5 Seconds)

TRANS_READ– The total number of transactions retrieved from this receive queue during the monitor interval – including those not yet

committed.

TRANS_APPLIED– The total number of transactions from this receive queue that the Q Apply committed to the target.

ROWS_PROCESSED– The number of rows that were read from receive queues and applied, including those not yet committed to the target.

ROWS_APPLIED– The total number of rows that were read from this receive queue and that the Q Apply program has committed to the

target.

SPILLED_ROWS– The number of rows that the Q Apply program sent to temporary spill queues while targets were being loaded or while Q

subscriptions were placed into a spill state by the spillsub command.

SPILLEDROWSAPPLIED– The number of spilled rows that were applied to the target

35

Monitoring Apply Serialization/Contention

TRANS_SERIALIZED– The total number of transactions that conflicted with another transaction – often because of a row conflict or a referential

integrity conflict. But also includes rows serialised as a result of low MAXAGENTS_CORRELID. The Q Apply program suspends parallel processing and applies the row changes within the transaction in the order they were committed at the source. A high value may not necessarily be a problem.

DEPENDENCY_DELAY– The average time in milliseconds that a transaction had to wait due to dependencies with prior transactions. A high value

often means the same row is being updated repeatedly and therefore updates have to be serialised. Should aim to keep this close to 0.

KEY_DEPENDENCIES– The total number of row updates that were serialised as >1 trans are updating the row (based on the primary key/replication

key). Will result in Dependency Delay

UNIQ_DEPENDENCIES– The total number of unique index constraints that were detected (not primary key/replication key), forcing transactions to be

serialized. Will result in dependency delay

JOB_DEPENDENCIES– The number of transactions that are delayed because of correlation ID dependencies. Will result in dependency delay

MONSTER_TRANS– The number of transactions that exceeded the MEMORY_LIMIT for the receive queue set in the

IBMQREP_RECVQUEUES table. While a monster transaction is being processed no other transactions are applied!

RI_DEPENDENCIES– The total number of referential integrity conflicts that were detected (across transactions), forcing transactions to be

serialized. e.g. attempt to insert child before parent inserted

36

Monitoring Apply Serialization/Contention

UNIQ_RETRIES– The number of times that the Q Apply tried to re-apply rows that were not applied in parallel because of unique index

constraints (reported by UNIQ_DEPENDENCIES).

RI_RETRIES– The number of times that the Q Apply had to re-apply row changes because of referential integrity conflicts (reported by

RI_DEPENDENCIES)

DEADLOCK_RETRIES– The number of times that the Q Apply re-applied row changes because of lock timeouts and deadlocks.

NOTE: After 3 attempts (default) the UOW will be put on the queue to be processed by apply agent #1. If there are many deadlocks/timeouts they will therefore be serialised through a single apply agent.

ROWS_NOT_APPLIED– The number of rows that could not be applied, and were added to the IBMQREP_EXCEPTIONS table.

OKSQLSTATE_ERRORS– The number of row changes that caused an SQL error that is defined as acceptable in the OKSQLSTATES field of the

IBMQREP_TARGETS table. The Q Apply program ignores these errors.

37

Monitoring Apply LatencyDEPENDENCY_DELAY

– The average time in milliseconds that a transaction had to wait due to dependencies with prior transactions

MCGSYNC_DELAY– When running in synchronized mode for multiple consistency groups (MCGs) with other Apply programs, this is the average

time per transaction in milliseconds that Apply had to wait for another apply to come into sync. Check IBMQREP_MCGMON.

WORKQ_WAIT_TIME– The average time in milliseconds that each transaction waited on the in-memory work Q for an agent to apply to the target

database. A high value could mean apply agents are saturated, or there are CPU constraints

RETRY_TIME– The average time in milliseconds that the transactions spent in being retried for SQL failures due to deadlocks, lock

timeouts, RI violations, or secondary unique constraint violations.

DBMS_TIME– The average elapsed time in milliseconds spent in DB2 per transaction.

APPLY_LATENCY =

DEPENDENCY_DELAY + MCGSYNC_DELAY +

WORKQ_WAIT_TIME + RETRY_TIME + DBMS_TIME

38

Why is Apply Latency High?APPLY_SLEEP_TIME

– Sum of all agents (in ms) E.g. 32 agents, each sleeps 10 seconds – 32*10000=32000– 0 = all agents busy

• Bottleneck – in DB2, # agents, or CPU constraint? Check workq_wait_time• If only a few agents are working, is there a hot row? Check dependency_delay• Hot row will not cause excessive delay, if enough CPU resources available. Check dbms_time

DBMS_TIME– Average elapsed time per DB2 transaction (in ms) – Q Apply gets system timestamp before and after each DB2 insert/update/delete/commit operation– rows_applied / trans_applied = average transaction size– Did dbms_time increase at the time latency increased?– If dbms_time is high

• Look at DB2 Accounting data – DB2 locking, synchronous I/O, etc• Compare source and target system class 2 time to determine where bottleneck may be

NOTE: Even if dbms_time appears ‘small’, just a modest increase may push latency up. E.g.e.g. dbms_time=35ms during slowdown, and 12ms before slowdown. 35ms is still 3 times slower!

Target system must deliver equivalent DB2 performance to the source system

39

Why is Apply Latency High?DEPENDENCY_DELAY

– High dependency delay can be symptom of CPU constraint– Or could be result of a ‘hot row’– Avoid using trans_batch_sz if there are hot rows. May cause excessive serialization

CURRENT_MEMORY– A value near IBMQREP_RECVQUEUES (MEMORY_LIMIT) indicates slowdown in apply. Apply agents are

not keeping up, or there is a monster transaction. Check dbms_time and monster_trans

MCGSYNC_DELAY– 0 indicates this is the slow consistency group

DEADLOCK_RETRIES– Can be result of deadlock or timeout– Apply retries after 2 seconds– Apply retries up to IBMQREP_APPLYPARMS (DEADLOCK_RETRIES) – default is 3.– Apply serialises on apply agent 1 after attempting DEADLOCK_RETRIES times

40

Apply PerformanceWorkload:

3 x Updates per UOW3 x Threads on each of 6 TablesMONITOR_INTERVAL = 10 Seconds

WORKQ_WAIT_TIME ≈ APPLY_LATENCY

# Agents too low, or CPU Constrained.

In this instance CPU constrained

41

Apply PerformanceWorkload:


APPLY_LATENCY ≈ DEPENDENCY_DELAY

Serialisation due to job dependenciesMAXAGENTS_CORRELID is set to 1

42

Why Row Locking may be Required

In Source batch jobs control order transactions are executed– For example, thread1 may update rows with keys 1-1000000, and thread2 the rows with keys 10000001-

20000000, of a given table– Each thread will commit frequently for best performance, for example, committing after every 500 row

updates

In Target, Apply replays transactions in parallel– Apply only serialises if there are dependencies– Parallel Apply will group multiple updates with commit_count. – >1 apply agents requesting page locks together can cause deadlock.– It is not uncommon for batch jobs that sequentially update each and every row of a table to create deadlocks

in Q Apply with PAGE LEVEL locking at the target– Reducing the number of apply agents with MAXAGENTS_CORRELID, or NUM_APPLY_AGENTS can

mitigate the problem by reducing the probability of hitting the same page – but at expense of throughput

For these cases, Row Level Locking May be necessary

43

Monitoring MQ ActivityIBMQREP_CAPQMON (One row per QMAP)

MQ_BYTESThe number of bytes of data read from all receive queues during the monitor interval, including message data and the message header.

MQ_MESSAGESThe number of messages put on the send queue during the monitor interval.

MQPUT_TIMEThe number of milliseconds during the monitor interval that the Q Capture program spent interacting with the MQ API for putting messages to the send queues for the queue map.

XMITQDEPTHThe queue depth (number of messages on the transmit queue.

IBMQREP_CAPMON (One row Capture)

MQCMIT_TIME

The number of milliseconds during the monitor interval that the Q Capture program spent in MQ commit time

IBMQREP_APPLYMON (One row per Receive Queue)

QDEPTH Number of messages on the receive queue at monitor_time.

MQ_BYTESThe number of bytes of data read from all receive queues during the monitor interval, including message data and header.

NUM_MQGETSThe number of browser MQGET calls to retrieve a message from the receive queue during the monitor interval, MQGET calls that do not retrieve a message are not counted.

QLATENCYAverage time that messages spend in MQ. It is calculated as the difference in milliseconds between Q Capture put on the send queue and Q Apply get on the receive queue.

MQGET_TIMEThe number of milliseconds during the monitor interval that the Q Apply browser spent in MQGET calls on the receive queue. The value includes MQGET calls that do not return messages (queue is empty.)

44

MQ PerformanceWorkload:


High XMITQDEPTH shows MQ bottleneck

45

Where is the Bottleneck

Performance Best Practice

47

Best Practice – General ConfigurationHave a dedicated MQ Queue Manager for each Capture and Apply

– Allows bigger bufferpools (1GB limit)– prevent potential logging contention

Capture/Apply programs with WLM service class as high as DB2 MSTR

MQ Channel Initiator task WLM as SYSSTC

Use Parallel send queues - at least 2 per consistency group– At very long distance (1000s of kilometres) essential to have >2 queues

Use Column suppression (changed_cols_only=y) – reduces amount of data transmitted over the network for updates and deletes

Use DB2 row-level locking (at least at the target) for tables updated by sequential access batch jobs

Q Apply is a busy DB2 application! - Must ensure Target DB2 can handle it.

48

Best Practice – Capture

DB2 log read IFI 306 filtering (V10.2.1) (requires DB2 V10 APAR PM90568 or DB2 V11)

Set sleep_interval = 50ms (Default 500ms)

Memory_limit is not significant impact – use default 0, or 200MB OK– Increase if spilling because of monster transactions. Check IBMQREP_CAPMON (trans_spilled)

Set trans_batch_sz=4 or more.– But, only if no issue with 'hot row' workloads, ie. if applymon(dependency_delay) is high, use

trans_batch_sz=1 (default)– Goal: get MQ message size > 10KB for optimal. – Combine with max_message_size = 1MB (large transactions will not be batched beyond 1MB)

Run Q Capture with warntsx to detect very large transactionsLarge transactions will cause high latencySource application must commit frequently

49

Best Practice – Apply

Use parallel_sendqueues=Y – Ensures ‘get by msgid ‘ is used - even if not using parallel send queues

Set maxagents=32 or more – check agent_sleep_time. Maximum supported = 128– BUT, reduce the number of agents if constrained CPU at the target! 16 performs better than 32 in that case

maxagents_correlid may be needed if lock timeout or deadlocks occur during batch job– Check IBMQREP_APPLYMON(DEADLOCK_RETRIES)

Use multi-row-insert=Y (default) – Exploit DB2 multi-row insert (V10.2.1)

Set prune_batch_sz=100

50

Best Practice – MQ

Set batchlim(1MB) - requires PTF PM79000/UK90868 with MQ 7.1

Set Channel batchsz(800)

– Determines the maximum number of messages in a unit work sent across MQ channel. 800 is best for OLTP workload. Must also set Q Capture MAX_TRANS accordingly

MQGET for delete (MQ 7.1) Q Rep APAR PM90110

– Improves Q Apply pruning performance, keeping receive queue small. Requires MQ APAR PM81785.

Use 1GB bufferpool space - divided among the receive queues for the queue manager

Enable MQ bufferpool read-ahead

– RAHGET - is for large messages and adds significant performance benefits for MQGET when the queue is spilling to pageset and the message sizes are > ~ 30k (RECOVER QMGR(TUNE RAHGET ON))

– READAHEAD - is for smaller messages (RECOVER QMGR(TUNE READAHEAD ON))

Migrate to MQ V8 or later (significant improvements for Q Rep in MQ), including:– 64-bit bufferpool: more, fast storage for queue access; allows more data on queue before spill to disk; potentially 64GB of

(each) XMITQ entirely in storage

– Enhanced deferred write (cast-out): faster MQPUT by capture when queue exceeds bufferpool

– Enhanced logging: particularly for larger messages

51

Best Practice – MQ

Size XMIT queues and Receive queues to support 24 hours of messages

– Qrep problems can occur, and can take time to resolve

– Queues can build quickly in high volume syste,

Separate QRep application queues and SYSTEM queues to different pagesets– Queue full conditions will prevent MQ commands being entered

52

Best Practice – Workload Considerations

If possible Avoid:

Large LOB data, use inline LOBs whenever possible

Very large purge jobs. – Consider running them at each site instead (if they cause increased replication latency)

Very large transactions, e.g., millions of row changes in a single unit of work– Will result in Monster transactions causing apply to serialize

Use Row-level locking at the target where necessary– May need to increase NUMLKUS accordingly

Add more QMAPS if hot row workloads cause latency spikes– Move the table with a hot row into its own replication queue

53

Session feedback

• Please submit your feedback at

http://conferences.gse.org.uk/2016/feedback/ic

• Session is IC

QRep Performance and Optimisation - GSE...

Documents

Transcript of QRep Performance and Optimisation - GSE...