QRep Performance and Optimisation - GSE...
Transcript of QRep Performance and Optimisation - GSE...
QRep Performance and OptimisationMark TurnerRoyal Bank of Scotland
01/11/2016Session IC
3
AgendaQRep Introduction/Refresher
Replication Tools
Application Considerations
Monitoring & Analyzing Replication Performance
Performance Best Practices
4
Acknowledgments
Some of these materials were ‘borrowed’ from Serge Bourbonnais, Lead Architect for IBM Infosphere Queue Replication
QRep Introduction/Refresher
6
What is QRepIBMs strategic data replication technology for the DB2 Family
– Is developed in close cooperation with DB2 development teams– Supports new DB2 features at DB2 GA time– Supports both mainframe and distributed platforms– Is a key technology of IBM‟s Active-Active strategy
Technical characteristics– Asynchronous replication – not limited by geographical distance– Application oriented – can replicate a subset of tables– Log-based change data capture – lowest impact to source system– Only changed data is delivered – minimum data processing– Transferring and staging data in IBM WebSphere® MQ queues – excellent data recovery capability– Target data always transactional consistent – target data available at any time– Parallel data applying – high performance
7
QRep Architecture• Q Capture reads data changes using IFI (IFCID 306) and publishes each
transaction to MQ queue (usually 1 message = 1 transaction)
• Q Apply reads the messages and constructs Insert, Update and Delete statements from the message content and applies changes to target tables
• End to end latency ~ 1second across any distance
MQ Queue MangerMQ Queue Manger
DB2
RecoveryLog QCaptureIFI QApply
DB2
Agent1
Agent2
Agentn
Applications
Application Data Queues
Admin Queue
SOURCE3SOURCE2
SOURCE1
TARGET1TARGET2
TARGET3
Administration & Monitoring
Replication Center Replication Dashboard
Administration & Monitoring
Replication Center Replication Dashboard
8
What is Replication Used For?Q Replication (and other software replication solutions) have been used for many years to:– Provide data feeds of changes to downstream operational systems– Maintain replica databases for workloads that could impact an operational system– Provide data feeds to MIS and data warehouse systems– Provide an audit of who changed what and when
Above uses tend to provide feeds into batch processes. – Replication latency is not a concern
GDPS Active-Active provides a Stand-in/Query capability– Replication latency is critical
9
What is GDPS Active-Active
10
GDPS A-A and Q Replication
MQ Queue MangerMQ Queue MangerMQ Queue MangerMQ Queue MangerDB2
RecoveryLog QCaptureIFI QApply
DB2
Agent1Agent2Agentn
Applications
MQ Queue MangerMQ Queue MangerMQ Queue MangerMQ Queue Manger
QCaptureQApply
RecoveryLog
IFIAgent1Agent2Agentn
WorkloadDistributor
Site 1 Site 2
Network
RoutingOLTPWorkload
Query Workload
Replication Tools
12
Replication Tools - AdministrationReplication Center– Thick client delivered with the DB2 Data
Server Client
– Use to configure qmaps and subscriptions
– Operations supported for LUW
– Useful for testing/prototyping
Replication Dashboard– Browser based (server runs on Windows, AIX,
or Linux - including zLinux)
– Multi-site monitoring, troubleshooting, tuning
and reporting
– Operations – start/stop/spill/resume
subscriptions/queues etc
ASNCLP– Interactive or batch scripting tool for
configuration
– Runs on z/OS or workstation
CREATE QMAP “MAP1” USING ADMINQ ADMIN1RECVQ “RCV1” SENDQ “SND1”………;
CREATE QSUB “SUB1” USING REPLQMAP “MAP1”(SUBNAME SUB1 TABLE1…….;)
13
Replication Tools - Utilities– ASNTDIFF
– Compare Source and Target tables
– Can be large tables across large
distances
– Minimal network traffic
– Can be run online
Example – Compare of 40M row table
13 columns - avg row 48 bytes
Elapsed Time = 195 Seconds
– ASNQMFMT– Display and format queued messages
– Alert Monitor– Checks status of replication
environment
– Sends alerts/notifications when certain
conditions detected
1. Summary of compare---------------------Number of different rows: 0Number of common rows: 40041147---------------------2. Difference Details (rows)---------------------Source total: 40041147Target total: 40041147Source only: 0Target only: 0Update: 0
Application Considerations
15
Application ConsiderationsTables with no Unique index– Target tables must have a unique replication key– If none available consider using a ‘hidden’ identity column to provide uniqueness
Database Sequences and Identity Columns– Identity columns must be defined as GENERATED BY DEFAULT– Sequences can not be replicated– Need to prevent conflicts (ie duplicate keys) when secondary system is being updated – may
consider using odd values in Site 1 and even values in Site 2Large Objects (LOBS)– If not inline – LOBs fetched by capture at time log record is read
Database Load/Reorg Discard Utilities– Utility data changes are not replicated (except online Load)– Can force automatic refresh when Load detected (CAPTURE_LOAD=‘R’)
Row Level Locking may be required– Parallel apply can result in contention for large batch processes
16
Application ConsiderationsSchema changes made to the operational system may need to be applied to the target system (GDPS Active-Active requirement)– QRep can automatically replicate some schema changes such as Alter Table Add Column
All application processes must commit regularly– Very large transactions can result in
• MQ maximum message size being breached
• Latency problems as Apply serializes when processing Monster transactions
Beware Mass Deletes– QRep requires DATA CAPTURE CHANGES on source table
– TRUNCATE and DELETE FROM TABLE will result in every row being deleted to be logged.
– Use LOAD with DUMMY input file instead
• QRep adding support to detect this in December 2016 (full tablespace) and partition level in 2017
Monitoring and Analysing Replication Performance
18
Monitoring and Troubleshooting1. Capture and Apply Program Log
– Environment information – startup parms, etc– Error, warning and Informational messages
2. z/OS Console/Job log– Environment information – startup parms, number of active subscriptions etc– Error/warning messages– Use LOGSTDOUT=Y to have all log messages written to job log
3. DB2 Tables– TRACE tables - IBMQREP_APPLYTRACE, IBMQREP_CAPTRACE– MONITOR tables – IBMQREP_CAPMON, IBMQREP_CAPQMON, IBMQREP_APPLYMON– EXCEPTIONS table – IBMQREP_EXCEPTIONS
4. Run time status queries– E.g. F taskname,status show details (on z/OS)
5. Replication Dashboard– Visualise and analyse information from the DB2 tables for all sites in a configuration
19
Replication Control TablesCapture Apply
ConfigurationIBMQREP_CAPPARMS IBMQREP_APPLYPARM
IBMQREP_SENDQUEUES IBMQREP_RECVQUEUES
SubscriptionsIBMQREP_SUBS IBMQREP_TARGETS
OperationsIBMQREP_SIGNAL IBMQREP_APPLYCMD (New in v10.2.1)
MonitoringIBMQREP_CAPMON IBMQREP_APPLYMON
IBMQREP_CAPQMON
Message Logs & ErrorsIBMQREP_CAPTRACE IBMQREP_APPLYTRACE
IBMQREP_EXCEPTIONS
20
Steps to Identify Performance Bottlenecks
Analyse replication latencies– High CAPTURE_LATENCY usually indicates issue with Capture or accessing Source DB2 log– High QLATENCY usually indicates issue with MQ of Q Apply– High APPLY_LATENCY usually indicates issues with Apply or the target database
Determine if performance issue is caused by lack of System Resource– Q Apply in particular is a busy DB2 application– Includes CPU, memory, network bandwidth
Analyze performance metrics in QRep Monitor Tables– Lack of apply agents?– Database contention at target?– Inefficient DB2 accesses at target?– Sub-optimal MQ configuration?– etc
21
Key Performance Metrics•Extensive performance metrics are located in the MONITOR tables.•TIP: Ensure monitor interval for both Capture and Apply are suitably low – say 10 seconds
Monitor Table Column Description
IBMQREP_CAPMON
(one row per Capture monitor interval)
At the SOURCE database
CURRENT_LOG_TIME Latest transaction commit time seen by Q Capture
LOGREAD_API_TIME Time spent in DB2 API to read log records
LOGRDR_SLEEP_TIME Sleep time of log reader thread in this monitor interval
CURRENT_MEMORY The amount of memory used by Q Capture to construct transactions
TRANS_SPILLED Number of transactions that are too large and spilled to disk
MQCMIT_TIME Time spent on MQCMIT calls
IBMQREP_CAPQMON
(one row per QMAP per Capture monitor interval)
At the SOURCE database
ROWS_PUBLISHED Total number of rows put into MQ by Q Capture
MQ_MESSAGES Total number of messages put into MQ by Q Capture
MQPUT_TIME Time spent on MQPUT calls
XMITQDEPTH Number of messages currently in the MQ transmit queue.
IBMQREP_APPLYMON
(one row per receive queue per apply monitor interval)
At the TARGET database
OLDEST_TRANS Q Replication synchronization point - all source transactions prior to this timestamp have been applied
ROWS_APPLIED Total number of rows applied to target database.
CURRENT_MEMORY The amount of memory used by Q Apply to read transactions
MQGET_TIME Total time spent on MQGET calls in the interval.
QDEPTH Number of messages currently in the MQ receive queue.
END2END_LATENCY Average end-to-end latency time for all transactions applied in this monitor interval - between source DB commit and target DB commit
CAPTURE_LATENCY Latency time spent in Capture – between source DB commit and source MQ commit
QLATENCY Latency time spent in MQ – between source MQ commit and target MQGET
APPLY_LATENCY Latency time spent in Apply – between target MQGET and target DB commit
DBMS_TIME Average time spent per transaction in target database for SQL processing
APPLY_SLEEP_TIME Total sleep time of all apply agents in this monitor interval
22
Q Replication LatencyEnd 2 End Latency
DB2Log
SOURCE3
SOURCE2
SOURCE1
QCapture
CAPQMONCAPMON
QApply
TARGET1
TARGET2
TARGET3
APPLYMON
Capture Latency Q Latency Apply Latency
LOGREAD_API_TIME
LOGRDR_SLEEPTIME
MQGET_TIMETransport andSTAGING time
WORKQ_WAIT_TIMEDEPENDENCY_DELAY
MCGSYNC_DELAYRETRY_TIME DBMS_TIME
MQPUT_TIMEMQCMIT_TIME
23
Understanding Capture Latency
MQ Queue ManagerCAPTURE
DB2LOG
SOURCE3
SOURCE2
SOURCE1
CAPQMONCAPMON
LOGRDTHREAD
(SLEEP_INTERVAL)
TransMgr(MAX_MEMORY)
PUBLISHTHREAD
(COMMIT_INTERVAL)
LOGREAD_API_TIMELOGRD_SLEEPTIME
CAPTURE_IDLEMQPUT_TIMEMQCMIT_TIME
CURRENT_MEMORY
Restart Queue
Send Queues
Admin Queue
XMITQDEPTH
MONITORTHREAD
Notifications
(internal) wait Q
Monitor Table Column Description
IBMQREP_CAPMON(per Capture instance)
LOGREAD_API_TIME Time spent in DB2 API to read log records
LOGRDR_SLEEPTIME Sleep time of log reader thread in this monitor interval
CURRENT_MEMORY The amount of memory used by Q Capture to construct transactions
MQCMIT_TIME Time spent on MQCMIT calls
CAPTURE_IDLE Time PUBLISH is waiting for LOGRD to read transactions
IBMQREP_CAPQMON
(one row per send queue)
MQPUT_TIME Time spent on MQPUT calls
XMITQDEPTH Number of messages currently in the MQ transmit queue.
24
Monitoring Capture Log Read activity
IBMQREP_CAPMON
LOGREAD_API_TIMEThe number of milliseconds that the Q Capture program spent making IFI calls to retrieve log records.
NUM_LOGREAD_CALLSThe number of log read IFI calls that Q Capture made.
CURRENT_MEMORYThe amount of memory (in bytes) that Capture uses for storing log records.
ROWS_PROCESSEDThe number of rows (individual insert, update, or delete operations) that the Q Capture program read from the log.
TRANS_PROCESSEDThe number of transactions read from the log.
TRANS_SPILLEDThe number of transactions that the Q Capture program spilled to a file after exceeding the MEMORY_LIMIT threshold ('MONSTER' transactions)
LOGRDR_SLEEPTIMEThe number of milliseconds that the Q Capture log reader thread slept because there were no changes to capture or because Q Capture is operating at its memory limit.
25
Monitoring Capture Log Read activity
MAX_TRANS_SIZEThe largest transaction, in bytes, that the Q Capture program processed
NUM_END_OF_LOGSThe number of times that Q Capture reached the end of the log.
Publisher Thread activity across all send queues
CAPTURE_IDLEThe number of milliseconds that the PUBLISH thread is waiting for LOGRD thread to produce transactions.
MQCMIT_TIMEThe number of milliseconds during the monitor interval that the Q Capture program spent in MQCMIT -committing messages on all send queues.
26
Monitoring Capture Publishing Activity
IBMQREP_CAPQMON
ROWS_PUBLISHEDNumber of rows (individual insert, update, or delete operations) that Q Capture program put on this send queue.
TRANS_PUBLISHEDNumber of transactions that Q Capture put on this send queue.
MQ_BYTESNumber of bytes put on this send queue during the monitor interval, including data from the source table and the message header
MQ_MESSAGESThe number of messages put on the send queue during the monitor interval.
MQPUT_TIMENumber of milliseconds during the monitor interval that the Q Capture program spent putting MQ messages on this send queue.
XMITQDEPTHNumber of messages on the MQ transmission queue. If you are using parallel send queues, the value is an aggregate of all transmission queues..
27
Where is Capture Time SpentLog Reader
Time in DB2 + Time Sleeping + Time Processing Log Records
LOGREAD_API_TIME + LOGRDR_SLEEPTIME + (MONITOR_INTERVAL - LOGREAD_API_TIME - LOGRDR_SLEEPTIME)
Publisher Thread
Time spent in MQ + Idle Time + time spent by worker thread decoding and formatting
(MQCMIT_TIME + SUM(MQPUT_TIME)) +CAPTURE_IDLE +(MONITOR_INTERVAL – [TIME IN MQ] – CAPTURE_IDLE)
Log Reader Latency
MONITOR_TIME – CURRENT_LOG_TIME
28
Capture PerformanceWorkload:
3 x Updates per UOW3 x Threads on each of 6 TablesMONITOR_INTERVAL = 10 Seconds
Workload Started
CPU ConstraintSlowed Capture
29
Capture PerformanceWorkload:
Large Batch sequential delete processMONITOR_INTERVAL = 60 Seconds
Nothing of Concern Showing
30
Tuning Capture• SLEEP_INTERVAL
– Defined in IBMQREP_CAPPARMS– How long Log Reader thread sleeps when End of log reached– Default – 500ms– For very busy systems you may not need to reduce this as end of log will not be reached. However most systems have
quieter periods so recommendation is to reduce from default.– Recommend – 50ms
• TRANS_BATCH_SZ– Defined in IBMQREP_CAPPARMS– How many source transactions are packaged by Capture into a single MQ message– Aim is to avoid small MQ messages– Default – 1 (ie no batching)– For OLTP systems with small transactions and high volume consider increasing
• MEMORY_LIMIT– Defined in IBMQREP_CAPPARMS– The amount of memory Capture uses to build transactions.– If memory limit is exceeded, Capture spills to file– Default – 0 – Default is usually acceptable on z/OS. – A good alternative is 200MB
31
Tuning Capture• NUM_PARALLEL_SENDQS
– Defined in IBMQREP_SENDQUEUES– How many send queues will be used to replicate transactions for a queue map– Default – 1– For very high volume replication parallel sendqs will improve MQ throughput by utilising multiple send queues/transmit
queues/channels
• CHANGED_COLS_ONLY– Defined in IBMQREP_SUBS– Determines if Capture will publish non-key columns that have not been changed– Affects UPDATE operations only– Default – ‘Y’– For tables with many columns and only a few updated this will reduce the amount of data sent to Apply– If used, CONFLICT_ACTION = ‘F’ can NOT be used– Recommend – ‘Y’
32
Understanding Apply Latency
(internal) WORKQ
Q Apply agent pool DB2connections
BROWSER
(internal) DONEQ
Control Tables
User tables
DONEMSG
AGENTAGENTAGENT
AGENT
AGENT
AGENTAGENT
SQL operations
23489
23489
In-memory transactions
3412
TRANSACTIONS or HEARTBEAT messages
RETRYAGENT
(internal) RETRY Q
Monitor Table Column Description
IBMQREP_APPLYMON
(one row per receive queue)
At the TARGET database
ROWS_PROCESSED Total number of rows in the process of being applied to target database.
APPLY_LATENCY Latency time spent in Apply – between target MQGET and target DB commit
DBMS_TIME Time spent in target database for SQL processing
APPLY_SLEEP_TIME Total sleep time of all apply agents in this monitor interval
DEPENDENCY_DELAY Total time transactions waited for dependent transactions to finish
MCGSYNC_DELAY Total time transactions waited for other CG (browser) to move up applyuptopoint, when operating in synchronized apply mode
WORKQ_WAIT_TIME Total time transactions waited in the WORKQ for an agent to execute them.
RETRY_TIME Total time a failed transaction was tried again.
Q Apply
33
Interpreting Apply Latency Counters
Latency counters are an average per transaction for the monitor interval
– Q Apply adds up latency values for each transaction, and divides the final sum by the number of transactions during the monitor interval.
– All counters are reset at each monitor interval
– If there are no transactions applied in the monitor interval, latency is 0
– All times are calculated using GMT
– All times are reported in milliseconds.
34
Monitoring Apply ThroughputIBMQREP_APPLYMON
APPLY_SLEEP_TIME– The number of milliseconds that Q Apply agents for this receive queue were idle while waiting for work.– e.g. If there are 10 apply agents and they were each 50% busy for the interval (10 seconds, say) APPLY_SLEEP_TIME
= 5,000ms (5 Seconds)
TRANS_READ– The total number of transactions retrieved from this receive queue during the monitor interval – including those not yet
committed.
TRANS_APPLIED– The total number of transactions from this receive queue that the Q Apply committed to the target.
ROWS_PROCESSED– The number of rows that were read from receive queues and applied, including those not yet committed to the target.
ROWS_APPLIED– The total number of rows that were read from this receive queue and that the Q Apply program has committed to the
target.
SPILLED_ROWS– The number of rows that the Q Apply program sent to temporary spill queues while targets were being loaded or while Q
subscriptions were placed into a spill state by the spillsub command.
SPILLEDROWSAPPLIED– The number of spilled rows that were applied to the target
35
Monitoring Apply Serialization/Contention
TRANS_SERIALIZED– The total number of transactions that conflicted with another transaction – often because of a row conflict or a referential
integrity conflict. But also includes rows serialised as a result of low MAXAGENTS_CORRELID. The Q Apply program suspends parallel processing and applies the row changes within the transaction in the order they were committed at the source. A high value may not necessarily be a problem.
DEPENDENCY_DELAY– The average time in milliseconds that a transaction had to wait due to dependencies with prior transactions. A high value
often means the same row is being updated repeatedly and therefore updates have to be serialised. Should aim to keep this close to 0.
KEY_DEPENDENCIES– The total number of row updates that were serialised as >1 trans are updating the row (based on the primary key/replication
key). Will result in Dependency Delay
UNIQ_DEPENDENCIES– The total number of unique index constraints that were detected (not primary key/replication key), forcing transactions to be
serialized. Will result in dependency delay
JOB_DEPENDENCIES– The number of transactions that are delayed because of correlation ID dependencies. Will result in dependency delay
MONSTER_TRANS– The number of transactions that exceeded the MEMORY_LIMIT for the receive queue set in the
IBMQREP_RECVQUEUES table. While a monster transaction is being processed no other transactions are applied!
RI_DEPENDENCIES– The total number of referential integrity conflicts that were detected (across transactions), forcing transactions to be
serialized. e.g. attempt to insert child before parent inserted
36
Monitoring Apply Serialization/Contention
UNIQ_RETRIES– The number of times that the Q Apply tried to re-apply rows that were not applied in parallel because of unique index
constraints (reported by UNIQ_DEPENDENCIES).
RI_RETRIES– The number of times that the Q Apply had to re-apply row changes because of referential integrity conflicts (reported by
RI_DEPENDENCIES)
DEADLOCK_RETRIES– The number of times that the Q Apply re-applied row changes because of lock timeouts and deadlocks.
NOTE: After 3 attempts (default) the UOW will be put on the queue to be processed by apply agent #1. If there are many deadlocks/timeouts they will therefore be serialised through a single apply agent.
ROWS_NOT_APPLIED– The number of rows that could not be applied, and were added to the IBMQREP_EXCEPTIONS table.
OKSQLSTATE_ERRORS– The number of row changes that caused an SQL error that is defined as acceptable in the OKSQLSTATES field of the
IBMQREP_TARGETS table. The Q Apply program ignores these errors.
37
Monitoring Apply LatencyDEPENDENCY_DELAY
– The average time in milliseconds that a transaction had to wait due to dependencies with prior transactions
MCGSYNC_DELAY– When running in synchronized mode for multiple consistency groups (MCGs) with other Apply programs, this is the average
time per transaction in milliseconds that Apply had to wait for another apply to come into sync. Check IBMQREP_MCGMON.
WORKQ_WAIT_TIME– The average time in milliseconds that each transaction waited on the in-memory work Q for an agent to apply to the target
database. A high value could mean apply agents are saturated, or there are CPU constraints
RETRY_TIME– The average time in milliseconds that the transactions spent in being retried for SQL failures due to deadlocks, lock
timeouts, RI violations, or secondary unique constraint violations.
DBMS_TIME– The average elapsed time in milliseconds spent in DB2 per transaction.
APPLY_LATENCY =
DEPENDENCY_DELAY + MCGSYNC_DELAY +
WORKQ_WAIT_TIME + RETRY_TIME + DBMS_TIME
38
Why is Apply Latency High?APPLY_SLEEP_TIME
– Sum of all agents (in ms) E.g. 32 agents, each sleeps 10 seconds – 32*10000=32000– 0 = all agents busy
• Bottleneck – in DB2, # agents, or CPU constraint? Check workq_wait_time• If only a few agents are working, is there a hot row? Check dependency_delay• Hot row will not cause excessive delay, if enough CPU resources available. Check dbms_time
DBMS_TIME– Average elapsed time per DB2 transaction (in ms) – Q Apply gets system timestamp before and after each DB2 insert/update/delete/commit operation– rows_applied / trans_applied = average transaction size– Did dbms_time increase at the time latency increased?– If dbms_time is high
• Look at DB2 Accounting data – DB2 locking, synchronous I/O, etc• Compare source and target system class 2 time to determine where bottleneck may be
NOTE: Even if dbms_time appears ‘small’, just a modest increase may push latency up. E.g.e.g. dbms_time=35ms during slowdown, and 12ms before slowdown. 35ms is still 3 times slower!
Target system must deliver equivalent DB2 performance to the source system
39
Why is Apply Latency High?DEPENDENCY_DELAY
– High dependency delay can be symptom of CPU constraint– Or could be result of a ‘hot row’– Avoid using trans_batch_sz if there are hot rows. May cause excessive serialization
CURRENT_MEMORY– A value near IBMQREP_RECVQUEUES (MEMORY_LIMIT) indicates slowdown in apply. Apply agents are
not keeping up, or there is a monster transaction. Check dbms_time and monster_trans
MCGSYNC_DELAY– 0 indicates this is the slow consistency group
DEADLOCK_RETRIES– Can be result of deadlock or timeout– Apply retries after 2 seconds– Apply retries up to IBMQREP_APPLYPARMS (DEADLOCK_RETRIES) – default is 3.– Apply serialises on apply agent 1 after attempting DEADLOCK_RETRIES times
40
Apply PerformanceWorkload:
3 x Updates per UOW3 x Threads on each of 6 TablesMONITOR_INTERVAL = 10 Seconds
WORKQ_WAIT_TIME ≈ APPLY_LATENCY
# Agents too low, or CPU Constrained.
In this instance CPU constrained
41
Apply PerformanceWorkload:
Large Batch sequential delete processMONITOR_INTERVAL = 10 Seconds
APPLY_LATENCY ≈ DEPENDENCY_DELAY
Serialisation due to job dependenciesMAXAGENTS_CORRELID is set to 1
42
Why Row Locking may be Required
In Source batch jobs control order transactions are executed– For example, thread1 may update rows with keys 1-1000000, and thread2 the rows with keys 10000001-
20000000, of a given table– Each thread will commit frequently for best performance, for example, committing after every 500 row
updates
In Target, Apply replays transactions in parallel– Apply only serialises if there are dependencies– Parallel Apply will group multiple updates with commit_count. – >1 apply agents requesting page locks together can cause deadlock.– It is not uncommon for batch jobs that sequentially update each and every row of a table to create deadlocks
in Q Apply with PAGE LEVEL locking at the target– Reducing the number of apply agents with MAXAGENTS_CORRELID, or NUM_APPLY_AGENTS can
mitigate the problem by reducing the probability of hitting the same page – but at expense of throughput
For these cases, Row Level Locking May be necessary
43
Monitoring MQ ActivityIBMQREP_CAPQMON (One row per QMAP)
MQ_BYTESThe number of bytes of data read from all receive queues during the monitor interval, including message data and the message header.
MQ_MESSAGESThe number of messages put on the send queue during the monitor interval.
MQPUT_TIMEThe number of milliseconds during the monitor interval that the Q Capture program spent interacting with the MQ API for putting messages to the send queues for the queue map.
XMITQDEPTHThe queue depth (number of messages on the transmit queue.
IBMQREP_CAPMON (One row Capture)
MQCMIT_TIME
The number of milliseconds during the monitor interval that the Q Capture program spent in MQ commit time
IBMQREP_APPLYMON (One row per Receive Queue)
QDEPTH Number of messages on the receive queue at monitor_time.
MQ_BYTESThe number of bytes of data read from all receive queues during the monitor interval, including message data and header.
NUM_MQGETSThe number of browser MQGET calls to retrieve a message from the receive queue during the monitor interval, MQGET calls that do not retrieve a message are not counted.
QLATENCYAverage time that messages spend in MQ. It is calculated as the difference in milliseconds between Q Capture put on the send queue and Q Apply get on the receive queue.
MQGET_TIMEThe number of milliseconds during the monitor interval that the Q Apply browser spent in MQGET calls on the receive queue. The value includes MQGET calls that do not return messages (queue is empty.)
44
MQ PerformanceWorkload:
Large Batch sequential delete processMONITOR_INTERVAL = 60 Seconds
High XMITQDEPTH shows MQ bottleneck
45
Where is the Bottleneck
Performance Best Practice
47
Best Practice – General ConfigurationHave a dedicated MQ Queue Manager for each Capture and Apply
– Allows bigger bufferpools (1GB limit)– prevent potential logging contention
Capture/Apply programs with WLM service class as high as DB2 MSTR
MQ Channel Initiator task WLM as SYSSTC
Use Parallel send queues - at least 2 per consistency group– At very long distance (1000s of kilometres) essential to have >2 queues
Use Column suppression (changed_cols_only=y) – reduces amount of data transmitted over the network for updates and deletes
Use DB2 row-level locking (at least at the target) for tables updated by sequential access batch jobs
Q Apply is a busy DB2 application! - Must ensure Target DB2 can handle it.
48
Best Practice – Capture
DB2 log read IFI 306 filtering (V10.2.1) (requires DB2 V10 APAR PM90568 or DB2 V11)
Set sleep_interval = 50ms (Default 500ms)
Memory_limit is not significant impact – use default 0, or 200MB OK– Increase if spilling because of monster transactions. Check IBMQREP_CAPMON (trans_spilled)
Set trans_batch_sz=4 or more.– But, only if no issue with 'hot row' workloads, ie. if applymon(dependency_delay) is high, use
trans_batch_sz=1 (default)– Goal: get MQ message size > 10KB for optimal. – Combine with max_message_size = 1MB (large transactions will not be batched beyond 1MB)
Run Q Capture with warntsx to detect very large transactionsLarge transactions will cause high latencySource application must commit frequently
49
Best Practice – Apply
Use parallel_sendqueues=Y – Ensures ‘get by msgid ‘ is used - even if not using parallel send queues
Set maxagents=32 or more – check agent_sleep_time. Maximum supported = 128– BUT, reduce the number of agents if constrained CPU at the target! 16 performs better than 32 in that case
maxagents_correlid may be needed if lock timeout or deadlocks occur during batch job– Check IBMQREP_APPLYMON(DEADLOCK_RETRIES)
Use multi-row-insert=Y (default) – Exploit DB2 multi-row insert (V10.2.1)
Set prune_batch_sz=100
50
Best Practice – MQ
Set batchlim(1MB) - requires PTF PM79000/UK90868 with MQ 7.1
Set Channel batchsz(800)
– Determines the maximum number of messages in a unit work sent across MQ channel. 800 is best for OLTP workload. Must also set Q Capture MAX_TRANS accordingly
MQGET for delete (MQ 7.1) Q Rep APAR PM90110
– Improves Q Apply pruning performance, keeping receive queue small. Requires MQ APAR PM81785.
Use 1GB bufferpool space - divided among the receive queues for the queue manager
Enable MQ bufferpool read-ahead
– RAHGET - is for large messages and adds significant performance benefits for MQGET when the queue is spilling to pageset and the message sizes are > ~ 30k (RECOVER QMGR(TUNE RAHGET ON))
– READAHEAD - is for smaller messages (RECOVER QMGR(TUNE READAHEAD ON))
Migrate to MQ V8 or later (significant improvements for Q Rep in MQ), including:– 64-bit bufferpool: more, fast storage for queue access; allows more data on queue before spill to disk; potentially 64GB of
(each) XMITQ entirely in storage
– Enhanced deferred write (cast-out): faster MQPUT by capture when queue exceeds bufferpool
– Enhanced logging: particularly for larger messages
51
Best Practice – MQ
Size XMIT queues and Receive queues to support 24 hours of messages
– Qrep problems can occur, and can take time to resolve
– Queues can build quickly in high volume syste,
Separate QRep application queues and SYSTEM queues to different pagesets– Queue full conditions will prevent MQ commands being entered
52
Best Practice – Workload Considerations
If possible Avoid:
Large LOB data, use inline LOBs whenever possible
Very large purge jobs. – Consider running them at each site instead (if they cause increased replication latency)
Very large transactions, e.g., millions of row changes in a single unit of work– Will result in Monster transactions causing apply to serialize
Use Row-level locking at the target where necessary– May need to increase NUMLKUS accordingly
Add more QMAPS if hot row workloads cause latency spikes– Move the table with a hot row into its own replication queue
53
Session feedback
• Please submit your feedback at
http://conferences.gse.org.uk/2016/feedback/ic
• Session is IC