IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to...

42
© 2016 IBM Corporation 1 IBM InfoSphere Data Replication’s Change Data Capture (CDC) LUW User Exits

Transcript of IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to...

Page 1: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

© 2016 IBM Corporation1

IBM InfoSphere Data Replication’sChange Data Capture (CDC) LUW User Exits

Page 2: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

2

A Note About Code Samples

Please note that this presentation includes some code samples, presented for informational purposes to demonstrate how to implement a user exit, and to illustrate some possible use cases for user exits in IIDR’s CDC LUW engines.

As such the code is not supported by IBM, and the discussed use cases are not presented as recommendations.

• ** Licensed Materials - Property of IBM

• ** IBM InfoSphere Change Data Capture

• ** 5724-U70

• **

• ** (c) Copyright IBM Corp. 2009 All rights reserved.

• **

• ** The following sample of source code ("Sample") is owned by International

• ** Business Machines Corporation or one of its subsidiaries ("IBM") and is

• ** copyrighted and licensed, not sold. You may use, copy, modify, and

• ** distribute the Sample in any form without payment to IBM.

• **

• ** The Sample code is provided to you on an "AS IS" basis, without warranty of

• ** any kind. IBM HEREBY EXPRESSLY DISCLAIMS ALL WARRANTIES, EITHER EXPRESS OR

• ** IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF

• ** MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Some jurisdictions do

• ** not allow for the exclusion or limitation of implied warranties, so the above

• ** limitations or exclusions may not apply to you. IBM shall not be liable for

• ** any damages you suffer as a result of using, copying, modifying or

• ** distributing the Sample, even if IBM has been advised of the possibility of

• ** such damages.

Page 3: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

3

What is a User Exit?

• A CDC User Exit is a subroutine that will be invoked when a pre-defined event occurs. You can customize the replication and related functionality of CDC to fit your business requirements by utilizing User Exits (UE).

• They can be invoked, for example:

• To calculate a column value to be assigned to a column on the target side.

• Before or after database operations are applied to the target .

• On the occurrence of some CDC event, such as latency threshold exceeded or replication ends abnormally.

• Further explanations follow in this presentation along with some practical examples.

• Note, all examples provided here are included for information purposes only, to illustrate user exit concepts and possible usages. They are available on request but are given on an ‘as-is’ basis as samples which you can freely use, but for which no support can be given.

Page 4: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

4

Types of User Exits• Subscription Level – executed on the target side for subscription-level events.

• Row Level – executed on target side based on row changes.

• Derived Expressions – used to compute a derived expression value to send to the target as though it were a column value. Can be configured to execute on source or target.

• Notifications – can be triggered as a customized response to some event on the source or on the target such as latency threshold exceeded, or an abnormal shutdown of replication. The user exit would operate in place of email notification implemented within the engine. Executes on source or target depending on selection of source or target events.

• Conflict Resolution – provide customized resolution beyond out-of-the-box resolvers (e.g. Source Wins). Executes on target side.

• Custom Flat File Formatter – DataStage flat file and WebHDFS target user exit typically used to override formatting.

Page 5: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

5

The Plus & Minus of User Exits

• The Plus: User Exits can be used to customize the behavior of CDC, adding functionality not available in the product.

• The Minus : As a user-added customization, user exit code is not part of the product and being custom-developed cannot be supported by the CDC product team. For example, a table or row level user exit is invoked for every operation, and there is a performance overhead to this. If you encountered a performance issue with CDC and raised a PMR for it, the CDC support team would not be able to assist you with tuning the UE.

• This is a trade-off which you must consider carefully before implementing a User Exit. Many useful and interesting things can be done with user exits, but the question to ask yourself is “Is this functionality essential to my business use case?”. If the answer is ‘yes’, and you are prepared to support the implementation, then a User Exit may be the right answer for you.

Page 6: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

6

User Exit Implementations

• In the CDC LUW Engine used for DB2 LUW, Oracle, MS SQL Server, Sybase, Informix, Teradata, DataStage and Event Server, user exits can be implemented as Java programs or as database stored procedures. With User Exits, you can retrieve the columns that were replicated from the source table, journal control columns and system values and manipulate their values within the user exit.

• User exits are also supported by the CDC engine for IBM i and CDC for z/OS. These are the subject of a separate presentation and are not discussed here.

Page 7: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

7

The Java User Exit API Documentation

• The documentation is available in JavaDoc format, and is located in the CDC installation directory under docs/api

• You should begin by reading the page help-doc.html in your browser, to understand the general structure of the Javadoc.

• A listing of the packages and classes of the API can be found by viewing the index.html file.

• Some sample user exits follow in this presentation that can be used as an aide to understanding the use of the API.

Page 8: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

8

Some Guidelines for Java User Exit Development

• Keep your code as efficient as possible to just do what absolutely must be done in the user exit.

When you invoke a user-written program at a defined exit point, it is important to realize that a call is issued each time a clear, insert, update or delete operation is applied to a target table. Therefore, when data replication activity is high, overall throughput and resource utilization impact is affected by the kind of actions that are implemented through the code in the user-written programs.

• Be aware that all user exit programs which are call ed from a CDC engine from CDC 6.5 to IIDR 11.3.3 will be run with an IBM Java Runtime Engine of version 1.6.

• Although most Java Development Kits with release 1.6 or higher can be used to compile classes which use the jar files that come with the CDC engines, we suggest that you use the IBM JDK for your development environment to be in sync with the required running environment.

(Con’td…)

Page 9: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

9

Some Guidelines (cont’d)

• The classpath for compiling your user exit classes must include the ts.jar which can be found in the lib directory the CDC engine produc t.

Consider copying the ts.jar to your development environment if your development environment does not have direct access to the directories which contain the CDC product.

• In order to deploy your user exit classes, they mus t be available in the classpath of the CDC engine.

It is recommended practice to place these in the <cdc_home>/lib directory as the engine will include that directory in its classpath. You can place the compiled class files there in a package directory structure. If you wish to package them into a jar file, you can place it in the CDC jre ext directory where the JVM will find it.

• If you make changes to user exit programs, you will have to restart the CDC instance to ensure the JVM picks up the new version of the c lass.

• Note: If a new version of CDC is installed, includi ng a new fixpack or build, the version of ts.jar will be new as well, and it may b e necessary to recompile your user exits using the new version of ts.jar.

Page 10: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

10

Stored Procedure User Exits

• Besides running user exits written in Java that are called from the CDC replication engine, you can also configure stored procedures. Stored procedures are compiled programs which are physically stored in a database which when called are run by the database engine. If you have to execute complex operations or calculations, it may prove to be more efficient to have the database engine do the work, and thus may provide additional scalability. Stored procedures do have a performance impact however which has to be considered and balanced against the need.

• When used in a table-level or row-level exit point, stored procedures are executed in line with the operations that CDC applies into the target tables. Both share the same database connection which ensures that when a stored procedure user exit is called from an after-operation exit point, the changes to the table that were made by ICDC are visible to the stored procedure user exit program.

• With CDC, you can retrieve the columns that were replicated from the source table, journal control columns and system values and pass their values to the stored procedure when it is called.

• An example of the use of a stored procedure user exit could be when you want to replicate the contents of a single table to two target tables. In a standard CDC configuration this would require you to create 2 subscriptions, each mapping the same table but having a different destination table. A stored procedure lessens the load on the source by moving the processing for the second table to the target database. Also with 2 subscriptions running on the source, since they run independently of each other, the transaction consistency between the target tables could not be guaranteed. With a stored procedure moving data to the second table, the target tables would be transactionally consistent.

• Later in the presentation, we will provide an example of a stored procedure and show how to configure CDC to call it .

Page 11: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

11

Some Common Uses for User Exits

• The most common use case for InfoSphere CDC User Exit programs is execution of additional actions on the apply side of the replication process.

• A CDC subscription replicates changes from a source to a target database and based on the type of operation, a User Exit could be invoked to call a custom user exit routine.

• Example:

In an application integration scenario you could exchange the data from your source application to your target application using database tables. Then, as application transactions are applied to the target database, you might want to notify the receiving application that there is a transaction to be processed by means of a message on a queue. The target application just has to monitor the queue for new incoming messages and then when a message arrives, pick up the transaction details from the tables that have been populated by the subscription apply process.

A similar thing could be done for the same scenario by building an XML document based on the incoming operations and then posting this document on an Enterprise Service Bus when the transaction is committed.

Page 12: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

12

User Exit Example: Row Filtering Use Case

Row filtering and user exits are sometimes used for scaling the replication if the target database can only handle a certain volume throughput of operations in a single database session. By using a row filter on the key columns you can then share the work for this single table across multiple subscriptions, thereby initiating multiple database sessions on the target side and increasing the throughput. Using a derived expression user exit provides flexibility in how you can design the row filter. For example, if the replicated table has a numeric key column, a user exit could calculate the modulo value of this numeric key with a certain divisor and allow you use the remainder in your row selection. An example of such a user exit follows.

Page 13: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

13

Row Filtering Derived Expression User Exit Example

The following is an example of a Java user exit for a derived expression, to demonstrate a possible use case, along with a simple code example.

The following UEModuloFilter class is an implementation of the ICDC derived expression interface and can be called using the %USERFUNC("JAVA") function from within ICDC, either on the source or the target.

This function could be used to do row filtering at the source. For example, if a high volume table has an integer primary key column (KEYCOLUMN) and you wanted to divide the workload of replicating this table across 3 subscriptions to have multiple database sessions apply the transactions on this table in parallel, you could create 3 subscriptions, each subscription mapping the same table. However, in the row filter condition you would specify different values as shown in the following list:

1. RF_SUB1: %USERFUNC("JAVA","UEModuloFilter",KEYCOLUMN,3,0)2. RF_SUB2: %USERFUNC("JAVA","UEModuloFilter",KEYCOLUMN,3,1)3. RF_SUB3: %USERFUNC("JAVA","UEModuloFilter",KEYCOLUMN,3,2)

When processing the operations for this table, each subscription would do the calculation, of KEYCOLUMN%3, and determine if the remainder value is the same as the remainder value specified as the last parameter. Effectively the first subscription would replicate all rows where KEYCOLUMN%3 can be divided by 3, the second subscription would replicate the rows where there is a remainder value of 1 and the third where there is a remainder value of 2. If the least significant digit of the KEYCOLUMN values is distributed evenly, the three subscriptions would complement each other and take one-third of the workload.

Debugging Tip

When specified in the row filter for a replicated table, CDC would call the invoke() method of the class for every log entry of that table. Should an invalid numeric be passed as a parameter, the exception is caught and then uses CDC's tracing facility. The exception that is then thrown is logged under the <cdc_home>/instance/<instance>/log directory in the current CDC trace file. This is generally a useful technique for tracing for any user exit you might implement as an aide to debugging.

Page 14: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

14

Row Filtering Derived Expression UE Example

Description:This user exit example performs a modulo operation against a specified column and determines if it equals the specified remainder value. The parameters passed into the user exit are:

a) Dividend -> Numeric value to perform the modulo function against

b) Divisor -> Numeric value by which the Dividend will be divided

c) Remainder -> Remainder value to be tested

The comparison logic is as follows:

<Dividend> % <Divisor> == <Remainder>

Instructions for use:1. Compile the user exit 2. Copy the UEModuloFilter65.class file to the <cdc install>/lib directory3. Go to the mapping details for the table4. Under Filtering, specify the following:

%USERFUNC("JAVA", "UEModuloFilter65", <Dividend column>, <Divisor value>, <Remainder>)

Return value is boolean, true or false.

Page 15: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

15

Configuring the Row Filtering Derived Expression ex ample

Page 16: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

16

Row Filter Derived Expression UE Example Codepackage ue.example;

import java.math.BigDecimal;

import com.datamirror.ts.derivedexpressionmanager.DEUserExitIF;

import com.datamirror.ts.derivedexpressionmanager.UserExitInvalidArgumentException;

import com.datamirror.ts.derivedexpressionmanager.UserExitInvokeException;

import com.datamirror.ts.util.Trace;

public class UEModuloFilter implements DEUserExitIF {

public Object invoke(Object[] aobjList) throws UserExitInvalidArgumentException, UserExitInvokeException {

try {

long dividendColumn = ((BigDecimal) aobjList[0]).longValue();

long divisorValue = ((BigDecimal) aobjList[1]).longValue();

long remainderValue = ((BigDecimal) aobjList[2]).longValue();

return new Boolean( (dividendColumn % divisorValue) == remainderValue);

} catch (ClassCastException e) {

// Piggyback on the ICDC logging facility

Trace. traceAlways(e);

throw new UserExitInvalidArgumentException(

"Invalid number parameter passed "

+ "to the user function, arguments passed: [0]="

+ aobjList[0] + ", [1]=" + aobjList[1] + ", [2]="

+ aobjList[2] + "Message: " + e.getMessage());

}

}

}

Page 17: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

17

User Exit Example: A Row-level User ExitImplementing a Soft Delete

It might be desirable to have CDC perform ‘soft deletes’ on a target table, where instead of physically deleting a row, an update of the row is performed with a column value set to a value that marks the row as deleted on the source. This is generally referred to as a soft delete. CDC does not contain functionality to do this, but it could be done with a user exit.

The following UESoftDelete user exit class example implements the UserExitIF interface and must be specified in the User Exits tab under the table mapping. Also, the before-delete exit point checkbox must be checked for this example. (For a row-level user exit, any checkbox can be selected as long as there is at least one so that the user exit will be invoked).

During the initialization of the user exit (ie the code provided in the init() method), the parameters that were passed, if any, are evaluated. Although not shown here, an optimization has been built into the example to force it to update only a few columns, not all columns in the row to optimize the SQL statement and execution on the target.

(Cont’d on next slide…)

Page 18: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

18

A Row -level Java User Exit example (cont’d)

The example explicitly un-subscribes from all possible events and then subscribes to the BEFORE_DELETE_EVENT. By specifying this in the code, you are preventing the situation where the user exit would be called on an exit point that it cannot handle. The SoftDelete user exit must only be used in the event of a delete operation, hence the registration for this event. In your user exits you should practice defensive coding to minimize risk to the CDC process.

The processReplicationEvent() method will be invoked for every event to which the user exit is subscribed. During first time processing, the method obtains a shared connection to the target database from the CDC engine and uses the same session as CDC to update the target tables. Additionally, it obtains the list of columns that must be populated in the SET clause of the UPDATE statement and the list of key columns for the WHERE clause. When a before-delete event is detected and the processReplicationEvent() method is called, it will do an update of the existing row instead of a delete.

Page 19: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

19

UESoftDelete ExampleNote that code samples hereafter are for illustration and are deliberately incomplete for display purposes.

// ICDC specific imports

import com.datamirror.ts.target.publication.userexit.DataTypeConversionException;

import com.datamirror.ts.target.publication.userexit.ReplicationEventIF;

import com.datamirror.ts.target.publication.userexit.DataRecordIF;

import com.datamirror.ts.target.publication.userexit.ReplicationEventPublisherIF;

import com.datamirror.ts.target.publication.userexit.ReplicationEventTypes;

public class UESoftDelete implements UserExitIF { …

public void init(ReplicationEventPublisherIF publisher) throws UserExitException {

// Process list of columns that will be updated on soft delete)

}

public boolean processReplicationEvent(ReplicationEventIF event) throws UserExitException {

DataRecordIF image = event.getBeforeData();

//the first time UE is invoked for this table mapping, initialize SQL statements that will be executed

//if the event is BEFORE-DELETE, update the row

performUpdate(image);

return true ;

}

public void finish() {

// perform any table specific cleanup such as closing the update statement}

}

Page 20: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

20

SoftDelete User Exit Explanation

Overview:

This soft delete user exit will mark a row as inactive instead of performing a physical delete. If the target row does not exist, a row will be inserted into the target table.

Assumptions:

• Delete operations have been disabled for this table.

• The &ENTTYP journal control field is used to determine the value of the "deleted" column

• Table has been configured for Adaptive Apply or Conflict Detection/Resolution(Source wins)

• Table does not possess LOB/LONG columns.

Instructions:

1. Copy the class files to the <cdc install directory>/lib directory

2. In Mapping Details->Operation, set the "On Delete" setting to "Do Not Delete".

(Cont’d…)

Page 21: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

21

SoftDelete User Exit Explanation (cont’d)

3. In Mapping Details->User Exits:

• Set "User Exit Type" to "Java Class"

• Set "Class Name" to “ue.example.UESoftDelete" (omit quotes)

• In the "Events and Actions" section, tick the "Delete Before" checkbox

NOTE:

By default, when the user exit updates the target row, it will update all target columns. For an optimized update operation you can specify the columns you want updated in the "Parameter" field.

Example: Set "Parameter" to "ENTRY_TYPE,AUD_TIME,SRC_AUD_TIME" (omit quotes)

As in the previous example, CDC tracing is used in various methods in the UESoftDelete class, by using com.datamirror.ts.util.Trace class.

Trace records are written to the ICDC trace files located in <cdc_home>/instance/<instance>/log

Page 22: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

22

User Exit Example: A Subscription-level User Exit Delivering Transactions to a Message Queue

• In some cases you may have an implementation that requires that committed transactions are delivered to a non-database target such as a message queue or web service. Row-level user exits provide the ability to enhance or replace CDC's apply processing and execute custom code based on insert, update, delete or even table level events. The subscription-level user exits take your user exits a step further in the ability to execute a process based on the unit of work that was read from the source database.

• Consider the example of a shopping cart on a web site. When the first item is added to the shopping cart, an order header row is created in the underlying database. As the customer adds items to the shopping cart, rows are inserted into the order detail table and once the checkout process is started, your business application commits the transaction into the database and provides a unit of work.

• Assume that you want to place the completed transaction as a single message on a message queue (or enterprise service bus), this would pose a challenge since CDC's row-level user exits would allow you to pick up the individual database operations and execute a user exit for the operation, but you would not know when the last item had been placed in the shopping cart to complete the transaction.

• A CDC subscription-level user exit point provides a method that gets invoked when the transaction is committed by the subscription. You can implement a method to start an action based on a transaction that was prepared in the row-level user exit points. With reference to the shopping cart example your row-level user exit points would start building the message to be sent (in XML or other format) and then when the commit is executed by CDC, the subscription-level user exit would take the message that was built, append any closing tags in the case of XML and place it onto a message queue.

Page 23: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

23

Subscription-Level Example Continued…

• In the CDC configuration, the Java user exit class is specified both at the subscription level and the table level. Both the subscription-level and table-level interfaces are implemented by the user exit, SubscriptionUserExitIF and UserExitIF respectively. For the functionality intended by the user exit it is must be registered at both levels. If you did not configure the subscription-level user exit, the table-level entry points would not have all information available to properly write the XML records.

• When the subscription is started, the target side instantiates a CDCTransactionFileWriter object and immediately invokes the init(SubscriptionEventPublisherIF) method (subscription-level) whose primary task is to create a subscription context object to be shared with the table-level objects. Then, for each mapped table that has the CDCTransactionFileWrite specified as the user exit, a table-level object is instantiated and the init(ReplicationEventPublisherIF) method is invoked. This method registers the after-insert, after-update and after-delete exit points for the table in question.

• Every insert, update and delete operation on the table in question will cause the processReplicationEvent() method to be invoked. This method first checks whether there already is an open file for this subscription. That is done by checking the subscriptionContext.printStream object. If there is no open file, a new file is created on the fly and opened. Subsequently, the insert, update, or delete operation for the table in question is written as an XML format structure. When a commit is received from the source side, the processSubscriptionEvent() method is called which closes the transaction tag and then closes the file.

Page 24: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

24

Subscription-Level Example Code - CDCTransactionFile Writer

import java.io.*;

import java.text.SimpleDateFormat;

import java.util.*;

import com.datamirror.ts.target.publication.userexit.*;

public class CDCTransactionFileWriter implements UserExitIF, SubscriptionUserExitIF {

/* init() is called once when the subscription is s tarted and initializes the subscription context. Al so, it

ensures that the processSubscriptionEvent method is invoked before every commit.*/

public void init(SubscriptionEventPublisherIF publisher) throws UserExitException {

…}

public boolean processSubscriptionEvent(SubscriptionEventIF subscr iptionEvent) throws UserExitException {

…}

/* Table - level initialization is called once for every mappe d table at subscription startup. It first retrieves the subscription context and then registers the eve nts it wants to listen to. */

public void init(ReplicationEventPublisherIF eventPublisher) {

…}

(cont’d…)

Page 25: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

25

Subscription-Level Example Code(cont’d )

/* Executed when table - level event is detected (insert/update/delete). This method writes an XML entry for the

table - level operation to the currently open output file. */

public boolean processReplicationEvent(ReplicationEventIF replicationEvent) throws UserExitException {

…}

}

Page 26: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

26

User Exit Example: Custom Flat File Formatter

When using CDC to deliver flat files for consumption by external applications, the target engine is often CDC for DataStage. This provides the ability to generate flat files and has additional functionality to automatically close and make the flat files available based on time or number of rows. Flat files which are generated by ICDC for DataStage have the following:

• Journal control information written as the first few columns on every line

• Characters written in UTF-8 encoding

• Columns which are separated by a comma

• Columns which are delimited by a double-quote

This fixed output format may sometimes not be suitable for the targeted applications.

With a User Exit it is possible tailor the standard CDC for DataStage Flat File output to use a different column delimiter and column separator, and also create the file using an encoding different from the default Unicode UTF 8 encoding.

Page 27: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

27

Custom Flat File Formatter Description

The CDCDataStageFormat class provides an example on how the flat file output can be customized to your needs. There are 3 main methods in the interface that must be implemented to format the data, which are formatDataImage(), formatNullImage() and formatJournalControlFields() respectively. Obviously, the other interface methods must be implemented too but you have a choice of whether you want to provide any code for these. When data is formatted, the formatJournalControlFields() method is called first. This provides the means to do initialization processing at the table level and formats the journal control columns. For formatting the data image, the formatDataImage() method is passed the argument which holds the data record object. When an update operation is processed, this method is invoked twice, once for the before image and the other time for the after image. The formatNullImage() method is called for the insert and delete operations if the before and after images are in a single record and can then just return the number of column separators equivalent to the number of columns.

If a character sequence depicts the column delimiter, which is common when binary objects are to be replicated, the example could also fit in that scenario. Just change the DELIMITER to the sequence of characters that defines the column delimiter, as any of the following,for example "?~#".

Page 28: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

28

Custom Flat File Formatter - Code...

/*User Exit to format the data suitable for a target application's sequential file reader. */

public class CDCDataStageFormat implements DataStageDataFormatIF {

public CDCDataStageFormat() {

...

}

/* Return a ByteBuffer containing the journal control field values that are of interest

Note that CDC journal control fields are available in a User Exit, */

public ByteBuffer formatJournalControlFields(ReplicationEventIF event, int operationType)

throws DataTypeConversionException {

UserExitJournalHeader header = (UserExitJournalHeader) event.getJournalHeader();

String journalControlString = header.getDSOutputTimestampStr() + header.getCommitID() + opChar

+ header.getUserName();

}

...

}

Page 29: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

29

Custom Flat File Formatter Usage

• In order to activate the custom data formatter, this must be specified in the Flat File properties of the table mappings. The class does not accept any parameters.

Page 30: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

30

User Exit Example: Conflict Resolution User Exit

• A subscription may have Detect Conflict enabled, and CDC will detect conflicts, as for example during bi-directional replication (two datastores serving as both source and target to each other.) For example, a conflict would be created if both datastores have inserted a row with the same key.

• CDC MC documentation contains a complete description concerning what type of conflicts ICDC will detect.

• Out of the box, CDC supports conflict resolution methods of source wins, target wins, largest value wins and smallest value wins.

• A user exit may be used to provide some customized behavior, for example to compare a timestamp column and select the most recent change as the winner.

Page 31: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

31

Conflict Resolution Example - Codepackage ue.example;

…/*

* This sample Conflict Resolution class is fired on detection of a conflict between a source and a tar get.

* The work is done in the resolveConflict method wh ich in this sample always uses a source wins strate gy and simply writes the details of the conflict to a text file.

* This sample assumes that the source and target ta ble columns are all numeric or character types.

*/

public class CRUserExitSample implements ConflictResolutionUserExit {

public CRUserExitSample() {

try {

writer = new BufferedWriter( new FileWriter( new File( OUTPUTFILE), true ));

} …}

public boolean resolveConflict(

ConflictResolutionUserExitControl control,

DataRecordIF beforeImage,

DataRecordIF afterImage,

DataRecordIF targetImage,

DataRecordIF desiredImage) throws ConflictResolutionUserExitException {

…}

Page 32: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

32

User Exit Example: Notification User Exit

• CDC Notifications out of the box are designed to use email as the notification mechanism, and in most cases this should be adequate.

• Notifications may also be used to execute a user exit. One use case for this is that some external monitoring solutions expect log files to be appended to determine the specific messages to be acted upon.

• The example notification user exit will write events that have been selected for user exit handling to the cdc_home>/log/cdc_notifications.log file.

• When a notification has been configured for a certain category in the CDC process and an event in that category is detected, CDC will invoke the handle() method.

Page 33: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

33

Notification User Exit Code Example

package ue.example;

import java.io.*;

import java.util.*;

import java.text.SimpleDateFormat;

import com.datamirror.ts.api.*;

public class NotificationToFile implements AlertHandlerIF {

…/**

* Constructor, will be called when the object is in stantiated. You could

* include activity such as creating the log file.

*/

public NotificationToFile() {

}

Page 34: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

34

Notification User Exit Code Example (cont’d)

/**

* This method is invoked for every event that is defined to be handled by a USER HANDLER at the datastore or

* subscription level.

* When the method is called, it opens the cdc_notifications.logfile in the <cdc_home>/log directory and

* writes the event in a format that is equivalent to the output of dmshowevents. The log file is continuously

* appended to and can be monitored by an external monitoring solution.

* @param zone - Zone of the event (not used anymore with CDC 6.5)

* @param category - Category of the event (information, error, ...)

* @param sourceOrTarget - Did the event happen on the source or the target

* @param subscriptionName - Subscription that generated the event

* @param eventID - Numeric representation of the event

* @param eventText - Message issued by CDC engine

* @param otherInfo - Other properties information (not used)

*/

public void handle( int zone, int category, String sourceOrTarget, String subscriptionName,

int eventID, String eventText, Properties otherInfo) throws Exception {

//writes the notification details to a file

}

}

Page 35: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

35

Notification User Exit Configuration

• To implement the notification handling, in Configuration go to the Datastore pane, right-click on the datastore and select Notifications. If you only want to handle certain events, select only the notifications which should generate entries into this file.

• In the following screenshot, you can see where the Notification user exit is set.

Page 36: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

36

Adding Traces for Java User Exits

• A Java user exit can piggy-back onto CDC tracing, s o output is written to the CDC trace files. This ca n be useful as a means of debugging.

package ue.examples;

import com.datamirror.ts.util.Trace;

/* Tracing facility for user exit */

public class UETrace {

boolean enabled = false;

public void init( boolean p_enabled) {

this.enabled = p_enabled;

}

public void write(String message) {

if (enabled) {

Trace.traceAlways(message);

}

return;

}

public void close() {

}

}

Page 37: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

37

Adding Traces to User Exit Code

To add the UETrace class to a User Exit, you would use something such as the following:

public class MyUserExit {

protected UETrace trace;

//initialize UETrace

public void init(ReplicationEventPublisherIF publis her) throws UserExitException {

// Tracing is always switched on, can be disabled o r parameterized

trace = new UETrace();

trace.init(true);

//add trace points in your code

trace.write(this.getClass().getName() + ": some use ful output");

}

Page 38: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

38

Debugging User Exits with Eclipse

The JVM's Java Debugger tool (jdb) can be used to enable remote debugging of a user exit via Eclipse using a few simple steps.

• Step 1 - Enable remote debugging in the JVM via jdb by editing <CDC Install dir>/conf/dmts64.vmargs (or dmts32.vmargs if your CDC engine is 32-bit) to add some –Xdebug and -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8765 (address is the port the debugger will listen on). The file content would look like this once those are added:

-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8765 -Dcom.datamirror.ts.instance=%TSINSTANCE% com.datamirror.ts.commandlinetools.script.Startup

Start the instance running. You should see output like the following indicating that CDC is running and the debugger is listening for connections:

dmts64 -I CDCTEST

Listening for transport dt_socket at address: 8765

IBM InfoSphere Change Data Capture is running.

• Step 2 - In Eclipse, click Run -> Debug Configurations. At the bottom of the configurations list, click Remote Java Application, and click the New Launch Configuration button. Type in or browse to the project name for your user exit code. For connection type, use Standard Socket Attach and give the host for the CDC engine which will execute the ue, and the port for remote debugging as in step 1.

• Step 3 - Set a break point in your code in Eclipse and run the configuration in debug mode.

• Execution of your code will stop at the first breakpoint and you can step through executions of your code in Eclipse.

Page 39: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

39

User Exit Example: Stored Procedure Example

In this example a row is added to the SAMPLE.RISKY_CUSTOMER table only if the credit limit of the customer has exceeded 10,000. The inserted row will log the customer number, credit limit and timestamp of update at the source.

Here is the table and stored procedure DDL:

CREATE TABLE CDCDEMO.RISKY_CUSTOMER (

CUSTNO DECIMAL(6,0), CRLIMIT DECIMAL(7,0), UPD_TIME STAMP TIMESTAMP)

IN USERSPACE1;

CREATE OR REPLACE PROCEDURE CDCDEMO.LOG_RISKY_CUSTOMER (

OUT result INT,

OUT returnMsg CHAR,

IN a$CUSTNO DECIMAL(6,0),

IN a$CRLIMIT DECIMAL(7,0),

IN j$TIMSTAMP VARCHAR(26)

)

BEGIN

if a$CRLIMIT>=10000 then

insert into CDCDEMO.RISKY_CUSTOMER values(a$CUSTNO, a$CRLIMIT, TIMESTAMP(j$TIMSTAMP));

end if;

set result=0;

set returnMsg='Row inserted';

END@

When running the subscription and changing a number of source rows which have or attain a credit limit of 10,000 or higher, the stored procedure starts populating the RISKY_CUSTOMER table.

Page 40: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

40

Row-Level User Exit Stored Procedure Example

• Stored procedure user exit for row-level operations

• Stored procedure row-level user exits are invoked just before or after ICDC applies the operation to the target table. If you specify a stored procedure user exit and disable the operation that it is attached to, the user exit will still be invoked and can therefore be executed instead of the operation being applied by CDC.

Page 41: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

41

Stored Procedure Example Configuration

Page 42: IBM InfoSphere Data Replication’s Change Data Capture · PDF fileused as an aide to understanding the use of the API. Information Management Software 8 Some Guidelines for Java User

Information Management Software

42

In Conclusion

• CDC User Exits can provide a solution for various scenarios where there is a critical need for functionality beyond what is available in the product, or to supplement existing functionality.

• It does come at a price, in the form of user resources to develop and maintain the custom code, and this should be considered carefully before undertaking to add a user exit.

• You should note that a user exit may also have an effect upon the performance characteristics of CDC, for which CDC product team cannot be responsible because it is a customization of the product.

• IBM Services may be engaged to develop custom user exit solutions for CDC and may be retained on an on-going basis if required resources are not available to you in-house.