InfoSphere Change Data Capture, Version 6.5...replicated data is the Source Capture Engine and the...

IBMInfoSphere Change Data CaptureVersion 6.5.2

InfoSphere Change Data Capture,Version 6.5.2Planning and Deployment Guide

IBM Confidential

��

NoteBefore using this information and the product it supports, read the information in “Notices” on page 107.

First edition, second revision

This edition applies to version 6, release 5, modification 2 of IBM InfoSphere Change Data Capture (productnumber 5724-U70), to version 6, release 5 of IBM InfoSphere Change Data Capture for z/OS (product number5655-U96), version 10, release 1 of IBM InfoSphere Classic Change Data Capture for z/OS (product number5655-W29), version 10, release 1, modification 2 of IBM InfoSphere Data Replication for Netezza (product number5725-E30), and to all subsequent releases and modifications until otherwise indicated in new editions.

© Copyright IBM Corporation 2011.US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contractwith IBM Corp.

IBM Confidential

Contents

About this guide . . . . . . . . . . . 1

Overview of InfoSphere CDC andInfoSphere CDC Management Console . 3

Determining your replication needs . . . 5

Assessing requirements. . . . . . . . 7Database assessment questions . . . . . . . . 8Server assessment questions . . . . . . . . . 11User accounts assessment questions . . . . . . 12TCP/IP network assessment questions . . . . . 12

Database requirements and supportedfeatures . . . . . . . . . . . . . . 15Supported databases and target applications . . . 16Supported data types . . . . . . . . . . . 19Supported table mappings . . . . . . . . . 22Calculating database connections required byInfoSphere CDC . . . . . . . . . . . . . 31Database logs. . . . . . . . . . . . . . 33

DB2 for Linux, UNIX, and Windows (LUW) –online and archive logs . . . . . . . . . 33DB2 for i - journal and journal receiverauthorities . . . . . . . . . . . . . . 33Informix Dynamic Server - logical logs . . . . 34Oracle - online redo logs and archived redo logs 34Microsoft SQL Server - online transaction logsand transaction log backups . . . . . . . . 34Sybase - online and archive logs . . . . . . 35z/OS - archive logs . . . . . . . . . . . 35

Database clustering . . . . . . . . . . . . 36Oracle Real Application Clusters (RAC) . . . . 36Configuring InfoSphere CDC in a RACenvironment . . . . . . . . . . . . . 38Microsoft SQL Server clustering . . . . . . 39DB2 for z/OS data sharing groups . . . . . 39

Operating system clustering . . . . . . . . . 40Database connection resiliency . . . . . . . . 40Replicating multibyte (MBCS) and double-byte(DBCS) character data . . . . . . . . . . . 40

Common encoding conversion scenarios. . . . 41Considerations when replicating MBCS characterdata . . . . . . . . . . . . . . . . 43

Continuous Capture . . . . . . . . . . . 44Replicating XA transactions . . . . . . . . . 44DB2 for LUW. . . . . . . . . . . . . . 45

Enabling database log retention. . . . . . . 45Creating a database backup . . . . . . . . 45Remote log reading. . . . . . . . . . . 45Remote target apply . . . . . . . . . . 45Table-level considerations. . . . . . . . . 46Replicating data in a Database PartitioningFeature (DPF) environment . . . . . . . . 47

Configuring InfoSphere CDC for a DB2 HighAvailability Disaster Recovery (HADR)environment . . . . . . . . . . . . . 47

DB2 for z/OS. . . . . . . . . . . . . . 48Extended CSA storage (ECSA) . . . . . . . 48Estimating above the bar storage requirements 48DB2 batch connections and allied threads . . . 51Log cache . . . . . . . . . . . . . . 52DB2 log buffers . . . . . . . . . . . . 52Code page conversion services . . . . . . . 52Schema evolution . . . . . . . . . . . 53Security Access Facility (SAF) and DB2authorization . . . . . . . . . . . . . 56

InfoSphere CDC for InfoSphere DataStage . . . . 57Considerations for InfoSphere CDC forInfoSphere DataStage . . . . . . . . . . 57

Informix Dynamic Server . . . . . . . . . . 58InfoSphere CDC API . . . . . . . . . . 58

Microsoft SQL Server . . . . . . . . . . . 58Transaction log backup plan . . . . . . . . 58Remote target apply . . . . . . . . . . 59SQL Server table-level considerations . . . . . 59ROWVERSION data type. . . . . . . . . 61TCP/IP and ports . . . . . . . . . . . 62Database services . . . . . . . . . . . 62SQL Server replication . . . . . . . . . . 62Recovery model . . . . . . . . . . . . 62Database backup . . . . . . . . . . . 63

Netezza . . . . . . . . . . . . . . . 63Installation considerations . . . . . . . . 63Netezza JDBC drivers . . . . . . . . . . 63

Oracle . . . . . . . . . . . . . . . . 63Supplemental logging . . . . . . . . . . 64ARCHIVELOG mode . . . . . . . . . . 64Log parallelism . . . . . . . . . . . . 65Log shipping . . . . . . . . . . . . . 65Log space for latency . . . . . . . . . . 65Automatic Storage Management (ASM) . . . . 66Remote log reading. . . . . . . . . . . 66Remote target apply . . . . . . . . . . 66Tablespace for InfoSphere CDC metadata . . . 67Undo tablespace for transaction rollbacks . . . 67Read-only database connections . . . . . . 67Read-only tables . . . . . . . . . . . . 67Oracle table-level considerations . . . . . . 67Disk quota for capture components . . . . . 68Bulk load refresh . . . . . . . . . . . 69Oracle listener . . . . . . . . . . . . 69Database constraints . . . . . . . . . . 69

Oracle - Trigger . . . . . . . . . . . . . 69Tablespace for InfoSphere CDC metadata . . . 70Undo tablespace for transaction rollbacks . . . 70Bulk load refresh . . . . . . . . . . . 70Multiple instances and sharing the same journaltable. . . . . . . . . . . . . . . . 70Oracle listener . . . . . . . . . . . . 71

IBM Confidential

© Copyright IBM Corp. 2011 iii

Database constraints . . . . . . . . . . 71Sybase . . . . . . . . . . . . . . . . 71

Setting the LANG environment variable (UNIX) 71Database and backup restrictions . . . . . . 71Refresh performance considerations . . . . . 72Enabling the creation of a partition table . . . 73

Teradata . . . . . . . . . . . . . . . 73Driver and utilities requirements . . . . . . 73Directories for Teradata FastLoad files . . . . 73

Server requirements . . . . . . . . . 75Supported operating systems and processors . . . 75CPU resource requirements . . . . . . . . . 83RAM requirements . . . . . . . . . . . . 84Disk requirements . . . . . . . . . . . . 85

Disk space. . . . . . . . . . . . . . 86Disk speed . . . . . . . . . . . . . 87Disk type . . . . . . . . . . . . . . 87

InfoSphere CDC metadata resiliency . . . . . . 87

User account requirements . . . . . . 89

User account access requirements . . . . . . . 89

TCP/IP network requirements andsupported features . . . . . . . . . 97InfoSphere CDC replication engine networkrequirements . . . . . . . . . . . . . . 97InfoSphere CDC administration networkrequirements . . . . . . . . . . . . . . 98Network connection resiliency. . . . . . . . 100Data encryption considerations . . . . . . . 101

What to do next . . . . . . . . . . 103

Troubleshooting and contacting IBMSupport . . . . . . . . . . . . . . 105

Notices . . . . . . . . . . . . . . 107Trademarks . . . . . . . . . . . . . . 109

IBM Confidential

iv InfoSphere Change Data Capture: Planning and Deployment Guide

About this guide

This guide is intended to help you understand the considerations you should takeinto account before installing and integrating InfoSphere® CDC into yourproduction environment.

You will need to understand the following before deploying:v The components of InfoSphere CDC and how the product worksv How to assess your replication needsv How to assess your current environmentv InfoSphere CDC's database requirements and supported featuresv InfoSphere CDC's server requirementsv InfoSphere CDC's user requirementsv InfoSphere CDC's network requirements

IBM Confidential

© Copyright IBM Corp. 2011 1

IBM Confidential

2 InfoSphere Change Data Capture: Planning and Deployment Guide

Overview of InfoSphere CDC and InfoSphere CDCManagement Console

IBM® InfoSphere Change Data Capture (InfoSphere CDC) is a replication solutionthat captures database changes as they happen and delivers them to targetdatabases, message queues, or an ETL solution such as InfoSphere DataStage®

based on table mappings configured in the InfoSphere CDC Management ConsoleGUI application.

InfoSphere CDC provides low impact capture and fast delivery of data changes forkey information management initiatives including dynamic data warehousing,master data management, application consolidations or migrations, operational BI,and enabling SOA projects. InfoSphere CDC also helps reduce processingoverheads and network traffic by only sending the data that has changed.Replication can be carried out continuously or periodically. When data istransferred from a source server, it can be remapped or transformed in the targetenvironment.

The following diagram illustrates the key components of InfoSphere CDC.

The key components of the InfoSphere CDC architecture are described below:v Access Server—Controls all of the non-command line access to the replication

environment. When you log in to Management Console, you are connecting toAccess Server. Access Server can be closed on the client workstation withoutaffecting active data replication activities between source and target servers.

v Admin API—Operates as an optional Java-based programming interface thatyou can use to script operational configurations or interactions.

v Apply agent—Acts as the agent on the target that processes changes as sent bythe source.

v Command line interface—Allows you to administer datastores and useraccounts, as well as to perform administration scripting, independent ofManagement Console.

IBM Confidential


v Communication Layer (TCP/IP)—Acts as the dedicated network connectionbetween the Source and the Target.

v Source and Target Datastore—Represents the data files and InfoSphere CDCinstances required for data replication. Each datastore represents a database towhich you want to connect and acts as a container for your tables. Tables madeavailable for replication are contained in a datastore.

v Management Console—Allows you to configure, monitor and managereplication on various servers, specify replication parameters, and initiate refreshand mirroring operations from a client workstation. Management Console alsoallows you to monitor replication operations, latency, event messages, and otherstatistics supported by the source or target datastore. The monitor inManagement Console is intended for time-critical working environments thatrequire continuous analysis of data movement. After you have set up replication,Management Console can be closed on the client workstation without affectingactive data replication activities between source and target servers.

v Metadata—Represents the information about the relevant tables, mappings,subscriptions, notifications, events, and other particulars of a data replicationinstance that you set up.

v Mirror—Performs the replication of changes to the target table or accumulationof source table changes used to replicate changes to the target table at a latertime. If you have implemented bidirectional replication in your environment,mirroring can occur to and from both the source and target tables.

v Refresh—Performs the initial synchronization of the tables from the sourcedatabase to the target. This is read by the Refresh reader.

v Replication Engine—Serves to send and receive data. The process that sendsreplicated data is the Source Capture Engine and the process that receivesreplicated data is the Target Engine. An InfoSphere CDC instance can operate as asource capture engine and a target engine simultaneously.

v Single Scrape—Acts as a source-only log reader and a log parser component. Itchecks and analyzes the source database logs for all of the subscriptions on theselected datastore.

v Source transformation engine—Processes row filtering, critical columns, columnfiltering, encoding conversions, and other data to propagate to the targetdatastore engine.

v Source database logs—Maintained by the source database for its own recoverypurposes. The InfoSphere CDC log reader inspects these in the mirroringprocess, but filters out the tables that are not in scope for replication.

v Target transformation engine—Processes data and value translations, encodingconversions, user exits, conflict detections, and other data on the target datastoreengine.

There are two types of target-only destinations for replication that are notdatabases:v JMS Messages—Acts as a JMS message destination (queue or topic) for

row-level operations that are created as XML documents.v InfoSphere DataStage—Processes changes delivered from InfoSphere CDC that

can be used by InfoSphere DataStage jobs.

For more information on how to install Management Console and Access Server,see Access Server and Management Console - Installation Guide. For information onhow to install your source and target replication engines, see the end-userdocumentation for your replication engine platform.

IBM Confidential


Determining your replication needs

The first step in planning an InfoSphere CDC deployment is to establish yourreplication needs. You need to determine each source database for replication andtheir corresponding target for replication. InfoSphere CDC source will capturechanged data in your source database and send source table changes to the target.

The following table lists all of the databases, middleware and messagingmiddleware providers that are supported for replication by InfoSphere CDC:

Supported source databasesSupported target databases andmiddleware applications

IBM DB2® for Linux, UNIX and Windows(LUW)

IBM DB2 for Linux, UNIX and Windows(LUW)

IBM DB2 for i IBM DB2 for i

IBM DB2 for z/OS® IBM DB2 for z/OS

IBM Informix® Dynamic Server IBM Informix Dynamic Server

IMS™ IBM InfoSphere CDC Event Server

Microsoft SQL Server IBM InfoSphere DataStage

Oracle Microsoft SQL Server

Sybase Netezza®

Oracle

Sybase

Teradata

Each database that will act as a source for replication will need to have an instanceof the InfoSphere CDC replication engine installed, as will each target forreplication.

Both the Management Console and Access Server applications must also beinstalled in order to configure and monitor replication between the source databaseand the target of replication.

If you are replicating from a non-relational database using InfoSphere ClassicChange Data Capture for z/OS, you will also need to install Classic Data Architect.Related concepts

“Supported databases and target applications” on page 16

IBM Confidential


IBM Confidential


Assessing requirements

There are several areas in your enterprise that you should review in order todetermine if you have the necessary resources for deploying InfoSphere CDC.These include:v Software and database-specific requirementsv Operating system, hardware, disk and memory requirementsv User account permissionsv Communications availabilities

The following tables contain questions to help you focus on areas of potentialcontention. Review the questions and see the corresponding topics for moreinformation about how a feature or option might impact your deployment.

In this section you will learn:“Database assessment questions” on page 8“Server assessment questions” on page 11“User accounts assessment questions” on page 12“TCP/IP network assessment questions” on page 12

IBM Confidential


Database assessment questions

ReplicationEngine Issue For more information, see:

General What are the versions of your source andtarget databases?

v “Supported databases and targetapplications” on page 16

What data types do you plan to replicate? v “Supported data types” on page 19

v “Supported table mappings” on page 22

Will your database log retention policy have tobe altered in order to retain the database logsthat InfoSphere CDC requires for replication?

v “DB2 for Linux, UNIX, and Windows(LUW) – online and archive logs” on page33

v “Informix Dynamic Server - logical logs” onpage 34

v “Microsoft SQL Server - online transactionlogs and transaction log backups” on page34

v “Oracle - online redo logs and archived redologs” on page 34

v “Sybase - online and archive logs” on page35

Is your database part of a clusteredenvironment or a DB2 for z/OS data sharinggroup?

v “Database clustering” on page 36

v “Operating system clustering” on page 40

Do you have session limits on databaseconnections?

v “Database connection resiliency” on page 40

Are you replicating multibyte character data(MBCS) or double byte character data (DBCS)?

v “Replicating multibyte (MBCS) anddouble-byte (DBCS) character data” on page40

Do you want InfoSphere CDC to continueprocessing the source database log duringperiods when the target is unavailable due tomaintenance, network outages, or otherreasons?

v “Continuous Capture” on page 44

InfoSphere CDCfor DB2 for LUW

Are you installing InfoSphere CDC on adifferent server from your source or targetdatabase?

v “Remote log reading” on page 45

v “Remote target apply” on page 45

Are you installing InfoSphere CDC in a DB2HADR environment?

v “Configuring InfoSphere CDC for a DB2High Availability Disaster Recovery (HADR)environment” on page 47

Have you enabled log retention for each DB2LUW database being used for replication?

v “Enabling database log retention” on page45

Have you performed a database backup andscheduled regular backups for each DB2 LUWdatabase being used for replication?

v “Creating a database backup” on page 45

Are you deploying InfoSphere CDC in a DPFenvironment?

v “Replicating data in a Database PartitioningFeature (DPF) environment” on page 47

Are you targeting a DB2 pureScaleenvironment?

v “Replicating data in a DB2 pureScaleenvironment” on page 47

IBM Confidential



InfoSphere CDCfor z/OS

Is your ECSA storage size sufficient? v “Extended CSA storage (ECSA)” on page 48

Do you have enough storage ‘above the bar'? v “Estimating above the bar storagerequirements” on page 48

What is the current setting in DB2 for themaximum number of batch connections andallied threads?

v “DB2 batch connections and allied threads”on page 51

How does InfoSphere CDC for z/OS handlecode page conversion?

v “Code page conversion services” on page 52

How does InfoSphere CDC for z/OS handleData Definition Language (DDL) changesperformed on in-scope tables?

v “Schema evolution” on page 53

InfoSphere CDCfor Informix

Has your Informix database been prepared touse the InfoSphere CDC API?

v “InfoSphere CDC API” on page 58

InfoSphere CDCfor Microsoft SQLServer

Will InfoSphere CDC be installed on adifferent server than the target database?


Would you like to deploy InfoSphere CDC aspart of a SQL Server clustering environment?

v “Microsoft SQL Server clustering” on page39

Are your table indexes clustered? v “Tables with clustered and non-clusteredindexes” on page 59

Will you be replicating data to tables withcomputed columns?

v “Computed columns” on page 60

Will you be replicating data to tables withidentity columns?

v “Identity columns” on page 60

Will you be replicating data to tablescontaining the rowversion data type?

v “ROWVERSION data type” on page 61

Will you be replicating columns with databasedefaults?

v “Columns with database defaults” on page61

Do the tables in your database have primarykeys?

v “Source tables and primary keys” on page60

Are SQL Server TCP/IP connections enabledin the database?

v “TCP/IP and ports” on page 62

How are the SQL Server services configured? v “Database services” on page 62

Is the server configured to be a Publisher thatacts as its own Distributor? And does theDistribution database exist?

v “SQL Server replication” on page 62

How is the recovery model configured? FULLor BULK_LOGGED?

v “Recovery model” on page 62

What is your database backup plan? v “Database backup” on page 63

Is there a transaction log backup plan? Do youuse third-party tools to do this or SQL Server?

v “Transaction log backup plan” on page 58

InfoSphere CDCfor Netezzadatabases

Are there any installation restrictions? “Installation considerations” on page 63

What JDBC drivers need to be installed? “Netezza JDBC drivers” on page 63

IBM Confidential

Assessing requirements 9


InfoSphere CDCfor Oracledatabases (trigger)

Will your database log retention policy have tobe altered? Has enough tablespace beenallocated?

v “Tablespace for InfoSphere CDC metadata”on page 67

Has enough undo tablespace been allocated? v “Undo tablespace for transaction rollbacks”on page 67

Do you want to install InfoSphere CDC in aRAC environment?

v “Oracle Real Application Clusters (RAC)”on page 36

Will you be replicating index organized tables(IOT)?

v “Index Organized Tables (IOT)” on page 68

Are you using ASM storage for redo logs? v “Automatic Storage Management (ASM)” onpage 66

Do your business requirements limit access toyour source database and only allow read-onlyusers?

v “Read-only database connections” on page67

Is log parallelism enabled? v “Log parallelism” on page 65

Do you want to ship your database logs to asecondary system that is accessible toInfoSphere CDC?

v “Remote log reading” on page 66

v “Log shipping” on page 65

Are you installing InfoSphere CDC on adifferent server than the source or targetdatabase?


Do you intend to replicate compressed tables? v “Compressed tables” on page 68

Are there encrypted tables in the database? v “Encrypted tables” on page 68

Do you plan on creating multiplesubscriptions to replicate your data?

v “Disk quota for capture components” onpage 68

Do you plan on processing large databasetransactions with InfoSphere CDC?

v “Disk quota for capture components” onpage 68

Is supplemental logging enabled? v “Supplemental logging” on page 64

Are your online redo database logs archivedand are they accessible to InfoSphere CDC?

v “ARCHIVELOG mode” on page 64

Will you be performing bulk load refreshes? v “Bulk load refresh” on page 69

Is the Oracle listener running? v “Oracle listener” on page 69

Are there any Oracle database constraints inplace?

v “Database constraints” on page 69

IBM Confidential




Has enough tablespace been allocated? v “Tablespace for InfoSphere CDC metadata”on page 70

Has enough tablespace been allocated? v “Undo tablespace for transaction rollbacks”on page 70

Will you be performing bulk load refreshes? v “Bulk load refresh” on page 70

Will multiple instances of InfoSphere CDC besharing the same journal table?

v “Multiple instances and sharing the samejournal table” on page 70

Is the Oracle listener running? v “Oracle listener” on page 71

Are there any Oracle database constraints inplace?

v “Database constraints” on page 71

InfoSphere CDCfor Sybasedatabases

Will you be replicating multibyte characterssets?

v “Setting the LANG environment variable(UNIX)” on page 71

Are you using InfoSphere CDC as a target ofreplication?

v “Refresh performance considerations” onpage 72

Have you considered all of the database andbackup restrictions when using InfoSphereCDC for Sybase databases?

v “Database and backup restrictions” on page71

Will you be replicating range partition tables? v “Enabling the creation of a partition table”on page 73

InfoSphere CDCfor Teradata

What JDBC drivers and Teradata utilities arecurrently installed?

v “Driver and utilities requirements” on page73

Which directory will be used for the FastLoadutility?

v “Directories for Teradata FastLoad files” onpage 73

Related concepts

“Server assessment questions”“User accounts assessment questions” on page 12“TCP/IP network assessment questions” on page 12

Server assessment questions

Issue For more information, see:

Do you meet the minimum requirements for operatingsystems and processors on your source and targetsystems?

v “Supported operating systems and processors” onpage 75

Do you have enough CPU resources for InfoSphere CDCto replicate the data volume/workload of your database?

v “CPU resource requirements” on page 83

Do you have enough RAM on your source and targetservers?

v “RAM requirements” on page 84

Do you have enough unallocated (and physicallyavailable) disk space?

v “Disk space” on page 86

Are there other applications running on the server whereyou are installing InfoSphere CDC and are they usingmemory and CPU resources?

v “RAM requirements” on page 84

v “CPU resource requirements” on page 83

IBM Confidential



Does the hard disk where you are installing InfoSphereCDC meet the minimum requirements?

v “Disk space” on page 86

v “Disk speed” on page 87

v “Disk type” on page 87

Do you plan to perform regular backups of InfoSphereCDC metadata?

v “InfoSphere CDC metadata resiliency” on page 87

Related concepts

“Database assessment questions” on page 8“User accounts assessment questions”“TCP/IP network assessment questions”

User accounts assessment questions


Do the necessary user accounts exist for the operatingsystems?

v “User account access requirements” on page 89

Do the necessary user accounts exist for the databases? v “User account access requirements” on page 89

Do the user accounts have the necessary permissions? v “User account access requirements” on page 89

Related concepts

“Database assessment questions” on page 8“Server assessment questions” on page 11“TCP/IP network assessment questions”

TCP/IP network assessment questions


How many ports are available on the servers?

Are there enough input and output ports available onthe servers for the replication engine instance, AccessServer and Management Console?

v “InfoSphere CDC replication engine networkrequirements” on page 97

v “InfoSphere CDC administration networkrequirements” on page 98

Are the ports accessible through both personal andnetwork firewalls?

v “InfoSphere CDC replication engine networkrequirements” on page 97


Are they static or dynamic ports? v “InfoSphere CDC replication engine networkrequirements” on page 97


Is TCP/IP enabled? v “InfoSphere CDC replication engine networkrequirements” on page 97


IBM Confidential



Is there sufficient network bandwidth? v “InfoSphere CDC replication engine networkrequirements” on page 97


Is there a risk of network communication interruptionsor DB2 LUW deadlock or timeout errors in yourdeployment of InfoSphere CDC?

v “Network connection resiliency” on page 100

Do the security policies at your organization require theencryption of stored data and transmitted data?

v “Data encryption considerations” on page 101

Related concepts

“Database assessment questions” on page 8“Server assessment questions” on page 11“User accounts assessment questions” on page 12

IBM Confidential


IBM Confidential


Database requirements and supported features

InfoSphere CDC replication engines support a variety of databases. While somefunctionality is common across the different databases, each also has its ownunique features.

The topics at the beginning of the section are common to all databases, while thefeatures unique to individual databases are contained in subsections on aper-database basis.

In this section you will learn:“Supported databases and target applications” on page 16“Supported data types” on page 19“Supported table mappings” on page 22“Calculating database connections required by InfoSphere CDC” on page 31“Database logs” on page 33“Database clustering” on page 36“Operating system clustering” on page 40“Database connection resiliency” on page 40“Replicating multibyte (MBCS) and double-byte (DBCS) character data” onpage 40“Continuous Capture” on page 44“Replicating XA transactions” on page 44“DB2 for LUW” on page 45“DB2 for z/OS” on page 48“InfoSphere CDC for InfoSphere DataStage” on page 57“Informix Dynamic Server” on page 58“Microsoft SQL Server” on page 58“Netezza” on page 63“Oracle” on page 63“Oracle - Trigger” on page 69“Sybase” on page 71“Teradata” on page 73

IBM Confidential


Related concepts

“Server requirements” on page 75“User account requirements” on page 89“TCP/IP network requirements and supported features” on page 97

Supported databases and target applicationsInfoSphere CDC replication engines support the following databases:

DB2 for Linux, UNIX, and Windows (LUW)

Database

Install DB2 LUW Client software and one of the following versions of DB2 LUW:

v IBM DB2 for Linux, UNIX and Windows, version 9.1

v IBM DB2 LUW, version 9.5. Requires InfoSphere CDC version 6.3, Fix Pack 2 or later.

v IBM DB2 LUW, version 9.7. To support this database, you must follow the bookmarkupdate procedure outlined in this guide if you are not using InfoSphere CDC version 6.3Fix Pack 2 or later.

v IBM DB2 pureScale®, version 9.8 or later

Note: If you are deploying InfoSphere CDC as a source in a DPF environment, theminimum supported version of DB2 for LUW is version 9.5. DB2 for LUW version 9.1 isnot supported in source DPF environments.

DB2 for i

Database

For InfoSphere CDC for DB2 for i version 6.2, Fix Pack 1:

v IBM i V5R3 or later

Note: For correct product operation, you may require Program Temporary Fixes (PTFs).Contact IBM for PTF information for your operating system.

DB2 for z/OS

Database

Install DB2 for z/OS Client software and one of the following versions of DB2:

v IBM DB2 for z/OS, version 8

v IBM DB2 for z/OS, version 9

IMS (source only deployments)

IBM Confidential


Supported IMS Databases

InfoSphere Classic CDC for z/OS requires one of the following versions of IBM IMS:

v IMS version 10.1

v IMS version 11.1

InfoSphere Classic CDC for z/OS supports most full-function IMS databases, including thefollowing:

v Direct entry (DEDB)

v High availability large (HALDB)

v Hierarchical direct access method (HDAM)

v Hierarchical indexed direct access method (HIDAM)

v Hierarchical sequential access method (HSAM)

InfoSphere CDC for InfoSphere DataStage (target-only deployments)

Required applications

v IBM WebSphere® DataStage version 7.5 or later to use InfoSphere DataStage forInfoSphere CDC.

v IBM InfoSphere Information Server version 8.5 to use the full functionality of InfoSphereDataStage for InfoSphere CDC version 6.5 and the direct connect option.

InfoSphere CDC Event Server (target-only deployments)

Messaging middleware and other required software

All of the following software:

v Messaging solution provider that supports the JMS API version 1.1 as defined by J2EE1.4

v JNDI, as outlined in the Java Message Service specification in order to connect to thedesired queue or topic. For more information, see the JMS specification document athttp://java.sun.com/products/jms/.

v A InfoSphere CDC source product so that database events can be scraped and sent to aqueue or topic

If you are using IBM i , use Java Virtual machine (JVM) 1.5, 32-bit

Informix Dynamic Server

Database

InfoSphere CDC supports the following versions of IBM Informix Dynamic Server (IDS):

v IBM Informix Dynamic Server version 11.50.xC3W1 or later

Microsoft SQL Server

Database

One of the following versions of Microsoft SQL Server:

v Microsoft SQL Server 2005—32-bit or 64-bit. Standard or Enterprise editions, ServicePack 1 or later.

v Microsoft SQL Server 2008—32-bit or 64-bit. Standard or Enterprise editions.

v Microsoft SQL Server 2008 R2—32-bit or 64-bit. Standard or Enterprise editions.

IBM Confidential

Database requirements and supported features 17

http://java.sun.com/products/jms/

Netezza databases (target-only deployments)

Database

Netezza database version 6.0.2

Oracle databases

Database

Install Oracle Client software and one of the following versions of the Oracle database:

v Oracle 9i (release 9.2.0.5 or later)

v Oracle 10g (release 10.1.0.4 or later)

v Oracle 11g (release 11gR1 and 11gR2)

If you are configuring InfoSphere CDC for an Oracle RAC environment, the followingversions of Oracle are supported:


v Oracle 10g (release 10.2 or later)


Note: The following Oracle 11g features are not supported with InfoSphere CDC:v LOBs in IOTsv LOBs in partitioned IOTsv IGNORE_ROW_ON_DUPKEY_INDEX hintv DML with error logging tablev DBMS_PARALLEL_EXECUTE (for RAC and non-RAC environments)v Oracle XA in non-RACv Flashback table (with focus on replicating DML following its recovery)v Flashback table (with focus on DDL replication)v ASM restricted modev Auto failover or failback for user SQL sessionsv Oracle RAC one node (that is, single-node RAC)v Oracle XA in RACv Auto degree of parallelism (for RAC and non-RAC environments)v Zero downtime patchingv OCFS for Solarisv 4KB sector size drivesv Encrypted tablespacesv Recovering database blocksv Edition-based redefinitionv Flashback data archive for DDL (such as adding columns or truncating tables)

Oracle databases (trigger)

IBM Confidential


Database

Install Oracle Client software and one of the following versions of the Oracle database:


v Oracle 10g (release 10.1.0.4 or later)


Sybase databases

Database

Install Sybase Client software and one of the following versions of Sybase Adaptive ServerEnterprise (ASE):

v Sybase ASE, version 12.5.3

v Sybase ASE, version 12.5.4

v Sybase ASE, version 15.0

Teradata (target-only deployments)

Database

Teradata version 12.0



Related concepts

“Database connection resiliency” on page 40“Supported data types”“Supported table mappings” on page 22

Supported data typesEach InfoSphere CDC replication engine supports a variety of data types, asindicated by the table below.

The following considerations and restrictions exist for the replication of data types:

v InfoSphere CDC casts data into and out of the database withVARCHAR(MAX), NVARCHAR(MAX), VARBINARY(MAX), TEXT, IMAGE,NTEXT, XML, and SQL_VARIANT data types. Due to this limitation imposed bythe database, mirroring and refresh performance may be reduced whenreplicating these data types.

v In addition to the listed data types, InfoSphere CDC supports AliasData Types (ADT) in all versions of Microsoft SQL Server.

v User-defined data types that are not aliases of standard database data typescannot be replicated by InfoSphere CDC. Tables where unsupported data typesare present will not be included in the catalog and are not available formapping.

v InfoSphere CDC does not provide support for columns of typeBFILE or CFILE.

v If you are replicating the TIMEZONE data type and the source andtarget have a different TIMEZONE value, the data replicated will be adjustedand use the source TIMEZONE value.

IBM Confidential


v InfoSphere CDC provides support for the ROWID and XML data types and thisincludes the use of SQL*Loader to load the data into the source table.

v The LOAD utility is not supported with LOB and XML columns.

v Support for CHAR and VARCHAR data types includes BIT, MIXEDand SBCS

v InfoSphere CDC can replicate IMAGE, TEXT, NTEXT,VARBINARY(MAX), NVARACHAR(MAX), and VARCHAR(MAX) data types ofunlimited length.

v Binary data in derived expressions is not supported

v You cannot map a binary type target table column to a constant orjournal control field

v Binary data columns are not supported for value translations.

v InfoSphere CDC does not support the replication of data typesINTERVAL, TIMESPAN, and TIME WITH TIMEZONE.

v You cannot map binary data types to a Netezza target table column.

Table 1. Supported data typesData type Supported by replication engine

DB2LUW

DB2 forz/OS

DB2 fori

InformixDynamic

Server

MicrosoftSQL

Server NetezzaOracleRedo

OracleTrigger Sybase Teradata

BIGINT U U U U U U U

BIGSERIAL U

BINARY U U U

BINARY_DOUBLE U U

BINARY_FLOAT U U

BIT U U

BLOB U U U U U U

BOOLEAN U U

BOOL U

BYTE U

BYTEINT U U

CHAR U U U U U U U U U

CHARACTER U U

CHARACTER FOR BIT DATA U U

CHARACTER VARYING U U

CLOB U U U U U U

DATE U U U U U U U U U

DATE DMY U

DATE EUR U

DATE ISO U

DATE JUL U

DATE MDY U

DATE USA U

DATE YMD U

DATETIME U U

DATETIME HOUR TOSECOND

U

DATETIME YEAR TOFRACTION(5)

U

DATETIME2 U

DATETIMEOFFSET U

IBM Confidential


Table 1. Supported data types (continued)Data type Supported by replication engine

DB2LUW

DB2 forz/OS

DB2 fori

InformixDynamic

Server

MicrosoftSQL



DBCLOB U U U

DBCS EITHR U

DBCS GRAPHIC U

DBCS ONLY U

DBCS OPEN U

DECIMAL U U U U U U U

DOUBLE U U U U

DOUBLE PRECISION U

FLOAT U U U U U U U U U U

FLOAT(p) U

GRAPHIC U U U U

HEX (fixed length only) U

IMAGE U U U

INT8 U

INTEGER U U U U U U U U U U

INTERVAL U

INTERVAL DAY TO SECOND U U

INTERVAL DAY TO MONTH U U

LOBs U U U

LONG RAW U U

LONG VARCHAR U U U U U

LONG VARCHAR FOR BITDATA

U U

LONG VARGRAPHIC U U

LVARCHAR U

MONEY U U U

NCHAR U U U U U U

NCLOB U U

NTEXT U

NUMERIC U U U U U U U U

NUMERIC(p, s) U

NUMERIC(p) U

NVARCHAR U U U U

NVARCHAR2 U U

NVARCHAR(MAX) U

PACKED DECIMAL U

RAW U U

REAL U U U U U U U U

ROWID U U

ROWVERSION U

SERIAL U

SERIAL8 U

SMALLDATETIME U U

SMALLFLOAT U

SMALLINT U U U U U U U U

SMALLMONEY U U

SQL_VARIANT U

TEXT U U

IBM Confidential


Table 1. Supported data types (continued)Data type Supported by replication engine

DB2LUW

DB2 forz/OS

DB2 fori

InformixDynamic

Server

MicrosoftSQL



TIME U U U U U U

TIMETZ U

TIMEZONE U U

TIME EUR U

TIME HMS U

TIME ISO U

TIME JIS U

TIME USA U

TIMESTAMP U U U U U U U U

TIMESTAMP WITH TIMEZONE

U U

TIMESTAMP WITH LOCALTIME ZONE

U U

TINYINT U U

UNICHAR U

UNIQUEIDENTIFIER U

UNITEXT U

UNIVARCHAR U

VARBINARY U U U

VARBINARY(MAX) U

VARBYTE U

VARCHAR U U U U U U U

VARCHAR FOR BIT DATA U U

VARCHAR(MAX) U

VARCHAR2 U U

VARGRAPHIC U U U

XML U U U U

XMLTYPE U U

ZONED NUMERIC U

Related concepts

“Supported table mappings”“Supported databases and target applications” on page 16

Supported table mappingsThis section indicates the table mappings that you can create in ManagementConsole with supported data types.

DB2 LUW

Source data types Supported table mappings

BIGINT Any numeric, binary, or BLOB data type

BLOB Any binary or BLOB data type

CHAR Any character, variable character, CLOB,binary, or BLOB data type

CHARACTER FOR BIT DATA Any binary or BLOB data type

IBM Confidential



CLOB Any character, variable character, CLOB,binary, or BLOB data type

DATE Any date data type

DBCLOB Any character, variable character, CLOB,DBCLOB, or BLOB data type

DECIMAL Any numeric data type

DOUBLE PRECISION Any numeric, binary, or BLOB data type

FLOAT Any numeric, binary, or BLOB data type

GRAPHIC Any character, variable character, CLOB,binary, or BLOB data type

INTEGER Any numeric, binary, or BLOB data type

LONG VARCHAR Any character, variable character, CLOB,binary, or BLOB data type

LONG VARCHAR FOR BIT DATA Any binary or BLOB data type

LONG VARGRAPHIC Any character, variable character, CLOB,binary, or BLOB data type

NUMERIC Any numeric, binary, or BLOB data type

REAL Any numeric, binary, or BLOB data type

SMALLINT Any numeric, binary, or BLOB data type

TIME Any time data type

TIMESTAMP Any date, time, or timestamp data type

VARCHAR Any character, variable character, CLOB,binary, or BLOB data type

VARCHAR FOR BIT DATA Any binary or BLOB data type

VARGRAPHIC Any character, variable character, CLOB,binary, or BLOB data type

XML XML, CLOB, or any character type

DB2 for z/OS

For the purposes of supported table mappings, the following text offers adefinition of binary and text fields:v A text field is any CHAR, VARCHAR, GRAPHIC, VARGRAPHIC, BINARY,

VARBINARY, BLOB, CLOB or DBCLOB column which has a CCSID (code page)or an XML column. By default, columns will have a CCSID or not, based ontheir definition in DB2. This CCSID, or lack of it, can be overridden using theEncoding tab in the Mapping Details view. Any text field can be mapped toany other text field.

v A binary field is any CHAR, VARCHAR, GRAPHIC, VARGRAPHIC, BINARY,VARBINARY, BLOB, CLOB or DBCLOB column which does not have a CCSID(code page). By default, columns will have a CCSID or not, based on theirdefinition in DB2. This CCSID, or lack of it, can be overridden using theEncoding tab in the Mapping Details view. Any binary field can be mapped toany other binary field.

IBM Confidential


Source data types

Supported tablemappings when acode page isassigned

Supported tablemappings when acode page is notassigned

Supported tablemappings when acode page is notapplicable

BIGINT N/A N/A Any numeric datatype

BINARY Any text field Any binary field N/A

BLOB Any text field Any binary field N/A

CHAR (for BIT) Any text field Any binary field N/A

CHAR (for MIXED) Any text field Any binary field N/A

CHAR (for SBCS) Any text field Any binary field N/A

CLOB Any text field Any binary field N/A

DATE N/A N/A Any date data type

DBCLOB Any text field Any binary field

DECIMAL N/A N/A Any numeric datatype

FLOAT N/A N/A Any numeric datatype

GRAPHIC Any text field Any binary field

INTEGER N/A N/A Any numeric datatype

ROWID N/A N/A For DB2 for z/OS toDB2 to z/OSmappings, a sourceROWID column canonly be mapped to atarget ROWIDcolumn. Columnmappings to anyother target databasepermit mappings tobinary data types.

SMALLINT N/A N/A Any numeric datatype

TIME N/A N/A Any time data type

TIMESTAMP N/A N/A Any date, time, ortimestamp data type

VARBINARY Any text field Any binary field N/A

VARCHAR (for BIT) Any text field Any binary field N/A

VARCHAR (forMIXED)

Any text field Any binary field N/A

VARCHAR (forSBCS)

Any text field Any binary field N/A

VARGRAPHIC Any text field Any binary field N/A

XML N/A N/A Any text field wherethe source fieldcontains a valid XMLdocument

IBM Confidential


Informix Dynamic Server


BIGINT Any numeric data type

BIGSERIAL Any numeric data type

BLOB Any BLOB data type.

BOOLEAN Any Boolean data type

CHAR Any character or variable character datatype

CHARACTER Any character or variable character datatype

CHARACTER VARYING Any character or variable character datatype

CLOB Any character, variable character, CLOB,binary, or BLOB data type

DATE Any date data type

DATETIME HOUR TO SECOND Any date data type

DATETIME YEAR TO FRACTION(5) Any date data type


FLOAT Any numeric data type

INT8 Any numeric data type

INTEGER Any numeric data type

LVARCHAR Any character or variable character datatype

MONEY Any numeric data type

NCHAR Any character or variable character datatype

NUMERIC Any numeric data type

NVARCHAR Any character or variable character datatype

SERIAL Any numeric data type

SERIAL(8) Any numeric data type

SMALLFLOAT Any numeric data type

SMALLINT Any numeric data type

VARCHAR Any character or variable character datatype

Microsoft SQL Server



BINARY Any binary or BLOB data type

BIT Any numeric data type

CHAR Any character, variable character, CLOB,binary, or BLOB data type

DATETIME Any datetime, date, or time data type

IBM Confidential



DATE Any datetime, date, or time data type

DATETIME2 Any datetime, date, or time data type

DATETIMEOFFSET Any datetime, date, or time data type



IMAGE Any BLOB data type.



NCHAR Any character, variable character, CLOB,binary, or BLOB data type

NTEXT Any character, variable character, CLOB,binary, or BLOB data type

NVARCHAR Any character, variable character, CLOB,binary, or BLOB data type

NVARCHAR(MAX) Any character, variable character, CLOB,binary, or BLOB data type


REAL Any numeric data type

ROWVERSION Any binary or BLOB data type

SMALLDATETIME Any datetime, date, or time data type


SMALLMONEY Any numeric data type

SQL_VARIANT sql_variant

TEXT Any character, variable character, CLOB,binary, or BLOB data type

TIME Any datetime, date, or time data type

TINYINT Any numeric data type

UNIQUEIDENTIFIER Any binary or BLOB data type

VARBINARY Any binary or BLOB data type

VARBINARY(MAX) Any binary or BLOB data type

VARCHAR(MAX) Any character, variable character, CLOB,binary, or BLOB data type

XML XML, CLOB, or any character type

Oracle


BLOB Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

CHAR Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

IBM Confidential



CLOB Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

DATE Any date or timestamp data type



INTERVAL DAY TO SECOND Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

INTERVAL DAY TO MONTH Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

LONG RAW Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

LONG VARCHAR Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

NCHAR Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

NCLOB Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type


NVARCHAR2 Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

RAW Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type


TIMESTAMP Any date or timestamp data type

TIMESTAMP WITH TIME ZONE Any date or timestamp data type

TIMESTAMP WITH LOCAL TIME ZONE Any date or timestamp data type

TIMEZONE Any date or timestamp data type

IBM Confidential



VARCHAR2 Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

XMLTYPE Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

Oracle Trigger


BLOB Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

CHAR Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

CLOB Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type




INTERVAL DAY TO SECOND Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

INTERVAL DAY TO MONTH Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

LONG RAW Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

LONG VARCHAR Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

NCHAR Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

IBM Confidential



NCLOB Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type


NVARCHAR2 Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

RAW Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type


TIMESTAMP Any date or timestamp data type

TIMESTAMP WITH TIME ZONE Any date or timestamp data type

TIMESTAMP WITH LOCAL TIME ZONE Any date or timestamp data type

TIMEZONE Any date or timestamp data type

VARCHAR2 Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

XMLTYPE Any character, variable character, LOB,interval DTS, interval YTM, LONG RAW,LONG VARCHAR, NCHAR, NVARCHAR,RAW, or XML type data type

Sybase



BINARY Any binary or LOB data type

BIT Any numeric data type

CHAR Any character, variable character, or binarydata type

DATE Any date or datetime data type

DATETIME Any date or datetime data type



IMAGE Any binary or LOB data type



NCHAR Any character, variable character, or binarydata type

NTEXT Any character, variable character, or binarydata type

IBM Confidential



NVARCHAR Any character, variable character, or binarydata type



SMALLDATETIME Any date or datetime data type


SMALLMONEY Any numeric data type

TEXT Any character, variable character, or binarydata type

TIME Any time data type

TIMESTAMP Any character, variable character, or binarydata type

TINYINT Any numeric data type

UNICHAR Any character, variable character, or binarydata type

UNITEXT Any character, variable character, or binarydata type

UNIVARCHAR Any character, variable character, or binarydata type

VARBINARY Any binary or LOB data type

VARCHAR Any character, variable character, or binarydata type

Teradata


BYTE Any character, variable character, GRAPHIC,or VARGRAPHIC data type

BYTEINT Any numeric data type

CHAR Any character, variable character, GRAPHIC,or VARGRAPHIC data type



DOUBLE PRECISION Any numeric data type


GRAPHIC Any character, variable character, GRAPHIC,or VARGRAPHIC data type

INT (or INTEGER) Any numeric data type

INTERVAL Any numeric data type

LONG VARCHAR Any character, variable character, GRAPHIC,or VARGRAPHIC data type




TIME Any time or timestamp data type

IBM Confidential



TIMESTAMP Any date, time, or timestamp data type

VARBYTE Any character, variable character, GRAPHIC,or VARGRAPHIC data type

VARCHAR Any character, variable character, GRAPHIC,or VARGRAPHIC data type

VARGRAPHIC Any character, variable character, GRAPHIC,or VARGRAPHIC data type

Related concepts

“Supported data types” on page 19“Supported databases and target applications” on page 16

Calculating database connections required by InfoSphere CDC

As an administrator, you may find it necessary to calculate how many databaseconnections are needed before installing InfoSphere CDC on either a source or atarget database. Calculating the upper bound (both permanent and temporary)database connections will help you plan your environment so that it canaccommodate InfoSphere CDC.

If you are installing InfoSphere CDC Event ServerInfoSphere CDC for InfoSphereDataStageInfoSphere CDC for TeradataInfoSphere CDC for Netezza databases, thenyou only need to calculate database connections for the target database as thisproduct only replicates to target-only destinations.

This topic includes the formulae and examples to help you calculate the number ofconnections required by InfoSphere CDC versions 6.5.x or 6.3.x. Only calculationsfor 6.5.x are relevant for the InfoSphere CDC for Netezza databases product.

Calculating connections required by InfoSphere CDC on asource database

For InfoSphere CDC version 6.5.x:

(22 + G)*N + (4 + A)*T + 3*R + B + C

For InfoSphere CDC version 6.3.x:

20*P + A*S + G + 3*R + B + C

Where:

Note: Enter 0 for any value that does not apply to your deployment of InfoSphereCDC.v T = number of InfoSphere CDC 6.5.x subscriptions (source datastore in

Management Console is version 6.5.x) in all of your InfoSphere CDC 6.5.xinstances.

v S = number of InfoSphere CDC 6.3.x subscriptions (source datastore inManagement Console is version 6.3.x) in all of your InfoSphere CDC 6.3.xinstances.

v N = number of InfoSphere CDC 6.5.x instances.

IBM Confidential


v P = number of InfoSphere CDC 6.3.x instances.v G = number of Management Console GUI applications that are connected to your

instances of InfoSphere CDC.v R = number of RAC nodes if you are using ASM with Oracle RAC.v A = number of RAC nodes if you are not using ASM with Oracle RAC.v B = number of subscriptions that contain LOB columns.v C = number of InfoSphere CDC command line utilities that you plan to use.

Example: How to calculate required connections for a sourcedatabase (InfoSphere CDC version 6.5)

You want to setup InfoSphere CDC in the source environment as follows:v 3 InfoSphere CDC 6.5.x subscriptions.v 2 InfoSphere CDC 6.5.x instances.v 1 subscription that uses LOB columns.v 1 instance of Management Console.v 2 RAC nodes and you are not using ASM.v You do not plan to use any InfoSphere CDC command line utilities.

The number of connections required on the source database will be:(22+1)*2 + (4+2)*3 + 1 = 65

You should plan for a maximum of 65 database connections before installingInfoSphere CDC on a source database.

Calculating connections required by InfoSphere CDC version6.5.x or 6.3.x on a target database

For InfoSphere CDC version 6.5.x or 6.3.x:

(4+G)*N + 3*T

Where:

v T = number of InfoSphere CDC subscriptions (target datastore in ManagementConsole is version 6.5.x or 6.3.x).

v G = number of Management Console GUI applications that are connected toyour instances of InfoSphere CDC.

v N = number of InfoSphere CDC 6.5.x instances.

Example: How to calculate required connections for a targetdatabase

You want to setup InfoSphere CDC in the target environment as follows:v 3 subscriptions.v 2 InfoSphere CDC 6.5.x instances.v 1 installed Management Console GUI application.

The number of connections required on the target database will be:

(4 + 1)*2 + 3*3 = 19

IBM Confidential


You should plan for a maximum of 19 database connections before installingInfoSphere CDC on the target database.

Database logsInfoSphere CDC is a log-based replication product and will only replicate loggeddatabase operations. When installing and deploying the product to work with yoursource database, you must determine if the current log retention policy in yourenvironment is sufficient to accommodate InfoSphere CDC. You can use thedmshowlogdependency command to determine which database logs are required byInfoSphere CDC.

Note: The dmshowlogdependency command is not available for InfoSphere CDC forz/OS, InfoSphere CDC for Netezza databases or InfoSphere CDC for DB2 for i

See also:“DB2 for Linux, UNIX, and Windows (LUW) – online and archive logs”“DB2 for i - journal and journal receiver authorities”“Informix Dynamic Server - logical logs” on page 34“Oracle - online redo logs and archived redo logs” on page 34“Microsoft SQL Server - online transaction logs and transaction log backups” onpage 34“Sybase - online and archive logs” on page 35“z/OS - archive logs” on page 35

DB2 for Linux, UNIX, and Windows (LUW) – online and archivelogs

When using InfoSphere CDC with a DB2 for LUW database, you will require thefollowing:v Physical access to the DB2 online logs and archive logs on disk.v DB2 log retention to be active so that logs are archived instead of overwritten.v Retention of the DB2 online logs and archive logs for as long as InfoSphere CDC

is shut down or latent.

To minimize the impact on your database, the InfoSphere CDC log reader onlyperforms read-only operations when working with your database logs and willnever write directly to these files. This reduces disk contention on your log filesand minimizes the impact on your database.Related concepts

“Remote log reading” on page 45

DB2 for i - journal and journal receiver authoritiesAfter installing InfoSphere CDC and verifying the journal and journal receiverauthorities, you will also have to validate the following journaling settings.v For the journals involved, MINENTDTA must be set to *NONE or *DTAARA. The

product does not support minimized journal entries for database files. *FILE and*FLDBDY are not supported.

v For the journaling of specific files, IMAGES must be set to *BOTH. Both before andafter images are required.

IBM Confidential


Note that you should retain journal receivers for as long as InfoSphere CDC is shutdown or latent.

For more information on how to verify journal and journal receiver authorities,refer to your InfoSphere CDC for IBM i documentation.

Informix Dynamic Server - logical logsWhen using InfoSphere CDC with an Informix database, you will require thefollowing:v Installation of the InfoSphere CDC API to retrieve changed data from the

Informix database.v Retention of Informix logical logs for as long as InfoSphere CDC is shut down

or latent.Related concepts

“InfoSphere CDC API” on page 58

Oracle - online redo logs and archived redo logsWhen using InfoSphere CDC with an Oracle database, you will require thefollowing:v Physical access to both redo and archive logs.v Retention of redo and archive logs for as long as the product is shut down or

latent.v Supplemental logging is configured to meet InfoSphere CDC requirements.

You can configure InfoSphere CDC to do the following:v Replication processes can use copies of complete Oracle archive logs that are

shipped to a secondary system that is physically accessible to InfoSphere CDC.This is referred to as log shipping.

v Read database logs that are remote from your source installation of InfoSphereCDC if the database log directory is located on a shared storage device.

v Replication process can use only archive logs and not online redo logs. This typeof configuration can lead to latency since the product must wait for the logs tobe archived which varies depending on data volume and Oracle configuration.

Note: To minimize the impact on your source database, the InfoSphere CDC logreader only performs read-only operations when working with your database logsand will never write directly to these files. This reduces disk contention on yourlog files and minimizes the impact on your database.Related concepts

“ARCHIVELOG mode” on page 64“Supplemental logging” on page 64“Log shipping” on page 65

Microsoft SQL Server - online transaction logs and transactionlog backups

When using InfoSphere CDC with a Microsoft SQL Server database, you willrequire the following:v Physical access to the online transaction logs on disk. Multiple online log files

are supported if you decide to split your database logs across multiple physicalfiles.

IBM Confidential


v Physical access to the transaction log backups on disk. Transaction log backupsmust not be compressed or encrypted and must not be moved after they aresaved. Third party tools for transaction log backups are not supported.

v Retention of the online transaction logs and transaction log backups for as longas InfoSphere CDC is shut down or latent.

It is beneficial for InfoSphere CDC performance if transaction logs and transactionlog backups are kept to a size of 1 GB or less (if data volume permits).

Note: To minimize the impact on your source database, the InfoSphere CDC logreader only performs read-only operations when working with your database logsand will never write directly to these files. This reduces disk contention on yourlog files and minimizes the impact on your database.

Sybase - online and archive logsWhen using InfoSphere CDC with a Sybase database, you will require thefollowing:v Physical access to the online logs on disk.v Physical access to the archive logs on disk. All archive logs must be located in

one directory and InfoSphere CDC must have read permission for this directory.v Retention of the online logs and archive logs for as long as InfoSphere CDC is

shut down or latent.

Note: To maximize performance, the online logs and archive logs can be placed ondifferent disks.

To minimize the impact on your database, the InfoSphere CDC log reader onlyperforms read-only operations when working with your database logs and willnever write directly to these files. This reduces contention on your log files andminimizes the impact on your database.Related concepts

“Database and backup restrictions” on page 71

z/OS - archive logsInfoSphere CDC for z/OS requires the DB2 archive logs to capture changes for asource subscription that has fallen behind (due to, for example, slowness of thetarget to apply changes, or the subscription being shut down for an extendedtime). You can determine how many archive logs to keep and how long to keepthem. However, archive logs should be ideally stored on virtual tapes (DASD)rather than actual tapes. If a subscription requires reading read log data fromactual tapes, it may be difficult (or even impossible) for the subscription to catchup to the current Head of Log. For example, if you decide to restart mirroring on asubscription that stopped for one week, InfoSphere CDC must read through all thearchive logs that were generated in the past week and capture accumulatedchanges. If DB2 generates an average of one archive log every hour, thesubscription must read through 168 archive logs to catch up on the week ofchanges it missed. If the logs are on actual tapes, the time to read through thatvolume of tape, along with the time required for 168 physical tape mounts, maywell take more than a week, in which case the subscription will continue to fallfurther behind rather than catching up.

In the case where archive logs cannot be read quickly enough to eventually catchup the current Head of Log, it will be necessary to perform a Refresh for the

IBM Confidential


subscription in order to get the subscription caught up. The DSPACT commandcan be used to determine if a subscription is catching up or falling behind. Thestandard method is to issue two DSPACT commands one hour apart and examinethe output for the subscription in question. If the subscription advanced more thanone hour in the DB2 log during the hour of real time, then the subscription iscatching up. If the subscription advanced less than one hour in the DB2 log duringthe hour of real time, then the subscription is falling behind.

Even if the archive logs are on virtual tape, it is important that DB2 have enoughvirtual tape drives defined to it in order for subscriptions reading from archivelogs to catch up. Ideally, have two virtual tape drives defined to DB2 for everysubscription that is significantly behind the Head of Log, as each such subscriptionmay be reading a different section of the log from the other subscriptions, and anygiven log read may span two archive log datasets.

Under ideal operating conditions, InfoSphere CDC will be configured with asufficiently large Log Cache, such that all subscriptions will find the log data theyrequire in the Log Cache. In this highly desirable situation only the Log Cache willbe physically reading the DB2 log, and it will be reading for at or near the Head ofLog, so that no archive log datasets are being read by InfoSphere CDC.

If a subscription requires archive log datasets that are no longer available, thesubscription will issue error messages and shut down. At this point a Refresh ofthe subscription will be required to get the subscription caught up.

Database clusteringInfoSphere CDC supports Oracle Real Application Clustering (RAC) and SQLServer Clustering.

See also:“Oracle Real Application Clusters (RAC)”“Configuring InfoSphere CDC in a RAC environment” on page 38“Microsoft SQL Server clustering” on page 39“DB2 for z/OS data sharing groups” on page 39

Oracle Real Application Clusters (RAC)

To deploy InfoSphere CDC in an Oracle RAC environment, you can install theproduct on one node in the cluster or on a system outside of the cluster. In bothscenarios you must install InfoSphere CDC on the mount point of a SAN (StorageArea Network) or NFS (Network File System). InfoSphere CDC must have accessto all archived and online redo log files generated by all nodes in the cluster.Installing the product on a system outside of your RAC environment is theoptimum configuration for failover scenarios where a node may fail.

To integrate InfoSphere CDC into your RAC environment, you must first define anOracle service for the RAC environment in the tnsnames.ora file for each RACnode. You can then select this service when creating an instance of InfoSphere CDCfor your RAC environment with the InfoSphere CDC configuration tool. Tocomplete the integration, you can create an InfoSphere CDC failover script thatautomates the failover process.

An overview of InfoSphere CDC configuration in a RAC environment (withActive-Passive configuration) is provided in the following diagram:

IBM Confidential


Behavior of InfoSphere CDC in a RAC environment

InfoSphere CDC queries the v$archived_log and v$log Oracle views to locate thedatabase log files and attempts to access the files in the first log destination asdefined in your Oracle database. If the log files are unavailable in the firstdestination, InfoSphere CDC proceeds to the next destination until it finds therequired log files. To avoid latency, configure your Oracle database so that the firstdestination contains the required log files.

InfoSphere CDC replicates data from all RAC nodes at once and the stream of datafrom individual nodes is sorted (to account for data that is scrambled by logparallelism) and merged together for transaction consistency. InfoSphere CDC doesnot replicate data if the main node is closed. While replicating data, InfoSphereCDC opens connections to the source database and if these connections are closedfor any reason, InfoSphere CDC ends replication.

InfoSphere CDC detects failed nodes in approximately 21 seconds. After the nodefails, InfoSphere CDC continues to replicate data if the Oracle Cluster ReadyServices (CRS) is running and recovers and finalizes the online logs from the failednode. InfoSphere CDC ensures data integrity when replicating data from all othernodes in your RAC environment. Once the node is restored, InfoSphere CDCautomatically starts replicating from the restored node. If the nodes in your RAC

IBM Confidential


environment have an unbalanced load, InfoSphere CDC may experience a latencyof several times the Oracle checkpoint interval (3 seconds).

Configuring InfoSphere CDC in a RAC environment

InfoSphere CDC can be installed in a node which is part of the Oracle RAC, or itmight be installed on a node outside of the RAC environment. In either case, youmust install InfoSphere CDC on the mount point of a SAN. This configurationassures that, in the case of a failure of one of the nodes of the Oracle RAC,InfoSphere CDC will not require any configuration changes to continue to work.

If InfoSphere CDC is running on another node from the failed one, no userintervention is required. Instead, InfoSphere CDC will detect the node failure inapproximately 21 seconds and if Oracle Cluster Ready Services is running andrecovers, InfoSphere CDC will continue to replicate data (including the online logsfrom the failed node).

If InfoSphere CDC was running on the failed node, then it must be restarted froma different node. No changes are needed because the same binaries andconfiguration metadata is accessible from all nodes. If this is the case, to achieve anoptimal configuration to perform failover of InfoSphere CDC, consider threepossible scenarios:v Active source RAC node failure. In this case, the RAC node where the active

InfoSphere CDC source instance fails.v Active target RAC node failure. In this case, the RAC node where the active

InfoSphere CDC target instance fails.v Both active RAC nodes (source and target) fail.

In addition, the following needs to be satisfied:v Ability to restart InfoSphere CDC from a different location (that is, InfoSphere

CDC binaries and configuration, and operational metadata need to beaccessible).

v Reachability of InfoSphere CDC by external clients or processes (for example, forsubscriptions targeting the failed InfoSphere CDC instance).

With shared location configuration, restarting InfoSphere CDC requires no specialconfiguration changes.

With respect to reachability, consider both the accessibility of the host whereInfoSphere CDC is running, and accessibility of the database. To ensureaccessibility of the host, create an entry in the /etc/hosts file for each nodeinvolved in the environment using a common host name pointing to the IP addressof the current node. For example, in case of a two-node RAC environment, the/etc/hosts file in the first node should have the following entry:#cdc_host <IP address of first node>

In the second node, the /etc/hosts file should have the following entry:#cdc_host <IP address of second node>

Thus, the cdc_host host name is invariable, but is actually pointing to the properphysical IP address, depending on which node InfoSphere CDC is running. Toensure accessibility to the database is to follow a similar strategy. A special entry intnsnames.ora file should be created using the common host name:

IBM Confidential


SID_CDC=(DESCRIPTION=

(ADDRESS=(PROTOCOL=TCP)(HOST=cdc_host)(PORT=1521))(CONNECT_DATA=(server=DEDICATED)(SERVICE_NAME=SID))

)

Using this configuration method, when InfoSphere CDC tries to connect to thedatabase it will connect to the Oracle instance listening in port 1521 on hostcdc_host, and cdc_host will point to the proper IP address depending on the nodewhere it is running.

With the this approach, regardless which node fails, and which InfoSphere CDCsource or target have to failover, no changes in configuration are needed. All that isrequired is restarting InfoSphere CDC from the new location and perform somecleanup such as removing transaction queues, and cleaning staging store afterrestarting the instance from the new location.

The same approach should be used to ensure accessibility from clients such asManagement Console. When defining datastores in Access Server, use host namesthat, in case of failovers, can be easily changeable to the new real physical location.Once the IP switch is complete, restart Access Server. No other configurationchange is needed to operate InfoSphere CDC.

Microsoft SQL Server clusteringYou can configure InfoSphere CDC to operate and failover in a Microsoft SQLServer clustered environment. Clustering in Microsoft SQL Server providescontinuous access to resources in the event of a hardware failure, software failure,or some other interruption.

For more information on how to enable Active/Passive clustering support inInfoSphere CDC, see your InfoSphere CDC for Microsoft SQL Serverdocumentation.

DB2 for z/OS data sharing groupsInfoSphere CDC for z/OS can operate in DB2 subsystems that are part of a datasharing group.

Limitations:

v Only one InfoSphere CDC for z/OS address space can use an assigned securityidentifier, within the same DB2 Subsystem or Data Sharing Group of DB2Subsystems

v It is not possible for multiple executing instances of InfoSphere CDC for z/OS toshare the same set of metadata tables, as this could lead to the same subscriptionexecuting concurrently in two separate instances. To prohibit this from occurring,InfoSphere CDC serializes access to the metadata it is using. If another instanceof InfoSphere CDC attempts to use the same metadata tables, it will besuspended until the first instance has terminated its address space. This is trueeven if the two contending instances are executing in different images of theoperating system and attempting to access the single metadata instance within aDB2 Data Sharing Group.

v When you are installing InfoSphere CDC for z/OS in a Data Sharing Group, youshould specify the data sharing group name for the database name, not the localmember.

IBM Confidential


Operating system clusteringInfoSphere CDC supports active/passive two-node clusters on the UNIX andLinux operating systems. Operating system clustering provides continuous accessto resources in the event of a hardware failure, software failure, or some otherinterruption.

InfoSphere CDC supports failover situations such as hardware and softwarefailures as well as forced or manual failover scenarios. You must develop customscripts for InfoSphere CDC during failover and manual failover situations.

The following InfoSphere CDC replication engines support operating systemclustering:v InfoSphere CDC for DB2 for LUWv InfoSphere CDC for Oracle databasesv InfoSphere CDC for Sybase databasesv InfoSphere CDC for Informix

Database connection resiliencyInfoSphere CDC will make several long-running connections to your database andthe product will maintain these connections while running.

As a best practice and for optimum performance, you should avoid session limitson InfoSphere CDC database connections. Database connections established by theproduct will terminate once session limits are reached which will then forceInfoSphere CDC to reconnect to the database and restart subscriptions. This mayresult in an increase in product latency.

Replicating multibyte (MBCS) and double-byte (DBCS) character data

Note: InfoSphere CDC for DB2 for i does not support MBCS tablenames, column names, or transformations such as row filtering expressions,derived expressions, and derived columns. Contact your IBM technicalrepresentative for a suitable solution for your environment.

InfoSphere CDC replicates character data between a wide variety of encodings andwill automatically convert the data from the column encoding detected on thesource to the column encoding detected on the target. For example, you canreplicate multibyte character data such as Japanese, Chinese, or Korean. Characterdata in these languages cannot be represented in a single-byte. The most commonMBCS implementation is double-byte character sets (DBCS).

By default, InfoSphere CDC assumes that the data stored in a character capablecolumn is in the encoding associated with that column type. For instance, if yourdatabase is set to use Shift-JIS, then data stored in CHAR and VARCHAR columnsis assumed to be in Shift-JIS by default. However, InfoSphere CDC is onlyconcerned with the encoding of the data, not the encoding of the column storagetype. This flexibility allows the product to deal with situations where the encodingof the data does not match the encoding specified for the column in the database.The ability to override the detected encoding is determined by InfoSphere CDC.Overriding the detected column encoding allows you to specify the actualencoding of the data as known by you.

IBM Confidential


This functionality has been extended to not only standard character-capablecolumn types such as CHAR and VARCHAR, but also to traditionally Unicodecapable columns such as NCHAR and NVARCHAR, many traditionally binarycolumn types, as well as many large object (LOB) column types, whether or notthese are traditionally considered to be character-based. InfoSphere CDC treats allof them as being character data capable. To provide the greatest level of flexibilityand where permitted by the limitations of the database, InfoSphere CDCundertakes to remove the distinction between the data themselves and the datatype as known by the database that is used to contain the data.

There may be situations where you want to replicate the data exactly as is with nochange to encoding. In these situations, you can designate the column as beingbinary and the data will be replicated as is. All binary designated column datamust also be mapped to binary column data.

Encoding conversion can increase the workload for your source or target servers.InfoSphere CDC provides the ability to specify (with a subscription-levelpreference) where that workload will be incurred – on either the source or thetarget.

InfoSphere CDC also provides an upgrade process for subscriptions that use olderimplementations (InfoSphere CDC version 6.3 and earlier) of MBCS support.Management Console allows you to quickly convert subscriptions to theauto-encoding mode for MBCS data that is available in InfoSphere CDC version6.5.

In this section, you will learn:“Common encoding conversion scenarios”“Considerations when replicating MBCS character data” on page 43

Common encoding conversion scenariosUse the following scenarios as guidelines when you want to convert character setencodings between your source and target.

Scenario 1: Converting encoding between a DB2 z/OS sourceand a DB2® LUW target

In this scenario, you have a DB2 z/OS source database with data in SimplifiedChinese and a default database character set encoding of CCSID 935. Your targetdatabase is a DB2 for Windows system with data in Simplified Chinese and adefault character set encoding of GB18030 (CCSID 1392).

InfoSphere CDC will automatically detect the default database encoding of thesource and target columns in your table mappings. If the detected encodings areappropriate for your business needs, no further configuration is required.

Note: Because of the encoding differences between these platforms, it is importantto note that not all characters will convert with equivalent characters.

Scenario 2: Converting from a national language character set toUnicode

In this scenario, you have a configuration in which data needs to be convertedfrom a national language character set to Unicode. For example, the character setencoding on the source is Traditional Chinese, while the character set encoding on

IBM Confidential


the target (to which you want to convert) is Unicode. No configuration is requiredin Management Console.

Scenario 3: Overriding the database default encoding of acolumn as detected by InfoSphere CDC

The default encoding of a column in a database can be different from the dataitself. InfoSphere CDC allows you to deal with these situations by allowing you tooverride the detected encoding of a column.

For example, you have a source database with a default encoding of Windows-1252and you want to replicate CHAR data to your target. You also have Shift-JIScharacter data in CHAR columns in your source database. InfoSphere CDC willlikely detect that all CHAR columns in your source database are Windows-1252because this is the encoding of the column in your database or the default databaseencoding. InfoSphere CDC will determine if you can select an encoding that ismore appropriate for your data. If InfoSphere CDC allows it, you can override theWindows-1252 encoding and select Shift-JIS encoding for those columns thatcontain Shift-JIS data.

Scenario 4: Replicating mixed character data encodings on thesource to multiple encodings on the target

In this scenario, your source data is a mix of different character encodings but isprimarily IBM-943. Business requirements dictate that you must have the followingcharacter encodings on your target: IBM-943 and IBM-943c.

InfoSphere CDC allows you to override the detected encodings in your targetcolumns and set them to either IBM-943 or IBM-943c.

Scenario 5: Replicating character data with no change toencoding

You can replicate data as is with no change to the encoding by designating thesource and target columns in a table mapping as a binary with the Overrideencoding as binary option in Management Console.

In this scenario, your DB2 for z/OS source has data structures in a character field.The data structure contains EBCDIC characters, dates, and packed numbers. Youwant to use a user exit to split the data and send it to 10-20 fields on your target.You can use the Java getBytes function in your user exit to read the data andperform the data conversion. Since the getBytes function is only allowed on abinary field or a character field that is overridden in Management Console as abinary, you can use Override encoding as binary option for the source and targetcolumns in the table mapping. This will allow getBytes to retrieve data from thesource image as bytes.

The Override encoding as binary option is useful in scenarios where the charactercolumn does not contain characters only, but other data types with complexstructures.

Scenario 6: Overriding a binary field with an appropriateencoding for your data

In this scenario, you have source column with various encodings in a binary field.You use row filtering when replicating tables to the target so that you only have

IBM Confidential


one type of encoding on the target. In the source columns that are detected asBinary by Management Console, you can override the detected encoding type ofBinary and select the encoding that is appropriate for the actual data in eachsource column. In this scenario, you can have the same source table mapped to thesame target table, but the encodings on the source binary columns are differentfrom the single encoding that is found on your target.

Considerations when replicating MBCS character dataWhen mapping source columns to target columns that contain MBCS characterdata, consider the following:v Most databases enforce NCHAR and NVARCHAR as Unicode and the encoding

is not changeable.v All binary data types such as BLOB will not have a default encoding, although

InfoSphere CDC allows you to specify an encoding.v Ensure that the target column you want to send data to is large enough to store

the replicated character data.v Your data will not be replicated if you override the encoding for

graphic, vargraphic, and dbclob columns.v For InfoSphere CDC products that support XML replication, InfoSphere CDC

can only replicate XML compliant data to XML column types. Please see theXML specification for the compliance criteria.

v InfoSphere CDC will respect the mappings and apply the data according to thecharacter set that is configured for the data. InfoSphere CDC does not validatethat the character set can be inserted correctly into the column.

v Target tables must use the correct length values. Databases use character lengthsemantics, byte length semantics, or both.

v InfoSphere CDC version 6.5 performs encoding conversion on the target bydefault, with an option to specify the source. This is a change in behavior fromprevious releases of InfoSphere CDC that always performs encoding conversionon the source with no option to specify the target. Encoding conversion is a CPUintensive activity and you should take this into consideration when decidingwhere encoding conversion will take place.

v Before using MBCS functionality in InfoSphere CDC, you must ensure that youroperating system, database, and all tools used to enter data (such as a terminal)are properly configured for your MBCS environment by a system administrator.Otherwise, unexpected behavior may result.

v Java class user exits in InfoSphere CDC support MBCS character data. Allcharacter data is converted to Java String objects.

v InfoSphere CDC for Teradata does not support MBCS character data on the AIX®

operating system.

When mapping source columns to target columns that contain MBCS characterdata, consider the following:v Most databases enforce NCHAR and NVARCHAR as Unicode and the encoding

is not changeable.v All binary data types such as BLOB will not have a default encoding, although

InfoSphere CDC allows you to specify an encoding.v Ensure that the target column you want to send data to is large enough to store

the replicated character data.v Your data will not be replicated if you override the encoding for

graphic, vargraphic, and dbclob columns.

IBM Confidential


v For InfoSphere CDC products that support XML replication, InfoSphere CDCcan only replicate XML compliant data to XML column types. Please see theXML specification for the compliance criteria.

v InfoSphere CDC will respect the mappings and apply the data according to thecharacter set that is configured for the data. InfoSphere CDC does not validatethat the character set can be inserted correctly into the column.

v Target tables must use the correct length values. Databases use character lengthsemantics, byte length semantics, or both.

v InfoSphere CDC version 6.5 performs encoding conversion on the target bydefault, with an option to specify the source. This is a change in behavior fromprevious releases of InfoSphere CDC that always performs encoding conversionon the source with no option to specify the target. Encoding conversion is a CPUintensive activity and you should take this into consideration when decidingwhere encoding conversion will take place.

v Before using MBCS functionality in InfoSphere CDC, you must ensure that youroperating system, database, and all tools used to enter data (such as a terminal)are properly configured for your MBCS environment by a system administrator.Otherwise, unexpected behavior may result.

v Java class user exits in InfoSphere CDC support MBCS character data. Allcharacter data is converted to Java String objects.

v InfoSphere CDC for Teradata does not support MBCS character data on the AIXoperating system.

Continuous CaptureContinuous Capture is a product feature that is designed to accommodate thosereplication environments in which it is necessary to separate the reading of thedatabase logs from the transmission of the logical database operations. This isuseful when you want to continue processing log data even if replication and yoursubscriptions stop due to issues such as network communication failures over afragile network, target server maintenance, or some other issue. You can enable ordisable Continuous Capture without stopping subscriptions.

Continuous Capture allows you to avoid spikes in your source system CPUresource utilization by continuing to process log data (and write to disk asnecessary) even when subscriptions are stopped. This feature allows you to avoidsituations where the product uses no CPU when subscriptions are stopped andhigh CPU when you start subscriptions after a prolonged target system outage.

This functionality comes with the trade-off of additional disk utilization on thesource machine in order to accumulate changes from the database log file whenthese are not being replicated to the target machine. These trade-offs should beevaluated and understood before deciding to use this feature in your replicationenvironment.

For more information about enabling and using this feature, see the InfoSphereCDC documentation for your database platform.

Replicating XA transactionsInfoSphere CDC will replicate transactions involved in XA. InfoSphere CDC willmaintain any data dependencies between individual branches. Oracle's DTP facilitymust be used in RAC environments when multiple branches of a single XAtransaction will occur in that database

IBM Confidential


Replication of XA transactions is supported on the following database versions:v Oracle 10g and laterv IBM DB2 LUW version 9.7 and later

Replicating XA transactions is supported in DB2 pureScale and DPF environments.

DB2 for LUWThe following DB2 for LUW configuration settings should be considered prior toinstalling InfoSphere CDC:

“Enabling database log retention”“Creating a database backup”“Remote log reading”“Remote target apply”“Table-level considerations” on page 46“Replicating data in a Database Partitioning Feature (DPF) environment” onpage 47“Configuring InfoSphere CDC for a DB2 High Availability Disaster Recovery(HADR) environment” on page 47

Enabling database log retentionYou are required to enable log retention for each database that you intend to usefor replication with the logretain parameter in DB2 LUW. For more informationon how to complete this task, see the documentation for your version of DB2 forLUW. You must also enable log retention with the logretain parameter for newlyadded partitions in a DPF environment on your InfoSphere CDC source system.

Creating a database backupAfter enabling log retention with the logretain parameter in DB2 for LUW, youmust create a backup for each database that you intend to use for replication.Before issuing the database backup command at the DB2 for LUW command line,you must ensure that there are no users connected to the database. A databasebackup is also required if you are enabling log retention after adding a partition toa DPF environment on your InfoSphere CDC source system. For more informationon how to complete this task, see the documentation for your version of DB2 forLUW.

Remote log readingYou can deploy your source installation of InfoSphere CDC on a system that isremote from your source database.

To allow communication with the source database, you must install DB2 Client onthe machine where InfoSphere CDC is installed.Related concepts

“DB2 for Linux, UNIX, and Windows (LUW) – online and archive logs” on page 33

Remote target applyYou can deploy your target installation of InfoSphere CDC on a system that isremote from your target database.

IBM Confidential


To allow communication with the target database, you must install DB2 Client onthe machine where InfoSphere CDC is installed.

Table-level considerationsIn this section, you will learn:

“Supported table types”“Identity columns”“Replicating data in a DB2 pureScale environment” on page 47

Supported table typesInfoSphere CDC can replicate regular tables with indexes in a DB2 for LUWdatabase. Regular tables are described as "general purpose" by the DB2 for LUWdocumentation.

InfoSphere CDC will not replicate the following table types in a DB2 for LUWdatabase:v Multidimensional clustering (MDC) tablesv Range-clustered tables (RCT)v Partitioned tables

Identity columns

There are two kinds of DB2 LUW identity columns that InfoSphere CDC for DB2for LUW now supports: those which are GENERATED ALWAYS and those whichare GENERATED BY DEFAULT. When replicating these identity columns youshould consider the following.

GENERATED ALWAYS identity columns

You can map GENERATED ALWAYS identity columns in source tables to a targetcolumn in Management Console.

GENERATED ALWAYS columns in target tables are read-only and cannot bemapped.

GENERATED BY DEFAULT identity columns

InfoSphere CDC handles GENERATED BY DEFAULT identity columns in sourcetables similarly to columns without the identity property. The value of a sourcecolumn is retained when mapped to a target identity column. Target identitycolumns will use the value provided by the database if you set the initial value ofthe column to Database Default. For more information on how to set the initialvalue of a target column to Database Default, see your Management Consoledocumentation.

When replicating data to or from tables containing GENERATED BY DEFAULTidentity columns, consider the following:v Any column which has the initial value of the mapping set to Database Default

is not applied by InfoSphere CDC. InfoSphere CDC skips the column whenapplying data and the database provides a value.

v You cannot reference a target identity column that has the initial value set toDatabase Default in an expression. There is no restriction on the source columnsin a similar scenario.

IBM Confidential


v User exits cannot access the data for any target column that has an initial valueset to Database Default in Management Console.

User exits and identity columns

isDataAvailable(int) in the InfoSphere CDC API can be used in a user exit todetermine if there is data for a specific column. This method will return false for atarget column with an initial value set to Database Default in ManagementConsole. An exception is thrown if you attempt to retrieve the value from acolumn that is mapped to a target identity column.

Replicating data in a DB2 pureScale environment

When targeting a pureScale environment with InfoSphere CDC, the load directorymust be accessible to all members in your pureScale environment on yourInfoSphere CDC target system. For example, you can use an NFS (Network FileSystem) mount to make the load directory accessible to all partitions. InfoSphereCDC prefers to use the DB2 load utility when performing a refresh on the targetsystem.

Replicating data in a Database Partitioning Feature (DPF)environment

InfoSphere CDC for DB2 for LUW version 6.5.1 now supports the replication ofchange data from a DPF environment on a source system to any InfoSphere CDCversion 6.5.1 target deployment.

For more information on how to source or target a DPF environment withInfoSphere CDC, see your InfoSphere CDC for DB2 for LUW – End-userdocumentation.

Configuring InfoSphere CDC for a DB2 High AvailabilityDisaster Recovery (HADR) environment

You can configure InfoSphere CDC to replicate data in a DB2 HADR environment.During a failover (either manual or automatic) when the DB2 LUW workload istransferred from the primary server to the standby server, you can configureInfoSphere CDC to start replication to the new primary server.

Important: InfoSphere CDC does not support replication to a standby server.

To deploy InfoSphere CDC in a DB2 HADR environment, you must installInfoSphere CDC and create your InfoSphere CDC instances on a storage areanetwork (SAN) device that has direct file system access to the primary server andstandby server. You must also make the same ports available to the InfoSphereCDC on the primary and standby servers. By default, the InfoSphere CDC for DB2for LUW replication engine uses port 10901.

In this section, you will learn:“Starting replication to the new primary server after a failover”

Starting replication to the new primary server after a failoverIf the current primary server in your DB2 HADR configuration fails and thedatabase workload is transferred to the standby server, you must complete severaltasks before the product will resume replication to the new primary server. If the

IBM Confidential


primary and standby servers in your DB2 HADR configuration use the same IPaddress or hostname, you can fully automate this process with a script.

You must complete the following tasks after a primary server failure:1. Stop replication on all subscriptions with the dmendreplication command.2. Shut down all InfoSphere CDC processes with the dmshutdown command.3. Delete the following staging area files in the InfoSphere CDC installation

directory:v <instance_name>/conf/txq*

v <instance_name>/stagingstore/*

v <instance_name>/tmp/*

4. Restart your InfoSphere CDC instance on the new primary server with thedmts32 or dmts64 commands.

5. If the IP address or hostname of your primary and standby servers aredifferent, you must specify the IP address or hostname of the new primaryserver after each failover event by changing the target datastore for allsubscriptions in Management Console. For more information on how to do thisin Management Console, see your InfoSphere CDC for DB2 for LUWdocumentation.

6. Start replication to the new primary server with the dmstartmirror command.

DB2 for z/OSThe following z/OS and DB2 for z/OS configuration settings should be consideredprior to installing InfoSphere CDC:

“Extended CSA storage (ECSA)”“Estimating above the bar storage requirements”“DB2 batch connections and allied threads” on page 51“Log cache” on page 52“DB2 log buffers” on page 52“Code page conversion services” on page 52“Schema evolution” on page 53“Security Access Facility (SAF) and DB2 authorization” on page 56

Extended CSA storage (ECSA)InfoSphere CDC uses the DB2 Instrumentation Facility Interface (IFI) to obtaincopies of the changes made to database tables from the DB2 Log. The IFI requiresthat the application program (InfoSphere CDC) supply a storage buffer located inthe z/OS ECSA with storage key 7. A buffer (approximately 72KB in size or 288KBin size if a DB2 Data Sharing Group is being used) is required for each active IFIconnection. InfoSphere CDC will maintain an active IFI connection for eachsubscription for which it is actively mirroring data. InfoSphere CDC will alsomaintain one active IFI connection for the Log Cache, if it is configured. This mayrequire a reassessment of the size of ECSA as configured using z/OS IPLparameters, and will require an IPL if the value configured needs to be changed.

Estimating above the bar storage requirementsDuring ongoing mirroring activity, an active InfoSphere CDC subscription uses theDB2 IFI to access the DB2 Log and obtain the changes for source tables that arebeing replicated. The subscription collects the changes into commit groups basedon the DB2 Unit of Recovery in which the changes occurred. When the

IBM Confidential


subscription obtains an indication of a COMMIT or a ROLLBACK from the DB2Log, it closes the associated commit group and disposes of it. If a ROLLBACK wasdetected, then the commit group is discarded. If a COMMIT was detected, then thecommit group is transmitted to the target server for application to the mappedtables.

Before a commit group is closed, the active subscription must hold the changes ina storage area (stage them) until their method of disposal has been determined. Atany point in time, the active subscription could be staging many commit groupsconcurrently. The staging resource used is storage above the bar (64-bit addressablestorage). Each active subscription acquires 1 MB of storage above the bar when itstarts, and acquires more storage above the bar as it is required. When storage isno longer required, it is released, except that once 5 MB or more are acquired, thelast 5 MB are not released until the subscription ends. As changes for a commitgroup are being gathered, the subscription writes them into the above the barstorage. As the commit groups are closed and disposed of, the subscriptionremoves the changes from the above the bar storage and releases the storage foruse by other subscriptions.

During ongoing mirroring across multiple subscriptions, each active subscriptionwill have a number of incomplete commit groups, each with different amounts ofchanges staged. As a subscription continues to process DB2 Log data, the volumeof staged changes across all the commit groups will increase until a COMMIT orROLLBACK indication is received from the DB2 Log and a commit group isclosed. Then the changes contained in the commit group that has been closed willbe disposed of and the volume of staged data will decrease. This produces a “sawtooth” behavior for the volume of staged data across all active subscriptions, withthe volume rising gradually as data is staged, then dropping sharply when acommit group is disposed of. An estimate of the resource requirements to stagecommit groups will need to consider the highest potential value the “points” of thesaw tooth can attain. The practical highest resource requirement may be lower thanthis potential value.

In order to estimate InfoSphere CDC's requirement for storage resources, you mustbe familiar with the contents of the DB2 Log. Specifically, you must be familiarwith the profile of DB2 units or recovery that contains changes to tables to bemirrored by InfoSphere CDC. Such tables are called “sensitive” tables, and thecontaining DB2 units of recovery are called “sensitive” units of recovery. Youshould be able to identify all sensitive units of recovery based on knowledge of theapplications that update the sensitive tables. You should also be able to estimatethe maximum concurrency of each sensitive unit of recovery, and the maximumnumber of inserts, updates and deletes against sensitive tables that are contained inthe sensitive units of recovery. Once these estimates have been obtained, you cancalculate the size of the staging resources required.

“Calculating staging resources”

Calculating staging resourcesAssuming that M sensitive units of recovery have been identified, numbered 1 toM, and that N sensitive tables have been identified, numbered 1 to N, the valuesbeing used in the calculation are as follows:v URConcurrency_m—The maximum number of concurrently logged

(overlapping) copies of sensitive unit of recovery m (ranging from 1 to M).v INSERTs_m_n—The maximum number of inserts into sensitive table n (ranging

from 1 to N) that occur in unit of recovery m. This may be 0 for certaincombinations of m and n.

IBM Confidential


v UPDATEs_m_n—The maximum number of updates to sensitive table n thatoccur in unit of recovery m. This may be 0 for certain combinations of m and n.

v DELETEs_m_n—The maximum number of deletes from sensitive table n thatoccur in unit of recovery m. This may be 0 for certain combinations of m and n.

v RecordLength_n—The record length of sensitive table n.

For each value of m, calculate the following:

where 280 is the control information overhead, and UPDATEs_m_n is doubled toaccount for before and after images of the updated row.

Then calculate:

TotalChangeVolume is the outside maximum amount of storage contained instaging spaces in above the bar storage and which potentially can be required ofAuxiliary Storage to manage the staging of ongoing mirroring activity. Asmentioned earlier, the practical maximum may be lower than TotalChangeVolume.Some of the factors that can produce lower results are:v How well distributed (versus “clumped”) the commits and rollbacks are in the

DB2 Log.v How many changes are discarded by filtering criteria in the source environment.v How variable the estimates of INSERTs_m_n, UPDATEs_m_n, and

DELETEs_m_n are (these estimates are of maximum) from the average valuesfor specific m and n.

Example calculation

Suppose you have 2 sensitive tables, table_1 with a row length of 1200 bytes andtable_2 with a row length of 2100 bytes. Suppose further that there are three typesof URs involving these tables.

UR_1 can contain up to 100 inserts and 100 updates for table_1, and there can beup to 50 concurrent URs of this type. UR_Concurrency_1 is 50, INSERTs_1_1 is 100,UPDATEs_1_1 is 100, DELETEs_1_1 is 0 and RecordLength_1 is 1200.

UR_2 can contain up to 1000 updates for table_2, and there can be up to 100concurrent URs of this type. UR_Concurrency_2 is 100, INSERTs_2_2 is 0,UPDATEs_2_2 is 1000, DELETEs_2_2 is 0, and Record_Length_2 is 2100.

UR_3 can contain up to 200 inserts, updates, and deletes for each of table_1 andtable_2, and there can be up to 25 concurrent URs of this type. UR_Concurrency_3is 25, INSERTs_3_1 is 200, UPDATEs_3_1 is 200, DELETEs_3_1 is 200, INSERTs_3_2is 200, UPDATEs_3_2 is 200, DELETEs_3_2 is 200, RecordLength_1 is 1200, andRecordLength_2 is 2100.

ChangeVolume_1 is (280 + 1200) * (100 + 2 * 100 + 0), or 444,000.

IBM Confidential


ChangeVolume_2 is (280 + 2100) * (0 + 2 * 1000 + 0), or 4,760,000.

ChangeVolume_3 is ((280 + 1200) * (200 + 2 * 200 + 200)) + ((280 + 2100) * (200 + 2* 200 + 200)), or 1,184,000 + 1,904,000, or 3,088,000.

TotalChangeVolume is (50 * 444,000) + (100 * 4,760,000) + (25 * 3,088,000), or575,400,000. A staging space size (as specified by the MAXSUBSCRSTAGESIZE keyword)of 600MB would be sufficient.

In practice, it may be possible to reduce the size of the staging spaces below thevalue computed for TotalChangeVolume. The Storage Manager STATUS commandwill display how much storage is being used by each subscription’s staging space,and will also display the maximum amount of storage ever used by eachsubscription’s staging space. In addition, the Staging Space Report will display thedetailed contents of each subscription’s staging space.

DB2 batch connections and allied threadsWhile it is executing, InfoSphere CDC opens several DB2 plans under separatesub-tasks. Each DB2 plan is opened using the DB2 batch Call Attach Facility, andso each opened DB2 plan requires a DB2 batch connection and represents a DB2user. Running InfoSphere CDC may require an adjustment to the maximumnumber of batch connections and allied threads (users) that DB2 can support. Inorder to determine what these adjustments (if any) are, you will need to know themaximum number of open plans that InfoSphere CDC will require. InfoSphereCDC keeps its DB2 plans open only as long as they are required, and closes themas soon as they are no longer needed. Accordingly, you will need to know themaximum number of concurrent replication and support activities that can beongoing at one time.

The following list is a set of guidelines to help you determine the number of alliedthreads InfoSphere CDC will require, once you know how many concurrentlyactive replication and support activities you plan to use:v When InfoSphere CDC starts, it opens three DB2 plans.v If the Log Cache is configured, InfoSphere CDC opens one additional DB2 plan.v For each agent that is connected, and for the duration of that connection,

InfoSphere CDC opens one DB2 plan.v For each ongoing describe (describes are usually very quick), InfoSphere CDC

opens two DB2 plans. The first plan is closed shortly after the second plan isopened. This is true whether InfoSphere CDC is the transmitter or receiver ofthe described data.

v For each source subscription that is refreshing a table, InfoSphere CDC openstwo DB2 plans. The first plan is closed shortly after the second plan is opened.

v For each source subscription that is mirroring a table, InfoSphere CDC opensthree DB2 plans. The first plan is closed shortly after the second plan is opened.The third plan may remain open for the duration of the subscription's activity,unless there are no derived column expressions or user exits involved at thesource side of the subscription.

v For each target subscription that is receiving table changes from a sourceenvironment refresh or mirror activity, InfoSphere CDC opens two plans. Thefirst plan is closed shortly after the second plan is opened, but is brieflyreopened each time the source sends a bookmark during mirroring.

IBM Confidential


Log cacheInfoSphere CDC for z/OS uses the DB2 IFI to read data from the DB2 logs. Eachsubscription will use the IFI directly when a log cache has not been configured.When multiple subscriptions are replicating data concurrently without using a logcache, additional work is required from the source database to satisfy all therequests for data.

With a log cache configured, data is read from the DB2 IFI and retained in thecache. When the cache becomes full, the oldest data will be purged from the cacheas room is needed for new data from the DB2 logs. Each subscription will attemptto retrieve the data it needs from the log cache. When data is no longer availablefrom the cache (the subscription needs data after it has been purged from thecache), the subscription will call the DB2 IFI to retrieve its data.

If you have more than two subscriptions replicating data concurrently, you shouldconsider configuring a log cache. The log cache consists of a level 1 cache inmemory, and a level two cache.

DB2 log buffersInfoSphere CDC uses the DB2 IFI to obtain DB2 Log data when it is mirroringtable changes. The DB2 IFI will retrieve and present DB2 Log data from thearchived DB2 Log data sets, from the active DB2 Log data sets, or from the DB2Log buffers. The DB2 IFI presents data from these sources preferentially, based onthe speed with which it can obtain the data. The DB2 Log buffer is the mostpreferred, and archived DB2 Log data sets are the least preferred. When InfoSphereCDC is mirroring table changes in real time (Continuous Mirroring), it attempts toobtain DB2 Log data as soon as it has been written. If, for whatever reason,InfoSphere CDC should fall behind, it will eventually catch up to the "end" of theDB2 Log again (assuming that table changes can be replicated faster than data isbeing written to the DB2 Log). The larger the number of DB2 Log buffers, the moreoften DB2 will find the DB2 Log data being requested within the DB2 Log buffers,and the quicker the DB2 IFI will be able to deliver DB2 Log data to InfoSphereCDC. This will improve InfoSphere CDC's ability to catch up to the "end" of theDB2 Log and maintain a low latency when it is mirroring. It is suggested that youreview and possibly increase the number of DB2 Log buffers before runningInfoSphere CDC for mirroring. Additionally, you should monitor the use of DB2Log buffers during periods of active replication, both when scraping current dataand scraping historical data. Optimally, monitoring should show that no requestsfor DB2 Log buffers were deferred until a buffer became available.

Code page conversion servicesOne of InfoSphere CDC's replication features is the translation of text data fromthe code page of the source server to the code page of the target server. You canconfigure InfoSphere CDC for z/OS to use either Unicode Conversion Services orthe Language Environment's Code Page Conversion Services to perform text datatranslation.

Either way, InfoSphere CDC must have access to the Language Environment's codepage conversion tables that are distributed with z/OS.

Notes:

v For better performance when converting characters between different codepages, you should configure InfoSphere CDC to use Unicode ConversionServices instead of Language Environment's Code Page Conversion Services.

IBM Confidential


v By default, InfoSphere CDC attempts to use Unicode Conversion Services toperform translation of text data from one code page to another. If UnicodeConversion Services is not initialized or configured properly, InfoSphere CDCissues warning messages during initialization to indicate that UnicodeConversion Services is not available or conversions between specific code pagesare not available. In this case, Language Environment's Code Page ConversionServices are used.

Schema evolutionThe behavior of InfoSphere CDC for z/OS regarding Data Definition Language(DDL) operations performed on in-scope tables is dependent on several factors:v DDL statement typev DB2's DATA CAPTURE CHANGES settingv Version of InfoSphere CDCv Version of DB2

DDL statements only have impact on InfoSphere CDC for z/OS if the types ofDDL operations affect its ability to read the log. The following are examples ofthese types of DDL statements:v Adding columnsv Modifying column formats (such as data type, length or precision)v Changing column types (such as CHAR to VARCHAR)v Dropping and recreating tablesv Turning off DATA CAPTURE CHANGES

These operations cause data to be written to the log which will not be received byInfoSphere CDC for z/OS or cause the log record format of the table (the layout ofthe data in the database logs) to change. For operations where the log recordformat changes, the log reader must be directed how to proceed when a change isencountered. InfoSphere CDC metadata must be modified to accommodate thenew log record format; otherwise the log reader will fail to properly decode logrecords after the point of the DDL change. Changes that do not materially affectthe physical structure of the table in the log or the DATA CAPTURE CHANGES settingfor the table will not interrupt replication.

InfoSphere CDC for z/OS can detect most DDL changes on in-scope tables only ifDATA CAPTURE CHANGES is enabled on the SYSIBM.SYSTABLES system catalog tablein DB2. If it is not enabled, InfoSphere CDC for z/OS will be aware of DDLchanges when a log record is encountered which does not match the definition inthe metadata, but its actions will depend on the type of DDL operation and theversion of DB2.v Adding columns to tables is one type of DDL change which may not affect the

ability of InfoSphere CDC for z/OS to read the log, however this is dependenton the version of the database and the level of maintenance installed forInfoSphere CDC. The actions of InfoSphere CDC for z/OS on new columnsbeing added to a table is dependent on the version of DB2.– Adding columns in DB2 version 8—The new column will be placed as the

last field of the log record structure. Replication will continue but the newcolumn will not be replicated.

– Adding columns in DB2 version 9—RRF format tables were introduced. Thischanged the log records such that fixed length columns (example: CHAR) arerecorded in the beginning section of the log record structure followed by the

IBM Confidential


variable length columns (example: VARCHAR). A column added to the tablemay change the format of the fixed or variable sections of the log recorddepending on the DDL change made to the table. Replication will attempt tocontinue, however it may fail at the point where the log record is interpreted.PTF UK55661 is available for InfoSphere CDC for z/OS version 6.2.1 whichenables replication to detect added columns and continue by ignoring thenew column.

v Changing column formats—Most column format changes will cause replicationto fail at the point where the log reader attempts to interpret the log record.

v Dropping and recreating tables—The log reader will not detect that a table hasbeen dropped and recreated. Tables are recognized by their DBID, PSID andOBID in the database, which will most likely change when the table is recreated.This will cause the log reader to treat the table as not in scope for replication. Ifthe table is created with the same DBID, PSID and OBID then scraping willresume when subsequent changes are made to the table. There is a possibilitythat another table could be created with the original table's DBID, PSID andOBID. Should this occur, invalid data could be replicated although it wouldlikely fail on format validation. Ensure that “Update Source Table Definition” isrun in Management Console to avoid this possibility after dropping andrecreating tables which are in-scope for replication.

v DATA CAPTURE CHANGES turned off—The log reader will not detect this and nochanges for the table will be replicated as they will not be returned by the DB2IFI when change records are requested.

If DATA CAPTURE CHANGES is enabled on the SYSIBM.SYSTABLES system catalogtable in DB2, both the versions of DB2 and the version of InfoSphere CDC forz/OS determines the behavior:v If InfoSphere CDC for z/OS is version 6.2 or earlier and DB2 is version 8, the

log reader will recognize when DDL changes that affect replication areperformed. When changes occur which would cause the log reader to be unableto decode the record, the table will be set to IDLE and replication will continuefor other in-scope tables. You will need to update the source table definition inManagement Console.

v If InfoSphere CDC for z/OS is version 6.2.1 or version 6.2 with PTF UK45393 /CHC0077 applied and DB2 is version 9, the new default behavior from thisrelease forward is to perform an “immediate shutdown” of the subscription andlog an error. Optionally, the ONSCHEMACHANGE parameter can be set to causeInfoSphere CDC for z/OS to IDLE the table and replication will continue.

v If InfoSphere CDC for z/OS is replicating data from DB2 version 8 and acolumn is added, replication will continue for the table, but the new column willnot be included.

v If DATA CAPTURE CHANGES is turned off, the log reader will detect this,issue a message and end replication for the subscription.

Adding columns to RRF format tables in DB2 version 9

With InfoSphere CDC for z/OS version 6.2.1 with the PTF UK55661 applied andDB2 version 9 with PTF UK31435 when replication is running and a column isadded in an RRF table, replication will continue, but the new column will not beincluded. InfoSphere CDC will maintain a history of the table versions andautomatically update the table definition in its metadata. Note that when viewinga table in Management Console, you will need to refresh the view in order to seeany changes to tables that have occurred during your current session.

IBM Confidential


There are some specific exceptions to the ability of InfoSphere CDC to adapt toadded columns:v Adding columns with FIELDPROCs that change the length of the data will stop

replication.v If InfoSphere CDC for z/OS is not running when you are making changes to the

structure of the table, your procedure should include going into ManagementConsole and updating the source table definition.

v If you skip forward in the log (through a REFRESH, issuing SETLOGPOS orchanging a table's status from ACTIVE to IDLE and back to ACTIVE), this willcause InfoSphere CDC to miss changes to table structure recorded in the skippedsection of log and it may be unable to interpret a record whose version was notdetected.

v If a table is removed from the InfoSphere CDC for z/OScatalog, all its versionswill be removed from the history table. A REORG would be required to add thetable back to the catalog.

v In versions earlier than InfoSphere CDC for z/OS version 6.2.1 or version 6.2.1without the PTF applied or DB2 version 9 without the PTF applied, addedcolumns will cause replication to end for the subscription. Recovery will requirea REORG and Refresh of the table

For all changes discussed above which materially affect the structure of the logrecord (except those special cases noted) a REORG of the table will be required. Ifthe subscription has shut down in error, recovery will require a refresh of the table.To avoid the need for a refresh, the best way to handle these types of changes is toensure that InfoSphere CDC has replicated all data for the table to the point whereit is taken off line for the modification. A REORG will be required to the table sothat all future log records contain the correct format for modified columns. Youwill need to run the "Update Source Table Definition" in Management Console andre-map as necessary before restarting replication.

The tables below show a synopsis of the behavior for the database versions andlevel of InfoSphere CDC for z/OS installed. As noted above, there are DDLchanges which may affect the log read and those that will not. The ‘ColumnAdded' is listed separately because while it may affect the log read the behavior isdifferent for these DDL changes than for others affecting the log read.

Table 2. Replication behavior with DB2 version 8

DDL change Version 6.2.1Version 6.2.1 + PTFUK45393

Version 6.2.1 + PTFUK55661 Version 6.5

Column added Continues(without thenew column)

Continues (without thenew column)



Not affecting logread

Continues Continues Continues Continues

Affecting log read Table is idledand replicationcontinues

Stops unlessONSCHEMACHANGE is set toIDLE



IBM Confidential


Table 3. Replication behavior with DB2 version 9

DDL change Version 6.2.1Version 6.2.1 + PTFUK45393

Version 6.2.1 + PTFUK55661 Version 6.5

Column added Table is idledand replicationcontinues

Stops Continues (without thenew column)


Not affecting logread

Continues Continues Continues Continues

Affecting log read Table is idledand replicationcontinues




Security Access Facility (SAF) and DB2 authorizationDB2 supports two approaches to security authorization:v GRANT and REVOKE SQL statements, the original implementation of access

control in DB2v DB2 Access Control Module (ACM), a newer method for centralizing access

control of DB2 resources within a common facility (RACF®, ACF2, TopSecret,etc.).

Each address space that executes programs on a z/OS system must do so underthe control of a security identifier. The installation process requires that such asecurity identifier be assigned for use by the InfoSphere CDC for z/OS addressspace. It is under this security identifier that access to DB2 resources by InfoSphereCDC will be validated.

To establish a connection with a InfoSphere CDC for z/OS replication enginerunning on a server, you must specify access parameters that include a valid useridentifier. For security reasons, the user identifier must be created by your systemadministrator through a SAF-compliant security administration product. This useridentifier is the InfoSphere CDC replication engine's security identifier.

The InfoSphere CDC for z/OS address space's security identifier is GRANTedSYSCTRL authority during installation. This gives the InfoSphere CDC for z/OSaddress space the ability to start a MONITOR TRACE and access the DB2Catalogue tables. When a Management Console user logs onto an InfoSphere CDCreplication engine task, their user identifier and password are validated using aSAF call (which interrogates RACF, TopSecret, and so on). Once the user haspassed validation, their primary identifier (security identifier) and secondaryidentifiers (connected-to groups) are used to determine whether the user can accessa DB2 resource at the level required to perform a specific action.

Depending on which approach to security authorization has been chosen for DB2,InfoSphere CDC for z/OS will use the same approach to validate InfoSphereCDC's access to the DB2 resources it is attempting to access. If GRANT orREVOKE is being used by DB2, then InfoSphere CDC for z/OS interrogates theDB2 authority tables to see if InfoSphere CDC has the access at the appropriatelevel to the DB2 resources. If a DB2 ACM is installed, then InfoSphere CDC forz/OS will call the installed ACM for authority validation instead of querying theDB2 authority tables.

Once access to the DB2 resources at the required level has been validated for theInfoSphere CDC replication engine's security identifier, the actual access occurs

IBM Confidential


under the authority of the InfoSphere CDC for z/OS address space. This meansthat it will be necessary to authorize access to the DB2 resources at the appropriatelevel to the InfoSphere CDC for z/OS security identifier for those resources thatthe InfoSphere CDC replication engine will access. Typically, this is done byauthorizing DBADM authority over the databases containing the resourcesinvolved to the InfoSphere CDC for z/OS security identifier.

For this reason, the user identifier used to logon to InfoSphere CDC must beassigned the appropriate database authorities and privileges to perform commonsubscription and table operations in Management Console. The following tableidentifies common Management Console subscription and table operations that canbe performed under different authority and privilege levels.

Function SYSADM SYSCTRL DBADM CREATEABAUTH SELECT

INSERT,UPDATE,DELETE

Adding tables U X U X U X

Viewingsubscriptions

U X U X U U

Assigning tables U X U X X U

Creatingsubscription tables

U U U U X X

Considerations:v SYSADM authority grants full database privileges to the z/OS user identifier. It

allows the identifier to work with all subscriptions and tables accessible throughthe replication engine.

v DBADM authority only applies to a single database. If a subscription containsone or more tables that reside in a different database and the z/OS useridentifier does not have the authority to work with the tables, the subscriptionwill not be presented through Management Console.

v INSERT, UPDATE, and DELETE privileges must all be granted to the z/OS useridentifier in order to perform the supported Management Console operationsindicated for the privileges in the table above.

If you are unable to perform desired operations with the z/OS user identifier thatis specified as an access parameter, consult your system administrator to determinethe database authority or privileges currently granted to the user identifier.

InfoSphere CDC for InfoSphere DataStage

The following configuration settings should be considered prior to installingInfoSphere CDC for InfoSphere DataStage:

“Considerations for InfoSphere CDC for InfoSphere DataStage”

Considerations for InfoSphere CDC for InfoSphere DataStageWhen using the Direct Connect connection method for InfoSphere CDC forInfoSphere DataStage, you should keep the following things in mind:v In order to use the autostart function in the Direct Connect method on a UNIX

or Linux system, ensure that you have correctly set the database librarydirectory in the dsenv file for InfoSphere Information Server. See the topic on"Ensuring that InfoSphere DataStage users have the correct localization settings

IBM Confidential


(Linux, UNIX)" and "Configuring the dsenv file" in the InfoSphere InformationServer Planning, Installation, and Configuration Guide.

v Running with autostart enabled requires both InfoSphere CDC and InfoSphereDataStage to be installed on the same server. If autostart is not enabled, youmust run jobs from InfoSphere DataStage before the Direct Connect data streamcan begin. For instructions on enabling autostart, see the InfoSphere DataStagedocumentation.

v The InfoSphere CDC for InfoSphere DataStage Direct Connect connectionmethod does not maintain the ordering of operations across tables, though itwill maintain the transactional boundaries that existed on the source. Forexample, if the transaction on the source wasINSERT TABLE1INSERT TABLE2COMMIT

then both of the INSERT actions will be completed by InfoSphere DataStagebefore it performs the COMMIT action, however, because the order of operationsis not maintained, there is no way to determine the order in which the INSERTactions will be done prior to being committed to the target database.

Informix Dynamic ServerThe following Informix Dynamic Server configuration settings should beconsidered prior to installing InfoSphere CDC:

“InfoSphere CDC API”

InfoSphere CDC APIYou must prepare the Informix database and database server to use the InfoSphereCDC API before installing and configuring InfoSphere CDC.

To do this, an Informix database with which you want to replicate data must existon your system. See the IBM Informix Getting Started Guide if you have not installedthe product and have not set up an Informix database.

Microsoft SQL ServerThe following Microsoft SQL Server configuration settings should be consideredprior to installing InfoSphere CDC:

“Transaction log backup plan”“Remote target apply” on page 59“SQL Server table-level considerations” on page 59“ROWVERSION data type” on page 61“TCP/IP and ports” on page 62“Database services” on page 62“SQL Server replication” on page 62“Recovery model” on page 62“Database backup” on page 63

Transaction log backup planInfoSphere CDC supports backups of your transaction logs with the Microsoft SQLServer Maintenance Plan or with a manual process in Microsoft SQL Server.

IBM Confidential


Note the following for transaction log backups:v Each backup must create a new file and should not append or replace an

existing file.v Compressed or encrypted backups are not supported.v Third party tools for backups are not supported. InfoSphere CDC for Microsoft

SQL Server can only read and process files generated directly by SQL Server andnot altered by any other process or procedure. InfoSphere CDC for MicrosoftSQL Server cannot read files which do not correspond directly to the SQL Servertransaction log format.

v Backups that are larger than 1GB may degrade InfoSphere CDC performance.InfoSphere CDC for Microsoft SQL Server will pre-parse the transaction logbackup file prior to processing the file in order to build an internal map of thefile, so large files will cause a large delay during the pre-parse phase. For bestperformance and reduced disk I/O, it may be preferable for InfoSphere CDC ifthe log files are stored on disk volumes which are cached by the operatingsystems and the log size comfortably fits within the disk cache of the operatingsystem.

If you need more information on how to perform this task, see your databaseadministrator or refer to your Microsoft SQL Server documentation.

Remote target applyIf you are deploying InfoSphere CDC for Microsoft SQL Server as a target, youhave the option of installing the product on a system that is remote from the targetdatabase. To allow InfoSphere CDC to communicate with the target database, youmust install Microsoft SQL Server connectivity tools on the server where youinstall InfoSphere CDC.

Depending on your target database, you must install the following software on theserver with InfoSphere CDC:v Microsoft SQL Server 2008—SQL Server 2008 Management Objects (SMO)v Microsoft SQL Server 2005—SQL Server 2005 Management Objects (SMO)v Microsoft SQL Server 2000—SQL Distributed Management Objects (SQL-DMO)

For more information, see your Microsoft SQL Server documentation.

SQL Server table-level considerationsIn this section, you will learn:

“Tables with clustered and non-clustered indexes”“Source tables and primary keys” on page 60“Compressed tables” on page 60“Computed columns” on page 60“Identity columns” on page 60“Columns with database defaults” on page 61“Partitioned tables” on page 61“Sparse columns” on page 61

Tables with clustered and non-clustered indexesInfoSphere CDC supports replicating data from tables containing a clustered index,tables with a mix of clustered and non-clustered indexes, and tables havingnon-clustered indexes only. Only one clustered index can exist per table.

IBM Confidential


Source tables and primary keysInfoSphere CDC supports the replication of:v Source tables with a primary key.v Source tables without a primary key when using Microsoft SQL Server 2000—No

configuration is required in your database.v Source tables without a primary key when using Microsoft SQL Server

2008—You must configure the Change Data Capture feature in Microsoft SQLServer 2008.

Note: In Microsoft SQL Server 2008, replicating data in a table without a primarykey will cause all the changed row data to be logged to a table in your SQL Serverdatabase. Depending on circumstances, this can result in a dramatic performancedecrease and so should be undertaken with appropriate caution. Please consultyour Microsoft SQL Server documentation for further details on how the ChangeData Capture feature of Microsoft SQL Server 2008 is implemented and its likelyperformance impact.

Compressed tablesInfoSphere CDC does not support tables with compression (either row or pagelevel) or with vardecimal columns (a type of compression).

Computed columnsInfoSphere CDC can replicate data in computed columns.

When replicating computed columns, consider the following:v Computed columns are read from the database at the time of replication.

Therefore, only the current image of a computed column field in a source tablewill be sent at the time of replication.

v You cannot reference computed columns in row-filtering expressions.v You cannot reference computed columns in an expression defined for a derived

column.v If the computed column is a LOB, then all the LOB considerations apply.

Identity columnsIdentity columns in source tables are handled similarly to columns without theidentity property. The value of a source column is retained when mapped to atarget identity column. Target identity columns will use the value provided by thedatabase if you set the initial value of the column in Management Console toDatabase Default.

When replicating data to or from tables containing identity columns, consider thefollowing:v Any column which has the initial value of the mapping set to Database Default

is not applied by InfoSphere CDC. InfoSphere CDC skips the column whenapplying data and the database provides a value.

v You cannot reference a target identity column that has the initial value set toDatabase Default in a derived expression. There is no restriction on the sourcecolumns in a similar scenario.

v User exits cannot access the data for any target column that has an initial valueset to Database Default in Management Console.

v isDataAvailable(int) in the InfoSphere CDC API can be used in a user exit todetermine if there is data for a specific column. This method will return false fora target column with an initial value set to Database Default in Management

IBM Confidential


Console. An exception is thrown if you attempt to retrieve the value from acolumn that is mapped to a target identity column.

Columns with database defaultsColumns with a database default value in the target database can either beoverwritten with values coming from the source table or be set to have the initialvalue of Database Default in Management Console. InfoSphere CDC skips anycolumn that is set to the initial value of Database Default when applying data ontarget. The column must be nullable or have a database default value defined inthe target database, otherwise SQL exceptions may occur when InfoSphere CDCapplies data on the target.

When replicating data to tables containing columns having database defaults,consider the following:v All supported data types are supported with database defaults.v If you map a source column to a target column containing a database default,

the source value will be inserted in the target column.v You cannot reference target Database Default columns from the target in a

derived expression.v User exits cannot access the data for any target column that has an initial value

set to Database Default in Management Console.v isDataAvailable(int) in the InfoSphere CDC API can be used in a user exit to

determine if there is data for a specific column. This method will return false fora target column with an initial value set to Database Default in ManagementConsole. An exception is thrown if you attempt to retrieve the value from acolumn that is mapped to a target database default column.

Partitioned tablesInfoSphere CDC does not support partitioned tables.

Sparse columnsInfoSphere CDC does not support the sparse columns feature in SQL Server 2008.

ROWVERSION data typeInfoSphere CDC supports source and target tables that contain the ROWVERSIONdata type. The ROWVERSION data type in Microsoft SQL Server cannot bemodified externally. If you want to retain the ROWVERSION data type in yoursource table, map the column in InfoSphere CDC to a BINARY(8) data type on thetarget. If you want the source table schema to be identical to the target tableschema, set the target column to an initial value of Read Only in InfoSphere CDC.

When replicating data to tables containing the ROWVERSION data type, considerthe following:v Replicating the ROWVERSION data type from a source table when mapped to a

binary data type on the target has similar restrictions to the replication ofBINARY data types

v ROWVERSION when present both in source and target table will containdifferent values when replicating values for other columns.

v ROWVERSION data types in the target table are not applied by InfoSphereCDC. InfoSphere CDC skips the column when applying data and the databaseprovides a value.

IBM Confidential


v When mapping a column to ROWVERSION on the target, you cannot referencetarget ROWVERSION columns in a derived expression. There is no restriction onthe source columns in a similar scenario.

v User exits cannot access the data for any target ROWVERSION columns. Thereis no restriction on the source columns in a similar scenario.

v isDataAvailable(int) in the InfoSphere CDC API can be used in a user exit todetermine if there is data for a specific column. This method will return false fora target column having an initial value set to Read Only. isReadOnly(int) willreturn true for these columns. An exception is thrown if you attempt to retrievethe value from a column that is mapped to a target ROWVERSION column.

TCP/IP and portsYou must enable the TCP/IP network protocols in Microsoft SQL Server andspecify a static port. You must also disable dynamic ports. You must specify aTCP/IP static port when you configure InfoSphere CDC.

If you need more information on how to perform this task in Microsoft SQL Server,see your database administrator or refer to your Microsoft SQL Serverdocumentation.

Database servicesMicrosoft SQL Server services usually start automatically but must be runningduring the installation and operation of InfoSphere CDC.

For information about working with the Microsoft SQL Server services, see yourMicrosoft SQL Server documentation.

SQL Server replication

Note: This task is required for source installations of InfoSphere CDC that areusing SQL Server 2005 and SQL Server 2008. It is optional if you are using SQLServer 2000.

InfoSphere CDC requires that you enable Microsoft SQL Server replication in yourdatabase to ensure the data required by InfoSphere CDC is available in thedatabase transaction log. InfoSphere CDC supports a local Distribution database(local Distributor) or a remote Distribution database (remote Distributor). If youare using a remote Distribution database, you are required to install InfoSphereCDC on your database server, not the Distribution database server.

After enabling SQL Server replication, SQL Server will create jobs that are usedexclusively by SQL Server and are not used by InfoSphere CDC. Stopping any ofthe jobs may negatively affect SQL Server performance.


Recovery modelInfoSphere CDC requires your Microsoft SQL Server recovery model to be eitherFULL or BULK_LOGGED. The following example uses SQL statements toconfigure the ‘pubs' database:ALTER DATABASE pubs SET RECOVERY FULL

IBM Confidential



Database backupInfoSphere CDC requires a full database backup to start database logging in theFULL or BULK_LOGGED recovery model. The recovery model will remain‘simple' until you perform a full database backup. The following example uses aSQL statement to perform a full backup for the ‘pubs' database:BACKUP DATABASE pubs TO DISK = ’c:\mssql\backup\pubs.bak’


NetezzaThe following Netezza configuration settings should be considered prior toinstalling InfoSphere CDC:

“Installation considerations”“Netezza JDBC drivers”

Installation considerationsBefore you install InfoSphere CDC for Netezza databases, you should be aware ofthe following installation considerations:v You can only install one instance of InfoSphere CDC for Netezza databases on a

single Netezza database.v InfoSphere CDC must not be installed on the Netezza appliance.

Netezza JDBC drivers

Netezza JDBC driver

Netezza JDBC driver, version 6.0.3 or above.

OracleThe following Oracle configuration settings should be considered prior to installingInfoSphere CDC:

“Supplemental logging” on page 64“ARCHIVELOG mode” on page 64“Log parallelism” on page 65“Log shipping” on page 65“Log space for latency” on page 65“Automatic Storage Management (ASM)” on page 66“Remote log reading” on page 66“Remote target apply” on page 66“Tablespace for InfoSphere CDC metadata” on page 67“Undo tablespace for transaction rollbacks” on page 67“Read-only database connections” on page 67“Read-only tables” on page 67“Oracle table-level considerations” on page 67

IBM Confidential


“Disk quota for capture components” on page 68“Bulk load refresh” on page 69“Oracle listener” on page 69“Database constraints” on page 69

Supplemental loggingOracle supplemental logging simply means that all columns or selected columnsare specified for extra logging. InfoSphere CDC requires supplemental logging onthe source database at both the database level and table level.v Database level supplemental logging—This is an Oracle requirement that

ensures that the Oracle redo log contains the information required to describe alldata changes completely. The appropriate setting for database levelsupplemental logging (by database version) is:– Oracle 9i—Enable minimal supplemental logging. This is the default level of

supplemental logging in this version of Oracle.– Oracle 10g or later—Enable minimal database level supplemental logging.

You must set this value explicitly in this version of Oracle because the defaultlevel of supplemental logging is not sufficient. To check if minimalsupplemental logging has been enabled at the database level, run thefollowing SQL statement: select supplemental_log_data_min fromv$database;. If supplemental logging is enabled, the returned value will beYES or IMPLICIT.

v Table level supplemental logging—InfoSphere CDC also requires fullsupplemental logging at the table level for those tables you have selected forInfoSphere CDC to replicate using the mirroring replication method. Thissupplemental logging is typically handled by InfoSphere CDC, unless you areusing a read-only database connection to the source database. For a read-onlydatabase environment, before configuring subscriptions that involve tables formirroring, ensure that you have sufficient supplemental logging enabled forthose tables. During InfoSphere CDC subscription configuration, the applicationchecks to see if the required logging is enabled. If sufficient table supplementallogging is not enabled, then InfoSphere CDC will return errors and does notcomplete the configuration. The appropriate setting for table level supplementallogging (by database version) is:– Oracle 9i—Enable full supplemental logging with conditional supplemental

log groups.– Oracle 10g or later—Enable full supplemental logging with conditional or

unconditional supplemental log groups.

To enable supplemental logging at the database level and table level, contact yourOracle database administrator. For more information on the command that enablessupplemental logging, see your Oracle documentation.

ARCHIVELOG modeInfoSphere CDC requires uninterrupted access to Oracle redo logs. Therefore, youmust enable the archiving of Oracle redo logs. Make sure that the source databaseinstance is operating in ARCHIVELOG mode. This lets InfoSphere CDC read datafrom archived Oracle redo logs instead of online redo logs, in the event thatexcessive latency occurs during mirroring. To set the source database instance toARCHIVELOG mode, contact your database administrator. You can also verify ifarchive logging is enabled for the source database by issuing the following SQLstatement: select log_mode from v$database; If archive logging is enabled, thereturned value will be instance to ARCHIVELOG mode. For more information, see

IBM Confidential


your Oracle documentation. CAUTION: If you do not set the database instance toARCHIVELOG mode, you may experience data loss.

InfoSphere CDC must have direct access to the archive log files. You can installInfoSphere CDC on the same node that has access to the archive log files.However, if your database instance is managed by Oracle Automatic StorageManager (ASM), then you can install InfoSphere CDC any node. Please be awarethat reading archive and redo logs from ASM can be significantly slower thanreading the logs through the file system. If the database produces high volumes oflogs, you should consider multiplexing the logs by configuring a log destinationoutside ASM.

Log parallelismInfoSphere CDC supports the replication of Oracle databases with log parallelismstarting from Oracle 10g (release 10.1.0.4 or later).

InfoSphere CDC will also replicate from Oracle RAC nodes that have logparallelism enabled.

Log shipping

InfoSphere CDC for Oracle databases can be configured to use copies of completeOracle archive logs that are shipped to a system that is local to InfoSphere CDC forOracle databases.

To use this feature, InfoSphere CDC for Oracle databases must be configured toonly use Oracle archive logs. As a result, InfoSphere CDC for Oracle databaseslatency is affected by the Oracle log switch frequency and the amount of timerequired to physically ship the logs to the remote destination. Latency mayincrease if the log switch interval and the log shipping time increase.

You can ship your database logs with Oracle Data Guard log transport services orwith a customized log shipping process that you develop and maintain.

Log shipping can address performance issues if the CPU on your source databaseserver is overloaded and is unable to accommodate the additional CPU resourcesthat InfoSphere CDC for Oracle databases requires. Log shipping improvesperformance by moving the processing to a different server.

This type of configuration may be used to resolve a Log Reader bottleneck.

Note: Log shipping can also be used to remove dependencies between databaselog retention and InfoSphere CDC for Oracle databases. The archive logs on yourdatabase server can be removed once they are shipped to the shared volume.Related concepts


Log space for latencyInfoSphere CDC for Oracle databases requires sufficient disk space for log filesaccumulated for a latent subscription, such as a subscription set for continuousmirroring.

There are circumstances in which InfoSphere CDC reads from archive logs, thuscausing increased latency. Such circumstances might include InfoSphere CDC being

IBM Confidential


shut down for a period of time prior to being restarted, or an unusually high loadon system resources on the source system. InfoSphere CDC requires access to thearchive logs so that it can continue to read the accumulated changes. If thesubscription is very latent, this can mean InfoSphere CDC must read log files thatmay be hours, days, or even weeks old before it can eventually catch up to thecurrent position in the archive log files.

Automatic Storage Management (ASM)InfoSphere CDC supports Oracle ASM and requires the following:v An Oracle ASM connection (user name and password).v Physical access to the underlying RAW block device storage.

Automatic Storage Management (ASM) Cluster File System

InfoSphere CDC supports database replication for an ASM Cluster File System, butInfoSphere CDC cannot be installed within an ASM Cluster File System.

Remote log readingYou can configure InfoSphere CDC for environments where a source deploymentof InfoSphere CDC does not have direct access to the Oracle online redo log filesand archived log files because the product is installed on a different machine fromyour source database.

The InfoSphere CDC log reader supports only direct access to the Oracle redo logfiles with a shared Storage Area Network (SAN) file system or remote access witha shared Network File System (NFS) mount. There is no support for customoptions such as manually copying files.

By default, the product is configured to read both online redo log files andarchived log files. This provides for low latency replication as the online log iscontinuously written by Oracle and read by the InfoSphere CDC log reader.However, the product can also be configured for reading archive log files only.Related concepts

“Log shipping” on page 65“Remote target apply”

Remote target applyYou can deploy your target installation of InfoSphere CDC on a system that isremote from your target database.

This type of configuration requires the following:v Install an Oracle Client on the machine where InfoSphere CDC is installed to

allow communication with the remote target database.v Add a TNS entry for your remote Oracle database to the tnsnames.ora file on

the machine where InfoSphere CDC is installed.v Select the TNS entry for the remote Oracle database when configuring your

instance of InfoSphere CDC.

IBM Confidential


Related concepts


Tablespace for InfoSphere CDC metadataInfoSphere CDC requires a minimum tablespace of 25 MB for product metadata.

The size of the minimum required tablespace will vary depending on the size ofyour replication configuration. Environments with many concurrent Oracle sessionscould require 100-200 MB.Related concepts

“Undo tablespace for transaction rollbacks”

Undo tablespace for transaction rollbacksInfoSphere CDC deployed as a target requires a sufficient amount of Oracle undotablespace for transaction rollbacks.

Undo tablespace requirements can be determined by calculating the size of thelargest database transactions that InfoSphere CDC will process in a productionenvironment. For example, if the sum of all rows changed in the largest transactionin your production environment is 2 million and the average row size for thetransaction is 2 KB, InfoSphere CDC will require an undo tablespace of 4 GB (2 M* 2 KB = 4 GB).

When refreshing an entire table with InfoSphere CDC, you will also requiresufficient undo tablespace for an insert of the entire table.

Insufficient undo tablespace may result in database errors.Related concepts

“Tablespace for InfoSphere CDC metadata”

Read-only database connectionsInfoSphere CDC supports read-only database connections to your Oracle database.This type of configuration is useful if your business requirements do not allowwrite access to your source database.

To support read-only database connections, you must create a read-only user andenable table-level supplemental logging in your database before installing andconfiguring InfoSphere CDC.

Read-only tablesInfoSphere CDC provides limited replication support on read-only tables.InfoSphere CDC cannot apply replicated data to a read-only table that exists in thetarget database and only supports replication from read-only tables on the sourcedatabase.

Oracle table-level considerationsIn this section, you will learn:

“Compressed tables” on page 68“Index Organized Tables (IOT)” on page 68“Encrypted tables” on page 68“Partitioned tables” on page 68

IBM Confidential


“Interval partitions”“Clustered tables”

Compressed tablesIn previous versions of InfoSphere CDC, partial support for tables withcompressed data could be achieved with the use of table partitioning. InfoSphereCDC now supports replication of tables with compressed data. This means thatyou can map compressed tables in Management Console. InfoSphere CDCsupports compressed tables for Oracle databases 11.2 or later. Hybrid ColumnarCompression available in Oracle Exadata version 2 is not supported at this time.

Index Organized Tables (IOT)InfoSphere CDC supports a subset of Oracle IOT operations starting with Oracle10g (release 10.1.0.4 or later). Examples of supported operations are individualinserts, updates, and deletes. The operations must be done one at a time on thesource database and not as part of a parallel or bulk operation such as INSERT INTOSELECT * FROM or a merge statement.

InfoSphere CDC does not support the following IOT operations:v An IOT comprised of character data.v Populating an IOT using SQL*Loaderv LOB columns with IOTs

Encrypted tablesInfoSphere CDC supports the replication of encrypted columns to the targetdatabase as a binary but will not decode the columns. Other processes in theproduction environment must be in place to decode binary columns.

Partitioned tablesInfoSphere CDC for Oracle databases supports partitioned tables.

Partition changes (adding or dropping) during replication are detected by theproduct. No user intervention is required.

Interval partitionsThe replication of tables with interval partitions is supported by InfoSphere CDCfor Oracle databases only when the tables are part of a Rules-based mapping forDDL Replication.

Tables with interval partitions cannot be replicated as part of Direct mappings.

For more information on Rule-based mappings and Direct mappings, see “Workingwith rule sets” in the Management Console Administration Guide.

Clustered tablesInfoSphere CDC does not support Oracle clustered tables.

Disk quota for capture componentsInfoSphere CDC uses specific components to capture changes from the databaselog. These components accumulate uncommitted transactions and data changes inmemory. InfoSphere CDC dynamically allocates memory between thesecomponents as required and only persists data to disk in the following scenarios:v An especially large uncommitted transaction is being accumulated.

IBM Confidential


v A latent subscription. For example, you may know of a subscription or set ofsubscriptions that will only run periodically which will cause the change log togrow. If sufficient memory to store the data for this latent subscription is notavailable, the data will be persisted to disk. Also, on shutdown or when thissubscription is stopped, InfoSphere CDC persists the data to disk.

You can optionally choose to bound disk space utilization by setting a quota usingthe mirror_global_disk_quota_mb system parameter. When the disk quota is met,InfoSphere CDC will stop all running subscriptions and generate an event in theEvent Log. No data is lost in this scenario.Related concepts

“RAM requirements” on page 84“Disk requirements” on page 85

Bulk load refreshIf you want InfoSphere CDC to use a bulk load refresh when applying data to thetarget database, then you must do the following:1. Install an Oracle Client on the same server where you have installed and

configured InfoSphere CDC. The Oracle Client must be able to connect to theOracle database.

2. Add the same TNS names entry for your database to the tnsnames.ora file andselect this database when configuring InfoSphere CDC.

Oracle listenerInfoSphere CDC requires the Oracle listener to connect to the database. You muststart the Oracle listener before installing InfoSphere CDC. For more information onstarting the Oracle listener, see your Oracle documentation.

Database constraintsInfoSphere CDC supports constraints on target tables under the followingconditions:v Constraints on the target system must be the same or less restrictive than the

source system.v When constraint checking is deferred on the Oracle source, the deferred

constraints should be removed from the target system to enable the product torun correctly.

You can defer constraints in Oracle with one of the following methods:v A user manually defers the constraintsv An update of KEY = KEY + 1. For example, UPDATE TABLE SCOTT.EMP SET EMPNO =

EMPNO + 1;

v Cascading deletes

Oracle - TriggerThe following Oracle - Trigger configuration settings should be considered prior toinstalling InfoSphere CDC:

“Tablespace for InfoSphere CDC metadata” on page 70“Undo tablespace for transaction rollbacks” on page 70“Bulk load refresh” on page 70“Multiple instances and sharing the same journal table” on page 70

IBM Confidential


“Oracle listener” on page 71“Database constraints” on page 71

Tablespace for InfoSphere CDC metadataInfoSphere CDC requires a minimum tablespace of 500 MB for product metadataand changed data that is stored in journal tables.

The size of the minimum required tablespace will vary depending on the size ofyour replication configuration. Environments with many concurrent Oracle sessionscould require 100-200 MB.Related concepts

“Undo tablespace for transaction rollbacks”

Undo tablespace for transaction rollbacksInfoSphere CDC deployed as a target requires a sufficient amount of Oracle undotablespace for transaction rollbacks.

Undo tablespace requirements can be determined by calculating the size of thelargest database transactions that InfoSphere CDC will process in a productionenvironment. For example, if the sum of all rows changed in the largest transactionin your production environment is 2 million and the average row size for thetransaction is 2 KB, InfoSphere CDC will require an undo tablespace of 4 GB (2 M* 2 KB = 4 GB). As a best practice, you may want to double this requirement sinceInfoSphere CDC stores journal tables with product metadata and changed data inyour database.

When refreshing an entire table with InfoSphere CDC, you will also requiresufficient undo tablespace for an insert of the entire table.

Insufficient undo tablespace may result in database errors.Related concepts

“Tablespace for InfoSphere CDC metadata”

Bulk load refreshInfoSphere CDC can use a bulk load refresh when applying data to the targetdatabase.

Multiple instances and sharing the same journal tableBy default, each instance of InfoSphere CDC uses its own set of journal tables in itsown Oracle schema. Also, each instance creates its own set of triggers for the tablesyou have mapped for mirroring. To reduce the performance impact of multipletriggers on the source table, you can have multiple instances share one journaltable. This may be necessary when you require a large number of subscriptions toreplicate data and you want to distribute the workload across multiple instances ofInfoSphere CDC. This way, InfoSphere CDC only has to create one trigger on thesource table and copy data to one journal table.

You can enable the system parameter mirror_journal_schema so that theseinstances can share the same journal table and trigger. For example, if you werereplicating from 1 database to 100 other databases, you may want to create twoinstances of InfoSphere CDC and create 50 subscriptions for each instance. To sharethe same journal table and trigger between both these instances, you can set thevalue of the mirror_journal_schema system parameter so that it references an

IBM Confidential


existing journal table. For example, if you have created an instance INSTANCE1that specifies the journal table SCHEMA1, and you have created another instanceINSTANCE2 that specifies the journal table SCHEMA2, then you would set thevalue of mirror_journal_schema on INSTANCE2 to SCHEMA1.

See "mirror_journal_schema" in your InfoSphere CDC documentation.

Oracle listenerInfoSphere CDC requires the Oracle listener to connect to the database. You muststart the listener before installing InfoSphere CDC. For more information startingthe Oracle listener, see your Oracle documentation.

Database constraintsInfoSphere CDC supports constraints on target tables under the followingconditions:v Constraints on the target system must be the same or less restrictive than the

source system.v When constraint checking is deferred on the Oracle source, the deferred

constraints should be removed from the target system to enable the product torun correctly.

You can defer constraints in Oracle with one of the following methods:v A user manually defers the constraintsv An update of KEY = KEY + 1. For example, UPDATE TABLE SCOTT.EMP SET EMPNO =

EMPNO + 1;

v Cascading deletes

SybaseThe following Sybase configuration settings should be considered prior toinstalling InfoSphere CDC:

“Setting the LANG environment variable (UNIX)”“Database and backup restrictions”“Refresh performance considerations” on page 72“Enabling the creation of a partition table” on page 73

Setting the LANG environment variable (UNIX)

The UNIX Language environment variable [LANG] needs to be properly set forInfoSphere CDC to open files with MBCS characters.

If your file names will contain a combination of English and Japanese characters,set the LANG parameter to the following value: LANG=ja_JP.UTF-8

Database and backup restrictionsBefore you can start replicating data from a Sybase database, you must followcertain database and backup restrictions on each database you want InfoSphereCDC for Sybase databases to connect to. These restrictions apply only if the Sybasedatabase is used as a replication source. The restrictions arise because InfoSphereCDC must read the backup (archive) logs in addition to reading the online (live)log.

IBM Confidential


See also:“Database restrictions”“Backup restrictions”

Related concepts

“Sybase - online and archive logs” on page 35

Database restrictions

When you use a Sybase database, consider the following restrictions.v Ensure the truncate log option on checkpoint is disabled—If the truncate log

option is enabled, the database log will be truncated automatically withoutbackup every time a database checkpoint is performed.

v Never run truncate_only operations—Running dump transaction with thetruncate_only option deletes inactive transactions from the log without creatinga backup.

v Use only data or log segments, not a combination—If a database has mixedsegments, log backup is not allowed and only full database backups can beperformed.

Backup restrictions

InfoSphere CDC uses your existing Sybase backup strategy to maintain transactionlogs and backup archive files.

You should consider the following backup restrictions:v All backup archive files must be in one directory—You will have to specify the

archive log directory when you configure InfoSphere CDC.

Note: Do not place the archive logs in the same directory as your online logsand do not place the archive logs for more than one Sybase database in the samedirectory.

v Back up files must be locally accessible—Do not backup with striping tomultiple disks, to tapes, or to remote servers.

v Use only decompressed backup files—Do no execute compressed backups.Only decompressed backup files can be read.

v Use a single backup server—InfoSphere CDC cannot read and merge logs frommultiple servers.

v Retain the backup server log—Do not delete a backup server log file. Thebackup server log lists all backups. Removing obsolete entries from the log canbe done manually as a part of the backup log removal procedure using acommand-line utility.

Refresh performance considerations

The following considerations only apply if the Sybase database is used as a targetof replication.

Enabling the bulkcopy option

To enable higher performance copying of data during a refresh to large tables,consider enabling Fastload by running:sp_dboption <db>, "select into/bulkcopy", true

IBM Confidential


Note: You will then need to run a full backup. For details see your Sybase ASEdocumentation.

Enabling the creation of a partition table

If you want to create a range partition table, the Sybase ASE installation must havethe ASE Partitions License and sp_configure must be used to set enable semanticpartitioning to 1:

sp_configure ‘enable semantic partitioning', 1

TeradataThe following Teradata configuration settings should be considered prior toinstalling InfoSphere CDC:

“Driver and utilities requirements”“Directories for Teradata FastLoad files”

Driver and utilities requirements

Teradata JDBC driver Teradata utility

Teradata JDBC driver, version 12.00.00.110Note: If you have system parametermirror_td_apply_method set to JDBC, youmust use Teradata JDBC driver, version13.00.00.06

Both of the following:

v Teradata TPump, version 12.00.00.000

v Teradata FastLoad, version 12.00.00.000

Teradata JDBC driver, version 13.00.00.06 Both of the following:

v Teradata TPump, version 13.00.00.000

v Teradata FastLoad, version 13.00.00.000

Directories for Teradata FastLoad filesA directory will need to be created or assigned for use with the FastLoad utility inTeradata. Both your InfoSphere CDC user and your database must have read andwrite permissions for this directory. Use a different directory for each instance ofInfoSphere CDC.

IBM Confidential


IBM Confidential


Server requirements

Before you can install InfoSphere CDC, you need to ensure that the system youchoose meets the necessary operating system, hardware, disk and memoryrequirements.

In this section, you will learn:“Supported operating systems and processors”“CPU resource requirements” on page 83“RAM requirements” on page 84“Disk requirements” on page 85“InfoSphere CDC metadata resiliency” on page 87

Related concepts

“Database requirements and supported features” on page 15“User account requirements” on page 89“TCP/IP network requirements and supported features” on page 97

Supported operating systems and processorsThe following table lists supported operating system and processor requirementsfor the administration of InfoSphere CDC.

InfoSphere CDCadministration Operating systems and processors

ManagementConsole

You will require one of the following versions of Windows:

v Microsoft Windows XP—x86/x64 processors

v Microsoft Windows Vista—x86/x64 processors

v Microsoft Windows 7—x86/x64 processors

v Microsoft Windows Server 2003—x86/x64 processors


v Microsoft Windows Server 2008 R2—x86/x64 processors

IBM Confidential


InfoSphere CDCadministration Operating systems and processors

Access Server If you are using Windows, use one of the following versions:

v Microsoft Windows XP—x86/x64 processors


v Microsoft Windows 7—x86/x64 processors




If you are using UNIX or Linux, use one of the following:

v AIX, version 5.3.0.8 or later—POWER® processor

v AIX, version 6.1—POWER processor


v HP-UX, 11i v2 (11.11)—PA-RISC processor


v Linux Red Hat version 4—x86/x64 processors


v Linux Red Hat version 5.4—x86/x64 processors

v Novell SUSE Linux (SLES) 10.0 Enterprise Server—x86/x64processors

v Novell SUSE Linux (SLES) 11.0 Enterprise Server—x86/x64processors

v Sun Solaris, version 2.9—SPARC processor


InfoSphere ClassicCDC for z/OSClassic DataArchitect

If you are using Windows, use one of the following versions:

v Microsoft Windows 2000

v Microsoft Windows XP Professional


If you are using UNIX or Linux, use one of the following:

v Red Hat Desktop 4.0—x86-32 processor

v Red Hat Enterprise Linux (RHEL) 4.0 AS/ES—x86-32 processor

v SuSE Linux (SLES) 10.0 Enterprise Server—x86-32 processor

v SuSE Linux (SLES) 9.0 Enterprise Server—x86-32 processor

v SuSE Linux (SLES) 9.0 SP2 Enterprise Server—x86-32 processor

v SUSE Linux Enterprise Desktop 10.0—x86-32 processor

The following tables list the supported operating system and processorrequirements for each InfoSphere CDC replication engine.

InfoSphere CDC for DB2 for LUW

IBM Confidential


Operating system and processor


v Microsoft Windows Server 2003—x86/x64 processors (Not applicable for DB2 pureScaleversion 9.8 FP3)

v Microsoft Windows Server 2008—x86/x64 processors (Not applicable for DB2 pureScaleversion 9.8 FP3)

v Microsoft Windows Server 2008 R2—x86/x64 processors (Not applicable for DB2pureScale version 9.8 FP3)

If you are using UNIX or Linux, use one of the following operating systems:

v AIX, version 5.3.0.8 or later—POWER processor (Not applicable for DB2 pureScaleversion 9.8 FP3)



v Linux Red Hat version 4—x86/x64 processors (Not applicable for DB2 pureScale version9.8 FP3)

v Linux Red Hat version 5—x86/x64 processors (Not applicable for DB2 pureScale version9.8 FP3)

v Linux Red Hat version 5.4—x86/x64 processors (Not applicable for DB2 pureScaleversion 9.8 FP3)

v Linux Red Hat on System z® version 5 (Not applicable for DB2 pureScale version 9.8FP3)

v Novell SUSE Linux (SLES) 10.0 Enterprise Server—x86/x64 processors


v Linux Red Hat on System z version 5 (Not applicable for DB2 pureScale version 9.8 FP3)

v Novell SUSE Linux (SLES) on System z version 10 (Not applicable for DB2 pureScaleversion 9.8 FP3)

v Sun Solaris, version 2.9—SPARC processor (Not applicable for DB2 pureScale version 9.8FP3)

v Sun Solaris, version 2.10—SPARC processor (Not applicable for DB2 pureScale version9.8 FP3)

InfoSphere CDC for DB2 for i, version 6.2, Fix Pack 1

Operating system and processor Java Virtual Machine (JVM)

You will require the following operatingsystem:

v IBM i , i5/OS® version 5.3—i5 processor

v IBM i , i5/OS version 5.4—i5 processor



Note: For correct product operation, youmay require Program Temporary Fixes(PTFs). Contact IBM for PTF information foryour operating system.

JVM 1.5 or later

InfoSphere CDC for z/OS

IBM Confidential

Server requirements 77

Operating system

InfoSphere CDC for z/OS supports the following versions of z/OS:

v z/OS Version 1 Release 10

v z/OS Version 1 Release 11

InfoSphere CDC Event Server


If you are using Windows, you will require the following operating system:





v AIX, version 5.3.0.8 or later—POWER processor



v HP-UX, 11i v3 (11.23)—Itanium processor








If you are using IBM i , use the following:




InfoSphere CDC for Informix

IBM Confidential

















v Linux Red Hat version 5.4—x86/x64 processors





Note: If you plan to replicate data between an Informix Dynamic Server database and aDB2 for z/OS database, contact IBM Technical Support for an important update.

InfoSphere CDC for InfoSphere DataStage

IBM Confidential












v HP-UX, 11i v2 (11.11)—PA-RISC processor (Not applicable for InfoSphere DataStageversion 8.5)

v HP-UX, 11i v3 (11.23)—PA-RISC processor (Not applicable for InfoSphere DataStageversion 8.5)

v Linux Red Hat version 4—x86/x64 processors (Not applicable for InfoSphere DataStageversion 8.5)


v Linux Red Hat version 5.4—x86/x64 processors (Not applicable for InfoSphere DataStageversion 8.5)

v Linux Red Hat on System z version 5 (Not applicable for InfoSphere DataStage version8.5)


v Novell SUSE Linux (SLES) 11.0 Enterprise Server—x86/x64 processors (Not applicablefor InfoSphere DataStage version 8.1)

v Novell SUSE Linux (SLES) on System z version 10 (Not applicable for InfoSphereDataStage version 8.1)

v Novell SUSE Linux (SLES) on System z version 11 (Not applicable for InfoSphereDataStage version 8.1)



InfoSphere CDC for Microsoft SQL Server






InfoSphere CDC for Oracle databases

IBM Confidential



One of the following:




v HP-UX, 11i v3 (11.23)—Itanium processor (Not applicable for Oracle 11gR2)


v HP-UX, 11i v2 (11.11)—PA-RISC processor (Not applicable for Oracle 11gR2)


v Linux Red Hat version 4—x86/x64 processors (Not applicable for Oracle 11gR2)

v Linux Red Hat version 5—x86/x64 processors (Not applicable for Oracle 11gR2)

v Linux Red Hat version 5.4—x86/x64 processors (Not applicable for Oracle 10g or Oracle11gR2)


v Novell SUSE Linux (SLES) 11.0 Enterprise Server—x86/x64 processors (Not applicablefor Oracle 9i or Oracle 10g)

v Linux Red Hat on System z version 5 (Not applicable for Oracle 9i)

v Novell SUSE Linux (SLES) on System z version 10 (Not applicable for Oracle 9i)

v Novell SUSE Linux (SLES) on System z version 11 (Not applicable for Oracle 9i)



InfoSphere CDC for Oracle databases (trigger)











v HP-UX, 11i v3 (11.31)—Itanium processor (Not applicable for Oracle 9i, Oracle 10g orOracle 11gR2)



v Linux Red Hat on System z version 5

v Novell SUSE Linux (SLES) on System z version 10



InfoSphere CDC for Sybase databases

IBM Confidential









v HP-UX, 11i v3 (11.31)—Itanium processor (Not applicable for Sybase ASE 12.53 orSybase ASE 12.54)



v Linux Red Hat version 4—x86/x64 processors (Not applicable for Sybase ASE 12.53 orSybase ASE 12.54)

v Linux Red Hat version 5—x86/x64 processors (Not applicable for Sybase ASE 12.53 orSybase ASE 12.54)

v Linux Red Hat version 5.4—x86/x64 processors (Not applicable for Sybase ASE 12.53 orSybase ASE 12.54)

v Novell SUSE Linux (SLES) 10.0 Enterprise Server—x86/x64 processors (Not applicablefor Sybase ASE 12.53 or Sybase ASE 12.54)

v Sun Solaris, version 2.9—SPARC processor (Not applicable for Sybase ASE 12.53 orSybase ASE 12.54)

v Sun Solaris, version 2.10—SPARC processor (Not applicable for Sybase ASE 12.53 orSybase ASE 12.54)

InfoSphere CDC for Teradata






If you are using UNIX, use the following operating system:



v 32-bit Teradata Tools and Utilities

InfoSphere Classic CDC for z/OS

Operating system

InfoSphere Classic CDC for z/OS supports the following versions of z/OS:

v IBM z/OS Version 1.10 or later

IBM Confidential


Related concepts

“Supported databases and target applications” on page 16

CPU resource requirementsInfoSphere CDC source and target systems both require CPU resources and theavailability of CPU directly affects the number of rows processed and the generalperformance of the product. CPU resources are consumed by different processes onthe source and target systems.

The following table lists CPU resource consumption for source and target systemsby replication engine.

Replicationengine

Source system CPU resources are required forthese processes:

Target system CPU resources are required forthese processes:

v InfoSphereCDC forDB2 forLUW

v InfoSphereCDC forInformix

v InfoSphereCDC forMicrosoftSQL Server

v InfoSphereCDC forOracledatabases

v InfoSphereCDC forOracledatabases(trigger)

v InfoSphereCDC forSybasedatabases

v Capturing, decoding, and staging the changeddata stored in the database log files. Theproduct must process and filter the entiredatabase log at all times, even if the majorityof the log contains out-of-scope data.

v Invoking triggers and journal tablesin your Oracle database which contain thedata to be replicated.

v Transmitting the captured change data to thetarget system using TCP/IP.

v Refresh operations which query the databasedirectly using JDBC.

v Querying the database directly for replicationand refresh of LOB and LONG data types.

v Querying the database directlyfor replication and refresh of LOB, CLOB,XML, and SQL_VARIANT data types.

v Reading the IBM Informixdatabase API to capture changed data.

v Querying the database directly for %GETCOLfunctions.

v Receiving captured change data from thesource system.

v Conversion of the captured change data todatabase operations.

v Committing the database operations to thetarget database.

v Execution of SQL operations by the targetdatabase during the apply process on thetarget system.

v Multiple indexes on tables in the targetdatabase often requires additional CPUresources.

v InfoSphereCDC forNetezzadatabases

InfoSphere CDC for Netezza databases can onlybe deployed as a target of replication and is notavailable as a source of replication. For moreinformation on the CPU requirements for thesource system in your replication environment,see the appropriate section in this guide.





v InfoSphereCDC EventServer

InfoSphere CDC Event Server can only bedeployed as a target of replication and is notavailable as a source of replication. For moreinformation on the CPU requirements for thesource system in your replication environment,see the appropriate section in this guide.


v Conversion of the captured change data toXML messages.

v Sending messages to JMS destinations.

IBM Confidential


Replicationengine

Source system CPU resources are required forthese processes:

Target system CPU resources are required forthese processes:

v InfoSphereCDC forInfoSphereDataStage

InfoSphere CDC for InfoSphere DataStage canonly be deployed as a target of replication and isnot available as a source of replication. For moreinformation on the CPU requirements for thesource system in your replication environment,see the appropriate section in this guide.


v Generating and writing flat files to disk.

v InfoSphereCDC forTeradata

InfoSphere CDC for Teradata can only bedeployed as a target of replication and is notavailable as a source of replication. For moreinformation on the CPU requirements for thesource system in your replication environment,see the appropriate section in this guide.






v InfoSphereCDC forz/OS

v Capturing, decoding, and staging the changeddata stored in the database log files. Theproduct must process all records from the DB2log for Units of Recovery and records fortables which have DATA CAPTURECHANGES configured. The DB2 IFI interfacemust filter the entire database log at all times,even if the majority of the log containsout-of-scope data.

v Transmitting the captured change data to thetarget system using TCP/IP.

v Querying the database directly for %GETCOLor %SELECT functions.






Related concepts

“RAM requirements”“Disk requirements” on page 85

RAM requirements

For information on storage requirements for InfoSphere CDC forz/OS, see “Estimating above the bar storage requirements” on page 48

When configuring instances of InfoSphere CDC, the defaults for memory allocationare as follows:

Product Required RAM

32-bit instance of InfoSphereCDC

512 MB

64-bit instance of InfoSphereCDC

1024 MB

InfoSphere CDCfor DB2 for i

512 MB of available pooled memory

IBM Confidential


Product Required RAM

Access Server 512 MB (1024 MB is recommended)

Management Console 512 MB (1024 MB is recommended)

InfoSphere Classic CDC forz/OS Classic Data Architect

1 GB

Although InfoSphere CDC memory requirements will fluctuate, you must workwith your system administrator to ensure the allocated memory for each instanceof the product is available at all times. This may involve deployment planningsince other applications with memory requirements may be installed on the sameserver with InfoSphere CDC. Using values other than the defaults or allocatingmore RAM than is physically available on your server should only be undertakenafter considering the impacts on product performance.

InfoSphere CDC source deployments may require additional RAM in the followingscenarios:v You are replicating large LOB data types with your InfoSphere CDC source

deployment. These data types are sent to target while being retrieved from thesource database. The target waits until all LOBs (for each record) are receivedbefore applying a row. LOBs are stored in memory as long as there is adequateRAM, otherwise they are written to disk on the target.

v You are replicating "wide" tables with hundreds of columns.v You are performing large batch transactions in your source database rather than

online transaction processing (OLTP).

With insufficient physically available RAM, you may encounter some of thefollowing performance related issues:v A possible performance degradation as the operating system on your server

increasingly relies on virtual memory and disk I/O.v Additional CPU resources may be consumed as the product spends more time

cleaning up memory.v Restricted throughput. Replication may be slow or pause due to an increased

reliance on virtual memory and disk I/O.v Replication may be slow as the product processes smaller units of work to

preserve memory.v Disk-staging of transactions by the InfoSphere CDC source deployment.v Out-of-memory exceptions, time outs, monitoring issues, and configuration

issues with command line tools or Management Console.Related concepts

“CPU resource requirements” on page 83“Disk requirements”

Disk requirementsThis section details the disk requirements for a InfoSphere CDC deployment.

See also:“Disk space” on page 86“Disk speed” on page 87“Disk type” on page 87

IBM Confidential


Disk space

Disk space for InfoSphere CDC

v 10 GB—The minimum required disk space for installation files, metadata database andback ups, flat files for InfoSphere CDC for InfoSphere DataStage, LOB storage cache,transaction queues (InfoSphere CDC source only), event log, and trace files. Disk spacerequirements are dependent on your replication configuration. For larger systems, 100GB of disk space is a typical requirement.

v 40 cylinders of a 3390 DASD device or equivalent for installing andmaintaining InfoSphere CDC for z/OS. Additional space requirements for data sets usedduring execution are determined by the installer.

v 500 MB—The minimum required disk space for installation files, metadatadatabase and back ups, event log, and trace files. Disk space requirements are dependenton your replication configuration. For larger systems, 5 GB of disk space is a typicalrequirement.

v 60 MB—The minimum required disk space for installation files, metadatadatabase and back ups, event log, and trace files. Disk space requirements are dependenton your replication configuration. Depending on the volume of data that you arereplicating, you may be required to allocate more disk space.

Disk space for Management Console

250 MB

Disk space for Access Server

250 MB

Disk space for InfoSphere Classic CDC for z/OS Classic Data Architect

150 MB for IBM Installation Manager

690 MB for Classic Data Architect

InfoSphere CDC may require disk space in addition to the stated minimum valuefor the following scenarios:v You are running large batch transactions in the database on your source system.v You are configuring multiple subscriptions and one of your subscriptions is

latent. In this type of scenario, the InfoSphere CDC source may stage transactionqueues on disk if available RAM is insufficient.

v You are replicating large LOB data types.v You are replicating "wide" tables that have hundreds of columns.

If you configure InfoSphere CDC with insufficient disk space for your databaseworkload, you may encounter some of the following performance related issues:v Replication may pause until disk files are released.v Replication may shut down or pause during large batch jobs which might have

an effect on metadata or result in events not being recorded.

IBM Confidential


Related concepts

“Disk requirements” on page 85

Disk speedThe InfoSphere CDC source deployment must read the database logs from disk.For optimal performance, there must be sufficient disk speed to accommodate theI/O requirements of InfoSphere CDC (reading the logs) and the database (writingthe logs).

InfoSphere CDC requires a disk device with the following minimum requirements:v 20 MB per second with concurrent reads/writes.v A minimum of 200 file handles from your supported operating system.Related concepts


Disk typeInfoSphere CDC operates most efficiently when installed on a low latency, highperformance disk device.

InfoSphere CDC supports installation on the following types of disk devices:v Non-networked file systems such as local physical drivesv Storage area networks (SAN)Related concepts


InfoSphere CDC metadata resiliencyInfoSphere CDC metadata contains important information about your currentreplication configuration. The product metadata is changed every time you makechanges to objects such as subscriptions or table mappings in ManagementConsole. Corrupt metadata through a hardware or software failure may result in anumber of issues within the product such as incorrect status for your subscriptions.

It is important to protect this information by backing up your metadata every timeyou make a change to your replication configuration.

To ensure that InfoSphere CDC is metadata resilient, you can do the following:v Install InfoSphere CDC on a mirrored device.v Use the dmbackupmd command to back up product metadata every time you

change your configuration.

Note: The dmbackupmd command is not available to InfoSphere CDC for z/OS. Themetadata for InfoSphere CDC for z/OSis stored in your DB2 database and shouldbe included as a part of your normal database backups.

IBM Confidential


IBM Confidential


User account requirements

When you configure InfoSphere CDC, you are prompted for the name of thedatabase you want InfoSphere CDC to connect to and the user name and passwordof the user that has access to this database.

In this section, you will learn:“User account access requirements”

Related concepts

“Server requirements” on page 75“TCP/IP network requirements and supported features” on page 97“Database requirements and supported features” on page 15

User account access requirementsThe following tables indicates the operating system and database user accountrequirements necessary to successfully install, configure and run InfoSphere CDC.User account requirements for supported middleware targets such as InfoSphereCDC Event Serverr and InfoSphere CDC for InfoSphere DataStage are also listed.

InfoSphere CDCadministration Operating system user account requirements Database user account requirements

ManagementConsole

You must set up a new, or decide on an existingWindows account that you will use to install,configure, or upgrade Management Console.

N/A

Access Server You must set up a new, or decide on an existingWindows, UNIX, or Linux account that you willuse to install, configure, or upgrade AccessServer.

UNIX and Linux installations of Access Serverrequire the following additional steps before youcan log in to Management Console:

v Start Access Server.

v Create an Access Server user account with thedmcreateuser command.

N/A

IBM Confidential


InfoSphere CDCreplication engine

Operating system user accountrequirements Database user account requirements

InfoSphere CDCfor DB2 for LUW

v Windows—You must set up a new, ordecide on an existing Windows accountthat you will use to install, configure, orupgrade InfoSphere CDC.

v UNIX—You must set up a new, or decideon an existing UNIX account that you willuse to install, configure, or upgradeInfoSphere CDC. You can installInfoSphere CDC in the directory of yourchoice, however, it must be owned by theUNIX account.

Note the following before you install orupgrade InfoSphere CDC on Linux or UNIX:

v Do not install or upgrade InfoSphere CDCas a root user.

v The installation directory requires filesystem permissions of 600700 if you planon using the same user account to installthe product, create and configureinstances, or upgrade the product.

v The installation directory requires filesystem permissions of 660770 if you planon using different user accounts to installthe product, create and configureinstances, or upgrade the product.

You must have a DB2 user account with systemadministrator (SYSADM) or databaseadministrator (DBADM) privileges forInfoSphere CDC to connect to your DB2database.

InfoSphere CDCEvent Server







N/A

IBM Confidential




InfoSphere CDCfor DB2 for i

When using InfoSphere CDC with a DB2 fori database, you need to verify that all userprofiles used to run InfoSphere CDC andstart mirroring jobs have sufficientauthorities to access journals and journalreceivers that are used by the product. TheD_MIRROR user profile is created duringproduct installation and is reserved forInfoSphere CDC. You should not log onwith this user profile. You can also createand customize a user profile which givesyou flexibility and control over security.

The user-configured user profile can bespecified as the value for parameter USER inthe CHGJOBD command issued for all jobdescription objects found in the productlibrary. To start the product TCP listener fora non-D_MIRROR user profile, thecommand STRDMTCP should not be used tostart the listener.

N/A

InfoSphere CDCfor InfoSphereDataStage







N/A

IBM Confidential

User account requirements 91



InfoSphere CDCfor Informix







Use an Informix user account that is part of theinformix group to install InfoSphere CDC on asupported operating system.

InfoSphere CDCfor Microsoft SQLServer

You must set up a new, or decide on anexisting Windows account that you will useto install, configure, or upgrade InfoSphereCDC.

If you plan on using SQL authentication toallow InfoSphere CDC to connect to yourMicrosoft SQL Server database, you must createa user account with SQL authentication that hasthe following privileges for the Microsoft SQLServer instance:

v If you are using InfoSphere CDC as a sourceof replicated data, you must specify sysadminprivileges for the user account.

v If you are using InfoSphere CDC as a targetof replicated data, at minimum you mustspecify db_owner privileges for the databaseand bulkadmin as the server role. If youprefer, you can also specify sysadminprivileges for the user account.

IBM Confidential




InfoSphere CDCfor Netezzadatabases

You must set up a new, or decide on anexisting Linux account that you will use toinstall, configure, or upgrade InfoSphereCDC. You can install InfoSphere CDC in thedirectory of your choice, however, it must beowned by the Linux account.





N/A

IBM Confidential




InfoSphere CDCfor Oracledatabases

You must set up a new, or decide on anexisting UNIX account that you will use toinstall, configure, or upgrade InfoSphereCDC. You can install InfoSphere CDC in thedirectory of your choice, however, it must beowned by the UNIX account.





Create a user account that has DBA privilegesfor the Oracle instance.

Before installing InfoSphere CDC, make sureyou review the specific grants required by theOracle DBA. Use the sample ora-createuser.sqlSQL script located in the installation directoryto create an Oracle user with all the necessaryDBA privileges that are required.

Optionally, you can create a user account thathas read-only database connection to the sourcedatabase. Specify that you want read onlyaccess and provide this user name wheninstalling and configuring InfoSphere CDC.Read-only database connection to the sourcedatabase indicates that the user can only viewdata or mirror subscribed tables. The usercannot change any information. If you use aread-only user, you should also ensure youhave enabled supplemental logging at thedatabase table level, prior to installing andconfiguring InfoSphere CDC.Note: If you plan to use a read-only databaseconnection, ensure that you have theDBMS_FLASHBACK Oracle supplied packageinstalled. By default, this package is installedwhen you create an Oracle database and runthe CATPROC.SQL script. No further action isrequired for this package. For more informationabout this package, refer to your Oracledocumentation.

If your database instance is managed by OracleAutomatic Storage Manager (ASM), then youshould already have an Oracle account for theASM instance to which you want to connect.InfoSphere CDC requires a user name andpassword so that it can connect to the ASMinstance that corresponds to the node in thecluster. The ASM user must have SYSDBAprivileges in order to log into ASM.

IBM Confidential





You must set up a new, or decide on anexisting UNIX account that you will use toinstall, configure, or upgrade InfoSphereCDC. You can install InfoSphere CDC in thedirectory of your choice, however, it must beowned by the UNIX account.





Create a user account that has DBA privilegesfor the Oracle instance. Before installingInfoSphere CDC, make sure you review thespecific grants required by the Oracle DBA.

Use the sample ora-createuser.sql SQL scriptlocated in the installation directory to create anOracle user with all the necessary DBAprivileges that are required.

InfoSphere CDCfor Sybasedatabases







For InfoSphere CDC to connect to your Sybasedatabase, you need to create a Sybase useraccount and assign system administrator ordatabase administrator (DBA) privileges to thisuser.

IBM Confidential




InfoSphere CDCfor Teradata







For InfoSphere CDC to connect to yourTeradata database, you need to create aTeradata user account.

InfoSphere CDCfor z/OS

A z/OS user id must be created for theInfoSphere CDC for z/OS instance to rununder. When defining this user id, makesure to define an OMVS segment. This userid will become the owner of the InfoSphereCDC for z/OS metadata in IBM DB2 forz/OS and must have the SYSCTRL privilege(Member CHCGRNTA in the sample librarycan be used to grant this privilege).

The user account must also have:

v SELECT authority for all tables to bereplicated on the source database

v INSERT, UPDATE, DELETE & SELECTauthority for all tables that will bereplicated on the target database.

You must have a DB2 user account with systemadministrator (SYSADM) or databaseadministrator (DBADM) privileges forInfoSphere CDC to connect to your DB2database. The user account must also have:

v SELECT authority for all tables to bereplicated on the source database

v INSERT, UPDATE, DELETE & SELECTauthority for all tables that will be replicatedon the target database.

IBM Confidential


TCP/IP network requirements and supported features

Before you can install Management Console, you need to ensure that your networkmeets the necessary communications requirements for Management Consolereplication engines, and the Management Console administration components:Management Console and Access Server.

Management Console supports IPV6 networks and mixed IPV4 and IPV6networks.

In this section you will learn:“InfoSphere CDC replication engine network requirements”“InfoSphere CDC administration network requirements” on page 98“Network connection resiliency” on page 100“Data encryption considerations” on page 101

Related concepts

“User account requirements” on page 89“Server requirements” on page 75“Database requirements and supported features” on page 15

InfoSphere CDC replication engine network requirementsInfoSphere CDC requires:v A fully inclusive TCP/IP network path with adequate network bandwidth that

connects the source and target installations of InfoSphere CDC, ManagementConsole, and Access Server. InfoSphere CDC may shut down if the network isunreliable.

v Firewalls or network tools that do not interfere with or close InfoSphere CDCcommunication ports or, as is the case with InfoSphere CDC for Netezzadatabases, it does not interfere with its connection to the Netezza appliance.

v InfoSphere CDC user exits or notifications require reliable connections to thedatabase or other applications such as email servers. The product may shutdown if these connections are lost.

v Reliable network connections between the source or target deployments ofInfoSphere CDC and Access Server. Lost connections may result in metadatacorruption during active configuration in Management Console.

v Sufficient network bandwidth when the product is configured toread remote database logs with a networked file system (NFS).

v Sufficient network bandwidth between the Linux box and whereInfoSphere CDC for Netezza databases is installed and the Netezza appliance.

IBM Confidential


Related concepts

“InfoSphere CDC administration network requirements”

InfoSphere CDC administration network requirementsManagement Console

Management Console requires a valid TCP/IP network so that it can communicatewith your installation of InfoSphere™ CDC.

Access Server

After installing Access Server, you must configure static ports if you are using afirewall or other security mechanism that requires fixed ports.

If your network uses a firewall or other security mechanism that requires staticports for communication, then you must specify the ports that other computers canuse to communicate with Access Server services.

Note: In addition to a network firewall, you might have personal firewall softwareinstalled and enabled on client machines. This firewall may cause a problem whenconnecting to Management Console from Access Server.

To calculate the number of Access Server ports to open, use this formula: numberof ports to open = 2 * (number of users + (number of users * number ofdatastores) + number of datastores) where a datastore refers to an InfoSphereCDC installation.

The following figure highlights the ports you can configure for ManagementConsole and Access Server components. You can configure static port numbers forall or some of these ports, depending on your network requirements.

The labels in the figure above correspond to the following groups of ports:v 1—Communication from Management Console to the Access Server service. You

specify this port when you install Access Server and when you log in toManagement Console. The default port is 10101 and you can set this value inManagement Console.

IBM Confidential


v 2—Communication from Access Server back to Management Console formonitor updates.

v 3—Communication from Management Console to the Access Server service, perdatastore (that is, per InfoSphere CDC installation). This requires two ports foreach InfoSphere CDC installation.

v 4—Communication from the Access Server service to the datastore, listenprocess. This is established for each Management Console connection.

v 5—Communication from the Access Server service to the datastore, monitorprocess. This is a shared connection between all Management Consoleconnections on the same datastore. This requires two ports for each datastore.

You must also configure your routers and firewalls to allow communicationthrough the configured ports. For more information, contact your networkadministrator.

Management Console requires:v One input and output port to the Access Server.v One input port from the Access Serverv One input and output port per datastore (regardless of whether you connect to

the datastore)

The Access Server requires:v One input and output port per datastore, per installation of Management

Consolev Two input and output ports, per datastore

Additionally, you can have more than one datastore, or more than one installationof Management Console; for example:v One installation of Management Console and one datastorev One installation of Management Console and two datastoresv Two installations of Management Console and one datastorev Two installations of Management Console and two datastores

Example: calculating ports required

To help determine the number of ports required, take a scenario where there areten concurrent users and three datastores.

To calculate the number of Access Server ports to open, use this formula: numberof ports to open = 2 * (number of users + (number of users * number ofdatastores) + number of datastores) where a datastore refers to an InfoSphereCDC installation.

Using the above scenario of ten concurrent users and three datastores, the numberof Access Server ports required is 86. Here is the breakdown of the calculation,following the order in the figure above illustrating the ports you can configure forManagement Console and Access Server components:v Number of concurrent users that will log into Access Server = 10v One port per user to connect to and deliver unsolicited message = 10v Number of possible concurrent connections from Management Console to

connect to datastores); that is, 10 users * 3 datastores = 10 * 3

IBM Confidential

TCP/IP network requirements and supported features 99

v Number of possible concurrent connections to datastore, listen process; that is,10 users * 3 datastores)

v Two ports required to connect to each datastore, monitor process = 2 * 3

Therefore, 10 + 10 + (10 *3) + (10 *3) + (2 *3) = 86

To calculate the number of ports to open Management Console, use this formula:number of ports to open = 2 + number of datastores

Using the above scenario of ten concurrent users and three datastores, the numberof ports required is 5 for each Management Console. This is the breakdown of thecalculation for each Management Console:v Connection to Access Server = 1v Connection for unsolicited updates from Access Server = 1v One port for each connection to a datastore, listen process = 1 * 3

Therefore, 1 + 1 + (1 *3) = 5Related concepts

“InfoSphere CDC replication engine network requirements” on page 97

Network connection resiliencyInfoSphere CDC may initiate a normal shutdown and end mirroring under thefollowing circumstances:v Interruptions in network communicationsv DB2 LUW deadlock or timeout errors for subscriptions that target DB2 LUW

To automatically restart continuous mirroring of subscriptions after a normalshutdown of InfoSphere CDC due to the preceding scenarios, you can mark thesubscriptions as persistent.

When persistency is enabled and network communications terminate, InfoSphereCDC attempts to automatically restart continuous mirroring for persistentsubscriptions at regular intervals. Attempts continue until an automatic restart issuccessful or until the persistent subscription or the InfoSphere CDC for z/OSaddress space is terminated.

In the event of a deadlock or timeout error with subscriptions that target DB2LUW, an event message will indicate that the error is recoverable and if you markthe subscription as persistent it will restart automatically. When restarted, thesubscription will resend the data that has been rolled back as a result of the errorand continue.

Subscriptions will not restart automatically if you intentionally end replication foran active subscription by name. For InfoSphere CDC for z/OS subscriptions only,automatic restart will still apply if you have ended replication for a group ofsubscriptions by specifying the wild card character ('*') with the ENDTSMIRcommand.

Persistency is only relevant to subscriptions that are used for continuous mirroring.If a persistent InfoSphere CDC for z/OS subscription is used for Refresh orScheduled End (Net Change) mirroring and network communications areinterrupted, this subscription is restarted according to how the same subscriptionwas terminated the last time it was used for continuous mirroring. For all otherInfoSphere CDC replication engines, if a persistent subscription is used for Refresh

IBM Confidential


or Scheduled End (Net Change) mirroring and network communications areinterrupted, it will not be restarted automatically.

You can set how often InfoSphere CDC for z/OS attempts to automatically restartcontinuous mirroring for all persistent subscriptions by modifying theAUTORESTARTINTERVAL configuration control statement keyword. For moreinformation, see your InfoSphere CDC for z/OS documentation. For all otherInfoSphere CDC replication engines, you can enable persistency by setting a valueto determine how often InfoSphere CDC attempts to automatically restartcontinuous mirroring for all persistent subscriptions through themirror_auto_restart_interval_minutes system parameter.

Data encryption considerations

This topic discusses encryption and InfoSphere CDC in two major contexts:v Data transmissionv Data storage

Data storage encryption

The majority of InfoSphere CDC metadata is stored in an embedded database inyour InfoSphere CDC installation directory. Data such as user names andpasswords are encrypted. A small portion of InfoSphere CDC metadata is stored inyour database which is not encrypted.

InfoSphere CDC trace information is not encrypted, although user-sensitiveinformation such as user names and passwords are removed from traces.

If you are interested in higher levels of encryption for stored data, you can deployan encrypted file system.

Data transmission encryption

InfoSphere CDC uses the DES encryption standard to encrypt user names andpasswords. All other data is obfuscated using a simple obfuscation algorithmwhich prevents data from being transmitted as is. If you are interested in higherlevels of encryption, you can use externally secured encryption channels such asSSH tunneling or use hardware VPN endpoints. Due to the volume of datatransmitted by the product and the computational requirements of encryption,hardware-based encryption is optimal for most deployments of InfoSphere CDC.

Note: The IBM z/OS operating system has tools available that will encrypt yourdata. For more information, see your z/OS system administrator and refer to yourInfoSphere CDC for z/OS documentation.

IBM Confidential

TCP/IP network requirements and supported features 101

IBM Confidential


What to do next

Now that you have assessed the resource requirements of InfoSphere CDC, youhave a better understanding of the areas of impact for installing, configuring andrunning the software.

After making any necessary adjustments to your environment, you will be ready toinstall InfoSphere CDC.

See the appropriate guide to learn the tasks required to install and configureInfoSphere CDC:v IBM InfoSphere Change Data Capture Management Console - Access Server and

Management Console Installation Guide

v IBM InfoSphere Change Data Capture Management Console - API and CommandsReference

v IBM InfoSphere Change Data Capture Management Console - Administration Guide

v IBM InfoSphere Change Data Capture Event Server End-User Documentation

v IBM InfoSphere Change Data Capture for DB2 for Linux, UNIX and WindowsEnd-User Documentation

v IBM InfoSphere Change Data Capture for Informix End-User Documentation

v IBM InfoSphere Change Data Capture for InfoSphere DataStage End-UserDocumentation

v IBM InfoSphere Change Data Capture for Microsoft SQL Server End-UserDocumentation

v IBM InfoSphere Change Data Capture for Netezza databases End-User Documentation

v IBM InfoSphere Change Data Capture for Oracle databases End-User Documentation

v IBM InfoSphere Change Data Capture for Oracle databases (trigger) End-UserDocumentation

v IBM InfoSphere Change Data Capture for Sybase databases End-User Documentation

v IBM InfoSphere Change Data Capture for Teradata End-User Documentation

v IBM InfoSphere Change Data Capture for z/OS End-User Documentation

v IBM InfoSphere Change Data Capture for z/OS Program Directory

v IBM InfoSphere Classic Change Data Capture for z/OS End-User Documentation

IBM Confidential


IBM Confidential


Troubleshooting and contacting IBM Support

The following support page contains the latest troubleshooting information anddetails on how to open a service request with IBM Support:v http://www.ibm.com/software/data/infosphere/support/change-data-capture/

For contact information in your region:v http://www.ibm.com/planetwide/

IBM Confidential


http://www.ibm.com/software/data/infosphere/support/change-data-capture/

http://www.ibm.com/planetwide/

IBM Confidential


Notices

This information was developed for products and services offered in Canada.

IBM may not offer the products, services, or features discussed in this document inother countries. Consult your local IBM representative for information on theproducts and services currently available in your area. Any reference to an IBMproduct, program, or service is not intended to state or imply that only that IBMproduct, program, or service may be used. Any functionally equivalent product,program, or service that does not infringe any IBM intellectual property right maybe used instead. However, it is the user's responsibility to evaluate and verify theoperation of any non-IBM product, program, or service.

IBM may have patents or pending patent applications covering subject matterdescribed in this document. The furnishing of this document does not grant youany license to these patents. You can send license inquiries, in writing, to:

IBM Director of LicensingIBM CorporationNorth Castle DriveArmonk, NY 10504-1785U.S.A.

For license inquiries regarding double-byte (DBCS) information, contact the IBMIntellectual Property Department in your country or send inquiries, in writing, to:

Intellectual Property LicensingLegal and Intellectual Property LawIBM Japan Ltd.1623-14, Shimotsuruma, Yamato-shiKanagawa 242-8502 Japan

The following paragraph does not apply to the United Kingdom or any othercountry where such provisions are inconsistent with local law:INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THISPUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHEREXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIEDWARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESSFOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express orimplied warranties in certain transactions, therefore, this statement may not applyto you.

This information could include technical inaccuracies or typographical errors.Changes are periodically made to the information herein; these changes will beincorporated in new editions of the publication. IBM may make improvementsand/or changes in the product(s) and/or the program(s) described in thispublication at any time without notice.

Any references in this information to non-IBM Web sites are provided forconvenience only and do not in any manner serve as an endorsement of those Websites. The materials at those Web sites are not part of the materials for this IBMproduct and use of those Web sites is at your own risk.

IBM Confidential


IBM may use or distribute any of the information you supply in any way itbelieves appropriate without incurring any obligation to you.

Licensees of this program who wish to have information about it for the purposeof enabling: (i) the exchange of information between independently createdprograms and other programs (including this one) and (ii) the mutual use of theinformation which has been exchanged, should contact:

IBM Canada Limited Office of the Lab Director8200 Warden AvenueMarkham, OntarioL6G 1C7CANADA

Such information may be available, subject to appropriate terms and conditions,including in some cases, payment of a fee.

The licensed program described in this information and all licensed materialavailable for it are provided by IBM under terms of the IBM Customer Agreement,IBM International Program License Agreement, or any equivalent agreementbetween us.

Any performance data contained herein was determined in a controlledenvironment. Therefore, the results obtained in other operating environments mayvary significantly. Some measurements may have been made on development-levelsystems and there is no guarantee that these measurements will be the same ongenerally available systems. Furthermore, some measurements may have beenestimated through extrapolation. Actual results may vary. Users of this documentshould verify the applicable data for their specific environment.

Information concerning non-IBM products was obtained from the suppliers ofthose products, their published announcements or other publicly available sources.IBM has not tested those products and cannot confirm the accuracy ofperformance, compatibility or any other claims related to non-IBM products.Questions on the capabilities of non-IBM products should be addressed to thesuppliers of those products.

All statements regarding IBM's future direction or intent are subject to change orwithdrawal without notice, and represent goals and objectives only.

All IBM prices shown are IBM's suggested retail prices, are current and are subjectto change without notice. Dealer prices may vary.

This information is for planning purposes only. The information herein is subject tochange before the products described become available.

This information contains examples of data and reports used in daily businessoperations. To illustrate them as completely as possible, the examples include thenames of individuals, companies, brands, and products. All of these names arefictitious and any similarity to the names and addresses used by an actual businessenterprise is entirely coincidental.

COPYRIGHT LICENSE:

This information contains sample application programs in source language, whichillustrate programming techniques on various operating platforms. You may copy,

IBM Confidential


modify, and distribute these sample programs in any form without payment toIBM, for the purposes of developing, using, marketing or distributing applicationprograms conforming to the application programming interface for the operatingplatform for which the sample programs are written. These examples have notbeen thoroughly tested under all conditions. IBM, therefore, cannot guarantee orimply reliability, serviceability, or function of these programs.

Each copy or any portion of these sample programs or any derivative work, mustinclude a copyright notice as follows:

© (your company name) (year). Portions of this code are derived from IBM Corp.Sample Programs. © Copyright IBM Corp. _enter the year or years_. All rightsreserved.

If you are viewing this information softcopy, the photographs and colorillustrations may not appear.

TrademarksIBM, the IBM logo, and ibm.com® are trademarks of International BusinessMachines Corp., registered in many jurisdictions worldwide. Other product andservice names might be trademarks of IBM or other companies. A current list ofIBM trademarks is available on the Web at "Copyright and trademark information"at http://www.ibm.com/legal/copytrade.shtml.

Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo,Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks orregistered trademarks of Intel Corporation or its subsidiaries in the United Statesand other countries.

Linux is a trademark of Linus Torvalds in the United States, other countries, orboth.

Microsoft, Windows, Windows NT, and the Windows logo are trademarks ofMicrosoft Corporation in the United States, other countries, or both.

Netezza is a trademark or registered trademark of Netezza Corporation, an IBMCompany.

UNIX is a registered trademark of The Open Group in the United States and othercountries.

Java and all Java-based trademarks and logos are trademarks or registeredtrademarks of Oracle and/or its affiliates.

Other company, product, or service names may be trademarks or service marks ofothers.

IBM Confidential

Notices 109

http://www.ibm.com/legal/copytrade.shtml

IBM Confidential


��

IBM ConfidentialPrinted in USA

InfoSphere Change Data Capture, Version 6.5...replicated data is the Source Capture Engine and the...

Documents

Transcript of InfoSphere Change Data Capture, Version 6.5...replicated data is the Source Capture Engine and the...