January 11, 2006 Symantec Corporation IBM Corporation

24
1 IBM ® DB2 ® Universal Database (DB2 UDB) Version 8 and VERITAS Storage Foundation for DB2 Accessible Standby Database in IBM DB2 UDB HADR Environment Using VERITAS Volume Instant Snapshots January 11, 2006 Symantec Corporation IBM Corporation

Transcript of January 11, 2006 Symantec Corporation IBM Corporation

Page 1: January 11, 2006 Symantec Corporation IBM Corporation

1

IBM® DB2® Universal Database™ (DB2 UDB) Version 8 and VERITAS Storage Foundation for DB2

Accessible Standby Database in IBM DB2 UDB HADR Environment Using VERITAS Volume Instant Snapshots

January 11, 2006 Symantec Corporation IBM Corporation

Page 2: January 11, 2006 Symantec Corporation IBM Corporation

2

The information contained in this publication does not include any product warranties, and any statements provided in this document should not be interpreted as such. © Copyright IBM Corporation and Symantec Corporation, 2006. All rights reserved. The scripts in this document are for as-is use only without support, guarantee of completeness, or formal distribution by IBM and / or Symantec. They combine standard command line sequences of the used products to improve the readability of the document as well as simplify the steps of the test sequences themselves. The same results can be obtained by executing the atomic commands in the same sequence as in the scripts with the correct user privileges. Integration into the VERITAS Storage Foundation for DB2 or Global Cluster Option / DB2 Agent of VERITAS Cluster Server is planned for a future release.

Page 3: January 11, 2006 Symantec Corporation IBM Corporation

3

TRADEMARKS The following terms are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both: IBM The IBM logo AIX DB2 DB2 Universal Database iSeries zSeries The following terms are trademarks or registered trademarks of Symantec Corporation or its affiliates in the United States, or other countries, or both: Symantec, the Symantec Logo, VERITAS, VERITAS Storage Foundation, VERITAS Volume Manager, Quick I/O Windows is a trademark of Microsoft Corporation in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a trademark of Linus Torvalds in the United States, other countries, or both. Other company, product, and service names may be trademarks or service marks of others. The furnishing of this document does not imply giving license to any IBM or Symantec patents. References in this document to IBM products, programs, or services do not imply that IBM intends to make these available in all countries in which IBM operates. All rights reserved

Page 4: January 11, 2006 Symantec Corporation IBM Corporation

4

ABOUT THE AUTHORS

Ulrich Maurus Symantec Corporation Server and Storage Management Group Ulrich Maurus joined Symantec as part of the merger with VERITAS Software in summer 2005. He is a principal software engineer and has worked for more than ten years in the same organization. Before assuming his latest role in Engineering, Ulrich was Senior Product Specialist for clustering and replication, Product Manager for SAP Edition, Engineer for VERITAS Cluster Server development, and Senior Consultant for OpenVision and VERITAS Germany.

Nuzio Ruffolo

IBM Corporation DB2 UDB System Test, Toronto Laboratory, IBM Canada

Nuzio Ruffolo has been with IBM for twelve years as part of the DB2 Universal Database System Test Organization (DB2 UDB SVT). Nuzio is responsible for testing DB2 UDB in a complex test environment that includes Linux®, UNIX®, Windows®, iSeries™ and zSeries® servers. DB2 UDB testing is performed in this environment based on expected production environment requirements – millions of transactions and thousands of operational tasks repeatedly executed during the test cycle. On this team, Nuzio has held various team lead and management positions. Nuzio represents the entire DB2 UDB test organization on the IBM test council and has presented various testing related topics at the annual Centre for Advanced Studies Conference (http://cas.ibm.com). Recently Nuzio has been responsible for testing DB2 UDB in high-availability environments, and currently manages a system test team focused on testing DB2 UDB in complex customer-like environments.

Page 5: January 11, 2006 Symantec Corporation IBM Corporation

5

Table of Contents Abstract ......................................................................................................................................................................6 Summary of Requirements.........................................................................................................................................8 Overview of the Software Components......................................................................................................................9

DB2 Universal Database V8.2 for Linux, UNIX, and Windows..............................................................................9 VERITAS Storage Foundation for DB2..................................................................................................................9

VERITAS Volume Manager...............................................................................................................................9 DB2 and Hardware Configuration ............................................................................................................................11

Storage Configuration ..........................................................................................................................................12 DB2 Layout ..........................................................................................................................................................12 Test Environment Details.....................................................................................................................................13

Snapshot Test Runs.................................................................................................................................................14 Preparing the Standby Volumes for Space Optimized Snapshots ......................................................................16

Volume Snapshot and Database Activation on the Standby...........................................................................16 Database State After Snapshots..........................................................................................................................17

Appendix A: Scripts to Administer Snapshots..........................................................................................................18 Take a Snapshot ..................................................................................................................................................19 Description File for Snap......................................................................................................................................19 Command File to Enable Snap Mount by Database Administrator .....................................................................20 Utility to Enable umount for Database Administrator ...........................................................................................21 Script to Remove a Complete Snapshot for Another Update ..............................................................................22

Appendix B: Scripts to Query the HADR Status.......................................................................................................24

Page 6: January 11, 2006 Symantec Corporation IBM Corporation

6

ABSTRACT In today’s global economy there are various business drivers to maintain a second copy of a database at a second, remote location. The need to access data at any time is critical, and thus planned and unplanned outages must be minimized. Depending on the line of business, there may also be mandatory legal requirements. An integrated feature in the IBM DB2® Universal Database™ V8.2 product with High Availability Disaster Recovery (HADR) provides an easy-to-use solution. Data is replicated from a source database, called the primary, to a target database, called the standby, by automatically shipping database log records from the primary to the standby. Protection for both partial and complete site failures is provided using this technology (e.g., hardware issues, software issues, disaster scenarios, etc.). The current implementation of HADR does not allow reads on the standby database (i.e., clients cannot connect to this database). A fully accessible standby database would provide the business with an opportunity for off-host processing (e.g., data mining) while not impacting users currently accessing the primary database. This paper describes the implementation and design of a solution for providing a fully accessible point-in-time copy of the standby database in a DB2 UDB HADR environment. Although this copy of the standby database is fully accessible, it is separated from the HADR pair so updates made to it are not automatically applied to the main (HADR) pair of database copies. The test configuration was implemented on a Sun Solaris 9 platform, but the associated software packages from IBM and Symantec with identical functionality are available on IBM AIX® and Linux platforms as well. Moreover, the additional scripts that are discussed in this paper are written in Perl and as such are easily portable between open systems. Since Symantec delivers a complete Perl distribution as part of the infrastructure for the Storage Foundation, access to the Perl command processor is guaranteed for the discussed environment.

KEY BENEFIT: A fully accessible point-in-time copy of the HADR standby database is possible with VERITAS Storage Foundation for DB2.

Figure 1: Solution Overview

Page 7: January 11, 2006 Symantec Corporation IBM Corporation

7

The solution leverages features in VERITAS Storage Foundation for DB2. With the release 4.0 of VERITAS Volume Manager™ (VM) – which is an integral part of the VERITAS Storage Foundation for DB2 – a new feature called instant, space optimized volume snapshot has been introduced. Now a single point-in-time copy of the storage can be created with only a small amount of additional disk space. Only storage for changed blocks is required.

KEY BENEFIT: For typical workloads, a small amount of disk space is required for the copy of the standby database. With the introduction of instant, space optimized volume snapshots, only storage for changed blocks is

required. This paper will discuss how to activate a point-in-time copy of the standby database without an impact to the active HADR databases. Typically, a DB2 WRITE SUSPEND (which blocks database write activity) is required for the duration of the snapshot creation. The paper will outline a solution using the instant, space optimized volume snapshot feature where a WRITE SUSPEND on the primary database is not required during the snapshot procedure to produce a clean copy of the standby database.

KEY BENEFIT: A WRITE SUSPEND on the PRIMARY is not required during the snapshot procedure to produce a clean copy of the standby database.

Page 8: January 11, 2006 Symantec Corporation IBM Corporation

8

SUMMARY OF REQUIREMENTS An understanding of the business drivers and terminology of high availability computing systems is required in order to obtain maximum benefit from the information contained in this paper. In addition, familiarity with the process of installing and configuring DB2 UDB, along with a working knowledge of VERITAS Volume Manager, VERITAS File System, and TCP/IP networking protocols, is assumed. Minimum software versions

1. IBM DB2 Universal Database V8.2 2. Release 4.0 of VERITAS Storage Foundation for DB2.

Solution requirements To avoid issuing a WRITE SUSPEND on the primary database, the snapshot must be created for all volumes at the same time. In other words, a single, atomic, snapshot command must be used to produce a clean copy of the standby database.

KEY REQUIREMENT: A single snapshot command for all volumes must be used to produce a clean copy of the standby database.

Additional storage will be required for the snapshot copy of the standby database. The amount of storage required for the copy of the standby database will vary with the nature of the workload. If 20% of your database is volatile, then up to 20% of additional storage will be required on the standby system to hold changed blocks.

Page 9: January 11, 2006 Symantec Corporation IBM Corporation

9

OVERVIEW OF THE SOFTWARE COMPONENTS This section gives a short introduction to DB2 UDB design and the VERITAS™ software components. For a more in-depth overview of software, refer to the specific product’s documentation. DB2 UNIVERSAL DATABASE V8.2 FOR LINUX, UNIX, AND WINDOWS DB2 UDB V8.2 marks the next stage in the evolution of the relational database, by providing enhancements in the areas of performance, availability, scalability, and manageability. DB2 UDB is the database of choice for business-critical solutions such as e-business, business intelligence (BI), content management, enterprise resource planning (ERP), and customer relationship management (CRM). In today’s global business environment, the ability to access data at any time is critical. DB2 UDB’s built-in capabilities to handle planned and unplanned outages ensure business applications are available whenever needed. Whether switching to a standby database server if an unexpected database failure occurs, or carrying out online maintenance — such as the ability to perform an online reorg — DB2 makes sure all business applications remain available. Online utilities, such as index rebuild, index create, and table load, as well as configuration parameters that can be changed without stopping the database, deliver improved performance and high availability. DB2 UDB Version 8.2 introduces an integrated high availability solution through two new features: High Availability Disaster Recovery (HADR) and Automatic Client Reroute (ACR). HADR replicates data from a source database, called the primary, to a target database, called the standby, by shipping database log records from the primary to the standby and replaying the logged operations there. It is tightly coupled with DB2 logging and recovery. HADR provides protection for both partial and complete site failures. Combined with ACR capabilities, transparency is provided to the application regardless of the failure type (hardware or software issues or disaster scenarios). HADR provides multiple levels of protection through 3 different synchronization modes (synchronous, near synchronous, and asynchronous) allowing flexibility in the environment. For more information and a complete list of the features included in DB2 UDB, refer to the product documentation, available online at: http://www.ibm.com/software/data/db2/udb/support/manualsv8.html. More information about HADR and ACR can be found in Chapter 7 of the Data Recovery and High Availability Guide and Reference, available online at: ftp://ftp.software.ibm.com/ps/products/db2/info/vr82/pdf/en_US/db2hae81.pdf. VERITAS STORAGE FOUNDATION FOR DB2 VERITAS Storage Foundation for DB2 is an integrated suite of data, storage, and system management technologies that optimize performance, availability, and manageability of DB2 UDB databases. In addition to DB2 UDB-specific utilities for simplified administration, Storage Foundation leverages the VERITAS core technologies: Volume Manager, File System — with Quick I/O™ and Cached Quick I/O, and Cluster Server. Further details of these solutions are available from the Symantec web site (www.symantec.com) and are discussed in the associated Symantec documentation available as part of the software distribution. VERITAS Volume Manager VERITAS Volume Manager is a storage virtualization tool that supports Solaris, HP-UX, Linux, AIX, and Windows operating systems, and allows the management of physical disks as logical devices, called volumes. A volume is a logical device that appears to data management systems as a physical disk partition device. By allowing volumes to span multiple disks, VERITAS Volume Manager enables the management of virtual storage pools rather than actual physical disks. By using VERITAS Volume Manager as an abstraction layer that virtualizes storage and makes managing that storage easier, users overcome the physical restrictions imposed by hardware disk devices.

Page 10: January 11, 2006 Symantec Corporation IBM Corporation

10

Through the support of RAID, VERITAS Volume Manager protects against disk and hardware failure. Additionally, Volume Manager provides features that enable fault tolerance and fast recovery from disk failure. VERITAS Volume Manager also provides easy-to-use online disk storage management for tasks such as dynamically configuring disk storage while the system is active, ensuring that data remains available. VERITAS Volume Manager 4.0 introduced a new feature called instant, space optimized snapshot, which greatly improves the ability to perform off-host application processing. While the traditional third-mirror split/snapshot techniques of the older release of Volume Manager had constraints on storage space and time — a complete copy of the data set had to be synchronized before the snapshot could take place — the new release supports the creation of almost instantaneous snapshots. Since writes to all volumes associated with a single snapshot command are frozen at a single point in time, a WRITE SUSPEND was not required on the primary database. VERITAS Volume Manager gets the complete I/O request from the application or file system layer and can guarantee that the freeze is aligned with page sizes used by the write requester. Fragmentation of data according to hardware constraints such as disk block sizes may happen at lower driver levels such as I/O controller or RAID arrays. But they do not affect the integrity and single point-in-time copy of VERITAS Volume Manager snapshots. With instant, space optimized snapshot, the space requirements for the changed blocks of all data volumes being snapped are consolidated in a single data volume called a shared cache object. Even multiple snapshots of the same set of volumes, from different points in time, can be covered by a single shared cache object. This cache object needs pre-allocated space and will be empty before a snap is taken. Instant, space optimized snapshot only requires sufficient disk space to store copies of the changed blocks from the original data set. Unchanged data blocks are not retained. With instant, space optimized snapshots, multiple read/write copies of the data can be created without a large investment in additional storage.

Page 11: January 11, 2006 Symantec Corporation IBM Corporation

11

DB2 AND HARDWARE CONFIGURATION The tests discussed below were executed on two Sun E450 servers. The HADR network link was implemented using crossover Ethernet cable connected to separate 100 BaseT ports. This type of network link is not representative of wide area DR solutions; however, the configuration was useful to separate the HADR IP traffic from that of DB2 clients accessing the database. One server was attached to an IBM Enterprise Storage Server (Shark) utilizing 100 GB of disk space separated into 100 logical units (LUNs) of 1 GB each. The other server was connected to a Sun A5200 with 22 individual drives, each with 36 GB of storage.

Figure 2: Hardware Setup

Page 12: January 11, 2006 Symantec Corporation IBM Corporation

12

STORAGE CONFIGURATION Each array’s disks were added to a local disk group named db2dg. Because the Volume Manager 4.0 no longer requires a disk group named rootdg, db2dg was the sole disk group on each server. For the database container and control files, 16 volumes were created on each server inside the db2dg disk group. The same names were chosen for matching volumes, to ease administration. All volumes at the primary site were then initialized with the VERITAS file system. Quick I/O files were created for the database container space, and these were accessed by DB2 as a database managed space (DMS). One volume inside each disk group was used for the snapshots. To enable instant, multiple snapshots, a shared data cache volume was created as a striped RAID-0 device. Good write performance is important for the cache, and separating the data cache storage from the data volumes is a good practice. Because the snapshots were used only temporarily on the standby server and performed exclusively on the original volumes, higher availability was not needed. It was, therefore, unnecessary to mirror the shared cache volume of the db2dg disk group located in the A5200 disk enclosure. The database control and log files reside on a single VM volume and VERITAS file system sized to contain a 24-hour run with a high-transaction-volume DB2 workload. The storage for the database container was placed on separate volumes each with VERITAS file systems and pre-allocated QIO files. Using a total of 14 separate volumes for a single database allowed for scalability testing of the snapshot environment. In particular data consistency of multiple snapshot volumes is harder to obtain than placing all data on a single file system – probably the better choice in respect of easy administration for the small test database. But the finer granularity of this storage configuration better matches that of large enterprise DB2 database implementations where performance optimization down to the disk level is a requirement. DB2 LAYOUT A database layout was created to support a high-transaction-volume DB2 workload. This design is commonly used as part of the DB2 UDB system test cycle. Four buffer pools, with 4 KB, 8 KB, 16 KB, and 32 KB page sizes, were created. Each buffer pool was allocated 16 MB, totaling 64 MB of buffer pool space. The database layout consisted of numerous DMS table spaces using file-type containers of various page sizes.

• Four temporary table spaces of 4 KB, 8 KB, 16 KB, and 32 KB page sizes totaling 20 GB • Four regular table spaces of 4 KB, 8KB, 16 KB, and 32 KB page sizes to store data totaling 20 GB • Four regular table spaces of 4 KB, 8 KB, 16 KB, and 32 KB page sizes to store index data totaling 8 GB • One long/large table space of 4 KB page size to store LOB data columns totaling 40 GB

A variation of the testing was performed on a layout consisting solely of SMS table spaces. A random data generator seeded the tables with data, creating a database with a 15 GB initial size for use in testing.

Page 13: January 11, 2006 Symantec Corporation IBM Corporation

13

TEST ENVIRONMENT DETAILS Below, the specific details of the test environment are outlined and are referenced in the paper. Primary Machine Hostname: sun-ha3.torolab.ibm.com DB2 instance: svtdbm Database name: absolut Standby Machine Hostname: sun-ha4.torolab.ibm.com DB2 instance: svtdbm Database name: absolut DB2 instance used to activate the snapshot copy of the standby database: svtdbm2 Note that since svtdbm is used on both machines, the paper will refer to svtdbm on the primary machine as DB2Admin_PRI and svtdbm on the standby machine as DB2Admin_STB. For consistency, svtdbm2 on the standby will be referred to as DB2Admin_SNAP.

Page 14: January 11, 2006 Symantec Corporation IBM Corporation

14

SNAPSHOT TEST RUNS For the test runs, DB2 UDB was installed, and identical DB2 instance environments were created on both servers on local, non-replicated storage. Note that using the same instance name on the primary and standby is recommended for HADR but not mandatory. Then separate disk groups and data volumes for the test database were created on each host. Again, it is good practice to use the same naming scheme and volume sizes on each machine. Then the test database was created on the dedicated primary. There are several ways to seed a HADR environment from the primary to the standby. One way is a complete database backup of the primary database followed by a database restore on the standby system. To start HADR, the necessary DB2 attributes have to be set on the databases. Some restrictions to note here:

1. The servers in the HADR pair must be running the same operating system version. 2. The DB2 UDB versions must match as well. 3. The DB2 instances must be at the same bit level. 4. Circular logging is not supported in an HADR environment. It is assumed that the appropriate steps are

taken prior to executing the steps below to establish the appropriate logging for HADR, such as:

db2 -v UPDATE DB CFG FOR absolut USING LOGRETAIN ON db2 –v BACKUP DATABASE absolut

LOGRETAIN is not a common production configuration but was used for simplicity in the testing of the solution. The user can choose to archive their log files for the primary database to another location. More information about HADR configuration parameters can be found in Chapter 7 of the Data Recovery and High Availability Guide and Reference, available online at: ftp://ftp.software.ibm.com/ps/products/db2/info/vr82/pdf/en_US/db2hae81.pdf. On DB2Admin_STB db2 -v UPDATE DB CFG FOR absolut USING HADR_LOCAL_HOST sun-ha4.torolab.ibm.com db2 -v UPDATE DB CFG FOR absolut USING HADR_LOCAL_SVC 60123 db2 -v UPDATE DB CFG FOR absolut USING HADR_REMOTE_HOST sun-ha3.torolab.ibm.com db2 -v UPDATE DB CFG FOR absolut USING HADR_REMOTE_SVC 60123 db2 -v UPDATE DB CFG FOR absolut USING HADR_REMOTE_INST svtdbm db2 -v UPDATE DB CFG FOR absolut USING HADR_SYNCMODE SYNC db2 -v UPDATE DB CFG FOR absolut USING HADR_TIMEOUT 120 db2 -v UPDATE DB CFG FOR absolut USING LOGINDEXBUILD ON db2 -v UPDATE DB CFG FOR absolut USING INDEXREC RESTART On DB2Admin_PRI db2 -v UPDATE DB CFG FOR absolut USING HADR_LOCAL_HOST sun-ha3.torolab.ibm.com db2 -v UPDATE DB CFG FOR absolut USING HADR_LOCAL_SVC 60123 db2 -v UPDATE DB CFG FOR absolut USING HADR_REMOTE_HOST sun-ha4.torolab.ibm.com db2 -v UPDATE DB CFG FOR absolut USING HADR_REMOTE_SVC 60123 db2 -v UPDATE DB CFG FOR absolut USING HADR_REMOTE_INST svtdbm db2 -v UPDATE DB CFG FOR absolut USING HADR_SYNCMODE SYNC db2 -v UPDATE DB CFG FOR absolut USING HADR_TIMEOUT 120

Page 15: January 11, 2006 Symantec Corporation IBM Corporation

15

db2 -v UPDATE DB CFG FOR absolut USING LOGINDEXBUILD ON db2 -v UPDATE DB CFG FOR absolut USING INDEXREC RESTART Although setting LOGINDEXBUILD to ON is not a mandatory operation to establish the HADR pair, it is the recommended setting. When LOGINDEXBUILD is ON, the complete information for index creation, recreation, and reorganization is logged. This means that index creation will take longer on the primary system and more log space will be required. The advantage is that these logged operations are replicated to the standby system and the indexes are rebuilt during HADR log replay; thus the indexes are immediately available after a takeover operation. Also, setting INDEXREC to RESTART is not mandatory but is the recommended setting. This will cause invalid indexes to be rebuilt after the takeover completes. To start HADR, issue the following commands. Note that the HADR standby is to be started first. DB2Admin_STB> db2start DB2Admin_STB> db2 start hadr on database absolut as standby DB2Admin_PRI> db2start DB2Admin_PRI> db2 start hadr on database absolut as primary The following methods can then be used to ensure the HADR pair is established.

1. Look in the db2diag.log for messages indicating the pair is in PEER state. For instance,

DB2Admin_PRI> /export/homevx/svtdbm/sqllib/db2dump > tail -5 db2diag.log 2005-01-31-11.16.44.508996-300 E15295439A321 LEVEL: Event PID : 8506 TID : 1 PROC : db2hadrp (ABSOLUT) 0 INSTANCE: svtdbm NODE : 000 FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrSetHdrState, probe:10000 CHANGE : HADR state set to P-Peer (was P-NearlyPeer)

2. Run a GET SNAPSHOT command and look at the HADR status, for instance using the script described in

Appendix B: Scripts to Query the HADR Status

DB2Admin_PRI> hadrstat.pl HADR Status

Role = Primary State = Peer Synchronization mode = Sync Connection status = Connected, 01/31/2005 11:16:43.037325 Heartbeats missed = 0 Local host = sun-ha4.torolab.ibm.com Local service = 60123 Remote host = sun-ha3.torolab.ibm.com Remote service = 60123 Remote instance = svtdbm timeout (seconds) = 120 Primary log position(file, page, LSN) = S0000002.LOG, 0, 0000000011940000 Standby log position(file, page, LSN) = S0000002.LOG, 0, 0000000011940000 Log gap running average(bytes) = 0

Page 16: January 11, 2006 Symantec Corporation IBM Corporation

16

PREPARING THE STANDBY VOLUMES FOR SPACE OPTIMIZED SNAPSHOTS For this part of the test sequence, VERITAS Storage Foundation for DB2 4.0 was used on the Solaris system to reduce the storage demands of the snapshot — 4.0 is the minimum version of Volume Manager that supports instant, space optimized volume snapshots. After configuring and loading the DB2 test database and starting HADR, the data volumes for the snapshot were initialized on the standby server. The preparation of storage for volume snapshots can be achieved using the Volume Manager CLI — see the example command syntax below — or the VEA GUI. Each volume used for the DB2 database must be initialized once: the Data Change Object (DCO), which tracks volume updates for each data region, can be used by multiple snapshots concurrently. DB2Admin_STB> vxsnap -g db2dg prepare <volume_name> To support multiple, space optimized snapshots a shared cache object was created in each Volume Manager disk group: DB2Admin_STB> vxassist -g db2dg make db2cach-vol <size_of_2_disks> DB2Admin_STB> vxcache -g db2dg att db2cach-vol db2cache Volume Snapshot and Database Activation on the Standby Because the current HADR implementation does not support access to the standby database, another copy of the DB2 database has to be created. The tests detailed here used the new instant, space optimized snapshot feature of Volume Manager Release 4.0, which allows the creation and use of a complete copy of data volumes in seconds. While all Volume Manager commands can only be executed by the root user — UNIX account with user ID 0 — additional scripts were created to administer and activate snapshots by the DB2 UDB instance owner (see Appendix A: Scripts to Administer Snapshots). The desired way to create an instant snapshot is to do so without interfering with the active database. Instant, space optimized snapshots were created at an arbitrary point in time on the standby without a WRITE SUSPEND on the primary database. During testing, the volumes created by the snapshot command were always consistent. The file systems of the snapshot copies were in a clean state and did not need recovery. In addition to quiescing the I/O to all volumes being snapped at a single point in time, the VERITAS Volume Manager snapshot also flushes file system buffers if it detects VxFS as the contents of the volumes. The act of creating the snapshot was separated from mounting the associated file systems. With separate scripts, multiple snapshots can be created, but only one at a time can be activated (mounted). DB2Admin_STB> snap00.pl DB2Admin_STB> snap00_mnt.pl To make the snapshot usable with an alternate instance, use the following DB2 commands: DB2Admin_SNAP> db2relocatedb –f reloc_absolut DB2Admin_SNAP> db2start DB2Admin_SNAP> db2 stop hadr on database absolut DB2Admin_SNAP> db2 rollforward database absolut to end of logs and stop

Page 17: January 11, 2006 Symantec Corporation IBM Corporation

17

The file reloc_absolut used by the db2relocatedb contains the configuration information necessary for relocating the database: DB_NAME=absolut DB_PATH=/db03/,/SNAP00-db03/ INSTANCE=svtdbm,svtdbm2 The database snapshot was reassigned to a different instance owner on the standby, since the original instance was still used in the HADR pair. The INSTANCE field maps the new database to a different instance (from svtdbm to svtdbm2) on the standby system. Since separate instances are being used, the same database name can be used (but is not mandatory) as indicated by the DBNAME parameter. The DBPATH parameter indicates the new location of the standby database copy (/SNAP00-db03/). The /db03/ file system is the database path of the original standby. More information about db2relocatedb can be found in Chapter 1 of the Command Reference, available online at: ftp://ftp.software.ibm.com/ps/products/db2/info/vr82/pdf/en_US/db2n0e81.pdf. DATABASE STATE AFTER SNAPSHOTS Each time the DB2 database was started, the db2diag.log files were scanned for error messages regarding data inconsistencies. This was done to prove that the used methodology was correct, but is not necessary to activate the snapshot. Other actions were taken to validate the copy of the standby database, including:

1. Running db2dart on the database. 2. Running “SELECT COUNT(*) and SELECT * on every table in the database. 3. Executing a high-transaction-volume DB2 workload on the database. This step was not necessary but

demonstrates that the copy of the standby database is fully accessible since this workload included INSERT, UPDATE, DELETE, and SELECT statements.

For the arbitrary point-in-time snapshots, the database was always in a roll-forward recoverable state, similar to the state after a system crash. The standard DB2 crash recovery procedures always opened the database in a consistent state for immediate user access. This illustrates the very robust DB2 recovery design, as well as confirming that the Volume Snapshot mechanism did not result in incomplete I/O page writes. For multiple volumes, all snapshots were taken at exactly the same single point in time, preserving consistency among the various database containers and the DB2 archive logs. A WRITE SUSPEND was not required on the primary or a disconnect of the HADR pair to take the snapshot copy of the standby database. This was all done while the HADR link was active and a high-transaction-volume DB2 workload was executing against the primary database. Because the complete update I/O stream from the database writer processes has to pass through the VxVM daemon, a temporary freeze of I/O activity allows even a multiple volume / file system snapshot. During the time of the freeze operation, internal volume manager buffers are flushed; guaranteeing consistency of the information on disk and thus a clean copy of the standby database is produced through the snapshot. The fact that VxVM does not break the write request into smaller units (as a disk driver might) means that write requests by DB2 UDB, especially to the active logs, are not partially written. .

Page 18: January 11, 2006 Symantec Corporation IBM Corporation

18

APPENDIX A: SCRIPTS TO ADMINISTER SNAPSHOTS Although a single snapshot was used for the tests, with the prefix SNAP00 for all volumes, Volume Manager supports multiple snapshots for volumes at any given point in time. VM supports even full data snapshots and hierarchical snapshots of snapshots. For multiple snapshots, an administration interface would be very desirable, because the number of volumes and output of standard commands becomes cluttered because of the number of objects for each snapshot. Nevertheless, the following samples show a possible hierarchy for a multiple snapshot approach. For our purposes, a set of four scripts for each step of the snapshot procedure was implemented in separate sub-directories. This was necessary because the setuid/setgid permission, set to make the Perl scripts executable by the DB2 database administrator, prohibited passing of information as part of the execution command line. All arguments used for privileged external Perl "system" calls have to be written into the script and cannot be obtained externally, like input from files or other commands. For more information, see the chapter on security in the Perl documentation at www.perl.org. In addition to scripts for creating and mounting a snapshot (snap00.pl and snap00_mnt.pl), two more scripts were implemented. The first script unmounts the file systems — snap00_umnt.pl — and the second removes the snapshot objects — snap00_rm.pl. Because the snapshot must be created for all volumes at the same time, it is not possible to use separate vxsnap commands for each volume inside a scripted loop. To handle this, a fifth file (snap.desc) was created containing the multiple sets of triplets for the “vxsnap make” command, which is supplied via the “-d” option. The volume snapshot information can be passed to vxsnap as part of the command line, but for configurations with many volumes, or complex volume names, the operating system’s maximum character limit for a single command line will likely be exceeded. In addition, the separation between the volume information and snap00.pl makes the script easier to read.

Page 19: January 11, 2006 Symantec Corporation IBM Corporation

19

TAKE A SNAPSHOT The contents of the snap00.pl script used to take a snapshot are listed below: #!/opt/VRTSperl/bin/perl # # This script creates an instant, space optimized snapshot for volumes # defined by description file $SnapDesc. # Secure environment $ENV{PATH} = "/usr/bin:/usr/sbin"; $ENV{ENV} = ""; $ENV{LANG} = "C"; $ENV{LD_LIBRARY_PATH} = "/usr/lib"; delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; my $VxSnap = "/usr/sbin/vxsnap"; my $SnapID = "SNAP00"; my $DGroup = "db2dg"; my $SnapDesc = "./$SnapID/snap.desc"; $RetCode = system("$VxSnap -g $DGroup -d $SnapDesc make"); if ($RetCode != 0) { printf("Create of snapshot for volume %s in diskgroup %s returned error code %d\n",$Vol,$DGroup,$RetCode); exit 1; } printf("Created new snapshot %s for diskgroup %s\n", $SnapID, $DGroup); exit 0; DESCRIPTION FILE FOR SNAP The contents of the description file snap.desc used for the snapshot are listed below: source=db03/new=SNAP00-db03/cache=db2cache source=db03log/new=SNAP00-db03log/cache=db2cache source=long4k/new=SNAP00-long4k/cache=db2cache source=data4k/new=SNAP00-data4k/cache=db2cache source=data8k/new=SNAP00-data8k/cache=db2cache source=data16k/new=SNAP00-data16k/cache=db2cache source=data32k/new=SNAP00-data32k/cache=db2cache source=index4k/new=SNAP00-index4k/cache=db2cache source=index8k/new=SNAP00-index8k/cache=db2cache source=index16k/new=SNAP00-index16k/cache=db2cache source=index32k/new=SNAP00-index32k/cache=db2cache source=temp4k/new=SNAP00-temp4k/cache=db2cache source=temp8k/new=SNAP00-temp8k/cache=db2cache source=temp16k/new=SNAP00-temp16k/cache=db2cache source=temp32k/new=SNAP00-temp32k/cache=db2cache

Page 20: January 11, 2006 Symantec Corporation IBM Corporation

20

COMMAND FILE TO ENABLE SNAP MOUNT BY DATABASE ADMINISTRATOR The contents of the snap00_mnt.pl script used to mount the snapshot are listed below: #!/opt/VRTSperl/bin/perl # # # Secure environment $ENV{PATH} = "/usr/bin:/usr/sbin"; $ENV{LD_LIBRARY_PATH} = "/usr/lib"; $ENV{ENV} = ""; $ENV{"LANG"} = "C"; $ENV{LD_LIBRARY_PATH} = "/usr/lib"; delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; my $MountFS = "/usr/sbin/mount"; my $Fsck = "/usr/sbin/fsck"; my $SnapID = "SNAP00"; my $DGroup = "db2dg"; my $BaseDir = "db03"; my $OldInst = "svtdbm"; my $NewInst = "svtdbm2"; @MntTable = (); push(@MntTable,'db03 /db03 vxfs rw,suid,delaylog,largefiles,qio,ioerror= mwdisable'); push(@MntTable,'data16k /db03/data16k vxfs rw,suid,delaylog,largefiles,qio, ioerror=mwdisable'); push(@MntTable,'data32k /db03/data32k vxfs rw,suid,delaylog,largefiles,qio, ioerror=mwdisable'); push(@MntTable,'data4k /db03/data4k vxfs rw,suid,delaylog,largefiles,qio, ioerror=mwdisable'); push(@MntTable,'data8k /db03/data8k vxfs rw,suid,delaylog,largefiles,qio, ioerror=mwdisable'); push(@MntTable,'index16k /db03/index16k vxfs rw,suid,delaylog,largefi les,qio,ioerror=mwdisable'); push(@MntTable,'index32k /db03/index32k vxfs rw,suid,delaylog,largefi les,qio,ioerror=mwdisable'); push(@MntTable,'index4k /db03/index4k vxfs rw,suid,delaylog,largefiles,qio, ioerror=mwdisable'); push(@MntTable,'index8k /db03/index8k vxfs rw,suid,delaylog,largefiles,qio, ioerror=mwdisable'); push(@MntTable,'long4k /db03/long4k vxfs rw,suid,delaylog,largefiles,qio, ioerror=mwdisable'); push(@MntTable,'db03log /db03/svtdbm/NODE0000/SQL00001/SQLOGDIR vxfs rw,suid, delaylog,largefiles,qio,ioerror=mwdisable'); push(@MntTable,'temp16k /db03/temp16k vxfs rw,suid,delaylog,largefiles,qio, ioerror=mwdisable');

Page 21: January 11, 2006 Symantec Corporation IBM Corporation

21

push(@MntTable,'temp32k /db03/temp32k vxfs rw,suid,delaylog,largefiles,qio, ioerror=mwdisable'); push(@MntTable,'temp4k /db03/temp4k vxfs rw,suid,delaylog,largefiles,qio, ioerror=mwdisable'); push(@MntTable,'temp8k /db03/temp8k vxfs rw,suid,delaylog,largefiles,qio, ioerror=mwdisable'); for (@MntTable) { chomp; ($Volume,$MntDir,$FSType,$MntOpts) = split; system("$Fsck -F $FSType -y /dev/vx/rdsk/$DGroup/$SnapID-$Volume"); system("$MountFS -F $FSType -o $MntOpts /dev/vx/dsk/$DGroup/$SnapID-$Volume $MntDir"); } system("/usr/bin/chown –R $NewInst /${SnapID}${BaseDir}"); system("/usr/bin/mv /${SnapId}${BaseDir}/$OldInst /${SnapId}${BaseDir}/$NewInst"); exit 0; UTILITY TO ENABLE UMOUNT FOR DATABASE ADMINISTRATOR The contents of the snap00_umnt.pl script used to unmount the snapshot are listed below: #!/opt/VRTSperl/bin/perl # # Secure environment $ENV{PATH} = "/usr/bin:/usr/sbin"; $ENV{LD_LIBRARY_PATH} = "/usr/lib"; $ENV{ENV} = ""; $ENV{"LANG"} = "C"; $ENV{LD_LIBRARY_PATH} = "/usr/lib"; delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; my $UMountFS = "/usr/sbin/umount"; my $SnapID = "SNAP00"; my $DGroup = "db2dg"; @MntTable = (); push(@MntTable,'temp8k'); push(@MntTable,'temp4k'); push(@MntTable,'temp32k'); push(@MntTable,'temp16k'); push(@MntTable,'long4k'); push(@MntTable,'index8k'); push(@MntTable,'index4k'); push(@MntTable,'index32k'); push(@MntTable,'index16k'); push(@MntTable,'data8k'); push(@MntTable,'data4k');

Page 22: January 11, 2006 Symantec Corporation IBM Corporation

22

push(@MntTable,'data32k'); push(@MntTable,'data16k'); push(@MntTable,'db03log'); push(@MntTable,'db03'); for (@MntTable) { chomp; ($Volume) = split; system("$UMountFS /dev/vx/dsk/$DGroup/$SnapID-$Volume"); $RetCode = $? >> 8; if ($RetCode) { printf("Unmount for volume %s-%s failed with error code %d.\n", $SnapID, $Volume, $RetCode); next; } } exit 0; SCRIPT TO REMOVE A COMPLETE SNAPSHOT FOR ANOTHER UPDATE The contents of the snap00_rm.pl script used to remove a complete snapshot are listed below: #!/opt/VRTSperl/bin/perl # # Secure environment $ENV{PATH} = "/usr/bin:/usr/sbin"; $ENV{LD_LIBRARY_PATH} = "/usr/lib"; $ENV{ENV} = ""; $ENV{"LANG"} = "C"; $ENV{LD_LIBRARY_PATH} = "/usr/lib"; delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; my $VxSnap = "/usr/sbin/vxsnap"; my $VxEdit = "/usr/sbin/vxedit"; $DiskFree = "/usr/bin/df -n"; my $SnapID = "SNAP00"; my $DGroup = "db2dg"; my @RepVols = (); push(@RepVols,'temp8k'); push(@RepVols,'temp4k'); push(@RepVols,'temp32k'); push(@RepVols,'temp16k'); push(@RepVols,'long4k'); push(@RepVols,'index8k'); push(@RepVols,'index4k'); push(@RepVols,'index32k'); push(@RepVols,'index16k'); push(@RepVols,'data8k');

Page 23: January 11, 2006 Symantec Corporation IBM Corporation

23

push(@RepVols,'data4k'); push(@RepVols,'data32k'); push(@RepVols,'data16k'); push(@RepVols,'db03log'); push(@RepVols,'db03'); for (@RepVols) { chomp; my ($Volume) = split; $RetCode = system("$DiskFree /dev/vx/dsk/$DGroup/$SnapID-$Volume 1>/dev/null 2>&1"); unless ( $RetCode ) { printf("Volune %s-%s is still mounted - skipping removal of volume.\n", $SnapID, $Volume); next; } system("$VxSnap -g $DGroup dis $SnapID-$Volume"); system("$VxEdit -g $DGroup -f -r rm $SnapID-$Volume"); } exit 0;

Page 24: January 11, 2006 Symantec Corporation IBM Corporation

24

APPENDIX B: SCRIPTS TO QUERY THE HADR STATUS To examine the current state of the HADR link, DB2 UDB has extended the output of the "db2 get snapshot" option. Since this listing can exceed multiple terminal pages, a small utility was developed to extract just the HADR relevant lines. The contents of the hadrstat.pl script used to obtain the HADR status is listed below: #!/opt/VRTSperl/bin/perl # # open(DB2SNAP, "db2 get snapshot for all databases |"); $Found = 0; while (<DB2SNAP>) { next unless /^\s+HADR/ || $Found; $Found = 1; print $_; last if /Log gap/; } exit 0;