Disaster Recovery of Oracle Database Standard Edition (SE) and

26
Disaster Recovery of Oracle Database Standard Edition (SE) and Standard Edition One (SE1) on Dell EqualLogic PS Series iSCSI Storage Systems Database Solutions Engineering Wendy Chen and Balamurugan B, Dell Product Group Thomas Kopec, Oracle April 2011

Transcript of Disaster Recovery of Oracle Database Standard Edition (SE) and

Page 1: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle

Database Standard Edition (SE) and Standard Edition One (SE1) on Dell EqualLogic PS Series iSCSI Storage Systems

Database Solutions Engineering

Wendy Chen and Balamurugan B, Dell Product Group

Thomas Kopec, Oracle

April 2011

Page 2: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page ii

THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND.

© 2011 Dell Inc. All rights reserved. Reproduction of this material in any manner whatsoever without the express written permission of Dell Inc. is strictly forbidden. For more information, contact Dell.

Dell, the DELL logo, the DELL badge, PowerEdge, PowerConnect, PowerVault, and EqualLogic are trademarks of Dell Inc. Oracle and the Oracle Platinum Partner logo are registered trademarks of Oracle Corporation and/or its affiliates. Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products. Dell Inc. disclaims any proprietary interest in trademarks and trade names other than its own.

April 2011

Page 3: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page iii

Acknowledgements

The authors would like to thank the following people for their contribution during the development and review of this whitepaper:

Sherry Keller James Park Sudhansu Sekhar Krishna Kapa John Ellis

Page 4: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 1

Contents Executive Summary ....................................................................................................... 3

Introduction ................................................................................................................ 4

Dell Solutions for Oracle Database ..................................................................................... 6

Storage-Based Remote Data Replication Technologies ............................................................. 6

Dell EqualLogic Replication Technology ............................................................................... 7

Overview ................................................................................................................. 7

Volume Collection ...................................................................................................... 7

Crash-Consistent Images............................................................................................... 7

Oracle Support for Third Party Snapshot-Based Technology ....................................................... 8

My Oracle Support Note 604683.1 ................................................................................... 8

Backup Operations ................................................................................................... 8

Restore and Recovery Operations ................................................................................. 8

Integrating My Oracle Support Note 604683.1 with Dell EqualLogic Replication ............................ 9

Oracle Disaster Recovery using Dell EqualLogic Replication ....................................................... 9

Test Configuration ...................................................................................................... 9

Preparing the secondary RAC database ........................................................................... 10

Creating Replica from a Primary Storage Group ................................................................ 11

Creating Volume Collection ...................................................................................... 11

Configuring Replication Partner ................................................................................. 11

Creating Replica on Volume Collection ........................................................................ 12

Failing Over Data Volumes .......................................................................................... 12

Promoting Replica to Recovery Volume in the Secondary Storage Group ............................... 12

Accessing Recovery Volumes from Standby Servers ............................................................ 12

Shutting Down Grid Infrastructure on Secondary Servers ................................................... 12

Accessing Recovery Volumes from Linux Operating System ................................................ 12

Starting up a Standby Database .................................................................................... 14

Starting up Grid Infrastructure on Secondary Servers ....................................................... 14

Starting up Database Instances on Secondary Servers ...................................................... 14

Creating a New Resetlog of the Secondary RAC Database .................................................. 15

Updating Database Parameters LOCAL_LISTENER and REMOTE_LISTENER ............................... 16

Failing Back Data Volumes .......................................................................................... 16

Performance Impact of Replication .................................................................................. 17

Summary .................................................................................................................. 18

Page 5: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 2

References ................................................................................................................ 18

Appendix .................................................................................................................. 19

Figures Figure 1. Increasing systems availability by combining high availability features ......................... 4

Figure 2. RAC and Dell EqualLogic storage replication for end-to-end data protection and high

availability ............................................................................................................ 6

Figure 3. Architectural overview of Oracle database replication with Dell EqualLogic PS Series iSCSI

storage10

Figure 4. Performance Impact of Dell EqualLogic replication on Transaction per Second (TPS) during

Quest Benchmark Factory TPC-C test ......................................................................... 17

Figure 5. Performance impact of Dell EqualLogic replication on average transaction response time

during Quest Benchmark Factory TPC-C test ................................................................. 18

Page 6: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 3

Executive Summary A reliable disaster recovery plan is essential to mission-critical Oracle® databases in order to support business continuity following a natural or human-induced disaster. Disaster recovery strategies for Oracle databases typically involve the replication of data to a remote location. A comprehensive disaster recovery plan enables continuous data access at the remote location, thus minimizing financial losses resulting from downtime outages in the event that disaster strikes the primary data center. Among the various disaster recovery options, Oracle Data Guard is one commonly selected solution given its native integration with Oracle databases. However, Oracle Data Guard is available only as a feature included within the Oracle Database Enterprise Edition (EE); small and medium businesses deploying the Oracle Database Standard Edition (SE) or the Oracle Database Standard Edition One (SE1) cannot implement Data Guard. To address this challenge, Dell™ and Oracle have developed an alternative disaster recovery solution using Dell EqualLogic™ storage-based data replication technology. This technology enables the creation of an instant Oracle database copy in a remote storage group. This paper provides recommended best practices to implement Dell EqualLogic storage replication as an Oracle disaster recovery solution.

Page 7: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 4

Introduction Every mission-critical Oracle database needs to be highly reliable and available in order to support global businesses. Any downtime for critical databases can lead to financial losses. It is essential for many enterprises to implement high availability (HA) solutions to protect mission-critical databases. To help maximize systems availability for Oracle databases, administrators can combine hardware high availability features with Oracle software high availability features. As illustrated in Figure 1, system availability improves with additional high availability features deployed in the Oracle database solution stack.

Figure 1. Increasing systems availability by combining high availability features

• Dell PowerEdge™ servers and Dell EqualLogic iSCSI storage systems provide redundancy at the hardware level with features such as dual power supplies, dual host bus adapters (HBAs), and dual network interface cards (NICs) for Dell PowerEdge servers, and dual storage controllers and RAID technology for Dell EqualLogic iSCSI storage systems.

• Oracle Flashback helps simplify data recovery by allowing administrators to shift data back and forth in time.

• Oracle Automatic Storage Management (ASM) helps protect data loss by striping and mirroring data across multiple disks.

• Oracle Recovery Manager (RMAN) enables the easy management of database backup, restore, and recovery processes.

• Dell EqualLogic storage snapshot enables quick backup and recovery of Oracle databases. • Oracle Real Application clusters (RAC) helps protect against single-instance component failures by

enabling multiple instances to share access to one Oracle database. • Oracle Data Guard (Enterprise Edition Database Only) helps protect Oracle databases against

primary site failures, disasters, errors, and data corruption. • Dell EqualLogic storage replication helps protect Oracle data from a variety of failures, ranging

from destruction of a volume to a complete site disaster.

Oracle 11g SE1 can support servers with up to two CPU sockets. Oracle 11g SE1 provides enterprise-class performance and security. It is simple to manage, and can easily scale as demand increases. Oracle 11g SE1 is upwardly compatible with other database editions.

Page 8: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 5

Oracle 11g SE includes Oracle Real Application Clusters (RAC) licensing for enterprise-class availability at no extra cost. Oracle 11g SE can support clustered servers with up to four CPU sockets in the cluster. With the significantly lower cost of the Oracle SE together with Oracle RAC capabilities extending to 4-socket processors, Oracle SE RAC offers database customers a cost effective option with high availability and scalability. Although Oracle SE and Oracle SE1 contain a subset of the database features included in EE, Data Guard is not one of them, and customers deploying on SE or SE1 need to seek an alternative disaster recovery. In this white paper, we introduce an Oracle database disaster recovery solution using Dell EqualLogic storage replication technology.

The Dell EqualLogic PS Series iSCSI storage arrays provide primary and secondary storage capacity to a wide variety of applications with enterprise-class performance and low cost-of-ownership. By delivering the benefits of consolidated networked storage in a self-managing iSCSI Storage Area Network (SAN), the PS Series storage product is easy to use and affordable. Built using a patented peer storage architecture where all arrays in a storage pool are designed to work together to provide disk capacity and evenly distribute the load, the PS Series SAN offers high performance, reliability, scalability, intelligent automation, simplified deployment, and comprehensive data protection. The PS Series storage arrays include add-on software to provide snapshot, replication, and other features with no additional cost. These features allow Oracle data to be easily and readily replicated for data protection and business continuity.

The Dell EqualLogic replication feature provides protection against data loss by copying volume data from one storage group to another storage group. The two storage groups must be connected through a TCP/IP network. If a volume in the primary storage group is destroyed, applications can fail over to the secondary storage group and recover data from a replica.

As shown in Figure 2, Dell EqualLogic replication can be used to implement a disaster recovery strategy for Oracle databases. In the event of a disaster on the primary site, business applications can quickly resume on the secondary site.

Page 9: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 6

Figure 2. RAC and Dell EqualLogic storage replication for end-to-end data protection and high availability

Dell Solutions for Oracle Database

Dell Solutions for Oracle products are designed to simplify operations, improve usability and provide cost-effective scalability as your needs grow over time. In addition to providing server and storage hardware, Dell solutions for Oracle include:

• Dell Configurations for Oracle―in-depth testing of Oracle configurations for high-demand solutions; documentation and tools that help simplify deployment

• Integrated Solution Management―standards-based management of Dell Solutions for Oracle that can lower operational costs through integrated hardware and software deployment, monitoring, and updating

• Oracle Licensing—licensing options that can simplify customer purchase • Dell Enterprise Support and Infrastructure Services for Oracle―planning, deployment, and

maintenance of Dell Solutions for Oracle Database tools

For more information concerning Dell Solutions for Oracle Database, visit www.dell.com/oracle.

Storage-Based Remote Data Replication Technologies Storage-based remote data replication technologies enable the copy of data blocks under one storage array to another storage array. The three key storage-based remote data replication methods are:

• Snapshot-based remote point-in-time replication. Snapshots are point-in-time copies of source data and are usually stored in the same storage array as the source data. Snapshots can be generated as pointer-based copies of source data via the copy on write mechanisms, or as complete copies of source data. However, snapshot cannot protect source data from storage hardware failures as snapshots and source data are collocated in the same storage system. Snapshot-based remote point-in-time replication addresses this challenge by replicating snapshots across a network to an offsite storage system.

Page 10: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 7

• Continuous synchronous replication. With continuous replication, each write executed on the primary storage system is executed on the remote storage system as well. With the synchronous replication method, the write on primary storage system is not acknowledged back to the application until the write is written to the remote storage system. Continuous synchronous replication may incur a large performance overhead to the application, and requires fast network between the local and remote sites to keep the latency to a minimum.

• Continuous asynchronous replication. Like continuous synchronous replication, each write

executed on the primary storage system is also executed on the remote storage system with the continuous asynchronous replication. Unlike continuous synchronous replication, continuous asynchronous replication acknowledges write back to the application immediately without waiting for write to be written to the remote storage system.

Dell EqualLogic Replication Technology

Overview The Dell EqualLogic PS Series storage system includes built-in software to support storage-based remote data replication at no additional cost. PS Series replication is a point-in-time snapshot-based data replication method that provides data protection by copying volume data from the primary storage group to a remote secondary storage group. If the primary site should fail, data access can continue at the secondary site by failing over to the secondary site and recovering data from replicas which represent data volumes created by replication in the secondary storage group. Replicas are similar to snapshots in that they represent the contents of the volume at a specific point-in-time. Each replicated volume has a replica set: the set of replicas created over time that delivers multiple recovery points.

Dell EqualLogic PS Series replication technology efficiently consumes storage space and network bandwidth. The first replication is a complete volume data transfer. For subsequent replication operations, the primary group copies only the data that changed since the previous replication operation started. This is network-efficient and uses less storage capacity while providing complete volume contents at the same time.

Volume Collection An Oracle database typically spans multiple volumes. If replication is taken on each volume individually, the contents of one volume might not be content-consistent with other volumes. The EqualLogic volume collection feature can help address this issue. A volume collection is a set of volumes grouped together for replication that represents content-consistent data at a particular point-in-time. Volume collection allows replicating two or more volumes simultaneously by placing them is the same collection.

Crash-Consistent Images When replicating an Oracle database, all volumes belonging to the same database should be grouped together in a volume collection, and replicated together as a volume collection. This creates a point-in-time database image at the standby site that is a crash-consistent database image. The state of the image is equivalent to the state of the database after a sudden power failure, or a server crash, or a shutdown abort. When such database image is restarted, Oracle automatically performs crash recovery using the online redo log files and undo log files. The crash recovery process basically rolls forward any

Page 11: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 8

committed transactions, and rolls back any uncommitted transactions. After the crash recovery, the database is in a transactional consistent state.

Oracle Support for Third Party Snapshot-Based Technology

My Oracle Support Note 604683.1 Oracle offers support of third-party snapshot technologies as documented in My Oracle Support Note # 604683.1. In this note, Oracle announced official support of backup and recovery of Oracle databases without the database in BACKUP mode using third-party snapshot technologies. Traditionally, Oracle requires that the database be in BACKUP mode when backing up database using a utility other than Oracle RMAN. The main reasons why Oracle requires putting the associated data files or database in BACKUP mode are:

• Ensuring file header consistency prior to the backup and updating the file headers with the recovery start time

• Creating sufficient redo to recover from any fractured or inconsistent blocks • Creating an “end backup” marker to demarcate the minimum recovery time and ensure

database consistency after restoring and recovering the data files

BACKUP mode incurs a performance impact on the Oracle database. The overhead of transitioning a database in and out of BACKUP mode is:

• Additional redo data is logged • A complete database checkpoint is required • More operational steps and complexity during the backup operation

In October 2010, Oracle announced its official support of using third-party snapshot technologies to create a valid database image without the database in BACKUP mode on the following operations:

Backup Operations • Full database snapshot without the database in backup mode

Restore and Recovery Operations • Point-in-time copy of the database. In this scenario, all control files, data files and online

redo logs are restored from a snapshot copy. Upon restarting the database on the snapshot copy, crash recovery will occur automatically, and database will be consistent to the last redo commit of the snapshot copy.

• Full database recovery with zero data loss. In this scenario, all data files are restored from a snapshot copy, and all current controlfiles, current online redo log files, and all archived redo log files must be available. Then Oracle is able to perform a complete recovery and recover the database to the most recent point-in-time without the loss of any committed transactions.

• Point-in-time database recovery. In this scenario, all data files are restored from a snapshot copy, and all current controlfiles, current online redo log files, and all archived redo log files up to a particular point-in-time must be available. Then Oracle is able to perform an

Page 12: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 9

incomplete recovery (or point-in-time recovery) and recover the database consistently to a point-in-time.

In order for Oracle to provide support on these operations, the third party snapshot technologies must meet the following prerequisites:

• Able to integrate with Oracle’s recommended restore and recovery operations above • Database crash consistent at the point of the snapshot • Write ordering is preserved for each file within a snapshot

Integrating My Oracle Support Note 604683.1 with Dell EqualLogic Replication With the release of My Oracle Support Note # 604683.1, Oracle recognizes the snapshot of an Oracle database without BACKUP mode is valid and supported. Dell EqualLogic replication is a remote snapshot-based technology that replicates snapshots to a secondary storage system. Dell EqualLogic replication technology fully complies with the three prerequisites mentioned in the previous section:

• A replica of the full primary database is created by initiating replication on all database volumes grouped in a volume collection. Replication should be repeated at a defined interval based on the RTO and RPO requirements. The replica contains point-in-time copies of all data files, controlfiles, online redo log files, and archived redo log files. Therefore, the most relevant restore and recovery scenario for Dell EqualLogic replication technology is the “Point-in-time copy of the database”.

• The database on the Dell EqualLogic replica is crash-consistent, assuming the replica is created on a volume collection. As discussed in the “Crash-Consistent Images” section above, the state of the database on the replica copy is equivalent to the state after a sudden power failure, or a server crash, or a shutdown abort. When the database image is restarted, Oracle automatically performs crash recovery using the online redo log files and undo log files, and brings the database to a consistent state.

• Oracle databases require write operations to occur in the correct sequences. Dell EqualLogic volume collection preserves the write ordering of each file within the volume collection by grouping volumes together for the purpose of creating simultaneous snapshots or simultaneous replicas of the volumes in the collection. A Dell EqualLogic replica represents a point-in-time database image which captures write operations completed prior to the point in time.

Oracle Disaster Recovery using Dell EqualLogic Replication

Test Configuration The reference configuration for replicating an Oracle database to a secondary site using Dell EqualLogic replication is intended to validate the following solution components:

• Dell PowerEdge servers running Oracle Enterprise Linux 5 Update 5 x86_64, and Oracle 11g R2 Real Application Cluster (RAC) database Standard Edition (SE) version 11.2.0.2.0.

Page 13: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 10

• Redundant Dell PowerConnect™ Gigabit Ethernet switches for Oracle cluster interconnect private network

• Server-storage interconnect using redundant Dell PowerConnect 10 Gigabit Ethernet switches • Dell EqualLogic PS Series 10 Gigabit Ethernet iSCSI storage array where the physical data

resides • Redundant set of the above hardware and software at a secondary site or a less powerful set of

hardware at a secondary site based on performance requirements for the secondary database should a failover occur

• TCP/IP network connection between the primary storage group and the secondary storage group

• Primary Oracle database volumes, excluding the Grid Infrastructure volumes or the Oracle Home volume, grouped in a volume collection and replicated to the secondary storage system

• A point-in-time database copy started at the secondary site

An architecture overview of the solution is shown in Figure 3.

Figure 3. Architectural overview of Oracle database replication with Dell EqualLogic PS Series iSCSI storage

Preparing the secondary RAC database As shown in Figure 3, five volumes (p-grid1, p-grid2, p-grid3, p-grid4 and p-grid5) in the primary storage system are used as the Oracle Grid Infrastructure cluster quorum disks. One volume (p-oraclehome) is used as the shared Oracle Home. These volumes do not need to be replicated to the secondary site. The cluster database at the secondary site runs the Grid Infrastructure and the Oracle Home on a separate set of volumes (s-grid1, s-grid2, s-grid3, s-grid4, s-grid5 and s-orahome).

Prior to starting up the point-in-time database copy at the secondary site, the following tasks need to be performed.

Page 14: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 11

• Install Oracle 11g R2 Grid Infrastructure on the database servers at the secondary site. • Configure shared Oracle Home for database binary using Automatic Storage Management

Cluster Filesystem (ACFS). • Install Oracle 11g R2 RAC database software in Oracle Home. • As user oracle, copy the database configuration files, including the database instance

init.ora initialization parameter files and the database instance password file, from to the primary database servers to the secondary database servers in the appropriate location. For example, the following files are copied under $ORACLE_HOME/dbs with appropriate soft links created.

-rw-r----- 1 oracle oinstall 47 Jan 21 13:45 initrepltlp1.ora -rw-r----- 1 oracle oinstall 47 Jan 21 13:45 initrepltlp2.ora -rw-r----- 1 oracle oinstall 1536 Jan 21 13:07 orapwrepltlp lrwxrwxrwx 1 oracle oinstall 68 Jan 21 13:07 orapwrepltlp1 -> /u01/app/oracle/acfsorahome/product/11.2.0/dbhome_1/dbs/orapwrepltlp lrwxrwxrwx 1 oracle oinstall 68 Jan 21 13:07 orapwrepltlp2 -> /u01/app/oracle/acfsorahome/product/11.2.0/dbhome_1/dbs/orapwrepltlp

• As user oracle, register the database and database instances as cluster resources from node 1 of the secondary database servers.

srvctl add database -d db_unique_name -o oracle_home srvctl add instance -d db_unique_name -i inst_name -n node_name For example, srvctl add database –d repltlp –o /u01/app/oracle/acfsorahome/product/11.2.0/dbhome_1 srvctl add instance –d repltlp –i repltlp1 –n secondary-n1 srvctl add instance –d repltlp –i repltlp2 –n secondary-n2

• If the database audit option is turned on, you must create in advance the directory to which the DB_FILE_DEST database parameter is set.

Creating Replica from a Primary Storage Group

Creating Volume Collection As discussed in the previous section, all primary database volumes, except the Grid Infrastructure volumes and the Oracle Home volume, need to be grouped in a volume collection for the purpose of performing replication simultaneously on the volumes. Refer to the PS Series Storage Arrays Group Administration guide for more details on how to create a volume collection.

Configuring Replication Partner Before the first replication operation, you must configure the two groups as replication partners. You must log into the primary storage group manager and configure the secondary group as a replication partner. You must also log into the secondary group manager and configure primary group as a replication partner. Volume replication between partners requires space on the primary group and on the secondary group:

• Local replication reserve. Each volume requires primary group space for use during replication and, optionally, for storing the failback snapshots.

Page 15: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 12

• Delegated space. To provide space for storing replicas, the secondary group delegates space to the primary group. All primary group volumes that are replicated to the secondary group share the delegated space. Each volume is assigned a portion of delegated space, called the replica reserve. The replica reserve for a volume limits the number of replicas you can keep. When replica reserve is consumed, the oldest replicas are deleted to free space for new replicas.

Refer to the PS Series Storage Arrays Group Administration guide for more details on replication partner configuration, and guidelines on sizing the reserve space for replication.

Creating Replica on Volume Collection When replicating a volume collection, the resulting set of replicas is called a replica collection. A replica collection contains one replica for each volume in the collection.

You can create schedules to automatically perform replication on a regular basis. For example, you can create a schedule to create replicas of a volume collection every 10 minutes or every hour. The minimum replication interval should not be shorter than the time it takes to complete one replication. A replication will not start until the previous replication has completed.

Refer to the PS Series Storage Arrays Group Administration guide for more details on scheduling replications.

Failing Over Data Volumes If a failure at the primary database site makes the database unavailable, you can fail over to the secondary group and resume database access. If the primary group becomes available, database volumes can be failed back to the primary group.

Promoting Replica to Recovery Volume in the Secondary Storage Group To temporarily fail over a volume to the secondary group, the inbound replica set can be promoted to a recovery volume. Users can connect to the recovery volumes to resume access to the database. Refer to the PS Series Storage Arrays Group Administration guide on how to promote replica set to recovery volume.

Accessing Recovery Volumes from Standby Servers

Shutting Down Grid Infrastructure on Secondary Servers If the Grid Infrastructure of the secondary RAC database is up, issue the following command as user root to shut down the Grid Infrastructure on all secondary database servers before accessing the recovery volumes.

crsctl stop crs

Accessing Recovery Volumes from Linux Operating System Once the replicas are promoted to recovery volumes in the secondary storage system, the servers at the secondary site can access the recovery volumes in the operating system. This procedure consists of the following tasks for the Linux operating systems (OEL 5 and RHEL 5):

1. Discover the iSCSI recovery volumes from the secondary hosts 2. Log into the iSCSI recovery volumes from the secondary hosts 3. Configure Device Mapper Multipath to the iSCSI recovery volumes from the secondary hosts

Page 16: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 13

To help speed up the fail over process, Dell engineers developed a Linux shell script to automate these tasks. The content of the script is documented in the Appendix section. The script is also available for download at

http://en.community.dell.com/dell-groups/enterprise_solutions/m/mediagallery/19859656/download.aspx A sample of the screen output of the script is shown below.

[root@secondary-n1]$ ./storageconfig.sh ---------------------------------------------------------------------------------- Script to Discover ,Login and Configure Multipath for an EqualLogic ISCSI target ---------------------------------------------------------------------------------- Do you want to continue [ y / Y / (ENTER to Exit)] : y Automatic target Discovery in progress..... Discovered the below EqualLogic Targets : 1 : iqn.2001-05.com.equallogic:0-8a0906-64de6a407-d900000001e4d38b-repltl-grid4 2 : iqn.2001-05.com.equallogic:0-8a0906-633e6a407-17f0000001b4d38b-repltl-grid3 3 : iqn.2001-05.com.equallogic:0-8a0906-ebcb9a407-e90000000334d307-repltlp-fra3 4 : iqn.2001-05.com.equallogic:0-8a0906-667e6a407-cff000000214d38b-repltl-grid5 5 : iqn.2001-05.com.equallogic:0-8a0906-299b9a407-d9c000000394d359-repltlp-data3 6 : iqn.2001-05.com.equallogic:0-8a0906-ac3e6a407-532000000244d39e-repltls-orahome 7 : iqn.2001-05.com.equallogic:0-8a0906-ea0b9a407-6c5000000304d307-repltlp-fra2 8 : iqn.2001-05.com.equallogic:0-8a0906-ed5b9a407-d87000000364d307-repltlp-fra4 9 : iqn.2001-05.com.equallogic:0-8a0906-e83b9a407-08f0000002d4d307-repltlp-fra1 10 : iqn.2001-05.com.equallogic:0-8a0906-619e6a407-f21000000184d38b-repltl-grid2 11 : iqn.2001-05.com.equallogic:0-8a0906-e2eb9a407-f94000000274d307-repltlp-data1 12 : iqn.2001-05.com.equallogic:0-8a0906-e60b9a407-6ce0000002a4d307-repltlp-data2 Select the iqn which you want to login from the above list [Eg. 1 or 2 ...] : 12 Found Multipaths to the storage Device from the Host Do you want to configure the Multipath [ y / Y / (q to Exit)] : y Enter the multipath disk name want to assign for the above disk : repltlp-data2 ----------------------------------------------------------- Mutipath device Name repltlp-data2 Configured for sdd and sde ----------------------------------------------------------- Use the same script to Discover, Login and Configure Multipath for each of EqualLogic Targets Configuration Successful Once the access to recovery volumes are configured on all secondary hosts, issue the following command as user root to scan the ASM disks residing on the recovery volumes.

service oracleasm scandisks

New ASM disks should be listed in addition to the existing ASM disks that are used for Grid Infrastructure and Oracle Home on the secondary hosts. In the example below, ASM disks REPLTLS_GRID[1-5] and REPLTLS_ORAHOME are the existing ASM disks for Grid Infrastructure and Oracle Home. ASM disks REPLTLP_DATA[1-3] and REPLTLP_FRA[1-4] are the newly discovered ASM disks that reside on the recovery volumes. They contain the ASM diskgroups which store the database files.

Page 17: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 14

$ service oracleasm listdisks REPLTLP_DATA1 REPLTLP_DATA2 REPLTLP_DATA3 REPLTLP_FRA1 REPLTLP_FRA2 REPLTLP_FRA3 REPLTLP_FRA4 REPLTLS_GRID1 REPLTLS_GRID2 REPLTLS_GRID3 REPLTLS_GRID4 REPLTLS_GRID5 REPLTLS_ORAHOME

Starting up a Standby Database

Starting up Grid Infrastructure on Secondary Servers Issue the following command as root user on both standby servers to start up Grid Infrastructure.

crsctl start crs

Upon the startup of Grid Infrastructure, ASM instances automatically start, and all ASM diskgroups, including the ones that reside on the recovery volumes, are automatically mounted. In the example below, ASM diskgroups OCRVOTDSK and ORAHOME are the existing diskgroups used for the Gird Infrastructure and the Oracle Home. ASM diskgroups DATABASEDG and FLASHBACKDG reside on the recovery volumes which are replicated from the primary storage group.

SQL> select name, state from v$asm_diskgroup; NAME STATE ------------------------------ ----------- OCRVOTDSK MOUNTED ORAHOME MOUNTED DATABASEDG MOUNTED FLASHBACKDG MOUNTED

Starting up Database Instances on Secondary Servers Once the Grid Infrastructure is up and running on both secondary servers, issue the following command as user oracle to start up the database instances.

srvctl start instance -d db_unique_name -i inst_name_list For example,

srvctl start instance -d repltlp -i repltlp1 The command above starts the database instance on node 1. The alert.log file of the instance 1 shows that crash recovery was automatically performed upon the startup of the first database instance.

Page 18: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 15

ALTER DATABASE OPEN /* db agent *//* {1:26551:165} */ This instance was first to open Beginning crash recovery of 2 threads parallel recovery started with 15 processes Started redo scan Wed Mar 30 12:29:30 2011 Completed redo scan read 262399 KB redo, 156899 data blocks need recovery Wed Mar 30 12:29:42 2011 Started redo application at Thread 1: logseq 2457, block 55670 Thread 2: logseq 3702, block 184711 Recovery of Online Redo Log: Thread 1 Group 1 Seq 2457 Reading mem 0 Mem# 0: +DATABASEDG/repltlp/onlinelog/group_1.258.741013685 Mem# 1: +DATABASEDG/repltlp/onlinelog/group_1.259.741013685 Recovery of Online Redo Log: Thread 2 Group 4 Seq 3702 Reading mem 0 Mem# 0: +DATABASEDG/repltlp/onlinelog/group_4.274.741015903 Mem# 1: +DATABASEDG/repltlp/onlinelog/group_4.275.741015905 Recovery of Online Redo Log: Thread 2 Group 7 Seq 3703 Reading mem 0 Mem# 0: +DATABASEDG/repltlp/onlinelog/group_7.276.741015905 Mem# 1: +DATABASEDG/repltlp/onlinelog/group_7.277.741015907 Recovery of Online Redo Log: Thread 1 Group 2 Seq 2458 Reading mem 0 Mem# 0: +DATABASEDG/repltlp/onlinelog/group_2.260.741013687 Mem# 1: +DATABASEDG/repltlp/onlinelog/group_2.261.741013687 Completed redo application of 103.28MB Wed Mar 30 12:30:14 2011 Completed crash recovery at Thread 1: logseq 2458, block 32730, scn 70263169 Thread 2: logseq 3703, block 165042, scn 70263129 156899 data blocks read, 156846 data blocks written, 262399 redo k-bytes read Issue another command to start up the second database instance of the secondary RAC database. For example, srvctl start instance -d repltlp -i repltlp2

Creating a New Resetlog of the Secondary RAC Database As documented in My Oracle Support note 604683.1, it is recommended to create a new resetlog ID of the secondary RAC database in order to differentiate the new archives from the existing archives. Follow these steps to create a new resetlog ID:

1. shutdown immediate 2. startup mount 3. recovery database until cancel 4. cancel 5. alter database open resetlogs

Page 19: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 16

Updating Database Parameters LOCAL_LISTENER and REMOTE_LISTENER In the test configuration, the secondary RAC database uses the database parameter file spfile stored in the ASM diskgroup. Therefore, the spfile is a copy that’s replicated from the spfile of the primary RAC database. In this scenario, database parameters LOCAL_LISTENER and REMOTE_LISTNER need to be manually updated to reflect the secondary RAC cluster information. LOCAL_LISTNER should point to the node VIPs of the secondary RAC cluster. REMOTE_LISTENER should point to the SCAN name of the secondary RAC cluster. The following sample commands illustrate the change. SQL> ALTER SYSTEM SET local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=172.16.168.3)(PORT=1521))))' SCOPE=MEMORY SID='repltlp1'; SQL> ALTER SYSTEM SET local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=172.16.168.4)(PORT=1521))))' SCOPE=MEMORY SID='repltlp2'; SQL> ALTER SYSTEM SET remote_listener='repltls-scan:1521' SCOPE=BOTH SID='*';

Failing Back Data Volumes Upon completion of the database fail over procedures described in the preceding sections, the database is available for read and write operations. The application can resume data access by connecting to the secondary database while the primary database is unavailable. If the original volumes on the primary group become available again at a later point, the Dell EqualLogic Failback to Primary feature allows the recovery volumes to fail back to the primary group, returning to the original configuration. Updates made to the recovery volume will be copied back to the original volumes. The volume collection feature is currently not available to use together with the fail back operation. Therefore, it’s necessary to take the secondary database offline to ensure data consistency. The high level steps for failing back the database are:

• Shut down the secondary RAC database • Shut down the Grid Infrastructure on all secondary servers • Log out of iSCSI sessions to the recovery volumes on all secondary servers • Remove configurations to recovery volumes in the /etc/multipath.conf file on all

secondary servers • Re-scan ASM disks to remove ASM disks on the recovery volumes on all secondary servers • Perform the Failback to Primary operation on each recovery volume from the secondary storage

group manager • Re-discover iSCSI targets on all primary servers • Log into iSCSI targets on all primary servers • Configure multipath to iSCSI volumes on all primary servers • Re-scan ASM disks on all primary servers • Start up Grid Infrastructure on all primary servers • Start up the primary RAC database

Refer to the PS Series Storage Arrays Group Administration guide for details on the fail back operation.

Page 20: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 17

Performance Impact of Replication Dell engineers conducted lab testing to study the performance impact of the PS Series replication on an Oracle database. The Quest Benchmark Factory TPC-C tool was utilized to stress the database by simulating real database application workloads using industry standard TPC-C benchmark. TPC-C benchmark measures Online Transaction Processing (OLTP) workloads, combining read-only and update-intensive transactions that simulate the activities found in a complex OLTP enterprise environment.

The Benchmark Factory TPC-C tests conducted by the team simulated loads from 100 to 3,000 concurrent users in increments of 100. During this run, replication was performed every 10 minutes at user load 200, 300, 500, 700, 1000, 1200, 1400, 1600, 1800, 2000, 2200, 2400, 2600 and 2900. The test outputs include metrics such as Transaction per Second (TPS) and average transaction response time. The impact of these replications on application performance is shown with red dots in Figure 4 and Figure 5.

Figure 4 shows the Transaction per Second (TPS) of the TPC-C run as user load increases. As shown in this graph, replications of the source database volumes have no measurable impact on the TPS performance during this run.

Figure 4. Performance Impact of Dell EqualLogic replication on Transaction per Second (TPS) during Quest Benchmark Factory TPC-C test

Figure 5 shows the average transaction response time in seconds as user load increases from 100 users to 3000 users in this TPC-C run. As illustrated by the trend of blue dots in Figure 5, the average transaction response time increases with user load. The volume of data changes also increases as user load increases. As a consequence, each replication takes longer to complete as the user load increases, and thus causes a greater impact on application performance. However, as shown by the

0

20

40

60

80

100

120

140

160

180

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

1400

1500

1600

1700

1800

1900

2000

2100

2200

2300

2400

2500

2600

2700

2800

2900

3000

Tran

sact

ion

per

Seco

nd (T

PS)

User Load

Performance Impact of Replication on TPS

Page 21: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 18

red dots in the graph, while response times increase where user load increases, they are still well within the acceptable range of the application response time.

Figure 5. Performance impact of Dell EqualLogic replication on average transaction response time during Quest Benchmark Factory TPC-C test

This study demonstrated that the Dell EqualLogic replication imposes a minimal performance hit on an Oracle database under stress in a lab simulated environment. Performance impact of replication can depend on various factors such as the amount of data changes, the network bandwidth between the two replicated sites, etc. Dell recommends conducting performance tests to understand the true impact to a real world application.

Summary This paper describes an Oracle disaster recovery solution for customers deploying Oracle Database Standard Edition (SE) or Oracle Database Standard Edition One (SE1) where Data Guard is not available. The solution uses the Dell EqualLogic PS Series iSCSI storage replication technology to create a point-in-time database copy at the secondary site with little performance impact on primary database. This paper also provides best practices and instructions on how to implement this solution.

To learn more about Dell Oracle solutions, visit www.dell.com/oracle or contact your Dell representative for up-to-date information on Dell servers, storage, and services for Oracle solutions.

References 1. “PS Series storage arrays group administration”, PS Series firmware version 5.0.

https://www.equallogic.com/support/download_file.aspx?id=930

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0 500 1000 1500 2000 2500 3000

Ave

rage

Tra

nsac

tion

Res

pons

e Ti

me

(sec

)

User Load

Performance Impact of Replication on TPC-C Response Time

Page 22: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 19

2. “Storage-based replication options: selecting the right replication method for optimal data

protection”, http://content.dell.com/us/en/enterprise/d/business~solutions~brochures~en/Documents~cb105-storage-based-replication-options.pdf.aspx

3. “Oracle Data Guard concepts and administration”, 11g Release 2 (11.2), E17022-05.

http://download.oracle.com/docs/cd/E11882_01/server.112/e17022/index.htm

4. “Supported backup, restore and recovery operations using third party snapshot technologies”, My Oracle Support document ID #604683.1. https://support.oracle.com

5. “Dell | Oracle Tested and Validated configurations”, Oracle 11g running Enterprise Linux 5.

http://content.dell.com/us/en/enterprise/d/solutions/oracle-configs-ent-linux-5.aspx

Appendix The following script is referenced in the section “Accessing Recovery Volumes from Linux Operating System” on page 13.

#!/bin/bash # #File Name: storageconfig.sh # #This script automates the failover process to the secondary Oracle #database on the remote site. The failover process is part of the #Disaster Recovery solution described in the Dell white paper #"Disaster Recovery of Oracle Database Standard Edition (SE) and #Standard Edition One (SE1) on Dell EqualLogic PS Series iSCSI #Storage Systems". The script automates the following tasks during #the failover process for Linux operating systems (OEL 5 and RHEL 5): #1. Discover the iSCSI recovery volumes from the secondary hosts #2. Log into the iSCSI recovery volumes from the secondary hosts #3. Configure Device Mapper Multipath to the iSCSI recovery volumes # from the secondary hosts # #Usage: ./storageconfig.sh (as root user) clear echo "----------------------------------------------------------------------------------" echo " Script to Discover ,Login and Configure Multipath for an EqualLogic ISCSI target " echo "----------------------------------------------------------------------------------" iscsistatus=`service iscsi status | grep -w running | wc -l` if [ $iscsistatus -lt 1 ]; then echo "iscsi service should be running before executing the script" echo "Start the iscsi service and try onceagain " exit fi

Page 23: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 20

iscsidisk() { groupip=`iscsiadm -m session | head -n 1 | cut -d ' ' -f 3 | cut -d : -f 1` iscsiadm -m discovery -t sendtargets -p $groupip &> /dev/null is=(`iscsiadm -m session -P 3 | grep Target | cut -d - -f 6-`) id=(`iscsiadm -m discovery -P 1 | grep Target | cut -d - -f 6-`) echo " Automatic target Discovery in progress....." i=0 newdsiqn=" " for d in "${id[@]}" do status=pos for s in "${is[@]}" do if [ $d = $s ]; then status=neg break fi done if [ $status != "neg" ]; then newdsiqn[$i]=$d i=`expr $i + 1` fi done if [ "$newdsiqn" != " " ]; then echo "Discovered the below EqualLogic Targets :" nl=0 for ndiqn in "${newdsiqn[@]}" do iqnshowlist[$nl]=`iscsiadm -m discovery -P 1 | grep Target | cut -d " " -f 2 | grep -w $ndiqn` nl=`expr $nl + 1` done else echo "---------------------------------------------------------------------------" echo " Check the storage configuration " echo "---------------------------------------------------------------------------" echo " Note:" echo " 1. Check whether you have promoted the Volume." echo " 2. Check the Access Control records of the promoted Volume" echo " 3. Check the CHAP configuration (/etc/iscsi.conf) on the host" echo "---------------------------------------------------------------------------" exit fi iqnno=1 for j in "${iqnshowlist[@]}"

Page 24: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 21

do echo " $iqnno : $j" iqnno=`expr $iqnno + 1` done #echo -n "Select the iqn which you want to login from the above list [Eg. 1 or 2 ...] : " #read seliqn selstate=yes while [ $selstate = "yes" ] do selstatus=yes echo -n "Select the iqn which you want to login from the above list [Eg. 1 or 2 ...] : " read seliqn expr "$seliqn" + 1 > /dev/null 2>&1 || selstatus=no if [ "$selstatus" != "no" ] then if [ $seliqn -gt 0 ] && [ $seliqn -lt $iqnno ] ; then selstate=no fi fi done iqnlogin=${iqnshowlist[`expr $seliqn - 1`]} iscsiadm -m node -T $iqnlogin -l &> /dev/null iscsinic=(`iscsiadm -m session | grep $iqnlogin | cut -d " " -f 4`) noiface=0 for k in "${iscsinic[@]}" do noiface=`expr $noiface + 1` done if [ $noiface -gt 1 ]; then echo "Found Multipaths to the storage Device from the Host" while [ "$confmul" != 'y' ] && [ "$confmul" != 'Y' ] do echo $confmul echo -n "Do you want to configure the Multipath [ y / Y / (q to Exit)] : " read confmul if [ "$confmul" = 'q' ]; then echo "Don't Forget to Configure Multipath Manually For the selected target Device" exit fi done getdisk fi #if [ $noiface -gt 1 ]; then # echo "Found Multipaths to the storage Device from the Host" # echo -n "Do you want to configure the Multipath [ y / Y / (ENTER to Exit)] : "

Page 25: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 22

# read confmul # if [ "$confmul" = "y" ] || [ "$confmul" = "Y" ]; then # getdisk # else # echo "Don't Forget to Configure Multipath Manually For the selected target Device" # exit # fi #else # exit #fi } getdisk() { diskname=(`iscsiadm -m session -P 3 | grep -iw 'Target\|disk' | grep -iw -A 2 "$iqnlogin" | grep -io sd[a-z]`) disk1=${diskname[0]} disk2=${diskname[1]} getscsiid } getscsiid() { disk1_id=`/sbin/scsi_id -gus /block/$disk1` disk2_id=`/sbin/scsi_id -gus /block/$disk2` if [[ "$disk1_id" = "" || "$disk2_id" = "" || "$disk1_id" != "$disk2_id" ]]; then echo "scsi id of the disks does't match, Please select the right disks" getdisk getscsiid else multiconf fi } multiconf() { echo -n "Enter the multipath disk name want to assign for the above disk : " read mdisk if [ "$mdisk" != "" ]; then echo -e "\tmultipath {" >> mul1.txt echo -e "\t\twwid\t$disk1_id" >> mul1.txt echo -e "\t\talias\t$mdisk" >> mul1.txt echo -e "\t }" >> mul1.txt sed -i '/multipaths/r mul1.txt' /etc/multipath.conf rm -rf mul1.txt echo "-----------------------------------------------------------" echo "Mutipath device Name $mdisk Configured for $disk1 and $disk2" echo "-----------------------------------------------------------" service multipathd reload &> /dev/null # status else multiconf fi }

Page 26: Disaster Recovery of Oracle Database Standard Edition (SE) and

Disaster Recovery of Oracle Database on Dell EqualLogic iSCSI Storage Systems

Page 23

#status() #{ #echo -n "Do you want to login to another target [ y / Y / (ENTER to Exit)] : " #read ans #if [ "$ans" = "y" ] || [ "$ans" = "Y" ]; then # iscsidisk #else # exit #fi #} echo -n "Do you want to continue [ y / Y / (ENTER to Exit)] : " read confirm if [ "$confirm" = "y" ] || [ "$confirm" = "Y" ]; then iscsidisk else exit fi echo "Use the same script to Discover, Login and Configure Multipath for each of EqualLogic Targets" echo "Configuration Successful"