Post on 07-Oct-2015
description
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 1
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery With
IBMs Enterprise Storage Server (ESS)
Sanjoy Das, Siegfried Schmidt, Jens Claussen and BalaSanni Godavari
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 2
The following terms are trademarks of International Business Machines Corporation in the United States, other countries, or both: AIX DB2 DB2 Universal Database Enterprise Storage Server (ESS) ESCON FlashCopy OS/390 Tivoli TME 10 The following terms are trademarks of SAP AG in Germany, in the United States, other countries, or both: SAP SAP Logo mySAP.com R/3 ABAP SSQJ Advanced Technology Group OSS SAP R/3 Note Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Legato NetWorker is a trademark of Legato Systems, Inc. in the United States, Other countries, or both. Windows is a trademark of Microsoft Corporation in the United States, Other countries, or both. Veritas NetBackup is a trademark of Veritas Software in the United States, Other countries, or both. Other company, product, and service names may be trademarks or service marks of others.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 3
The information provided in this document is distributed "AS IS" basis without any warranty either express or implied. IBM AND SAP DISCLAIM ALL EXPRESS AND IMPLIED WARRANTIES WITH RESPECT TO SUCH INFORMATION, INCLUDING ANY WARRANTIES OF MERCHANTIBILITY OR FITNESS FOR A PARTICULAR PURPOSE. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customer's ability to evaluate and integrate them into their operating environment. While the information contained in this paper has been reviewed by IBM and SAP for accuracy, there is no guarantee that the same or similar results will be obtained elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk. The performance data contained herein was obtained in a controlled environment based on the use of specific data. Actual results that may be obtained in other operating environments may vary significantly. These values do not constitute a guarantee of performance. References in this document to IBM and SAP products, programs, or services do not imply that IBM or SAP intend to make such products available in all countries in which each company operates. Neither IBM nor SAP warrants the others products. Each companys products are warranted in accordance with the agreements under which they are provided.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 4
Abstract.......................................................................................................................................................... 5
Preamble ........................................................................................................................................................ 6
1 Introduction .......................................................................................................................................... 6
2 Customer Requirements...................................................................................................................... 7
3 SAP Requirements ............................................................................................................................... 8
4 DB2 UDB and ESS Features to Support the Split Mirror Backup / Recovery Solution................. 10 4A DB2 UDB Features ..................................................................................................................... 11 4B FlashCopy - ESS's Advanced Local Copy Function ............................................................... 12 4C Peer-to-Peer-Remote Copy (PPRC) ESSs Advanced Remote Copy Function .................. 13
5 Split Mirror Backup & Recovery (SMBR) Setup ............................................................................... 16 5A SSQJ: R/3 Load Simulation and Testing Tool ......................................................................... 17 5B SAP/DB2 UDB File System Definition....................................................................................... 18 5C Split Mirror Infrastructure Setup............................................................................................... 21 5D ESS LUN Definition .................................................................................................................... 22 5E Volume Group Assignment... 24
6 Split Mirror Backup & Recovery Process ......................................................................................... 26 6A Start Situation............................................................................................................................. 26 6B Split Mirror Backup & Recovery (SMBR) Scenarios................................................................ 34 6C Key Observations about the Split Mirror Backup & Recovery Scenarios ............................. 37
7 Recovery.............................................................................................................................................. 38
8 Process Automation - Solution Integration in Customer Environments ....................................... 42
Acknowledgements..................................................................................................................................... 43
Appendix...................................................................................................................................................... 44 A) DB2 UDB Architectural Overview ............................................................................................. 44 B) DB2 UDB Database Growth and Impact on Storage ............................................................... 49 C) DB2 UDB Memory Management................................................................................................ 51 D) Backup and Recovery Management for DB2 UDB .................................................................. 52 E) Work Process Management in SAP R/3 with DB2 UDB and Storage..................................... 55 F) SAP R/3 data layout.................................................................................................................... 57 G) ESS Raid5 Disk Layout Considerations for SAP R/3 Environments ...................................... 59 H) Considerations for DB2 UDB database layout ......................................................................... 63 I) Sub System Device Driver (SDD) .............................................................................................. 66 J) Volume Layout for DB2 UDB ..................................................................................................... 69 K) The Role of the ESS Specialist .................................................................................................. 71
References................................................................................................................................................... 72
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 5
Abstract
Recent new products such as SAPs B2B Procurement, CRM, mySAP.com Portals and
Trading Exchange markets require an ever increasing need for continuous system
availability.
This paper provides information on an essential component of advanced infrastructure
solutions the High Availability Split Mirror Backup/Recovery (SMBR) for Systems of a mySAP landscape on IBMs DB2 UDB operating in AIX environment. In the following R/3
will be used as a synonym for SAP products.
The solution described in this paper is intended to deliver a backup process with no
impact on the live R/3 system (server less) using the advanced functions of IBMs
Enterprise Storage Server (ESS). This zero downtime for the live R/3 system means
that SAP users typically will not see a disruption of activity while the backup of the live
database takes place. No transactions typically are cancelled during the copy
process/backup process.
Instant availability of consistent copies of the database provides the ability to place an
emergency system at the users disposal while recovering the live database from a
disaster. Beyond Backup / Recovery, a consistent copy of the database may be used for
various purposes, such as creation of a reporting instance, production-fix system or
repository for building Business Warehouse (BW) system.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 6
Preamble This white paper, written from the DBA's perspective, addresses the infrastructure
design, implementation tasks and techniques required for complex Enterprise
Application Integration landscapes for high availability SAP R/3 applications consisting of
a database (DB2 UDB), an operating system (AIX) and a storage subsystem (ESS)
which all interoperate to deliver an easy-to-manage backup & recovery solution for SAP
customers. Backup & Recovery solutions are mission critical activities in todays world of
7 x 24 computing and are a major focus for IT personnel, application management and
DBAs. With the exploding growth in storage requirements for SAP application
environments, this work touches on major elements of each area of technology spanning
critical operating requirements and how this storage-centric solution delivers compelling
value for SAP customers.
1 Introduction
Service level agreements increasingly have to reflect that in case of planned downtime
such as backup, hardware / software maintenance, R/3 upgrade and unplanned
downtime such as database Data Manipulation Language (DML) error, error analysis
and restores, the system has to be available within minutes.
SAPs Advanced Technology Group has developed scenarios using live databases that
constantly copy or mirror using storage subsystems, allowing business continuation
during the split (and resynchronization) of the mirror. Once the logical database mirrors
are established, additional copies are created for backup and for use by a standby SAP
R/3 System. This solution minimizes the I/O load impact of the live environment and
offloads the backup activity away from the live database server/storage server to a
standby/backup server host. The Split Mirror solution, based on this concept, was
successfully implemented using DB2 UDB database on the AIX platform using the
Enterprise Storage Server. This solution was implemented using SAP R/3 4.6B with DB2
UDB 7.1 on AIX 4.3.3. It can be implemented for either a single or a dual ESS
configuration using the ESSs advanced functions the local copy function FlashCopyTM
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 7
and the remote synchronous copy function Peer-to-Peer-Remote-Copy (PPRC). The
core R/3 system was loaded using SAP-developed SSQJ tool for OLTP volumes.
This solution in dual ESS configuration is intended to enable customers to implement
remote data vaulting and/or to scale to a larger database size.
A solution similar to the one described in the following pages was implemented on IBMs
Enterprise Storage Server (ESS) with SAP R/3 on DB2 OS390 and first demonstrated in
November 1999 (see reference [9.,10.] for details). The Split Mirror Backup on the
DB2OS390 for both RVA and ESS storage systems used a special write suspend
function created by DB2OS390 especially for executing the Split Mirror Backup/
Recovery solution (SMBR).
This Open Systems solution for SAP R/3 with DB2 UDB on AIX takes advantage of a
similar functional capability created by IBM especially for this DB2 UDB Split Mirror
Backup /Recovery solution.
2 Customer Requirements
Increasingly, with the rapid trend towards very large databases, accompanied by the
need for high availability in a global computing environment, customers now demand
that production systems be available on a 7 x 24 basis. This also means that in case of a
disaster, the system has to be available within minutes or hardly longer than the time
needed for the physical reload of the database from a secondary or remote storage
media. This high availability requirement also implies that backup and the creation of an
emergency system may not cause any downtime of the live production system and all
procedures to achieve this must be seamlessly automated.
Customers are also very aware of the fact that software or application logical errors and
not hardware failures are the most likely causes for the need for disaster recovery
capabilities. But, for mission critical applications, customers will do everything possible to
optimally protect themselves from a hardware disaster. This means that cost effective,
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 8
intelligent fault tolerant storage subsystems are an important ingredient of high
availability advanced infrastructure solutions, helping to insure against cost-prohibitive
downtime possibilities.
Along with the demanding requirements of non-stop, web-centric computing, customers
now have to contend with business processes spread over multi-vendor platforms,
databases and file systems, where automated interaction between systems provide the
essence of competitive productivity. In these emerging environments, they need and
demand database availability with current data, fast backup/restore/recovery with no
impact on the main production (OLTP) system, all enabled with seamless automation
through well defined, user friendly management interfaces. With the emerging world of
SAP technologies, customer environments have to meet stringent requirements for
availability, scalability and flexibility that can handle changing customer requirements
based upon SAP instance strategies, data migration requirements and disparate growth
rates of databases.
3 SAP Requirements
In order to deliver a solution that matches the complex requirements for high availability
backup, recovery and performance, the SAP administrative workload should be taken off
the live production system and performed on a copy of the system. The recommended
environment is depicted in Figure 1.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 9
M irro red D ata
lo cal co p y (sp lit)
P ro d u ction D ata
rem ote copy (sp lit )
2n d m irro r 3rd m irro r
A dm inis tratio nD B C heck
U pd ate S ta tisticR /3 R epo rting S ystem
Figure 1: SAP High Availability Advanced Infrastructure Backup & Recovery Conceptual Model
While a production server is connected to a primary fault tolerant storage controller, a
mirrored remote copy of the production database is created without the computing
support of the production server, in a separate/secondary fault tolerant storage controller
located at a spatially separated site, providing high availability and disaster recovery
capability for business continuance. This copy can be accessed by a standby/backup
server should IT management procedures require its intervention in the event the
primary site experiences business interruption. A local copy function within the
secondary storage controller produces a consistent point-in-time copy of the production
database. This consistent copy, at the option of the user, can be either held inside
secondary storage controller or be transferred to tape for remote vaulting. In the event
that the primary database or its mirror is not available, the production server or the
backup server can be connected to this disk resident copy, helping to dramatically
reducing downtime.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 10
The database and SAP Basis administration can be augmented by providing a
homogeneous split mirror-based system copy for the purposes of test upgrades, support
packages and database recovery routines. Intelligent storage controllers need to deliver
this capability to create near-instant, consistent copies of the database.
Thus a number of key benefits can accrue to users from this recommended
environment:
- Backup & restore time is database size independent
- Minimum impact on production environment
- Physical and logical disaster prevention
- Enhanced database administration
- Remote data vaulting
- Offloading backup from production database server
- Reduction of backup-restore-time to minutes
In order to realize this concept, the database management system needs to cooperate
with storage subsystems to deliver the results. It should support the creation of a
consistent database copy during the application READ/WRITE processing in a manner
that exerts negligible impact on the production system, database or the live production
storage server.
4 DB2 UDB and ESS Features to Support the Split Mirror Backup / Recovery Solution
This section describes the key DB2 UDB features required to create a consistent backup
database image of the production database using a set of specially developed DB2 UDB
commands. The backup image includes SAP R/3 objects and File definitions for DB2
UDB. This backup process utilizes two advanced functions of the ESS storage sub
system such as Peer-to-Peer Remote Copy (PPRC) and FlashCopy.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 11
4A DB2 UDB Features
For very large DB2 UDB databases in SAP R/3 production environments, the Split Mirror
backup capability is essential for the creation of consistent database backup copies
without stopping the production system. In order to make this possible, DB2 UDB
created a set of special commands that enable temporary suspension of writes to the
database:
DB2 SET WRITE SUSPEND FOR DATABASE: This command suspends the I/O
writes and puts the tablespaces into a new SUSPEND_WRITE state. Writes to the
logs are also suspended by this command. However, read-only transactions are
allowed to continue. Any changes to tablespace information needs to wait till the
writes are subsequently resumed. Once the SUSPEND_WRITE state is in effect
for the database, all files and directories within the database are protected from
updates.
DB2 SET WRITE RESUME FOR DATABASE: This command resumes I/O writing
and removes the SUSPEND_WRITE state and makes the tablespaces available
for updates.
For an overview of the DB2 UDB architecture and other features that take advantage of
ESS storage management capabilities, please refer to Appendix A (DB2 UDB
Architectural Overview), Appendix B (DB2 UDB Database Growth and Impact on
Storage) and Appendix C (DB2 UDB Memory Management).
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 12
4B FlashCopy - ESS's Advanced Local Copy Function
The ESS administration tool (ESS Specialist) identifies the Logical Unit Numbers (LUNs)
by their ESS internal serial numbers. FlashCopy, ESSs near-instant local copy
function, can be used for all systems that have volumes or LUNs within the same Logical
Subsystem (LSS) of an ESS. FlashCopy is set up using the Web interface of the
StorWatch ESS Specialist. Then, task selections on the volume pair FlashCopy with
Full or No Copy, and WITHDRAW options can be made. See reference [12] for set up
details on FlashCopy.
The NO COPY option in FlashCopy is useful if the copy operation has to complete with
in a short time so that the source database/application are returned to their normal
usage from a quiesced state. WITHDRAW command enables the removal of source and
target relationships from a previously specified NOCOPY command. The relationship
between source and target will automatically end when the physical copy is completed.
Using the ESS Specialist, FlashCopy tasks - with either No Copy or Physical Copy
options are created. These tasks are initiated from the Unix command line using
Command Line Interface (CLI) rsExecuteTask. The status of those tasks can be queried
using the CLI, rsExecuteQuery. Based on the results of this query, the SMBR
automation process can capture the error code for each of the FlashCopy tasks.
FlashCopy, when set up by the StorWatch ESS Specialist, creates an identical copy of
the source volume onto target volumes when an appropriate task is initiated using CLI.
Volume identification or disk signatures need to be validated with respect to the host that
is connected to the ESS in order for that host to start using the target ESS FlashCopy
volumes. In order for a single host to mount both source and target volumes of
FlashCopy pairs, AIX provides the RECREATEVG command, which is packaged as a
PTF for AIX 4.3.3 in APAR IY10456. It is officially available in:
1. AIX 4.3.3 Recommended Maintenance Level 05 (RML05)
2. AIX 4.3.3 RML06
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 13
The RECREATEVG command overcomes the problem of duplicated LVM data
structures and identifiers caused by a disk copying process such as FlashCopy. It is
used to recreate an AIX Volume Group (VG) on a set of disks that are copied. Then, the
normal commands to bring the volume online, i.e. mounting can be attempted by the
host operating system.
4C Peer-to-Peer-Remote Copy (PPRC) ESSs Advanced Remote Copy Function PPRC is a hardware-assisted synchronous remote copy or synchronous mirroring
function that can help preserve data integrity. Synchronous mirroring means that an I/O
is only completed until after it is acknowledged from the remote site.
PPRC is set up on a volume or LUN basis in two or more ESS's, which are connected by
ESCON as in Figure 2. Updates to a PPRC volume on the local or primary site (primary
volume) go first into cache and non-volatile storage (NVS) in the primary storage. The
updates are then sent over the ESCON links via larger ESCON frame transmission to
the remote or secondary storage control. When the data is in cache and NVS on the
secondary site, the receipt of the data is acknowledged and the primary storage control
signals to the application the completion of the I/O by a Device End status.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 14
Figure 2: PPRC Connectivity using ESCON Links
The enhancements to ESCON protocol, implemented in ESS micro code as an
advanced copy function, allow an extended distance between two ESS's of up to 103
km, when using multi-mode to mono-mode ESCON converters, amplifiers and switches.
PPRC can be implemented over longer distances using channel extenders from third
party suppliers certified on the ESS.
Up to eight ESCON links are supported between two ESS storage subsystems. The
local primary storage control with PPRC source volumes and the remote secondary
storage control are connected via ESCON links. One ESS storage control can act as
primary and secondary at the same time if it has PPRC source and target volumes.
PPRC links are uni-directional, as shown in Figure 3, so that a physical ESCON link can
be used to transmit data from the primary storage control to the secondary.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 15
Figure 3: PPRC Links between two ESS Storage Sub Systems
Before PPRC pairs can be established, logical paths must be defined between the
logical control unit images also called as Logical Sub Systems (LSS). The ESS supports
up to 16 logical Fixed Block (FB) control unit images and up to 32 SCSI/Fiber controller
images. Logical paths are established between LSS of the same type over physical
ESCON links that connect suspending and resuming pairs.
During the suspension of a pair, the primary control unit maintains a bitmap in NVS (Non
Volatile Storage) with a flag bit for each track that was changed on the primary volume.
This allows for a later resynchronization (RESYNCH) of the volume pair while allowing
only cylinders flagged in the bitmap table to be copied to the remote site.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 16
As with FlashCopy tasks, PPRC tasks are created via the ESS Specialist and then
invoked from the Unix command line using CLI interface. Based on the result of the
query - rsExecuteQuery, the SMBR automation process can capture the error code for
each of the PPRC tasks.
5 Split Mirror Backup & Recovery (SMBR) Setup The SMBR set up involves the physical database design from the SAP installations
guide. This includes file systems definitions according to sizing (based upon planned
usage statistics, I/O forecasts, numbers of users etc.). The file systems requirements
follow standard SAP installation procedures.
In a full-featured high availability SMBR solution we recommend to use two physically
separated database hosts and two ESS, each containing two copies of the production
database. In our test environment we used two ESS clusters (refer to figure 6, 7) like
separated storage systems, but without reservations this can simply be extended to a
two ESS implementation.
The installation binaries for SAP kernel and DB2 along with the SSQJ (a load testing tool
developed by SAP) file systems are also setup for each of the four copies. SSQJ was
developed with ABAP4 in R/3 for benchmarking based on SQL/ABAP statements and
table manipulation for performance / throughput analysis.
The ESS LUNs design is based on logical addressing of striped physical volumes in a
RAID5 array. The LUN definitions are based on file system requirements. And are
translated into volume groups via IBMs Subsystems Device Driver (SDD) mapping of
virtual paths installed on the AIX host.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 17
5A SSQJ: R/3 Load Simulation and Testing Tool This Split Mirror Backup / Recovery solution utilized SAP designed SSQJ tool for load
simulation. This tool is a generic test measurement tool that was developed with ABAP/4
in core R/3 and it enables testing and measurement of the following functions in the R/3
Basis system such as runtimes of specific functions and their alternatives, resource
consumption, query plans in the case of DB statements and other functions. SSQJ runs
on all SAP-supported database systems with built in infrastructure for Measurement &
Analysis including environment History of measurements.
A number of test suites are included in the current version of the package:
1) SQL statements
2) ABAP statements
3) Large Tables for performance and throughput analysis
4) Archiving Systems
5) Data Conversion
6) TPC/D-Benchmark and
7) Business Warehouse
SSQJ is currently used for SAP Basis Benchmarking, by SAP Database Porting teams,
SAP Database Partners and other SAP Hardware Partners.
The SSQJ tool was used as the kernel database for the DB2 UDB SMBR testing.
SSQJ requires the creation of two new tablespaces, PSAPSSQJD for data and
PSAPSSQJI for indexes. The size of the SSQJ tablespaces depends on the required
database size. For the initial installation, we require at least 8GB for PSAPSSQJD and
4GB for PSAPSSQJI. The data files for PSAPSSQJD are in /db2/UDB/sapdata3 (total of
8 files of 2GB each), as shown in Figure 5, and the data files for PSAPSSQJI are in
/db2/UDB/sapdata4 (total of 4 files of 2GB each).
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 18
5B SAP / DB2 UDB File System Definition SAP R/3 Backup Objects and File Definitions for DB2 UDB
All objects in the R/3 environment need to be backed up. These objects are:
1. R/3 data objects:
a. R/3 archiving objects
b. R/3 Interfaces
c. SAP Executables
2. Computing center data such as:
a. The Operating System
b. Third Party Programs connected to R/3
3. Database objects
DB2 UDB supports both Online and Offline database backups. Given a properly
implemented database backup cycle, both forms of backup attain the same end goals.
The specific circumstances at each SAP R/3 installation determine which of the two
methods is appropriate. The SAP tools for database backup support both options. Figure
4 depicts the SAP Backup Objects and Figure 5 depicts the UDB File Systems for SAP.
DB2 UDB Backup Objects for SAP Directory Meaning
/$INSTDIR/db2_07_XX/ DB2 software (specified during DB2 installation)
/db2/ DB2 instance directory
/db2//sqllib/bin,tsm,doc,.. DB2 binaries, TSM control files, documentation
/db2/
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 19
DB2 UDB File System Definitions for SAP Volume Group
File system Name Function Size (in 8MB Physical Partitions)
/usr/sap/UDB Link to /sapmnt/UDB 0
/sapmnt/UDB SAP Executables 60
/usr/sap/trans Transport directory 60
/db2/UDB DB2 Executables 130
Sapvg
/db2/db2as 236
/db2/UDB/log_dir Online Log files 50
/db2/UDB/sapdatat Temporary Tablespace 50
Logvg
/db2/UDB/log_archive Archive log files 470
/db2/UDB/sapdata1 SAP R/3 data files 3810 LSS10data
/db2/UDB/sapdata2 SAP R/3 data files 3810
/db2/UDB/sapdata3 SAP R/3 data files 3810 LSS12data
/db2/UDB/sapdata4 SAP R/3 data files 3810
/db2/UDB/sapdata5 SAP R/3 data files 3810 LSS14data
/db2/UDB/sapdata6 SAP R/3 data files 3810
LSS16data Not Assigned For future data files -
/basebackup SAP backups 2800 Backupvg
/usr/sap/trans/data SSQJ data load 700
Figure 5: UDB File Systems for SAP
In Figure 5 above, the volume groups LSS10data, LSS12data, LSS14data and
LSS16data are 64GB each. The other volume groups are sapvg and logvg comprising of
1GB LUNs. sapvg is used for the SAP and DB2 UDB executables, and logvg holds the
online and archive logs.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 20
SAP delivers binary executables in order to help achieve on-line and offline backups as
a part of the kernel installation. The external data management tools like Tivoli Storage
Manager (TSM) have products that integrate into SAP R/3 as referred in Appendix D. As
show in Figure 16, the SAP R/3 objects DB2 control files, DB2 containers, offline log
files and online log files are required for a consistent database backup and restore /
recovery.
In order to achieve SAP requirements for SMBR, the ESSs advanced copy functions,
namely FlashCopy and Peer-to-Peer-Remote-Copy are incorporated into the solution.
For the initial SAP installation, the file systems are created as shown in Figure 5.
As shown in Section F (SAP R/3 Data layout for UDB) of the Appendix, consideration
must be given to sizing recommendations, SAP landscape, and data layout spread
evenly across storage systems. ESS RAID5 disk layout considerations, as detailed in
Section G (ESS RAID Disk Layout Considerations for SAP R/3 Environments) of the
Appendix, ensure that the "hot spot phenomenon common in non-RAID5 environments
is eliminated by striping all DB2 UDB tablespaces across all devices, thus utilizing the
cumulative throughput of all device adapters. Additional considerations for DB2 UDB
database layout on ESS should incorporate optimal values for PREFETCHSIZE,
DB2_PARALLEL_IO, DB2_STRIPED_CONTAINERS, tablespace EXTENTSIZE,
database page size as outlined in Section H of the Appendix.
In order to complete the file systems layout, first allocate the logical volumes by
choosing the appropriate AIX options. Then create the file system on top of these logical
volumes. At this point we complete the installation of DB2 UDB 7.1 with fix pack 2 for our AIX host
joyjoy and complete the installation of SAP Version 4.6B. The SAP and DB2 UDB
system/instance names are UDB.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 21
5C Split Mirror Infrastructure Setup
As shown in Figure 6, our live database consists of DBA for the SAP Instance UDB on
the primary ESS attached to host joyjoy. It contains all the DB2 binaries, SAP kernel,
transport directory, DB2 online / archive log directories, DB2 instance directory, DB2
data container file systems and SSQJ file systems.
DBC is the PPRC mirror of the live database DBA. DBB is the Safety FlashCopy of the live
database DBA. This safety copy is created in order to maintain point in time copy of the
live production database should there be any problems encountered with DBA during the
SMBR operation.
Figure 6: R/3 Split Mirror Backup & Recovery and Backup Host
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 22
In situations where the customers Basis or DBA group would like to mount the file
systems for verification purposes by using the FlashCopy or PPRC target volumthe
In situations where the customers Basis or DBA group would like to mount the file
systems for verification purposes by using the FlashCopy or PPRC target volumes on
the same host, the Physical Volume Identifier (PVIDs) need to be renamed using the AIX
RECREATEVG command mentioned in Section 4B.
5D ESS LUN Definition
For all references to LUN and volume group sizes, 1GB translates to 1*1000*1000*1000
bytes. All the LUNS for host joyjoy are defined only on cluster 1 for our SMBR solution
scenarios using FlashCopy and PPRC. Each of the 8 ranks on ESS1 (Cluster 1) consists
of twelve 8GB LUNs. In addition to these, two 1GB LUNs are also defined on each of the
ranks. The 8GB LUNs are used for the DB2 UDB data files, while one set of eight 1GB
LUNs is used for db2 log_dir and log archive files. The other set of eight 1GB LUNs is
used for the SAP R/3 and DB2 UDB executables and other temporary storage.
OSPL2 and OSPL6 are the production and disaster recovery ESS's in our setup.
For the initial installation, four 8GB LUNs from each rank are used to layout the SAP R/3
DB2 UDB data files. This leaves the other eight 8GB LUNs for FlashCopy (safety copy)
and PPRC.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 23
Figure 7: ESS Logical Sub Systems (LSS)
As shown in Figure 7, there are a total of eight LSS's, four on each cluster of an ESS.
Each LSS on a cluster has two ranks assigned to it. There are two sets of four 8GB
LUNs of an LSS that are assigned to one volume group.
In our case, that translates to four volume groups of 64GB each, one for each LSS. The
volume groups are LSS10data, LSS12data, LSS14data and LSS16data on ESS1
(OSPL2). Similarly, LSS11data, LSS13data, LSS15data and LSS17data are created on
ESS2 (OSPL6).
The other volume groups are sapvg and logvg comprising of 1GB LUNs. sapvg is used
for the SAP and DB2 UDB binaries, and logvg holds the online and archive log files.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 24
In an optimal database layout where the distribution of tablespaces follows the basic
principle spread the data over as many physical disk/arrays as possible (refer to [4])
tablespaces should be extended by at least one full stripe set across all arrays.
This ensures that tablespaces remain distributed across a stripe set. Allocation of a
specific number of arrays for a tablespace facilitates distribution and placement of the
data files. The more arrays that are in the allocation of arrays, the better the distribution.
Data file distribution should be based on a round-robin placement across ESS clusters
and arrays and the granularity can be achieved using AIX Logical Volume allocation.
In the Split Mirror Backup and Recovery Solution validation scenarios, the AIX Logical
Volume Manager maps the assigned ESS 8GB LUNs as hdisks. Grouping the LUNS of
a single RAID array can create volume groups. Under this traditional route for database
layouts, AIX file systems are created over AIX logical volumes that reside on a single
array. The data files of a tablespace are then distributed over file systems on different
arrays. The criterion for a data file to be placed in these file systems is still the same -
the overall I/O activity should be distributed across all available arrays. Creation of
volume groups and logical volumes should be within the constraints imposed by the AIX
Logical Storage Management.
5E Volume Group Assignment
Once the LUNS are defined on each ESS and assigned to hosts "joyjoy" and "decme",
the LUNS are made available to "joyjoy" by running the cfgmgr command on that host.
The Subsystems Device Driver (SDD) is a high availability automatic I/O Path Failover
Function that provides management of active paths to the LUNs as outlined in Section I
of the Appendix. The SDD software needs to be installed on the hosts before running
cfgmgr.
The LUNs appear as hdisks on joyjoy. If a LUN is assigned to one path, an hdisk, lets
say hdisk5 is defined on joyjoy. As joyjoy has 4 SCSI adapters, the LUNs are configured
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 25
such that they can be accessed through all four paths. For each additional path, another
hdisk is assigned. So for 4 paths, we have four hdisks all pointing to the same LUN on
the ESS.
When SDD is used, additional data path devices called vpath devices are created. Each
hdisk set (a set is based on the number of paths from the host to the LUN) is assigned a
vpath device. In our example, vpath0 will consist of hdisk5, hdisk45, hdisk89 and
hdisk125.
When a volume group is to be created on joyjoy, two options are available (in smitty)
(1) Add a Volume Group, or
(2) Add a Volume Group with Data Path Devices.
As we are using SDD, the volume groups are created with the second option.
Volume Layout for DB2 UDB The Volume layout for SAP R/3 system in DB2 UDB for ESS is depicted in Figures 25A
and 25B (Logical Sub System layout of vpath mapping to hdisks) in Section J of the
Appendix.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 26
6 Split Mirror Backup & Recovery Process 6A Start Situation The live system is in normal READ/WRITE operation state. The Log Volumes (log_dir,
log_archive) are in a constant synchronous PPRC connection between source DBA and
DBC across ESS1 and ESS2 along with Data Container Volumes as shown Figure 8.
After the prior SMBR backup activity, the DBD instance is in FlashCopy No Copy
relationship with DBC while DBB is in FlashCopy No Copy relationship with DBA.
Figure 8: Split Mirror Backup / Recovery and Standby R/3 System
The nine SMBR process steps depicted in Figure 8 are listed below:
1: FlashCopy (FC) DBA to DBB (Safety Copy) 2: Resync DBA to DBc 3: Withdraw Prior FC DBD 4: Write Suspend DBA and Freeze PPRC pairs 5: Create final FC DBD & Write Resume 6: Resync PPRC 7: Backup DBD to EBU/TSM 8: Mount FC Volumes to Backup Host 9: Start R/3 DB2 UDB on Backup Host
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 27
Backup Process
The first step in SMBR process would be to collect all the current layout details from
within the DB2 UDB database. Obtain the database structure information through the
query block ("LIST TABLESPACES SHOW DETAIL") that can determine the latest
information on the database structure including the log group, tablespaces, and
container information. During this step, the file structure information and volume group
layout along with device list from the host operating system is obtained.
On the Stand-by side shutdown all SAP and DB2 UDB processes and clean up any
semaphores that pertain to SAP/DB2 DB2 UDB on the target host decme and also on
the Safety FlashCopy target volumes (DBB) in the primary ESS. This step will confirm
that the current users in the SAP instance UDB on the decme host (probably the
Production Fix system users) are logged off and the journaled file systems on decme will
be unmounted. Unmount all the application related file systems except those that pertain
to the AIX host operating system on the decme host.
To be prepared for a new backup, the systems must be set back to the start situation.
When the DB2 suspend I/O command is issued on the primary system to stop data
volume activity, the suspension of writes on data and log volumes should prevent any
partial page writes for the next step. Read-only transactions will continue as long as they
are not requesting any resource held by the suspended I/O process.
While the database is in SUSPEND_WRITE state, any update operation to the
tablespaces, such as ALTER TABLESPACE, QUIESCE TABLESPACE, BACKUP
DATABASE ... FOR TABLESPACE, etc., will wait because they are waiting for the
tablespace write latch.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 28
The use of WRITE SUSPEND and WRITE RESUME commands exclusively developed
for the use with Hardware mirroring facilities, allows a consistent point-in-time snapshot
replica of the live database.
In the backup scenario as described above, we create a safety copy on the source
ESS1. This safety copy will enable us to recover the database with a DB2 conditional
restart on the secondary side, while the live system is available for error analysis. In this SMBR set up, the following tasks were created with task names that are indicative of the ESS CLI command sets: FCAllNoBgCpWD: Withdraw all previously held FlashCopy NoCopy relationship
between FlashCopy source/target pairs
FC10121416NoBgCp: FlashCopy source production volumes to target Safety copy volumes in LSS10, LSS12, LSS14, LSS16 in NO COPY mode.
EstALL4PATH: Establish paths and/or ensure that the PPRC paths exist between ESS1 (LSS10, LSS12, LSS14, LSS16) and ESS2 (LSS11, LSS13, LSS15, LSS17)
ResyncPPRCAll: Resynchronize PPRC source and target volumes in between ESS1 & ESS2
Frzall: Subsystem level withdrawal of paths and dataflow between PPRC source and target volumes
FC11131517NoBgCp FlashCopy PPRC target volumes ( DBC ) to DBD volumes in LSS11, LSS13, LSS15, LSS17 in NO COPY mode.
OSPL2C0 is the Cluster 0 in ESS1
OSPL6C0 is the Cluster 0 in ESS2
rsExecuteTask.sh is a CLI command to be initiated from AIX host CLI command option -v provides verbose output
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 29
Each SMBR step of the backup process is described as follows:
Step 1: Create a Safety FlashCopy of the Production Instance
i. Withdraw previous Safety Copy source/target FlashCopy relationships on the
Primary ESS1:
To Withdraw previous Safety Copy source/target FlashCopy relationships on the
Primary ESS1, as captured in a Task "FCAllNoBgCpWD", the following CLI is
initiated against OSPL2C0 from AIX.
rsExecuteTask.sh -v -s ospl2c0 FCAllNoBgCpWD
Ospl2c0 is the cluster1 of ESS1 that hosts Production Volumes.
FCAllNoBgCpWD is the predefined task set up between source Production volumes
and Target Safety Flash Copy volumes
ii. Quiesce the database write I/O's using the UDB feature
"db2 set write suspend for database"
iii. FlashCopy all DBA volumes to DBB volumes with No Copy option
rsExecuteTask.sh v s ospl2c0 FC10121416NoBgCp
iv. After the completion of the logical NoCopy, bring the database back from quiesce
to normal mode using DB2 UDB feature
db2 set write resume for database.
v. Extract all the system information pertaining to disk storage from within AIX on
joyjoy and remote copy those files to host decme.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 30
Step 2: Resynchronize DBA to DBC
i. Ensure that the Paths do exist between PPRC source (DBA) and target volumes
(DBC); establish if they do not already exist
rsExecuteTask.sh v s ospl2c0 EstALL4PATH
ii. Resynchronize all the PPRC source (DBA) and target (DBC) volumes.
rsExecuteTask.sh v s ospl2c0 ResyncPPRCAll
iii. Monitor % of resynch completions.
Step 3: Withdraw Previously held FlashCopy (source / target) Volumes on ESS2 after stopping secondary R/3 Application and DB2 UDB Database on the Backup host
In ESS2, initiate a withdrawl of previously held FlashCopy NoCopy relationship between
DBC and DBD volumes from a previous backup run.
RsExecuteTask.sh -v -s ospl6c0 FCAllNoBgCpWD
Step 4: Quiesce the Production database and Suspend all the PPRC source and target Volumes
i. Quiesce the DBA - Production database write I/O's using the DB2 UDB feature
db2 set write suspend for database
This will enable a freeze of all write activity on DBA so that PPRC targets - DBC can
function as a consistent FlashCopy source volumes for the final Split Mirror on DBD.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 31
ii. Split Log and Data Volume pairs
(a) Obtain error code and send an automated message upon successful
suspension. Freeze all the ESS subsystem level PPRC relationships
between ESS1 and ESS2. Suspend all paired data and log volumes.
rsExecuteTask.sh v s ospl2c0 frzall
(b) Verify the status of the task completion using rsExecuteQuery in a query loop.
Step 5: Create final FlashCopy that can be backed up to a backup utility
i. Verify the error codes from the PPRC Freeze.
ii. Proceed with the Final FlashCopy. Initiate FlashCopy of the source DBC to target
DBD in NoCopy mode.
rsExecuteTask.sh v s ospl6c0 FC11131517NoBgCp
iii. Verify error codes for the FlashCopy and upon error code of zero, bring back
quiesced database using
"db2 set write resume for database" command.
This step completes the tasks required to create a Split Mirror Copy of Productive
environment.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 32
Optional requirement for a complete physical copy: If there is a requirement for a complete physical copy of the final FlashCopy (logical
copy) targets DBD, that are attached to the Backup Host decme, initiate a FlashCopy
with BackGroundCopy option from DBD to DBE (not shown in the Figure 8) in ESS2.
Step 6: Resynchronize PPRC source and target volumes
After successful completion of FlashCopy (logical copy) and DB2 UDB write_resume
on the Production instance, it is time to reinitiate the PPRC links between ESS1 and
ESS2.
i. Initiate PPRC links and establish paths between ospl2 and ospl6
rsExecuteTask.sh -v -s ospl2c0 EstAll4Path
ii. Resynchronize all the delta activity in ESS1 during the period of PPRC freeze
(during the execution of step 5 above).
rsExecuteTask.sh -v -s ospl2c0 ResyncPPRCAll
Step 7: Backup the Final FlashCopy Volumes to Enterprise Backup Utility (EBU) Once the physical movement of the data from DBC to DBD is completed, then remote
copy all the file systems, volume groups, and device lists from the source Production
Host.
At this point using Tivoli Storage Manager (TSM), Legato or any compatible EBU,
backup DBD instances SAP / DB2 UDB objects to tape. We will always save all the DBD
volumes to tape media by invoking complete file system backup or by using DB2
BACKUP utility.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 33
Step 8: Mount FlashCopy Volumes to Backup Host
Rediscover all the devices that are part of the defined volume groups at the host
operating system level and (varyon) all the volume groups and mount the file systems
required for DB2 UDB & SAP on host decme.
Varying-on and mounting of all the relevant volume groups is done after rediscovering
the physical volumes and remapping those to the virtual paths as seen by the host
operating system via the data path optimizer.
Step 9: Start DB2 UDB on the Backup Host for verification purposes In order for us to verify that it is a consistent copy of the live database, this step is
optional. Startup DB2 UDB using db2inidb. Startup SAP after checking for db2 error
logs.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 34
6B Split Mirror Backup & Recovery (SMBR) Scenarios
Customer Options and Configurations for Split Mirror Backup & Recovery
Figure 9: SMBR Scenarios
Scenario 1: Off-line Backup with FlashCopy using only one ESS storage subsystem. - Stop Live UDB DBA ; FlashCopy (FC) DBA to DBB
- When "Logically Complete", start DBA, move DBB to tape; If customer elects FC
physical copy option for backup and reporting purposes, then wait for FC physical
completion, start DBA. - Move to tape with Tivoli Data Protection (TDP)/Tivoli Storage Manager (TSM).
This Scenario incurs production down time.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 35
Scenario 2: Off-line Backup using FlashCopy and PPRC using two ESS storage subsystems.
- Stop Live UDB DBA; Break/Suspend PPRC links from DBA to DBC - FC DBA to DBB (safety copy); When "logical complete, FC DBC to DBD
- Start PPRC ReSynch
- When FC DBD "logical complete", start DBA - Move DBD to tape with TDP/TS
This Scenario, while requiring production downtime for completion of the FlashCopy command (logical completion), provides the disaster recovery capability inherent in ESSs fault tolerant architecture.
Scenario 3: On-Line backup using FlashCopy and PPRC i.e. on-line implementation of Scenario 2 with two ESS subsystems. - Execute DB2 UDBs "Write SUSPEND"..Write RESUME " commands for
consistent FC.
- FC DBA to DBB (safety copy)
- When FC logical complete, Write Resume
- Start PPRC ReSynch (logs, control and container files); For high availability
considerations it is advisable to make a FlashCopy Copy (physical copy) prior to
executing PPRC with remote ESS
- FC DBC to DBD
- Write Resume DBA
- Move DBA to tape with TDP/TSM
This scenario is intended to keep the production system available during the backup while providing the flexibility to transmit a point-in-time copy to a remote Disaster Recovery site.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 36
Scenario 4: On-line, "Zero-downtime" / Serverless Backup & Recovery
This is the scenario described in section 6A.
- Primary (ESS1/Cluster1) DBA; FlashCopy (FC) DBB; Secondary Mirror
ESS2/Cluster2) DBC; FC Secondary DBD - FC DBA to DBB (safety copy before start of backup process); Execute DB2 UDBs
"Write SUSPEND", Write RESUME " commands for consistent FC.
- ReSynch PPRC (logs + data volumes) pairs to include transactions during safety
copy step and all changed data since last data files resynch process - Monitor % resynch completion; this process creates complete mirror (logs + data) of
ESS1 on ESS2 - As soon as 100% Resynch completes, execute UDB commands " Write SUSPEND " - Suspend PPRC (logs + data vols) on ESS1
- FC DBC to DBD for on-disk Point-In-Time-Copy (NoBgCp - No Background Copy) When FC DBD "logical complete", execute database " Write RESUME ",
"ReSynch" PPRC (log volumes) to resume mirroring of Log volumes for high
availability - Move DBD to tape using TDP/TSM
- Start SAP on backup server from DBD to prove restartability - Integrate all scripts for automation through backup agents like TSM, Legato
Networker etc. For this "serverless/zero downtime scenario - backup activity places no load on the Production Server or Primary Storage. This scenario conforms to SAPs recommendation for high availability, advanced infrastructure backup & recovery requirements.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 37
6C Key Observations about the Split Mirror Backup & Recovery scenarios
Scenario 1: In this single configuration implementation of Split Mirror Backup, customer expects to have downtime as the database is momentarily shut down for Point-
in-Time Copy (PITC) using FlashCopy.
Scenario 2: In this configuration using two ESSs, the customer expects to have downtime while creating Point-in-Time Copy using FlashCopy (physical copy). PPRC is
used to move this FlashCopy (physical copy) on the Primary ESS to the Secondary
ESS. The PPRC targets on the secondary can then be sent to Tape for remote vaulting.
This scenario provides a disaster recovery capability.
Scenario 3: Repeats Scenario 2 keeping DB2 UDB in on-line mode with "Write Suspend" and "Write Resume" Operations. With DB2 UDB functions, downtime is
avoided. As in scenario two above, PPRC provides disaster recovery and remote
vaulting capabilities.
Scenario 4: In this complete Split Mirror Backup solution using two - ESS's, customer has the ability to bring up four copies (Production & Mirror [2 copies] + FlashCopy on
Primary ESS + FlashCopy on Secondary ESS) of the production database. DB2 UDB
"write suspend" and "write resume" features, enable consistent image copies of live DB2
UDB database to be created while suspending and resuming I/O.
After a successful backup, perform a test restore of the database to check the physical
correctness of the data transferred, of the tape device, and it's software drivers.
Depending on the number of different databases laid out within the same LSS and
depending on the requirements for a multi-system database point-in-time concurrent split
mirror backup, individual LUN level SUSPEND can be issued with the ESS CLI instead
of the PPRC Freeze command.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 38
The backup strategy should include verification of the data that was backed up. Perform
a logical data check using db2dart (during periods of low system activity) to verify the
consistency of the DB2 database. Corrupt DB2 blocks (error SQL2412) can appear in an
R/3 database as a result of operating system or hardware errors and may make the
backup unusable. The existence of these blocks only becomes evident during the next
read access attempt to a table within the database. Since this particular access attempt
may not occur often, and corrupt DB2 blocks are not recognized during a database
backup, corrupt blocks can remain undetected in an R/3 UDB system for a long time.
7 Recovery
The backup process described above will provide a consistent copy of the live database
achieved at the moment of application write suspension. This backup file set will be on
tape media and on the DBD volumes of ESS2 (OSPL6). Recovery process in DB2 can
be achieved in 2 types:
Rollforward recovery is performed when all or parts of the persistent storage (disk) have been lost or corrupted. The process could be very lengthy and would require
manual intervention in a non-SMBR process.
Restart recovery is performed by DB2 whenever data in the non-persistent memory (RAM) has been lost, due to a system hardware failure or a power loss. Restart recovery
is done by restarting the DB2 database, which initiates a crash recovery. DB2 reruns all
transactions that were performed previously but have not been recorded in persistent
memory. To enable DB2 to perform automatic restart recovery after a crash, the
database parameter AUTORESTART is to be set to ON by the R/3 installation program.
The "DB2 log file header file" which contains the log sequence number, must be used as
a starting point for this type of recovery. In cases of a disaster that includes loss of this
file, DB2 recovery is done by restarting the database and performing a rollforward
recovery.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 39
The database restart brings the database to the latest consistent state based on the disk
image and the online log files. A restart recovery is based on the entries in the log
control file and the online log files. If a disk error occurs and the contents of the disk are
not recoverable, a restart recovery will not work. This applies to the containers of the
database, the online log files, and the log control file. Therefore, as discussed in Section
J of the Appendix, R/3 UDB Volume Layout has to use ESS RAID5's striping, which
allows a highly available disk subsystem that mirrors the data, the log files, the
containers, and the control files. The contents of DBD and DBC volumes of the SMBR process differ from each other in
that the restart of the DBD SAP UDB instance will roll back all the transactions that were
open at the time of write suspension. Also, because of the RESYNC task in Step 8 of the
SMBR process, DBC and DBA volumes are a pair of synchronous mirrors and differ from
the DBD (Point In Time) volumes.
The DBD volumes can either be left open for multiple purposes like Database validity
checks, Database reorganization dry-run tasks, testing SAP support packages, testing
and applying of OSS notes or for the purposes of serving as a production fix system or
as a reporting instance. To set up the DBD volumes to serve as a standby database:
a) mount the volumes on host "decme"
b) place the DBD in roll-forward pending and then rollforward the mirror.
This is accomplished by running the tool "db2inidb UDB as standby" (UDB is the DB
instance) to place the mirrored database in roll-forward pending state, removing the
suspended write state and rollforward the database to the end of logs.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 40
The database recovery is performed in several steps:
1) For a recovery process, the tape backup needs to be first restored on to the DBD
volumes and then the DBD volumes can be FlashCopied on to the PPRC
volumes on ESS2 (OSPL6). The recovery steps further involve executing PPRC
from source ESS2 to the target volumes on ESS1.
2) Depending on the Service Level Agreements and consistent with customer
scenarios and requirements, we can repeat the steps in reverse order of the
backup scenarios for achieving the recovery of a consistent database copy from
a tape media or from ESS2 volumes on DBD to provide a completely recovered
SAP UDB instance.
3) Transaction log processing. Setup a user exit program to retrieve the log files
that are requested in the recovery history file. The file specifies the log files
needed to maintain consistency of an online backup. In the rollforward phase,
use the DB2 Control Center to roll forward the transactions using the log files that
have been retrieved.
The database can be forwarded to a certain point in time or to the end of the logs
using Coordinated Universal Time (CUT or GMT) since the database server
calculates the timestamps based on this time zone. Applications from different
time zones may connect to the database, but internal processing is done in CUT.
However, the timestamp used for backups is based on the local time zone of the
database server.
4) During the rollforward process, all committed transactions are reapplied to the
database. At the end of the rollforward, a list of open transactions exist. These
transactions are still awaiting a commit. For data consistency reasons, open
transactions will now be rolled back, to make sure that changes that are not
committed are not applied to the database. Then the database is instructed to
"rollforward stop". With this, the database is made consistent.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 41
Crash Recovery
The time needed to perform a crash recovery depends on the amount of work DB2 has
to redo. DB2 must redo part of the transactions in the log files. That is committed
transactions are reapplied to the database and uncommitted transactions are removed
from the database. The starting point for this reapplying is found in the log control file.
This file contains the lowest log sequence number (LSN) needed for a crash recovery.
All log records that are older than the LSN are discarded because these changes to the
data in the buffer pool have already been written to disk during the soft checkpoint.
SOFTMAX (a DB2 UDB configuration parameter) specifies the percentage of a log file
used before a soft checkpoint will occur. Large databases are configured with
SOFTMAX spanning several log files.
DB2 UDB's Forward Recovery uses a backup image (DBD) and the entire subsequent
log files created after the backup are used to recover the database to a consistent state.
The design of a backup strategy should be based on how much time is to be spent on a
recovery. With ESS SMBR's advantage in providing rapid backup of VLDB's (Very Large
Data Base) made possible by the combination of PPRC and FlashCopy CopyServices
functions, customers can have multiple Point In Time backups which will enable them to
perform a faster recovery since there are fewer log files to redo.
Other Considerations during Recovery:
During the restore process, the db2agent process communicates with, and controls the
operation of the db2med and db2bm processes. During restore operations, the media
controllers transfer data from device into restore buffers. During a restore, DB2 uses
page cleaners to transfer the buffer contents to disk. The number of buffers should be
large enough to keep the media controllers and their devices busy (generally more than
2). The size of the restore buffer should be 1024, or preferably 2048 4k pages (4 MB,
8MB) or a multiple of the backup buffer size.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 42
By using the same size for the restore buffers as achieved in a good backup
performance with a certain backup buffer size, excellent restore performance can be
achieved. When a buffer is full, it is transferred to disk on FIFO (First In First Out) basis.
The number of db2med processes depends on the number of restore devices or
sessions. It is always much faster to empty the buffers to disk than to fill them from
tape/TSM. The usage of DB2 UDB parameter, PARALLELISM when reading the backup
image(s) from disk during a restore will be particularly helpful when not using multiple
tape devices or TSM sessions or when not using multiple db2bm processes.
8 Process Automation - Solution Integration in Customer
Environments In establishing this Split Mirror Backup & Recovery solution for R/3 with DB2 UDB and
ESS, IBM has created end-to-end automated procedures that will enable this solution to
seamlessly integrate into customer environments. While recognizing that each customer
will likely implement this solution uniquely, the procedures incorporate use of platform-
independent Java technology, which will execute the customized, set of Split Mirror task
routines set up for a particular backup environment. Inherent benefits of Java technology
implementation will accrue to the customer in terms of flexibility, security, and location
transparency. With this standards-driven approach, customers can integrate this solution
with customer-preferred enterprise solutions management consoles (TME10, OpenView,
BMC etc).
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 43
Acknowledgements
Our sincerest thanks to Siegfried Schmidt and Dr. Jens Claussen, of SAPs
Advanced Technology Group (ATG) for their valuable guidance and for sharing
SAPs vision and requirements with us to accomplish the work that is reported in
this paper. In particular this white paper draws considerably from the insightful
experiments, observation and publications on DB2 UDB and storage, conducted by
Dr. Claussen at the SAP ATG labs in Walldorf, Germany. Siegfrieds earlier work
with us on the Split Mirror Backup & Recovery solution implementation on the
DB2OS390 platform with IBMs RVA and ESS storage subsystems have paved the
way for this current implementation with DB2 UDB on the AIX platform.
We would also like to thank a number of IBM colleagues with whose thoughtful
guidance, this work is brought forward with the benefit of their applied knowledge in
R/3, DB2 UDB and ESS, in delivering high availability, zero downtime, and easy to
use Split Mirror Backup and Recovery Solution for mission critical SAP customer
environments. Matt Huras, Jason Racicot, Bill Micka, David Short and James Teng
have been most helpful with their knowledge of DB2 UDB, the ESS and its
advanced functions and have endured this journey with us.
Sanjoy Das, Siegfried Schmidt, sanjoy@us.ibm.com siegfried.schmidt@sap.com Jens Claussen BalaSanni Godavari jens.claussen@sap.com godavari@us.ibm.com
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 44
Appendix
A) DB2 UDB Architectural Overview
Figure 10: DB2 UDB Architectural Overview
Figure 10 shows the architecture of DB2 UDB structures. It contains data structures on
disk (extents for objects like tables and indices), in memory (buffer pools, log buffers), in
processes (prefetchers, page cleaners, loggers and sub agents) and connectivity for
applications (UDB client library, coordinator agents, for usage of shared memory,
semaphores, named pipes, protocols TCPIP, Named Pipes, SNA, NetBIOS and
IPX/SPX).
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 45
Below the level of SQL, DB2 UDB manages data objects using extents. An extent is an adjacent grouping of a fixed number of DB2 pages.
In order to optimize disk I/O and data distribution, tablespaces manage pages by
grouping them into extents per each database object. The optimal R/3 UDB extent size
ranges from 8 to 64 pages. There should be enough free space in the container of a
tablespace to accommodate the additional extents required for database object growth.
The extent size (the number of adjacent pages) is defined on a per-tablespace level. It
cannot be changed, once it has been defined during installation of the R/3 database.
Typical extent size is eight for tablespaces with many small and unused tables, but up to
64 for the tablespace containing ABAP repository tables. All the extents for a data object
are in one tablespace, but can be spread over several OS files or devices. In DB2 these
storage facilities are called containers.
DB2 UDB supports two kinds of tablespaces:
- Systems Manager Storage (SMS) tablespaces
- Database Managed Storage (DMS) tablespaces
SAP only uses DMS table spaces for data and indexes. The recommendation for R/3
Temp Space is on SMS and for BW on DMS for maximum performance. Both types of
tablespaces may be used in the same database. SMS tablespaces are called Systems
Managed because the operating systems file system manages the containers. With
DMS, DB2 itself manages the container space and allocates the space when the
container is created.
There is no set limitation on the number of extents in a tablespace, but a limit to the total
number of pages. This is reflected in the maximum table/tablespace size of 64 GB with
4K pages, 128 GB with 8K pages, 256 GB with 16K pages and 512 GB with 32K pages.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 46
The DB2 Extended Enterprise Edition (EEE) enables the distribution of a database across multiple hosts and to install multiple partitions on a single host.
The limitation of 2 24 pages per tablespace refers to a single partition of the database.
Consequently, if a tablespace is distributed across multiple partitions (nodes) of the EEE
instance either on a single host or on multiple hosts, the limit is raised to n * 2 24 pages
per tablespace (n is the number of partitions on which the table space is stored). SAP,
however, currently supports EEE only for a few mySAP components.
When a DB2 UDB database instance is started, several processes are created:
1) Dedicated Shadow Processes are created when a new SAP work process
establishes a connection to DB2.
2) Shared Processes are required by the DBMS systems to function and perform
various database management tasks.
The UDB database is stored in 4/8/16/32 KB blocks in data files on disk. The container
files, online and offline log files and control files can be stored like in our test
environment in an underlying set of RAID5 disks of a storage sub system. In order to
accelerate read/write access to the data, these data blocks are cached in the database
buffer pool in production CPU main memory.
Any changes or modification to the database data are logged in the online log files. This
procedure ensures security of data. For fail-safe database operations, without using
additional operating system utilities, the control files and the online active/retained log
(log dir/log retain) files of the database system can be like in our test environment in
constant synchronous remote copy mode.
The UDB database management system holds the executable SQL Statements in the
Shared SQL Area which is part of the shared pool, only once for all processes. The DB2
Storage and Memory structures interact as depicted in Figure 11.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 47
Figure 11: DB2 UDB Storage and Memory Structure
Each R/3 work process, as shown in Figure 12, does the following:
a. Connects to the database as one database user
b. Handles database requests for the different R/3 users
c. Communicates with a corresponding shadow process on the database
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 48
Figure 12: SAP R/3 Work Processes and their Interaction with DB2 UDB
The R/3 naming convention for tablespace names is :
PSAP. The abbreviations in the tablespace name are
part of the container name. Containers are numbered starting with 001.
Depending on the size of each data record, several data records can be stored in a DB2
page. Within a tablespace, the extents belonging to different data objects compete for
storage space. DB2 allows to separate data and index objects into different tablespaces.
In non-RAID5 storage architecture implementations, the emphasis is to distribute data
and index tablespaces to different physical devices, thus improving database
performance.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 49
With ESSs robust RAID5 architecture and its use of striping different types of data on all
RAID arrays, the performance hot-spots are eliminated while I/O performance improves
through parallel access to all members of the array.
B) DB2 UDB Database Growth and Impact on Storage
Over time, because of a gradual and continual accumulation of data or repeated
insertion and deletion operations, fragmentation occurs within a data page. Once space
has been allocated, it can be released for re-use only after table reorganization or a
physical deletion operation (DROP). This can cause storage space fragmentation within
a tablespace, which can be resolved by DB2 UDB online tablespace reorgs.
As shown in Figure 13, reorganizing tables is an operation that is performed on an
exceptional basis. R/3 System may stay up and running, with no heavy transaction load
on the tables which are to be reorganized. PSAPTEMP temporary tablespace is used for
the copy of the table while reorging, thus requiring this tablespace to hold all the data of
the copied table. In general, it is recommended to allow for 2 times the physical size of
the table according to transaction DB02. Very large tables should not be reorganized
using the online method.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 50
Figure 13: DB2 UDB Usage Patterns and Impact on Storage
In order to accommodate growing tablespace requirements, each time a new container
is added to the tablespace, the rebalancing process is started. When a new container is
added, DB2 rebalances the tablespace:
1. It creates the container, that is, it allocates the specified amount of pages in a file
system.
2. When the container is available, a process or a thread is started to rebalance the
extents over all containers of the tablespace. This helps ensure that extents are
distributed evenly among all the containers, assisting data distribution for existing
objects and proper striping for newly allocated extents.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 51
C) DB2 UDB Memory Management
DB2 UDB provides Buffer Pool, Package Cache, and Catalog Cache for memory
management. Figure 14 depicts how R/3 work processes working through DB2 agents
utilize these three memory management functions to execute logical and physical (disk)
access.
Buffer pool(s) - provide the storage for data and index pages. The buffer pool functions
as an optimized cache for the database to improve database system performance. Since
this cache optimizes its strategy for database use, it is better to use the buffer pool(s)
than to use a large file system cache. The SAP architecture design goal is to minimize
disk accesses (physical reads) and to maximize buffer pool access (logical reads). While
the size of the primary buffer pool is controlled by DB2, buffer pools can also be defined
at tablespace level with at least one buffer pool per database.
Package cache: It stores the DB2 UDB database access plans as part of the dynamic
statement cache function, thus helping to reduce the time/cost per connection to the
database.
Catalog cache: It stores binary and compressed descriptors for tables, views, and
aliases to avoid reading directly from disk.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 52
Figure 14: DB2 UDB Memory Management for each SAP work Process
D) Backup and Recovery Management for DB2 UDB
IBM DB2 provides the following tools for performing data backups and restores (Figure
15):
- DB2 Backup backs up the container files and the control information
- DB2 Restore allows data retrieval from the backup image.
- The DB2 backup tools store information in the DB2 History file, which is then read by
R/3 tools.
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 53
Figure 15: Backup & Recovery Tools on UDB for R/3
External data management systems like TSM (Tivoli Storage Manager from IBM) or
Networker (from Legato) enable backup procedures to be integrated with the existing
operations of a computer center. DB2 UDB supports Legato Networker and IBM TDP
R/3 as hierarchical storage management systems for DB2 backup/restore. These
systems improve backup throughput and allow easy management of large number of
tapes. R/3 on DB2 UDB currently supports IBM TDP R/3 as a hierarchical storage
system for log file archiving in brarchive.
SAP delivers several executables to manage database backups and log files archived
offline:
1) db2adutl (Administration utility for TSM, described in SAP R/3 note: 67789)
- Query, selects backup images in TSM
- Extract, extracts backups from TSM to disk
- Delete, marks database backup in TSM as inactive, deletion will take place
according to retention policy
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 54
2) BRARCHIVE (Stores offline retained log files into TSM). BRARCHIVE backs up
the database log files from the offline log directory. BRARCHIVE can be invoked
through the command line, but is usually called from the R/3 DB2 Control Center
Extensions DB2Admin.
3) BRRESTORE (Restores offline archived log files from TSM. BRRESTORE is
used to retrieve log files into a log-retrieval directory (log_retrieve) during
recovery. It is typically used from the DB2 Control Center Extensions.
4) BRDB6QRY (Query for offline archived log files in TSM)
5) BRDBSDSD (Deletes offline archived log files from TSM)
As shown in the following Figure 16, the SAP delivered executables perform backup of
DB2 UDB Control Files, DB2 UDB Containers, Offline Log Files, On Line Log Files and
SAP objects.
Figure 16: SAP R/3 Backup Objects
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 55
E) Work Process Management in SAP R/3 with DB2 UDB and
Storage An SAP R/3 work process enables a user to connect to the UDB database. It is only
active for one user during a dialog step i.e. between screen input and screen output.
Different users use the same work process and database connection consecutively
because the work process issues a commit to the database before screen output.
Figure 17: SAP R/3 Work Process Management with DB2 UDB
From the storage subsystems perspective, it is important to note how the SAP R/3
processes, as shown in Figure 17 above, handle database SQL statements and logging:
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 56
1. A SQL statement is sent from the R/3 work process to its associated agent.
2. The DB2 Optimizer function in the DB2 agent transforms the SQL statement into an
UDB access plan, which is usually cached as part of a UDB function called dynamic
statement cache.
3. DB2 parallel I/O servers obtain information about the required buffer pool pages from
the data containers with the help of db2agents that execute in parallel, performing
the tasks defined in the DB2 access plan. The db2 agents also obtain
UPDATE/INSERT SQL statements from SAP R/3 application.
4. Data changes that occur in the buffer pool and not transferred to disk are written as
log records in the log buffer. The log file header (LFH) is moved forward to the first
log record, which contains data about pages that are changed and not on disk.
Changed pages from buffer pool(s) are copied asynchronously to containers. If the
LFH moves out of a log file, this file is not required for a crash recovery. If the file
sqlogctl.lfh containing the log file header is not readable, crash recovery is not
possible and the database must be restored.
5. A COMMIT / ROLLBACK SQL statement at the completion of the transaction causes
the contents of the log buffer to be synchronously written to the current log file in
log_dir directory.
6. Asynchronous I/O cleaners or page cleaners destage updated pages onto disk when
triggered by the following:
The maximum amount of log space that should be read during crash
recovery has been reached
The maximum percentage of changed pages has been reached
There are no pages available during insert/update
Storage Management for SAP and DB2 UDB: Split Mirror Backup / Recovery with IBMs Enterprise Storage Server (ESS)
Page 57
F) SAP R/3 data layout
In preparation for SAP R/3 UDB data layout of the ESS, the following sequence of
activities should be considered:
- Find Number and types of SAP systems to be installed in the system landscape
that typically includes Development, Quality Assurance and Production system
instances.
- Start with the Planning of an SAP landscape topology, estimating hardware
requirements for storage, network and server machines. This creates a
hardware requirements list: ESS boxes, number of SCSI/FC connections, server
machines, network connectivity etc.
- Sizing (done by HW partners): identify requirements for CPU
- Storage space, I/O Access
- Network bandwidth
- System infrastructure/landscape (inter system communication)
- SAP installation guide
- Assignment of SAP systems to hosts
- Estimations for storage needs of individual database systems / tablespaces
- Collect space requirements from SAP applications, their databases
(tablespaces, logging), 3rd party software (bolt-ons)
- Consider ESS / Host / DB/ Application restrictions
- Map requirements to AIX logical volumes
- Group the logical volumes to volume groups
- Design ESS configuration and perform configuration steps with ESS Specialist - Create AIX volume groups and logical volumes
Customer exper