Tape drive Sg 244503
-
Upload
luis-tavis -
Category
Documents
-
view
241 -
download
0
Transcript of Tape drive Sg 244503
-
7/27/2019 Tape drive Sg 244503
1/325
International Technical Support Organization
System/390 MVS Parallel Sysplex
Continuous Availability SE Guide
December 1995
SG24-4503-00
-
7/27/2019 Tape drive Sg 244503
2/325
-
7/27/2019 Tape drive Sg 244503
3/325
International Technical Support Organization
System/390 MVS Parallel Sysplex
Continuous Availability SE Guide
December 1995
SG24-4503-00
IBML
-
7/27/2019 Tape drive Sg 244503
4/325
Take Note!
Before using this information and the product it supports, be sure to read the general information under
Special Not ices on page xvi i.
First Edition (December 1995)
This edition applies to Version 5 Release 2 of MVS/ESA System Product (5655-068 or 5655-069).
Order publications through your IBM representative or the IBM branch office serving your locality. Publications
are not stocked at the address given below.
An ITSO Technical Bulletin Evaluation Form for reader s feedback appears facing Chapter 1. If the form has been
removed, comments may be addressed to:
IBM Corporation, International Technical Support Organization
Dept. HYJF Mail Station P099
522 South Road
Poughkeepsie, New York 12601-5400
When you send information to IBM, you grant IBM a non-exclusive right to use or distribute the information in any
way it believes appropriate without incurring any obligation to you.
Copyright International Business Machines Corporation 1995. All rights reserved.
Note to U.S. Government Users Documentation related to restricted rights Use, duplication or disclosure is
subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corp.
-
7/27/2019 Tape drive Sg 244503
5/325
Abstract
This document discusses how the parallel sysplex can help an installation get
closer to a goal of continuous availability.
It is intended for customer systems and operations personnel responsible forimplementing parallel sysplex, and the IBM Systems Engineers who assist them.
It will also be useful to technical managers who want to assess the benefits they
can expect from parallel sysplex in this area.
The book describes how to configure both the hardware and software in order to
eliminate planned outages and minimize the impact of unplanned outages.
It describes how you can make hardware and software changes to the sysplex
without disrupting the running of the applications.
It also discusses how to handle unplanned hardware or software failures, and to
recover from error situations with minimal impact to the applications.
A knowledge of parallel sysplex is assumed.
(296 pages)
Copyright IBM Corp. 1995 iii
-
7/27/2019 Tape drive Sg 244503
6/325
iv Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
7/325
Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii i
Special Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi x
How This Document Is Organized . . . . . . . . . . . . . . . . . . . . . . . . . xix
Related Publicat ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xx
International Technical Support Organization Publications . . . . . . . . . . . xxi
ITSO Redbooks on the World Wide Web (WWW) . . . . . . . . . . . . . . . . . xxii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxii i
Part 1. Configuring for Continuous Availability . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 1. Hardware Configuration . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 What Is Continuous Availabil ity? . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Parallel Sysplex and Continuous Availabil ity . . . . . . . . . . . . . . . 3
1.1.2 Why N + 1 ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Coupl ing Fac il it ies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.1 Separate Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.2 How M any? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.3 CF Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.4 Coupling Facil ity Structures . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.5 Coupling Facil ity Volati l ity/Nonvolati l i ty . . . . . . . . . . . . . . . . . . 8
1.4 Sysplex T im ers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4.1 Duplicating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4.2 Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.3 Setting the Time in MVS . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.4.4 Protect ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5 I /O Configura tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5.1 ESCON Logical Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.6 CTCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.6.1 3088 and ESCON CTC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.6.2 Alternate CTC Configuration . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6.3 Sharing CTC Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6.4 IOCP Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6.5 3088 Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.7 XCF Signal l ing Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.8 Dat a P lacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.9 DASD Configura tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.9.1 RAMAC and RAMAC 2 Array Subsystems . . . . . . . . . . . . . . . . 17
1.9.2 3990 Model 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.9.3 3990 Model 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.9.4 DASD Path Recommendat ions . . . . . . . . . . . . . . . . . . . . . . . 17
1.9.5 3990 Model 6 ESCON Logical Path Report . . . . . . . . . . . . . . . . 18
1.10 ESCON Directors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.10.1 ESCON Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.10.2 ESCON Director Switch Matrix . . . . . . . . . . . . . . . . . . . . . . . 19
1.11 Fibe r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.11.1 9729 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.12 Conso les . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Copyright IBM Corp. 1995 v
-
7/27/2019 Tape drive Sg 244503
8/325
1.12.1 Hardware Management Console (HMC) . . . . . . . . . . . . . . . . . 21
1.12.2 How Many HMCs? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.12.3 Using HMC As an MVS Console . . . . . . . . . . . . . . . . . . . . . . 21
1.12.4 MVS Conso les . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.12.5 Master Console Considerations . . . . . . . . . . . . . . . . . . . . . . 22
1.12.6 Console Configuration Considerations . . . . . . . . . . . . . . . . . . 23
1.13 T ape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.13.1 3490 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.14 Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.14.1 VTAM CTCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.14.2 3745s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.14.3 CF Struc ture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.15 Envi ronmental . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.15.1 Uninterruptible Power Supply (UPS) . . . . . . . . . . . . . . . . . . . 26
1.15.2 9672/9674 Protection against Power Disturbances . . . . . . . . . . . 27
Chapter 2. System Software Configuration . . . . . . . . . . . . . . . . . . . . . 29
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 N, N+1 in a Software Environment . . . . . . . . . . . . . . . . . . . . . . . 29
2.3 Shared SYSRES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.3.1 Shared SYSRES Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3.2 Indirect Catalog Function . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4 M as te r Cat al og . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5 Dynamic I /O Reconf igurat ion . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.5.1 Except ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.6 I/O Def ini tion Fi le . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.7 Couple Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.8 JES2 Checkpo in t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.8.1 JES2 Checkpoint Reconfiguration . . . . . . . . . . . . . . . . . . . . . 39
2.9 RACF Dat abase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.10 PARMLIB Considerat ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.10.1 Developing Naming Conventions . . . . . . . . . . . . . . . . . . . . . 40
2.10.2 MVS/ESA SP V5.2 Enhancements . . . . . . . . . . . . . . . . . . . . . 41
2.10.3 MVS Conso les . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.11 System Logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.11.1 Logstream and Structure Allocation . . . . . . . . . . . . . . . . . . . 46
2.11.2 DASD Log Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.11.3 Duplexing Coupling Facility Log Data . . . . . . . . . . . . . . . . . . 47
2.11.4 DASD Staging Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.12 System Managed Storage Considerations . . . . . . . . . . . . . . . . . . 50
2.12.1 SMSplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.12.2 DFSMShsm Considerat ions . . . . . . . . . . . . . . . . . . . . . . . . 52
2.12.3 Continuous Availabil ity Considerations . . . . . . . . . . . . . . . . . 52
2.12.4 RESERVE Activi ty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.13 Shared Tape Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542.13.1 Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.13.2 Implementing Automatic Tape Switching . . . . . . . . . . . . . . . . 54
2.14 Exploiting Dynamic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.14.1 Dynamic Exits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.14.2 Dynamic Subsystem Interface (SSI) . . . . . . . . . . . . . . . . . . . . 56
2.14.3 Dynamic Reconfiguration of XES . . . . . . . . . . . . . . . . . . . . . 57
2.15 Automating Sysplex Failure Management . . . . . . . . . . . . . . . . . . 57
2.15.1 Planning for SFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.15.2 The SFM Isolate Function . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.15.3 SFM Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
vi Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
9/325
2.15.4 SFM Act ivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
2.15.5 Stopp ing SFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.15.6 SFM Ut il izat ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.16 Planning the Time Detection Intervals . . . . . . . . . . . . . . . . . . . . . 73
2.16.2 Synchronous WTO(R) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.17 ARM: MVS Automatic Restart Manager . . . . . . . . . . . . . . . . . . . . 79
2.17.1 ARM Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
2.17.2 ARM Processing Requirements . . . . . . . . . . . . . . . . . . . . . . 80
2.17.3 Program Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.17.4 ARM and Subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.18 JES3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
2.18.1 Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
2.18.2 JES3 Sysplex Considerations . . . . . . . . . . . . . . . . . . . . . . . 89
2.18.3 JES3 Parallel Sysplex Requirements . . . . . . . . . . . . . . . . . . . 90
2.18.4 JES3 Configurat ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
2.18.5 Additional JES3 Planning Information . . . . . . . . . . . . . . . . . . 93
Chapter 3. Subsystem Software Configuration . . . . . . . . . . . . . . . . . . . 95
3.1 CICS V4 Transaction Subsystem . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.1.1 CICS Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 963.1.2 CICS Af fin it ies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
3.1.3 File-Owning Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
3.1.4 Resource Definit ion Online (RDO) . . . . . . . . . . . . . . . . . . . . . 97
3.1.5 CSD Considerat ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
3.1.6 Subsystem Storage Protection . . . . . . . . . . . . . . . . . . . . . . . 98
3.1.7 Transact ion Isolat ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.2 CICSPlex SM V1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.2.1 CICSPlex SM Configuration . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.3 IMS Transact ion Subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . 100
3.3.1 IMS Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
3.3.2 IMS RESLIB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.3.3 IMSIDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.3.4 Terminal Defin it ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.3.5 Data Set Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
3.3.6 IRLM Def ini t ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
3.3.7 Coupling Facil ity Structures . . . . . . . . . . . . . . . . . . . . . . . . 102
3.3.8 Dynamic Update of IMS Type 2 SVC . . . . . . . . . . . . . . . . . . . 102
3.3.9 Cloning Inhib itors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
3.4 DB2 Subsyst em . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
3.4.1 DB2 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
3.4.2 DB2 Struc tures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
3.4.3 Changing Structure Sizes . . . . . . . . . . . . . . . . . . . . . . . . . 105
3.4.4 DB2 Data Avai labi li ty . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
3.4.5 IEFSSNXX Considerat ions . . . . . . . . . . . . . . . . . . . . . . . . . 105
3.4.6 DB2 Subsystem Parameters . . . . . . . . . . . . . . . . . . . . . . . . 1053.5 VSA M RLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
3.5.1 Control Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
3.5.2 Defining the Database . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
3.5.3 Defining the SMSVSAM Structures . . . . . . . . . . . . . . . . . . . . 108
3.5.4 CICS Use of System Logger . . . . . . . . . . . . . . . . . . . . . . . . 109
3.6 TSO in a Parallel Sysplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
3.7 System Automation Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.7.1 NetView . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.7.2 AOC/MVS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.7.3 OPC/ESA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Contents vii
-
7/27/2019 Tape drive Sg 244503
10/325
3.8 VTAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
3.8.1 Configurat ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Part 2. Making Planned Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Chapter 4. Systems Management in a Parallel Sysplex . . . . . . . . . . . . . 115
4.1 The Importance of Systems Management in Parallel Sysplex . . . . . . 1154.1.1 Change Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.1.2 Prob lem Management . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.1.3 Operations Management . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.1.4 The Other System Management Disciplines . . . . . . . . . . . . . . 116
4.1.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Chapter 5. Coupling Facility Changes . . . . . . . . . . . . . . . . . . . . . . . 117
5.1 Structure Attr ibutes and Al location . . . . . . . . . . . . . . . . . . . . . . 117
5.2 Structure and Connection Disposition . . . . . . . . . . . . . . . . . . . . . 118
5.2.1 Structure Disposi t ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.2.2 Connection State and Disposit ion . . . . . . . . . . . . . . . . . . . . 119
5.3 Structure Dependence on Dumps . . . . . . . . . . . . . . . . . . . . . . . 1205.4 To Move a Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.4.1 The Structure Rebuild Process . . . . . . . . . . . . . . . . . . . . . . 121
5.5 Altering the Size of a Structure . . . . . . . . . . . . . . . . . . . . . . . . 123
5.6 Changing the Active CFRM Policy . . . . . . . . . . . . . . . . . . . . . . . 125
5.7 Reformatting the CFRM Couple Data Set . . . . . . . . . . . . . . . . . . . 126
5.8 Adding a Coupl ing Faci l ity . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.8.1 To Define the Coupling Facil ity LPAR and Connections . . . . . . . 127
5.8.2 To Prepare the New CFRM Policy . . . . . . . . . . . . . . . . . . . . 127
5.8.3 Setting Up the Structure Exploiters . . . . . . . . . . . . . . . . . . . . 128
5.9 Servicing the Coupling Facil ity . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.9.1 Concurrent Hardware Upgrades: . . . . . . . . . . . . . . . . . . . . . 132
5.9.2 Concurrent LIC Upgrades . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.10 Removing a Coupling Facil ity . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.11 Coupling Facil ity Shutdown Procedure . . . . . . . . . . . . . . . . . . . 134
5.11.1 Coupling Facil ity Exploiter Considerations . . . . . . . . . . . . . . 138
5.11.2 Shutting Down the Only Coupling Facility . . . . . . . . . . . . . . . 141
5.12 Putting a Coupling Facility Back Online . . . . . . . . . . . . . . . . . . . 142
Chapter 6. Hardware Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.1 Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.1.1 Adding a Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.1.2 Removing a Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.1.3 Changing a Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
6.2 Logical Part it ions (LPARs) . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
6.2.1 Adding an LPAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1456.2.2 Removing an LPAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.2.3 Changing an LPAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.3 I /O De vic es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.4 ESCON Direc tors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.5 Changing the Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.5.1 Using the Sysplex Timer . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.5.2 Time Changes and IMS . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.5.3 Time Changes and SMF . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.5.4 Changing Time in the 9672 HMC and SE . . . . . . . . . . . . . . . . 148
viii Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
11/325
Chapter 7. Software Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
7.1 Adding a New MVS Image . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
7.1.1 Adding a New JES3 Main . . . . . . . . . . . . . . . . . . . . . . . . . 150
7.2 Adding a New SYSRES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
7.2.1 Example JCL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
7.3 Implementing System Software Changes . . . . . . . . . . . . . . . . . . 154
7.4 Add ing Subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
7.4.1 CICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
7.4.2 IMS Subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
7.4.3 DB2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
7.4.4 TSO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
7.5 Start ing the Subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
7.5.1 CICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
7.5.2 DB2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
7.5.3 I MS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
7.6 Chang ing Subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
7.7 Moving the Workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
7.7.1 CICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
7.7.2 I MS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7.7.3 DB2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1637.7.4 TSO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
7.7.5 Batch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
7.7.6 DFSMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
7.8 Closing Down the Subsystems . . . . . . . . . . . . . . . . . . . . . . . . . 165
7.8.1 CICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
7.8.2 I MS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
7.8.3 DB2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.8.4 System Automation Shutdown . . . . . . . . . . . . . . . . . . . . . . 169
7.9 Removing an MVS Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Chapter 8. Database Availability . . . . . . . . . . . . . . . . . . . . . . . . . . 171
8.1 VSA M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
8.1.1 Batch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
8.1.2 Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
8.1.3 Reorg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
8.2 I MS/DB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
8.2.1 Batch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.2.2 Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.2.3 Reorg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
8.3 DB2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
8.3.1 Batch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
8.3.2 Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
8.3.3 Reorg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Part 3. Handling Unplanned Outages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
Chapter 9. Parallel Sysplex Recovery . . . . . . . . . . . . . . . . . . . . . . . 179
9.1 Sys tem Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
9.1.1 Sysplex Failure Management (SFM) . . . . . . . . . . . . . . . . . . . 179
9.1.2 Automatic Restart Management (ARM) . . . . . . . . . . . . . . . . . 179
9.1.3 What Needs to Be Done? . . . . . . . . . . . . . . . . . . . . . . . . . 179
9.2 Coupling Facil ity Failure Recovery . . . . . . . . . . . . . . . . . . . . . . 180
9.3 Assessment of the Failure Condition . . . . . . . . . . . . . . . . . . . . . 185
9.3.1 To Recognize a Structure Failure . . . . . . . . . . . . . . . . . . . . 185
Contents ix
-
7/27/2019 Tape drive Sg 244503
12/325
9.3.2 To Recognize a Connectivity Failure . . . . . . . . . . . . . . . . . . . 186
9.3.3 To Recognize When a Coupling Facil ity Becomes Volati le . . . . . . 186
9.3.4 Recovery from a Connectivity Failure . . . . . . . . . . . . . . . . . . 187
9.3.5 Recovery from a Structure Failure . . . . . . . . . . . . . . . . . . . . 188
9.4 DB2 V4 Recovery from a Coupling Facil ity Failure . . . . . . . . . . . . . 189
9.4.1 DB2 V4 Built-In Recovery from Connectivity Failure . . . . . . . . . 189
9.4.2 DB2 V4 Built-In Recovery from a Structure Failure . . . . . . . . . . 190
9.4.3 Coupling Facil ity Becoming Volati le . . . . . . . . . . . . . . . . . . . 190
9.4.4 Manual Structure Rebui ld . . . . . . . . . . . . . . . . . . . . . . . . . 190
9.4.5 To Manually Deallocate and Reallocate a Group Buffer Pool . . . . 190
9.4.6 To Manually Deallocate a DB2 Lock Structure . . . . . . . . . . . . . 191
9.4.7 To Manually Deallocate a DB2 SCA Structure . . . . . . . . . . . . . 192
9.5 XCF Recovery from a Coupling Facil ity Failure . . . . . . . . . . . . . . . 192
9.5.1 XCF Built-In Recovery from Connectivity or Structure Failure . . . . 192
9.5.2 Coupling Facil ity Becoming Volati le . . . . . . . . . . . . . . . . . . . 193
9.5.3 Manual Invocation of Structure Rebuild . . . . . . . . . . . . . . . . . 193
9.5.4 Manual Deallocation of the XCF Signall ing Structures . . . . . . . . 193
9.5.5 Parti t ioning the Sysplex . . . . . . . . . . . . . . . . . . . . . . . . . . 193
9.6 RACF Recovery from a Coupling Facility Failure . . . . . . . . . . . . . . 194
9.6.1 RACF Built-In Recovery from Connectivity or Structure Failure . . . 1949.6.2 Coupling Facil ity Becoming Volati le . . . . . . . . . . . . . . . . . . . 195
9.6.3 Manual Invocation of Structure Rebuild . . . . . . . . . . . . . . . . . 195
9.6.4 Manual Deallocation of RACF Structures . . . . . . . . . . . . . . . . 196
9.7 VTAM Recovery from a Coupling Facility Failure . . . . . . . . . . . . . . 196
9.7.1 VTAM Built-In Recovery from Connectivity Failure . . . . . . . . . . 196
9.7.2 VTAM Built-In Recovery from a Structure Failure . . . . . . . . . . . 196
9.7.3 The Coupling Facil ity Becomes Volati le . . . . . . . . . . . . . . . . . 196
9.7.4 Manual Invocation of Structure Rebuild . . . . . . . . . . . . . . . . . 196
9.7.5 Manual Deallocation of the VTAM GRN Structure . . . . . . . . . . . 197
9.8 IMS/DB Recovery from a Coupling Facil ity Failure . . . . . . . . . . . . . 197
9.8.1 IMS/DB Built-In Recovery from a Connectivity Failure . . . . . . . . 197
9.8.2 IMS/DB Built-In Recovery from a Structure Failure . . . . . . . . . . 198
9.8.3 Coupling Facil ity Becoming Volati le . . . . . . . . . . . . . . . . . . . 198
9.8.4 Manual Invocation of Structure Rebuild . . . . . . . . . . . . . . . . . 198
9.8.5 Manual Deallocation of an IRLM Lock Structure . . . . . . . . . . . . 199
9.8.6 Manual Deallocation of a OSAM/VSAM Cache Structure . . . . . . 199
9.9 JES2 Recovery from a Coupling Facil ity Failure . . . . . . . . . . . . . . . 199
9.9.1 Connectivity Failure to a Checkpoint Structure . . . . . . . . . . . . 199
9.9.2 Structure Failure in a Checkpoint Structure . . . . . . . . . . . . . . 202
9.9.3 The Coupling Facil ity becomes Volati le . . . . . . . . . . . . . . . . . 203
9.9.4 To Manually Move a JES2 Checkpoint . . . . . . . . . . . . . . . . . . 203
9.10 System Logger Recovery from a Coupling Facil ity Failure . . . . . . . . 203
9.10.1 System Logger Built-In Recovery from a Connectivity Failure . . . 203
9.10.2 System Logger Built-In Recovery from a Structure Failure . . . . . 203
9.10.3 Coupling Facil ity Becoming Volati le . . . . . . . . . . . . . . . . . . 2039.10.4 Manual Invocation of Structure Rebuild . . . . . . . . . . . . . . . . 204
9.10.5 Manual Deallocation of Logstreams Structure . . . . . . . . . . . . 204
9.11 Automatic Tape Switching Recovery from a Coupling Facility Failure . 204
9.11.1 Automatic Tape Switching Recovery from a Connectivity Failure . 204
9.11.2 Automatic Tape Switching Built-In Recovery from a Structure
Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
9.11.3 Coupling Facil ity Becoming Volati le . . . . . . . . . . . . . . . . . . 204
9.11.4 Manual Invocation of Structure Rebuild . . . . . . . . . . . . . . . . 204
9.11.5 Consequences of Failing to Rebuild the IEFAUTOS Structure . . . 205
9.11.6 Manual Deallocation of IEFAUTOS Structure . . . . . . . . . . . . . 205
x Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
13/325
9.12 VSAM RLS Recovery from a Coupling Facil ity Failure . . . . . . . . . . 205
9.12.1 SMSVSAM Built-In Recovery from a Connectivity Failure . . . . . 205
9.12.2 SMSVSAM Built-In Recovery from a Structure Failure . . . . . . . 205
9.12.3 Coupling Facil ity Becoming Volati le . . . . . . . . . . . . . . . . . . 206
9.12.4 Manual Invocation of Structure Rebuild . . . . . . . . . . . . . . . . 206
9.12.5 Manual Deallocation of SMSVSAM Structures . . . . . . . . . . . . 206
9.13 Couple Data Set Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
9.13.1 Sysplex (XCF) Couple Data Set Failure . . . . . . . . . . . . . . . . 206
9.13.2 Coupling Facility Resource Manager (CFRM) Couple Data Set
Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
9.13.3 Sysplex Failure Management (SFM) Couple Data Set Failure . . . 207
9.13.4 Workload Manager (WLM) Couple Data Set Failure . . . . . . . . . 207
9.13.5 Automatic Restart Manager (ARM) Couple Data Set Failure . . . . 207
9.13.6 System Logger (LOGR) Couple Data Set Failure . . . . . . . . . . . 208
9.14 Sysplex Timer Fai lures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
9.15 Restart ing IMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
9.15.1 IMS/IRLM Failures Within a System . . . . . . . . . . . . . . . . . . 210
9.15.2 CEC or MVS Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
9.15.3 Automat ing Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
9.16 Restart ing DB2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2119.17 Restart ing CICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
9.17.1 CICS TOR Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
9.17.2 CICS AOR Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
9.18 Recover ing Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
9.18.1 Recovering an Application Failure . . . . . . . . . . . . . . . . . . . 212
9.18.2 Recovering an MVS Failure . . . . . . . . . . . . . . . . . . . . . . . 213
9.18.3 Recovering from a Sysplex Failure . . . . . . . . . . . . . . . . . . . 213
9.18.4 Recovering from System Logger Address Space Failure . . . . . . 213
9.18.5 Recovering OPERLOG Failure . . . . . . . . . . . . . . . . . . . . . . 213
9.19 Restarting an OPC/ESA Controller . . . . . . . . . . . . . . . . . . . . . . 213
9.20 Recovering Batch Jobs under OPC/ESA Control . . . . . . . . . . . . . 214
9.20.1 Status of Jobs on Failing CPU . . . . . . . . . . . . . . . . . . . . . . 214
9.20.2 Recovery of Jobs on a Fail ing CPU . . . . . . . . . . . . . . . . . . . 214
Chapter 10. Disaster Recovery Considerations . . . . . . . . . . . . . . . . . 215
10.1 Disasters and Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
10.2 Disaster Recovery Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
10.2.1 3990 Remote Copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
10.2.2 IMS Remote Site Recovery . . . . . . . . . . . . . . . . . . . . . . . . 216
10.2.3 CICS Recovery with CICSPlex SM . . . . . . . . . . . . . . . . . . . 217
10.2.4 DB2 Disaster Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Appendix A. Sample Parallel Sysplex MVS Image Members . . . . . . . . . 221
A.1 Example Parallel Sysplex Configuration . . . . . . . . . . . . . . . . . . . 221
A.2 I PLPARM M em bers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222A.2.1 LOADAA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
A.3 P AR ML IB M em be rs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
A.3.1 IEASYMAA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
A.3.2 IEASYS00 and IEASYSAA . . . . . . . . . . . . . . . . . . . . . . . . . 224
A.3.3 COUPLE00 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
A.3.4 JES2 Startup Procedure in SYS1.PROCLIB . . . . . . . . . . . . . . . 227
A.3.5 J2G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
A.3.6 J2L42 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
A.4 VTAMLST M em bers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
A.4.1 ATCSTR42 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Contents xi
-
7/27/2019 Tape drive Sg 244503
14/325
A.4.2 ATCCON42 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
A.4.3 APCIC42 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
A.4.4 APNJE42 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
A.4 .5 CDRM42 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
A.4 .6 MPC03 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
A.4.7 TRL03 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
A.4 .8 APAPPCAA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
A.5 Al locat ing Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
A.5.1 A LL OC JC L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
Appendix B. Structures, How to ... . . . . . . . . . . . . . . . . . . . . . . . . 241
B.1 To Gather Information on a Coupling Facil ity . . . . . . . . . . . . . . . . 241
B.2 To Gather Information on Structure and Connections . . . . . . . . . . . 243
B.3 To Deallocate a Structure with a Disposition of DELETE . . . . . . . . . 245
B.4 To Deallocate a Structure with a Disposit ion of KEEP . . . . . . . . . . . 245
B.5 To Suppress a Connection in Active State . . . . . . . . . . . . . . . . . . 245
B.6 To Suppress a Connection in Failed-persistent State . . . . . . . . . . . 246
B.7 To Monitor a Structure Rebui ld . . . . . . . . . . . . . . . . . . . . . . . . 246
B.8 To Stop a Structure Rebui ld . . . . . . . . . . . . . . . . . . . . . . . . . . 248
B.9 To Recover from a Hang in Structure Rebuild . . . . . . . . . . . . . . . 248
Appendix C. Examples of CFRM Policy Transitioning . . . . . . . . . . . . . . 249
C.1 Changing the Structure Def ini t ion . . . . . . . . . . . . . . . . . . . . . . . 249
C.2 Changing the Coupling Facil ity Definition . . . . . . . . . . . . . . . . . . 255
Appendix D. Examples of Sysplex Partitioning . . . . . . . . . . . . . . . . . . 259
D.1 Parti t ioning on Operator Request . . . . . . . . . . . . . . . . . . . . . . . 259
D.2 System in Missing Status Update Condition . . . . . . . . . . . . . . . . . 260
Appendix E. Spin Loop Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . 263
Appendix F. Dynamic I/O Reconfiguration Procedures . . . . . . . . . . . . . 267
F.1 Procedure to Make the System Dynamic I/O Capable . . . . . . . . . . . 267
F.2 Procedure for Dynamic Changes . . . . . . . . . . . . . . . . . . . . . . . 270
F.3 Hardware System Area Considerat ions . . . . . . . . . . . . . . . . . . . 271
F.4 Hardware System Area Expansion Factors . . . . . . . . . . . . . . . . . 272
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
List of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
xii Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
15/325
Figures
1. Sample Paral lel Sysplex Cont inuous Avai labi li ty Conf igurat ion . . . . . 5
2. ESCON Logical Paths Configurat ion . . . . . . . . . . . . . . . . . . . . . . 13
3. CTC Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4. Recommended XCF Signall ing Path Conf igurat ion . . . . . . . . . . . . . 165. Recommended DASD Path Conf igurat ion . . . . . . . . . . . . . . . . . . . 19
6. ISCKDSF R16 ESCON Logical Path Report . . . . . . . . . . . . . . . . . . 20
7. Console Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
8. Recommended Console Conf iguration . . . . . . . . . . . . . . . . . . . . 25
9. 9910 Local UPS and 9672 Rx2 and Rx3 . . . . . . . . . . . . . . . . . . . . 28
10. Indirect Catalog Funct ion with SYSRESA . . . . . . . . . . . . . . . . . . . 31
11. Indirect Catalog Funct ion with SYSRESB . . . . . . . . . . . . . . . . . . . 32
12. A lt er na te C on sol es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
13. Example of Failure Dependent Connection . . . . . . . . . . . . . . . . . . 48
14. Example of Fai lure Dependent/ Independence Connections . . . . . . . . 49
15. Basic Relat ionship between Sysplex Name and System Group . . . . . 51
16. SMSplex Consist ing of System Group and Individual System Name . . . 51
17. I so la ting a Fa il ing MVS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
18. INTERVAL and ISOLATETIME Relat ionship . . . . . . . . . . . . . . . . . . 61
19. SFM Policy with the ISOLATETIME Parameter . . . . . . . . . . . . . . . . 62
20. SFM LPARs Act ions T im ings . . . . . . . . . . . . . . . . . . . . . . . . . . 67
21. Sample JCL to Delete a SFM Policy . . . . . . . . . . . . . . . . . . . . . . 72
22. Figure to Show Timing Relat ionships . . . . . . . . . . . . . . . . . . . . . 74
23. JES3 *I S Display Showing Non-Existent Systems . . . . . . . . . . . . . . 88
24. JES3-Managed and Auto-Switchable Tape . . . . . . . . . . . . . . . . . . 90
25. NJE Node Definit ions Portion of JES3 Init Stream . . . . . . . . . . . . . . 91
26. Sample JES3 Proc for Use by Multiple Globals . . . . . . . . . . . . . . . 92
27. Cloned CICSplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
28. CICSPlex SM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
29. Sample IMS 5.1 Configuration . . . . . . . . . . . . . . . . . . . . . . . . 10030. Sample DB2 Data Sharing Conf igurat ion . . . . . . . . . . . . . . . . . . 104
31. Sample VSAM RLS Data Sharing Conf igurat ion . . . . . . . . . . . . . . 107
32. START Command When Adding a New JES3 Global . . . . . . . . . . . 151
33. V ol um e I nit ia li za tio n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
34. Copy SYSRESA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
35. SMP/ E ZONEEDIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
36. Add IPL Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
37. Example parallel sysplex Envi ronment . . . . . . . . . . . . . . . . . . . 154
38. Introducing a New Software Level into the parallel sysplex . . . . . . . 155
39. Redist ribut ing Workload on TORs . . . . . . . . . . . . . . . . . . . . . . 162
40. Redist r ibut ing Workload on AORs . . . . . . . . . . . . . . . . . . . . . . 163
41. DB2 Data Sharing Avai labi li ty . . . . . . . . . . . . . . . . . . . . . . . . 16842. Sample Checkpoint Def ini t ion . . . . . . . . . . . . . . . . . . . . . . . . . 200
43. 3990-6 Peer-to-Peer Remote Copy Conf igurat ion . . . . . . . . . . . . . 217
44. 3990-6 Extended Remote Copy Configuration . . . . . . . . . . . . . . . 218
45. IMS Remote Site Recovery Conf iguration . . . . . . . . . . . . . . . . . 219
46. DB2 Data Sharing Disaster Recovery Conf igurat ion . . . . . . . . . . . 220
47. Example Paral lel Sysplex Configurat ion . . . . . . . . . . . . . . . . . . 221
48. LOADAA Member . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
49. IEASYMAA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
50. IEASYS00 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
51. IEASYSAA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
Copyright IBM Corp. 1995 xiii
-
7/27/2019 Tape drive Sg 244503
16/325
52. COUPLE00 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
53. JES2 Member in SYS1.PROCLIB . . . . . . . . . . . . . . . . . . . . . . . 227
54. J2G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
55. J2L42 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
56. ATCSTR42 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
57. ATCCON42 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
58. APCIC42 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
59. APNJE42 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
60. CDRM42 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
61. MPC03 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
62. TRL03 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
63. APAPPCAA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
64. Allocat ing System Specif ic Data Sets . . . . . . . . . . . . . . . . . . . . 238
65. Coupl ing Facil ity Display . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
66. St ructures and Connect ions Display . . . . . . . . . . . . . . . . . . . . . 243
67. Monitoring Structure Rebui ld through Exploi ters Messages . . . . . . 246
68. Monitoring Structure Rebuild by Displaying Structure Status . . . . . . 247
69. CFRM Pol icy Samp le . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
70. JCL to Instal l a New CFRM Policy . . . . . . . . . . . . . . . . . . . . . . 252
71. Origina l CFRM Pol icy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25672. New CF RM Poli cy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
73. VARY OFF a System without SFM Pol icy Act ive . . . . . . . . . . . . . . 259
74. VARY OFF a System with an SFM Pol icy Act ive . . . . . . . . . . . . . . 260
75. System in Missing Status Update Condition and No Act ive SFM Policy 260
76. System in Missing Status Update with an Active SFM Policy and
CONNFAIL(YES) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
77. Resolut ion of a Spin Loop Condit ion . . . . . . . . . . . . . . . . . . . . 264
78. HCD Panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
79. CONFIG Frame Fragment . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
80. HCD Panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
81. Dynamic I /O Customiza tion . . . . . . . . . . . . . . . . . . . . . . . . . . 270
xiv Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
17/325
Tables
1. Couple Data Set Placement Recommendat ions . . . . . . . . . . . . . . . 37
2. JES2 Checkpoint Placement Recommendat ions . . . . . . . . . . . . . . . 39
3. References Conta in ing Information on the Use of System Symbols . . . 42
4. Summary of SFM Keywords and Parameters . . . . . . . . . . . . . . . . 635. I MS Dat a Set s i n Sysplex . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6. Automation Recommendat ions . . . . . . . . . . . . . . . . . . . . . . . . 116
7. Suppor t o f REBUILD by IBM Exp lo iters . . . . . . . . . . . . . . . . . . . 123
8. Support o f ALTER by IBM Exp lo iters . . . . . . . . . . . . . . . . . . . . . 124
9. DB2 Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
10. Subsystem Recovery Summary Part 1 . . . . . . . . . . . . . . . . . . 182
11. Subsystem Recovery Summary Part 2 . . . . . . . . . . . . . . . . . . 184
12. Summary of Couple Data Sets . . . . . . . . . . . . . . . . . . . . . . . . 209
Copyright IBM Corp. 1995 xv
-
7/27/2019 Tape drive Sg 244503
18/325
xvi Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
19/325
Special Notices
This publication is intended to help customers systems and operations personnel
and IBM systems engineers to plan, implement and use a parallel sysplex in
order to get closer to a goal of continuous availabil ity. It is not intended to be a
guide to implementing or using parallel sysplex as such. It only covers topicsrelated to continuous availability.
The information in this publication is not intended as the specification of any
programming interfaces that are provided by MVS Version 5 or any other product
mentioned in this redbook. See the PUBLICATIONS section of the IBM
Programming Announcement for MVS Version 5, or other products, for more
information about what publications are considered to be product documentation.
References in this publication to IBM products, programs or services do not
imply that IBM intends to make these available in all countries in which IBM
operates. Any reference to an IBM product, program, or service is not intended
to state or imply that only IBMs product, program, or service may be used. Any
functionally equivalent program that does not infringe any of IBMs intellectual
property rights may be used instead of the IBM product, program or service.
Information in this book was developed in conjunction with use of the equipment
specified, and is limited in application to those specific hardware and software
products and levels.
IBM may have patents or pending patent applications covering subject matter in
this document. The furnishing of this document does not give you any license to
these patents. You can send license inquiries, in writ ing, to the IBM Director of
Licensing, IBM Corporation, 500 Columbus Avenue, Thornwood, NY 10594 USA.
The information contained in this document has not been submitted to any
formal IBM test and is distributed AS IS. The information about non-IBM
(VENDOR) products in this manual has been supplied by the vendor and IBM
assumes no responsibility for its accuracy or completeness. The use of this
information or the implementation of any of these techniques is a customer
responsibility and depends on the customers ability to evaluate and integrate
them into the customers operational environment. While each i tem may have
been reviewed by IBM for accuracy in a specific situation, there is no guarantee
that the same or similar results wil l be obtained elsewhere. Customers
attempting to adapt these techniques to their own environments do so at their
own risk.
Reference to PTF numbers that have not been released through the normal
distribution process does not imply general availabil ity. The purpose ofincluding these reference numbers is to alert IBM customers to specific
information relative to the implementation of the PTF when it becomes available
to each customer according to the normal IBM PTF distribution process.
The following terms are trademarks of the International Business Machines
Corporation in the United States and/or other countries:
ACF/VTAM Advanced Peer-to-Peer Networking
AIX APPN
CICS CICS/ESA
CICS/MVS CUA
Copyright IBM Corp. 1995 xvii
-
7/27/2019 Tape drive Sg 244503
20/325
The following terms are trademarks of other companies:
C-bus is a trademark of Corollary, Inc.
PC Direct is a trademark of Ziff Communications Company and is
used by IBM Corporation under license.
UNIX is a registered trademark in the United States and other
countries licensed exclusively through X/Open Company Limited.
Windows is a trademark of Microsoft Corporation.
Other trademarks are trademarks of their respective companies.
DATABASE 2 DB2
DFSMS DFSMS/MVS
DFSMSdfp DFSMSdss
DFSMShsm DFSORT
Enterprise Systems Connection
Architecture
ES/3090
ES/9000 ESA/370
ESA/390 ESCON XDF
ESCON GDDM
Hardware Configuration Definition IBM
IMS IMS/ESA
IPDS LPDA
Magstar MVS/DFP
MVS/ESA MVS/SP
MVS/XA NetView
PR/SM Processor Resource/Systems Manager
PS/2 RACF
RAMAC RETAIN
RMF S/370
S/390 SAA
SQL/DS Sysplex Timer
System/360 System/370
System/390 Systems Applicat ion Architecture
SystemView Virtual Machine/Enterprise Systems
Architecture
Virtual Machine/Extended Architecture VM/ESA
VM/XA VSE/ESA
VTAM
xviii Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
21/325
Preface
This document discusses how the parallel sysplex can help an installation get
closer to a goal of Continuous Availability.
This document is intended for customer systems and operations personnelresponsible for implementing parallel sysplex, and the IBM Systems Engineers
who assist them. It wil l also be useful to technical managers who want to
assess the benefits they can expect from parallel sysplex in this area.
How This Document Is Organized
The document is in 3 parts:
Part 1, Conf iguring for Cont inuous Avai labil i ty
This part describes how to configure both the hardware and software in
order to eliminate planned outages and minimize the impact of unplanned
outages. Chapter 1, Hardware Conf igurat ion
This chapter discusses how to design a hardware configuration for
continuous availability.
Chapter 2, System Software Conf igurat ion
This chapter describes how to configure the system to support
continuous availability and minimize the effort needed to maintain and
run it.
Chapter 3, Subsystem Software Conf igurat ion
This chapter deals with configuring the various subsystems to provide an
environment that will support the goal of continuous availability.
Part 2, Making Planned Changes
This part describes how you can make changes to the sysplex without
disrupting the running of the applications.
Chapter 4, Systems Management in a Parallel Sysplex
This chapter discusses the importance of maintaining good systems
management disciplines in a parallel sysplex environment.
Chapter 5, Coupl ing Faci l ity Changes
This chapter deals with changes that can be made to the coupling
environment, for installation, planned or unplanned maintenance.
Chapter 6, Hardware Changes
This chapter discusses how to add, change or remove hardware
elements of the sysplex in a non-disruptive way.
Chapter 7, Software Changes
This chapter discusses how to make changes such as adding, modifying
or removing system images and subsystems.
Chapter 8, Database Avai labi l i ty
Copyright IBM Corp. 1995 xix
-
7/27/2019 Tape drive Sg 244503
22/325
This chapter discusses subsystem (CICS, IMS, DB2) configuration options
to minimise the impact of making database changes.
Part 3, Handl ing Unplanned Outages
This part describes how to handle unplanned outages and recover from error
situations with minimal impact to the applications.
Chapter 9, Paral lel Sysplex RecoveryThis chapter discusses how to recover from unplanned hardware and
software failures.
Chapter 10, Disaster Recovery Considerat ions
This chapter contains a discussion of disaster recovery considerations
specific to the parallel sysplex environment.
Related Publications
The publications listed in this section are considered particularly suitable for a
more detailed discussion of the topics covered in this document.
The publications listed are sorted in alphabetical order.
CICS/ESA Release GuideGC33-0655
CICS VSAM Recovery GuideSH19-6709
CICS/ESA Dynamic Transaction Routing in a CICSPlex, SC33-1012
CICS/ESA Version 4 Intercommunication Guide, SC33-1181
CICS/ESA Version 4 Recovery and Restart Guide, SC33-1182
CICS/ESA Version 4 CICS-IMS Database Control Guide, SC33-1184
Concurrent Copy OverviewGG24-3936
DB2 Version 4 Data Sharing: Planning and Administration, SC26-3269
DB2 Version 4 Release Guide, SC26-3394
DCAF V1.2.1 Installation and Using Guide, SH19-6838
DFSMS/MVS V1 R3 DFSMSdfp Storage Administration Reference, SC26-4920 ES/9000 and ES/3090 PR/SM Planning Guide, GA22-7123
ES/9000 9021 711-based Models Functional Characteristics, GA22-7144
ES/9000 9121 511-based Models Functional Characteristics, GA24-4358
Hardware Management Console Application Programming Interfaces,
SC28-8141
Hardware Management Console Guide, GC38-0453.
IBM CICS Transaction Affinities Utility Users Guide, SC33-1159
IBM CICSPlex Systems Manager for MVS/ESA Concepts and Planning,
GC33-0786.
IBM Token-Ring Network Introduction and Planning Guide, GA27-3677
IBM 3990 Storage Control Reference for Model 6, GA32-0274
IBM 9037 Sysplex Timer and System/390 Time Management, GG66-3264 Implementing Concurrent Copy, GG24-3990
IMS/ESA Version 5 Administration Guide: Data Base, SC26-8012
IMS/ESA Version 5 Administration Guide: System, SC26-8013
IMS/ESA Version 5 Administration Guide: Transaction Manager, SC26-8014
IMS/ESA V5 Operations Guide, SC26-8029
IMS/ESA Version 5 Sample Operating Procedures, SC26-8032
JES2 Multi-Access Spool in a Sysplex Environment, GG66-3263
Large System Performance Reference Document, SC28-1187
LPAR Dynamic Storage Reconfiguration, GG66-3262
MVS/ESA Hardware Configuration Definition:Planning, GC28-1445
xx Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
23/325
MVS/ESA RMF Users Guide, GC33-6483
MVS/ESA RMF V5 Getting Started on Performance Management, LY33-9176
MVS/ESA SML:Implementing System-Managed Storage, SC26-3123
MVS/ESA SP V5 Hardware Configuration Definition: Users Guide, SC33-6468
MVS/ESA SP V5 Assembler Services Guide, GC28-1466
MVS/ESA SP V5 Authorized Assembler Services Guide, GC28-1467
MVS/ESA SP V5 Authorized Assembler Services Reference, Volume 2,
GC28-1476
MVS/ESA SP V5 Conversion Notebook, GC28-1436
MVS/ESA SP V5 Initialization and Tuning Guide, SC28-1451
MVS/ESA SP V5 Initialization and Tuning Reference, SC28-1452
MVS/ESA SP V5 Installation Exits, SC28-1459
MVS/ESA SP V5 JCL Reference, GC28-1479
MVS/ESA SP V5 JES2 Initialization and Tuning Reference, SC28-1454
MVS/ESA SP V5 JES2 Commands, GC28-1443
MVS/ESA SP V5 JES3 Commands, GC28-1444
MVS/ESA SP V5 Planning: Global Resource Serialization, GC28-1450
MVS/ESA SP V5 Planning: Security, GC28-1439
MVS/ESA SP V5 Planning: Operations, GC28-1441
MVS/ESA SP V5 Planning: Workload Management, GC28-1493 MVS/ESA SP V5 Programming: Assembler Services References, GC28-1474
MVS/ESA SP V5 Programming: Sysplex Services Guide, GC28-1495
MVS/ESA SP V5 Programming: Sysplex Services Reference, GC28-1496
MVS/ESA SP V5 Setting Up a Sysplex, GC28-1449
MVS/ESA SP V5 System Commands, GC28-1442
MVS/ESA SP V5 Sysplex Migration Guide, SG24-4581
MVS/ESA SP V5 System Management Facilities (SMF) , GC28-1457
S/390 MVS Sysplex Application Migration, GC28-1211
S/390 MVS Sysplex Hardware and Software Migration, GC28-1210.
S/390 MVS Sysplex Overview: An Introduction to Data Sharing and
Parallelism, GC28-1208
S/390 MVS Sysplex Systems Management, GC28-1209
S/390 9672/9674 Managing Your Processors, GC38-0452
S/390 9672/9674 System Overview, GA22-7148
SMP/E R8 Reference, SC28-1107
Sysplex Timer Planning, GA23-0365
TSO/E V2 Users Guide, SC28-1880
TSO/E V2 CLISTs, SC28-1876
TSO/E V2 Customization, SC28-1872
VTAM for MVS/ESA Version 4 Release 3 Migration Guide, GC31-6547
International Technical Support Organization Publications
Automating CICS/ESA Operations with CICSPlex SM and NetView, GG24-4424 Batch Performance, SG24-2557
CICS Workload Management Using CICSPlex SM And the MVS/ESA Workload
Manager, GG24-4286
CICS/ESA and IMS/ESA: DBCTL Migration For CICS Users, GG24-3484
DFSMS/MVS Version 1 Release 3.0 Presentation Guide, GG24-4391
DFSORT Release 13 Benchmark Guide, GG24-4476
Disaster Recovery Library: Planning Guide, GG24-4210
MVS/ESA Software Management Cookbook, GG24-3481
MVS/ESA SP-JES2 Version 5 Implementation Guide, SG24-4583
MVS/ESA SP-JES3 Version 5 Implementation Guide, SG24-4582
Preface xxi
-
7/27/2019 Tape drive Sg 244503
24/325
MVS/ESA Version 5 Sysplex Migration Guide, SG24-4581
MVS/ESA Sysplex Migration Guide, GG24-3925
Planning for CICS Continuous Availability in an MVS/ESA Environment,
SG24-4593
RACF Version 2 Release 1 Installation and Implementation Guide, GG2
RACF Version 2 Release 2 Technical Presentation Guide, GG24-2539
Sysplex Automation and Consoles, GG24-3854
S/390 Microprocessor Models R2 and R3 Overview, SG24-4575
S/390 MVS Parallel Sysplex Continuous Availability Presentation Guide,
SG24-4502
S/390 MVS Parallel Sysplex Performance, GG24-4356
S/390 MVS/ESA Version 5 WLM Performance Studies, SG24-4352
Storage Performance Tools and Techniques for MVS/ESA, GG24-4045
A complete list of International Technical Support Organization publications,
known as redbooks, with a brief description of each, may be found in:
International Technical Support Organization Bibliography of Redbooks,
GG24-3070.
To get a catalog of ITSO redbooks, VNET users may type:
TOOLS SENDTO WTSCPOK TOOLS REDBOOKS GET REDBOOKS CATALOG
A listing of all redbooks, sorted by category, may also be found on MKTTOOLS
as ITSOCAT TXT. This package is updated monthly.
How to Order ITSO Redbooks
IBM employees in the USA may order ITSO books and CD-ROMs using
PUBORDER. Customers in the USA may order by calling 1-800-879-2755 or by
faxing 1-800-445-9269. Most major credit cards are accepted. Outside the
USA, customers should contact their local IBM office. For guidance on
ordering, send a PROFS note to BOOKSHOP at DKIBMVM1 or E-mail [email protected].
Customers may order hardcopy ITSO books individually or in customized
sets, called BOFs, which relate to specif ic functions of interest. IBM
employees and customers may also order ITSO books in online format on
CD-ROM collections, which contain redbooks on a variety of products.
ITSO Redbooks on the World Wide Web (WWW)
Internet users may find information about redbooks on the ITSO World Wide Web
home page. To access the ITSO Web pages, point your Web browser to thefollowing URL:
http://www.redbooks.ibm.com/redbooks
IBM employees may access LIST3820s of redbooks as well. The internal
Redbooks home page may be found at the following URL:
http://w3.itsc.pok.ibm.com/redbooks/redbooks.html
xxii Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
25/325
Acknowledgments
This publication is the result of a residency conducted at the International
Technical Support Organization, Poughkeepsie Center.
The advisor for this project was:
The authors of this document are:
G. Tom Russell
International Technical Support Organization,
Poughkeepsie
Paola Bari
I BM I ta ly
Margaret Beal
IBM Aust ral ia
Horace Dyke
IBM Canada
Patr ick Kappeler
IBM France
Paul ONeill
IBM Nord ic
Ian Wai te
IBM U K
Preface xxiii
-
7/27/2019 Tape drive Sg 244503
26/325
xxiv Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
27/325
Part 1. Configuring for Continuous Availability
This part describes how to configure both the hardware and software in order to:
Eliminate planned outages
Minimize the impact of unplanned outages
Copyright IBM Corp. 1995 1
-
7/27/2019 Tape drive Sg 244503
28/325
2 Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
29/325
Chapter 1. Hardware Configuration
This chapter discusses how to design a hardware configuration for continuous
availabil ity. This means eliminating all single points of failure, and giving the
possibility to make changes to hardware and software without disrupting the
running of the applications.
1.1 What Is Continuous Availability?
When we speak about continuous availability we are really dealing with two
different but interrelated topics, high availability and continuous operations.
High availability has to do with keeping the applications running without any
breakdown during the planned opening hours. The way we achieve this is by a
combination of high reliability for the individual components of the system and of
redundancy of components so that even if a component fails there is another one
there that can replace it.
Continuous operations on the other hand is about keeping the applications and
systems running without any planned stops. This in itself would not be too big a
problem if it were not for the opposing but equally urgent need for
responsiveness to changing business requirements. So the simplistic solution to
freeze all changes just will not do.
What the end users increasingly require is that the applications are kept running
without any planned or unplanned stops, and this is what we mean by continuous
availabil ity.
Up to now the only real solution to these requirements has been redundancy at
the system level. This is a costly solution, but organizations such as airl ines that
have these requirements often have two complete systems, where one runs theproduction and the other is a hot standby, and they can switch the production
from one system to the other quickly. Then if they have an unplanned
breakdown on the production system the standby one takes over with a
minimum delay. Having a second system also allows them to make planned
changes to the standby system, and then switch the production over to it when
they are ready to bring the change into operation.
1.1.1 Parallel Sysplex and Continuous AvailabilityThe parallel sysplex was designed to:
Provide a single system image to the end-user of the application
Support multiple copies of the applications, and provide services for dynamicbalancing of the workload over the multiple copies
Provide locking facilities to allow data to be shared among the multiple
copies of the applications with integrity
Provide services to facilitate communication between the multiple copies
From the perspective of continuous availability, the two most important functions
provided by a parallel sysplex are:
Data Sharing
Copyright IBM Corp. 1995 3
-
7/27/2019 Tape drive Sg 244503
30/325
Which allows multiple instances of an application running on multiple
systems to work on the same databases simultaneously.
Workload Balancing
Which means that the workload can be distributed evenly across these
multiple application instances. This is made possible by the fact that they
can share data.
These radically new possibilities provided by parallel sysplex change the way we
approach continuous availabil ity.
Today, a specific system provides the infrastructure for a major customer
application. The loss or degradation of that system can severely impact the
customer s business.
In the parallel sysplex environment, where multiple cooperating systems provide
the infrastructure, the loss or degradation of one of the many identical systems
has little impact.
This means that we can now design a system that is fault-tolerant from both a
hardware and software perspective, giving us the possibility of the following:
Very High Availability
With redundancy in both hardware and software we can eliminate
points-of-failure, and workload balancing can ensure that the work being
done on a lost component will be distributed across the remaining ones.
Nondisruptive Change
Hardware changes can be made by removing the system that needs to be
changed from the sysplex while the applications continue to run on the
remaining systems, making the change, and then returning the system to the
sysplex.
Software changes can be achieved in a similar way, provided that thechanged version of the software in question can co-exist with the current
ones in the sysplex. This coexistence (at level N and N+1) is a design
objective of the IBM systems and subsystems that support parallel sysplex.
This shift in philosophy changes the way we think about designing the
configuration in a parallel sysplex. In order to take advantage (or exploit) the
parallel sysplex there must be more than one of each hardware component, and
the software must be designed for cloning.
If the application requires N images in order to provide the processing capacity,
then the system designer should provide N+1 images in the sysplex.
1.1.2 Why N+1 ?When designing systems for high availability we must always consider the
possibil ity that a component can fail. If we build the system with redundant
components such that, even if any component does fail, the system will continue
to function, then we have a fault-tolerant system. We can also say that we have
no single point of failure.
Obviously this component redundancy has a cost. The simplest, but most
expensive solution, is to duplicate everything. This is often not an economically
viable alternative. Fortunately there are others.
4 Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
31/325
If we assume that the individual components of the system are inherently
reliable, that is that the probability of failure is very low for each component,
then the probability of more than one failing at any one time is extremely low,
and can be ignored. So, if we need a number of components (N) to do a
particular job, all we need to do is allocate one extra to allow for the possibility
of failure, and these N+1 components give us the redundancy we need. The
larger the number of components (N) sharing the work, the less the relative cost
of this redundancy.
In other words, if we are flying in a two-engined plane and want to be safe in the
case of an engine failure, then one engine must be able to f ly the plane. This
means one of the two engines (50%) is redundant. If it is a four-engined plane
then we want to be able to continue with three engines, so the fourth one (25%)
is redundant.
In the same way we have been building hardware redundancy into computer
systems for some time, the number of channels to I/O units, power supplies in
the processor, and so on.
Now with parallel sysplex we can take this concept one step further, and
introduce N+1 redundancy in the number of machines or system images in the
system. This allows us to configure for the failure of entire machines or system
images and sti l l keep the system on the air.
Figure 1. Sample Paralle l Sysplex Continuous Availabi l ity Configurat ion. The coupling faci l it ies, sysplex t imers
and all the links are duplicated to eliminate single points of failure.
Ch ap ter 1 . Ha rd wa re Co nf igu rat ion 5
-
7/27/2019 Tape drive Sg 244503
32/325
1.2 Processors
The first prerequisite is that we have multiple processors following the N+1
philosophy outlined above.
1.2.1.1 CMOS-Only Sysplexes
If we are designing a configuration from scratch, using CMOS processors, thenthis is just a matter of deciding what the optimal processor size is and then
configuring N+1 identical machines, where N of these are sufficient to run the
workload.
In theory, the larger N is (that is, the smaller the individual machines) the less is
the cost of the redundant N+1 machine.
In practice there are counterbalancing reasons, such as the following:
The performance overhead on the sysplex (between 0.5% and 1% for each
extra machine).
The extra human effort in managing more machines (which will depend on
how well the systems management procedures and tools can handle multiplemachines).
The extra work involved in maintaining more system images (which will
depend on how well the clones are replicated and on how well the naming
and other installation standards support this).
How useful small machines are in handling the workload. If there are
components in the workload that require larger machines to perform
satisfactorily then this will tend to reduce the number of ways we can split
the sysplex.
1.2.1.2 Mixed SysplexesVery often a sysplex will be a mixture of large bipolar and smaller CMOS
machines. This is for many installations a natural evolution from their current
bipolar configurations and allows these machines to continue their useful life into
the parallel sysplex world. It may also be necessary to keep these larger
machines because parts of the workload need either the larger system image or
the more powerful engines that these provide.
In many cases it is not realistic to adopt a simplistic N+1 approach to these
configurations with large machines due to the high cost of having a redundant
large processor. In any event we are often dealing here with a transit ion state,
where not all of the work can be partit ioned on a sysplex. What we need to
consider from an availability viewpoint is the effect of the failure of each machine
in the configuration, and particularly the larger ones. We must ensure that there
is reserve capacity available to take over the essential work from that machine.This may involve removing or reducing the priority of some other nonessential
work.
6 Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
33/325
1.3 Coupling Facilities
The recommended configuration of coupling facilities for availability is to have at
least two of them, and as separate 9674s, not partitions in processors doing
other work.
1.3.1 Separate MachinesThe reason for having them as separate machines is that if a coupling facility
fails then the structures it contains will have to be rebuilt in another coupling
facility, and this rebuild will be done using data from the coupled MVS-systems.
If you run the coupling facility in a partition in a machine which is also running
one of the systems in the sysplex, then a hardware failure on this machine will
not only bring down the coupling facility but also one of the sources needed to
rebuild it. The only way to recover from this situation is to restart the whole
sysplex.
1.3.2 How Many?In deciding how many coupling facil it ies you need, the same N+1 considerations
apply as we have seen for processors. If one fails, we need to have sufficientprocessor capacity and memory available in the remaining ones to rebuild the
structures and handle the load.
The simplest design is where we have two coupling facilities, each of which has
enough processor power and memory to handle the entire sysplex. In normal
production we can then distribute the structures over these, and for each
structure specify the other CF as the alternate for rebuild in case of a failure.
1.3.3 CF LinksThe recommended number of CF links to each machine in the sysplex is at least
two, for avai labi li ty reasons. You may need more for performance. SeeParallel
Sysplex Performance, GG24-4356. Note that each of these receiver links (at theCF end) is separate. Sender l inks (at the MVS end) can be shared between
partitions in a fashion similar to EMIF, so even if you have several partitions you
will only need two links per machine for each CF you need to connect to. If you
have an MP-machine which you plan to partition for any reason, then this means
two links per CF on each side of the machine.
In the coupling facility, one Intersystem Channel Adapter (fc #0014) is required
for every two coupling links (#0007 or #0008). The Intersystem Channel Adapter
is not hot pluggable, but the coupling links are. If you do not have a redundant
9674 to switch the coupling load to, you may want to consider installing
additional Intersystem Channel Adapters to allow for additional coupling links to
be installed without an outage in the future. For details on hot plugging, refer tothe 9672/9674 System Overview, GA22-7148.
1.3.4 Coupling Facility StructuresThere could be some planned activities that require a coupling facility shutdown.
A coupling facil ity cannot be treated as a normal device. It requires a particular
procedure to be unallocated by the subsystems and the shutdown can be
disruptive or not depending on the initial coupling facility setting and the usage
made by each different user. Here we will go through some considerations that
can be useful in designing the coupling facility environment and making it
possible to remove structures.
Ch ap ter 1 . Ha rd wa re Co nf igu rat ion 7
-
7/27/2019 Tape drive Sg 244503
34/325
While designing the coupling facility environment, you should consider which
structures must be relocated to an alternate coupling facil ity. Some subsystems
can continue to operate without their coupling facility structure, although there
may be a loss of performance. For example, the JES2 checkpoint can be
relocated to DASD and the RACF structure can simply be deallocated while
coupling facil ity maintenance is being performed. For the remaining structures,
you must ensure that enough capacity (storage, CPU cycles, link connections
structure IDs, etc.) exists on an alternate coupling facility to allow structures to
be rebuilt there.
When you set up your coupling facility configuration you should provide
definitions that enable the structures to be moved or rebuilt; structures being
moved to the alternate coupling facility must have the alternate coupling facility
name in the PREFLIST statement. The following is an example on how to define
a structure that can be rebuilt:
STRUCTURE NAME(IEFAUTOS) SIZE(640)REBUILDPERCENT(20)PREFLIST(CF01, CF02)
For structures that will be moved (REBUILT) from the outgoing coupling facility toan alternate coupling facility, ensure that all systems using the structures have
connectivity to the alternate coupling facility.
1.3.5 Coupling Facility Volatility/NonvolatilityPlanning a coupling facility configuration for continuous availability requires
particular attention to the storage volatility of the coupling facility where shared
data resides. The advantages of a nonvolati le coupling facil ity are that if you
lose power to a coupling facility that is configured to be nonvolatile, the coupling
facility enters power save mode, saving the data contained in the structures.
Continuous availability of structures can be provided by making the coupling
facility storage contents nonvolatile.
This can be done in different ways depending on how long a power loss we want
to allow for:
With a UPS
With an optional battery backup feature
With a UPS plus a battery backup feature
For more details on this see 1.15.2, 9672/9674 Protection against Power
Disturbances on page 27.
The volatility or nonvolatility of the coupling facility is reflected by the volatility
attribute, and can be monitored by the system and subsystems to decide onrecovery actions in the case of power failure.
There are some subsystems that are very sensitive to the status of this coupling
facility attribute, like the system logger, and they can behave in different ways
depending on the volati l i ty status. To set the volatil i ty attribute you should use
the coupling facility control code command:
Mode Powersave
This is the default setup and automatically determines the volatility status of
the coupling facility based on the presence of the battery backup feature. If
8 Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
35/325
the battery backup is installed and working, the CFCC sets its status to
nonvolati le. The battery backup feature will preserve coupling facil ity
storage contents across a certain time interval (default is 10 seconds).
Mode Non-Volatile
This command should be used to inform the CFCC to set non-volatile status
for its storage because a UPS is installed.
Mode Volatile
This command informs the CFCC to put its storage in volatile status
irrespective of whether there is a battery or not.
There are considerations in coupling facility planning depending on the
sensitivity of subsystem users to coupling facility volatile/nonvolatile status:
JES2
JES2 can use a coupling facility structure for primary checkpoint data set,
and its alternate checkpoint data set can either be in a coupling facility or on
DASD. Depending on the volati l i ty of the coupling facil ity, JES2 will or wil l
not allow you to have both primary and secondary checkpoint data sets on
the coupling facility.
Logger
The system logger can be sensitive to the volatile/nonvolatile status of the
coupling facility where the LOGSTREAM structures are allocated.
Particularly, depending on the coupling facility status, the system logger is
able to protect its data against a double failure (MVS failure together with
the coupling facil ity). When you define a LOGSTREAM you can specify the
following parameters:
STG_DUPLEX(NO/YES)
Specifies whether the coupling facility logstream data should be
duplexed on DASD staging data sets. You can use this specif icationtogether with the DUPLEXMODE parameter to be configuration
independent.
DUPLEXMODE(COND/UNCOND)
Specifies the conditions under which the coupling facility log data will be
duplexed in DASD staging data sets. COND means that duplexing will be
done only if the logstream contains a single point of failure and is
therefore vulnerable to permanent log data loss:
- Logstream is allocated to a volati le coupling facil ity residing on the
same machine as the MVS system.
- Duplexing will not be done if the coupling facil ity for the logstream is
nonvolatile and resides on a different machine than the MVS system.
DB2
DB2 requests of MVS that structures be allocated in a nonvolatile coupling
facility; however, it does not prevent allocation in a volatile coupling facility.
DB2 does issue a warning message if allocation occurs into a volatile
coupling facil ity. A change in volati l ity after allocation does not have an
effect on your existing structures.
The advantages of a nonvolatile coupling facility are that if you lose power to
a coupling facility that is configured to be nonvolatile, the coupling facility
Ch ap ter 1 . Ha rd wa re Co nf igu rat ion 9
-
7/27/2019 Tape drive Sg 244503
36/325
enters power save mode, saving the data contained in the structures. When
power is returned, there is no need to do a group restart, and there is no
need to recover the data from the group buffer pools. For DB2 systems
requiring high availability, nonvolatile coupling facilities are recommended.
SMSVSAM Lock
The coupling facility IGWLOCK00 lock structure is recommended to be
allocated in a nonvolati le coupling facil ity. This lock structure is used to
enforce the protocol restrictions for VSAM RLS data sets and maintain the
record level locks. The support requires a single CF lock structure.
IRLM Lock
The lock structures for IMS or DB2 locks are recommended to be allocated in
a nonvolati le coupling facil ity. Recovery after a power failure is faster if the
locks are still available.
IMS Cache Directory
The cache directory structure for VSAM or OSAM databases can be
allocated in a nonvolatile or volatile coupling facility.
VTAM
The VTAM Generic Resources structure ISTGENERIC can be allocated in
either a nonvolati le or a volati le coupling facil ity. VTAM has no special
processing for handling a coupling facility volatility change.
1.4 Sysplex Timers
In a multi-system sysplex it is necessary to synchronize the Time-of-Day (TOD)
clocks in all the systems very accurately in order to maintain data integrity. If all
the systems are in the same CPC, under PR/SM, then this is no problem as they
are all using the same TOD clock. If the systems are spread across more than
one CPC then the TOD clocks in all these CPCs must be synchronized using a
single external time source, the sysplex timer.
The IBM Sysplex Timer (9037) is a table-top unit that can synchronize the TOD
clocks in up to 16 processors or processor sides, which are connected to it by
fiber-optic l inks. For full details see IBM 9037 Sysplex Timer and System/390
Time Management, GG66-3264-00.
The sysplex cannot continue to function without the sysplex t imer. If any system
loses the timer signal, it will be fenced from the sysplex and put in an
unrestartable wait state.
1.4.1 DuplicatingWhen the Expanded Availability Feature is installed, two 9037 devices linked to
one another, provide a synchronized, redundant configuration. This ensures that
the failure of one 9037, or a fiber optic cable, will not cause loss of time
synchronization. It is recommended that each 9037 have its own AC power
source, so that if one source fails, both devices are not affected.
Note that these two timers must be within 2.2 meters of one another.
The sysplex timer attaches to the processor via the processors Sysplex Timer
Attachment Feature. Dual ports on the attachment feature permit redundant
connections, so that there is no single point of failure.
10 Continuous Availability with PTS
-
7/27/2019 Tape drive Sg 244503
37/325
1.4.2 DistanceThe processors are connected to the timer by a multi-mode fiber, and can be up
to three kms from the timer, depending on the fiber. Distances between the
sysplex timer and CECs beyond 3,000 meters are supported by RPQ 8K1919.
RPQ 8K1919 allows the use of single mode fiber optic (laser) links between the
processor and the 9037. To support single mode fiber on the 9037, a specialLED/laser converter has been designed called the 9036 Model 003. The 9036-003
is designed for use only with a 9037, and is available only as RPQ 8K1919. Two
9036-003 extenders (two RPQs) are required between the 9037 and each sysplex
timer attachment port on the processor.
The si