BMC Slide Show

33
Implementing Control-M in a Distributed Environment under MC/ServiceGuard

description

BMC patrol review document. for interview preparation.

Transcript of BMC Slide Show

Implementing Control-M in a Distributed Environment under

MC/ServiceGuard

Presentation Overview

Quick note on the installation environment

Business pressures on IT infrastructure and the need for automation

Selection process (taken from the case study) and getting the right tool for the job without losing track of the needs of the business

Presentation Overview

Installation and Configuration; Mainframe standards on Unix?

How it works in the real world

Future development (Control-M takes over!)

Any questions/suggestions?

Notes on the Installation Environment

One Control-M Server version 2.2.4 running on HP-UX 11.0 (currently moving up to 11i)

Enterprise Controlstation Release 5.0.0 running on HP-UX 11.0 (Exceed/Motif Version)

CONTROL-M Agents v 2.2.4, running on -NT 4.0 Servers (3 servers)Windows 2000 Servers (15 servers)HP-UX 10.20 and 11.0 (approximately 20)HP-UX 11i (4 servers)

Installation Environment

Part One - IT Infrastructure Changes; A Typical Story

Traditional Mainframe siteNightly batchAttended operations

Mainframe declines as part the move to Distributed Computing -

Perceived as cheaper & more flexible'User-centric'

Mainframe (and associated staff) are Decommissioned

However ...

The Distributed Enterprise still requires'Unseen' operational tasks to be completedSome method of centralised batch control/reporting neededSpecified tasks to execute reliably (i.e. more reliably than 'cron' table entries and in-house written scripts)

Additional DevelopmentsManagement commit to run services for Far East Office (+7 hours) on existing platforms (over Citrix MetaFrame)Increasing need for integrated cross-platform tasks (especially as M/F applications are migrated)Staff headcount under pressure

The Scenario

The Company's Golden IT Rules

No risks taken with core IT systems

Therefore, avoid single points of failure and build redundancy into hardware infrastructure

Extremely high priority given to achieving rigid security standards

Reconciling the Pressures and the Required Standards

Accept that the Distributed Computing model is established and needs to be embraced

Business requires reliably integrated cross platform tasks – the Enterprise needs to supply this service

“Give me something of Mainframe quality, but on Unix instead”

Consider a job scheduler

Part Two – The Selection Process

Selection Process – Initial Steps

After establishing the “Enterprise-Wide” need, ask for input from Sys Administrators, Application Support & major users

Build matrix of technical requirements and “preferred features”

Define product “pot-holes” to avoid

Draw up shortlist of potential software

Product “Wish List”

Compatible with -

UnixNT and Windows 2000NetWareMC/ServiceGuardVantagePoint Operations (ITO OpenView)PeopleSoftStandard Command Line Interface

Preferred Features

Including -

“Easy use” colour coded GUIs Extensive batch administration optionsPlug-ins for widest possible range of systemsSNMPEmail messagingBatch modelling

Avoid the Pot Holes!

Hold a Contest

Check sources (Gartner, GIGA, search web)

Invite vendors to make presentation

Rate each product (build scoring table)

Select a winner to come forward for a test installation, possibly testing the top two onsite

And the Winner is ...

Return to the Original Issues

Does the product do the vital tasks?

How does the test installation perform and what do the test users think?

Does the cost of the product out-weigh the benefits to be gained from installation?

Can I be sure that the product will not become a 'White Elephant'?

Take your time before coming to a decision ...

Part Three – Installation and Configuration

The challenge = installing the product and obtaining the desired standards

Consider the issues behind loading all your mission critical jobs into a single system for execution

Consider failure scenarios

Uphold your golden rules

Implementing High AvailabilityThreats to system availability;

Hardware Failure or System Error = 44%Human Error = 32%Application Software = 14%Viruses = 7%Natural Disasters = 3% Contingency Planning Research, Livingston, NJ, USA

Put the job scheduler into context (make it as redundant as the core systems)

Consider MC/ServiceGuard as the HP-UX Tool for automatic failover for clustered HP 9000 Enterprise Servers

What Is MC/ServiceGuard?

MC(Multi-Computer) ServiceGuard is HP's High Availability solution. Similar to HACMP for AIX/ R6k & MS Cluster Server Software (aka Wolfpack)

Under ServiceGuard applications are seen as 'packages' with their own DNS entry

Redundant/mirrored storage required

ServiceGuard Monitors for software failures or SPU/Disk/LAN component failures & coordinates the transfer between failing and redundant components

MC/ServiceGuard in Action

MC/ServiceGuard in Action

The Important Issues for Control-M

The package names are used in Control-M job definitions (not underlying IP addresses) and are synonymous with the DNS entries for each node

If the applications are packaged themselves (possibly together with other apps) then any outages will be minimised

Packages can be manually failed over via operator commands, thus allowing rolling maintenance on production platforms

Installation Issues

Control-M installation needs to be fully planned

Consider creating separate Sybase server for Control-M/ECS

Underlying Sybase databases (used by Control-M) also need to be defined as ServiceGuard packages

Only Exceed/Motif version of ECS (i.e. not NT) is available when installing under ServiceGuard & ECS version 500 (addressed in ECS 600)

Control-M Has Built-in Failover

Control-M server has options to internally create a mirrored databases and backup server, but

This will have to defined separately by Control-M administrator

Failover is not automatic, it requires intervention

Failover is designed as short-term contingency (i.e. get the original server or DB fixed ASAP!)

Clustered hardware, redundant storage, high availability systems

Control-M fully integrated into environment

'Intelligent' scheduling deployed

Introduce naming standards and conventions

Consider how best to implement your security policy

Part Four – The Real World

For Example – Before Control-M

08:00 support onsite, backups have failed

For Example – Before Control-M

Both backups completed by 06:30, support

Now Under Control-M

Control Resources & AutoEdit Variables

Definable Quantitative Resources

Shout Messages (for various situations)

From/to windows, maximum reruns & cyclical

'On' conditions for return codes & standard out

User defined calendars or set pattern of days

Critical path jobs, priority settings

Other Features Under Control-M

Roll out to large number of Agents on W2k and HP-UX 11i

Backup strategy migrating from Legato to Omniback

Control-M SDK to be released in 2002 and possibly used for bespoke banking applications

ECS version 600 to be installed

Future Development Under Control-M

The End

Questions and Suggestions

Thank you for listening

Mark Francome Globetech AG, Basel 061 263 1360 [email protected]