Network and Voice Management Green Book ENU
Transcript of Network and Voice Management Green Book ENU
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 1/182
CA GREEN BOOKS
Network andVoiceManagementAn Integrated Solution forNetwork Fault andPerformance Management
OVERVIEW OF CONVERGED NETWORK CHANGES ANDMANAGEMENT NEEDS
BEST PRACTICES FOR DEPLOYING CA’S INTEGRATEDSOLUTION FOR NETWORK AND VOICE MANAGEMENT
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 2/182
LEGAL NOTICE
This publication is based on current information and resource allocations as of its date of publication and
is subject to change or withdrawal by CA at any time without notice. The information in this publication
could include typographical errors or technical inaccuracies. CA may make modifications to any CA
product, software program, method or procedure described in this publication at any time without
notice.
Any reference in this publication to non-CA products and non-CA websites are provided for convenience
only and shall not serve as CA’s endorsement of such products or websites. Your use of such products,
websites, and any information regarding such products or any materials provided with such products or
at such websites shall be at your own risk.
Notwithstanding anything in this publication to the contrary, this publication shall not (i) constitute
product documentation or specifications under any existing or future written license agreement or
services agreement relating to any CA software product, or be subject to any warranty set forth in any
such written agreement; (ii) serve to affect the rights and/or obligations of CA or its licensees under
any existing or future written license agreement or services agreement relating to any CA software
product; or (iii) serve to amend any product documentation or specifications for any CA software
product. The development, release and timing of any features or functionality described in this
publication remain at CA’s sole discretion.
The information in this publication is based upon CA’s experiences with the referenced software
products in a variety of development and customer environments. Past performance of the software
products in such development and customer environments is not indicative of the future performance of
such software products in identical, similar or different environments. CA does not warrant that the
software products will operate as specifically set forth in this publication. CA will support only the
referenced products in accordance with (i) the documentation and specifications provided with the
referenced product, and (ii) CA’s then-current maintenance and support policy for the referenced
product.
Certain information in this publication may outline CA’s general product direction. All information in this
publication is for your informational purposes only and may not be incorporated into any contract. CA
assumes no responsibility for the accuracy or completeness of the information. To the extent permitted
by applicable law, CA provides this document “AS IS” without warranty of any kind, including, without
limitation, any implied warranties of merchantability, fitness for a particular purpose, or non-
infringement. In no event will CA be liable for any loss or damage, direct or indirect, from the use of
this document, including, without limitation, lost profits, lost investment, business interruption, goodwill
or lost data, even if CA is expressly advised of the possibility of such damages.
COPYRIGHT LICENSE AND NOTICE:
This publication contains sample application programming code and/or language which illustrate
programming techniques on various operating systems. Notwithstanding anything to the contrary
contained in this publication, such sample code does not constitute licensed products or software under
any CA license or services agreement. You may copy, modify and use this sample code for the
purposes of performing the installation methods and routines described in this document. These
samples have not been tested. CA does not make, and you may not rely on, any promise, express or
implied, of reliability, serviceability or function of the sample code.
Copyright © 2007 CA. All rights reserved. All trademarks, trade names, service marks and logosreferenced herein belong to their respective companies.
2 Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 3/182
ACKNOWLEDGEMENTS
CA thanks the following people for their contributions to this CA Green Book:
Principal Authors
Don LeClairSue Andersen
Jason Bryk
Roger Craig
Bill Donoghue
Justin Gagnon
Brian Gollaher
Andrew Haigh
Kathleen Hickey
Mark Hounslow
John Kane
Michael Marks
John Murdough
Barbara O’Toole
Pete Oliveira
Jason Warfield
Dianne Weiss
The principal authors and CA would like to thank the following contributors:
Ajei Gopal
Tricia Bancroft
Lynn Beck
Gregory Buonaiuto
Curtis Lehman
Peter ClairmontDan Lewis
Anders Magnusson
Alexandre Moscoso
Joe Pennachio
David Soares
Peter Skotny
Cheryl Stauffer
Tom Wilson
3: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 4/182
4: Network and Voice Management
CA PRODUCT REFERENCES
CA Network and Voice Management
eHealth®
eHealth®
for Voice
SPECTRUM®
eHealth® for Voice Policy Manager
eHealth®
E2E Console
eHealth®
Live Health
eHealth®
Traffic Accountant
eHealth® Universal Workflow Integration Modules
eHealth®
Universal Data Integration Modules
eHealth®
Universal Wireless Integration Modules
SPECTRUM®
Infinity
SPECTRUM®
Integrity
SPECTRUM® Xsight
SPECTRUM®
OneClick
SPECTRUM® Service Manager
SPECTRUM®
Report Manager
SPECTRUM®
Alarm Notification Manager
SPECTRUM®
ATM Circuit Manager
SPECTRUM® Configuration Manager
SPECTRUM®
Secure Domain Manager
SPECTRUM®
Frame Relay Manager
SPECTRUM®
Microsoft Operations Manager Connector
SPECTRUM® Multicast Manager
SPECTRUM® OSS Integrations
SPECTRUM®
QoS Manager
SPECTRUM®
Remedy ARS Gateway
SPECTRUM®
SNMPv3 Support
SPECTRUM® VPN Manager
SPECTRUM®
Watch Editor
SPECTRUM®
Service Performance Manager
SPECTRUM®
Assurance Server Xsight
SPECTRUM®
Assurance Server Integrity
SPECTRUM®
Assurance Server Infinity
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 5/182
Contents
Chapter 1: Introduction ......................................................................................................9
About This Book .............................................................................................................9
Executive Summary ...................................................................................................... 10
Evolving Requirements for Network and Voice Management............................................10
CA’s Network and Voice Management Solution ..............................................................10Chapter 2: Challenges of Network and Voice Management ....................................................13
Evolution of the Network-to-Service Delivery Platform ......................................................13
Impact on Network Operations Teams............................................................................. 13
Impact on Network Management Software Requirements ..................................................14
Chapter 3: CA’s Network and Voice Management Solution ....................................................17
EITM: CA’s Vision ......................................................................................................... 17
Enterprise Systems Management....................................................................................18
The Value of CA’s Network and Voice Management Solution ...............................................19
A Key Part of CA’s EITM Vision ....................................................................................20
Network and Voice Management for Key Vertical Markets ..................................................20
Telecommunication Service Providers...........................................................................20
Government .............................................................................................................21
Enterprise................................................................................................................. 21
The Components of the Solution..................................................................................... 22eHealth ....................................................................................................................... 22
eHealth Components .................................................................................................. 23
The Benefits of eHealth .............................................................................................. 23
SPECTRUM ..................................................................................................................24
SPECTRUM Components ............................................................................................. 24
The Benefits of SPECTRUM .........................................................................................25
Integration between eHealth and SPECTRUM ...................................................................25
eHealth for Voice .......................................................................................................... 25
The Benefits of eHealth for Voice.................................................................................26
CA Technology Services Network and Voice Management Service Offerings .........................26
Assessment – Understanding the Gaps.........................................................................27
CA Maturity Models .................................................................................................... 28
Design – Building the Right Solution ............................................................................28
Implementation – The Bottom Line of Solution Success..................................................29
Optimization – Anticipating Change .............................................................................29
Why Trust Your Service Availability to CA Technology Services? ......................................29
How the Solution Delivers the Key Points of Value ............................................................30
Effective Service Level Management ............................................................................30
Proactive Service Assurance .......................................................................................31
Rapid Problem Resolution ...........................................................................................32
Predictive Capacity Planning .......................................................................................33
5: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 6/182
Chapter 4: Deployment Architecture for Network and Voice Management ...............................35
Network Performance Components ................................................................................. 35
E2E Console.............................................................................................................. 35
Live Health ............................................................................................................... 36
Integration Modules ................................................................................................... 36
Distributed eHealth .................................................................................................... 36
Remote Poller ...........................................................................................................37
Report Center ...........................................................................................................37Traffic Accountant .....................................................................................................37
Network Fault Management Components.........................................................................38
Assurance Server ......................................................................................................38
OneClick................................................................................................................... 39
Watch Editor ............................................................................................................. 40
Alarm Notification Manager ......................................................................................... 40
Frame Relay Manager ................................................................................................ 41
ATM Circuit Manager .................................................................................................. 41
Multicast Manager .....................................................................................................42
QOS Manager............................................................................................................ 42
VPN Manager ............................................................................................................ 43
SNMPv3 ................................................................................................................... 43
Secure Domain Manager ............................................................................................44
Configuration Manager ............................................................................................... 44Report Manager ........................................................................................................ 45
Service Performance Manager.....................................................................................45
Service Manager........................................................................................................ 46
Voice Management Components..................................................................................... 47
eHealth for Voice ....................................................................................................... 47
eHealth for Voice Policy Manager.................................................................................47
Deployment Architectures.............................................................................................. 48
Small-to-Medium Enterprise Deployment......................................................................48
Large Service Provider Deployment .............................................................................49
Network Performance Hardware and Software Requirements/Sizing....................................50
Network Fault Management Hardware and Software Requirements/Sizing ...........................50
eHealth for Voice Single PC or Database Server Hardware and Software Requirements ......... 52
Chapter 5: Setting Up and Configuring the Integrated Solution..............................................53
Installing the CA Network and Voice Management Solution Software...................................53
Installation Prerequisites ............................................................................................53
Installation Steps ...................................................................................................... 54
How You Install SPECTRUM......................................................................................... 54
How You Install SPECTRUM OneClick and Report Manager ..............................................55
How You Install eHealth .............................................................................................55
How You Install eHealth for Voice ................................................................................ 56
Configuring the Integrated Solution ................................................................................ 57
Best Practices ...........................................................................................................57
Identify Resources and Use SPECTRUM to Discover Them as Global Collections .................57
Import Global Collections into eHealth.......................................................................... 58
Organize Your Resources by Creating eHealth Groups ....................................................60
Schedule eHealth Discoveries of Global Collections ........................................................61
Network and Voice Monitoring........................................................................................62
Set Up Live Health ..................................................................................................... 62
Forward Live Health Traps to SPECTRUM......................................................................64Customize and Schedule Health Reports to Forward Traps..............................................65
Configure eHealth for Voice to Send Alerts to SPECTRUM................................................66
Configure SPECTRUM to Recognize the eHealth Server ...................................................68
Configure SPECTRUM to View eHealth Alarms ...............................................................69
System Maintenance..................................................................................................... 70
System Backup Archives ............................................................................................70
Data Recovery Best Practices ......................................................................................70
6: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 7/182
Chapter 6: Gathering System Information from Agents ........................................................71
Deployment and Administration of System Agents............................................................71
Best Practices ...........................................................................................................71
Supported Agents ...................................................................................................... 72
Prerequisites ............................................................................................................. 72
How You Add System Agents in SPECTRUM................................................................... 72
Unicenter NSM Agents................................................................................................ 79
How You Add System Agents in eHealth .......................................................................80Performance Reporting On System Agents.......................................................................80
At-a-Glance Reports ..................................................................................................80
MyHealth Reports for Systems ....................................................................................82
Health Reports for Systems ........................................................................................82
Using Live Trend ....................................................................................................... 82
How You Run Trend Reports for Systems......................................................................84
Top N Reports........................................................................................................... 86
What-If Capacity Trend Reports for Systems.................................................................86
Chapter 7: Service Level Management................................................................................87
Interview Procedures ....................................................................................................88
Interview Questions...................................................................................................88
General Questions .....................................................................................................88
Analysis and Mapping Procedures ...................................................................................89
How You Organize the Resource Information.................................................................89How You Illustrate the Relationships of Resources to Each Other .....................................90
How You Decompose the Information and Mapping to Service Models ..............................90
Example of a Business Service Map to Service Models....................................................90
Creating Service Models and Relationships....................................................................... 92
Key Concepts............................................................................................................ 92
How You Create Service Models................................................................................... 93
Example 1: A Customer Account Access Service ............................................................93
Example 2: Extend the Service to Monitor Critical Processes ...........................................99
Implement Example 2 in SPECTRUM.......................................................................... 101
Example 3: Extend the Service to Include a Response Time Element ............................. 105
Create SLAs............................................................................................................... 108
Key Concepts.......................................................................................................... 108
Create SLAs and Guarantees..................................................................................... 109
Example 4: An SLA for the Customer Account Access Service....................................... 110
How You Implement the A to Z Account Access SLA in SPECTRUM................................. 116
Service and SLA Reporting........................................................................................... 118
Run SPECTRUM Service Manager Customer-Facing Reports .......................................... 118
Service Availability by: Name, Customer, Owner ......................................................... 119
Service Availability Variable Health Level .................................................................... 120
Service Summary by: Name, Customer, Owner ........................................................... 121
Service Summary Variable Health Level ..................................................................... 121
SLA Detail By Customer ........................................................................................... 122
SLA Inventory by Customer ...................................................................................... 123
SPECTRUM Service Manager: Internal Reports ............................................................ 123
Service Health by Service Name................................................................................ 123
Service Inventory.................................................................................................... 125
Top N Worst Performing Services .............................................................................. 126
Top N Worst Performing Services Including All Outage Types ........................................126
Top N Worst Service Outages.................................................................................... 127Top N Worst Service Resources by Total Downtime...................................................... 128
SLA Status Current and Recent by Customer .............................................................. 128
SLA Summary by: Name, Customer, Status................................................................ 129
SLA Summary Warned or Violated ............................................................................. 129
SLA Detail By: SLA Name, Time Range, Last N Periods................................................. 130
SLA Detail with Resource Outages ............................................................................. 133
Customer SLA Summary .......................................................................................... 134
7: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 8/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 9/182
Chapter 1: Introduction
About This Book
The CA Green Book for Network and Voice Management describes how to manage theperformance and availability of converged networks. The CA solution provides proactive
management of voice and data services, ensures that bandwidth and system capacity is
sufficient, and supports business-driven service levels. The solution also provides integrated
network fault and performance management to support the network as a service delivery
platform.
The information contained in this CA Green Book is designed for network operators,
engineering, and technical staff charged with managing voice and data networks. The
deployment examples highlighted in this book present the views of a small enterprise and a
large service provider. This information may be useful for many other network
deployments, but may not meet all of their specific requirements.
This CA Green Book provides an understanding of capabilities that you can deploy today tomanage your converged network. The opening sections provide a strategic view of the
trends toward converged networks, and the subsequent sections present best practices for
deploying and using the CA Network and Voice Management solution to manage the
converged network.
This CA Green Book contains only information about network and voice management. This
is one of a series of CA Green Books designed to help define the capabilities of CA’s key
solutions and provide best practices on how to manage and secure them. Other CA Green
Books will present solutions across a wide range of IT management topics including
systems management, database management, and workload automation.
This Network and Voice Management Green Book is targeted toward CIOs, network
management teams, and technical teams. The book is structured as follows:
Chapters 1-3 Provide CIOs and network managers with an overview of the challenges
of managing converged networks and the value of CA’s Network and Voice Management
solution.
Chapters 4-10 Deliver sample deployments for the products comprising CA’s Network
and Voice Management solution to network managers and other technical personnel.
Provide best practices for planning, deploying, and configuring this solution to speed the
time-to-value for investments made in optimizing the network.
This CA Green Book also covers the following topics:
Technical descriptions of the components that comprise the recommended solution
Best practices for setting up and configuring the components of the solution
Defining and managing network service level management
Best practices for enabling proactive service assurance and resolving problems quickly
Performing capacity planning and management of voice and data networks
9: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 10/182
Executive Summary
Evolving Requirements for Network and Voice Management
The entire computing infrastructure has been dependent upon the network. Recently, the
use of the network has changed enormously, with the convergence of data, voice, businesscritical applications, and video content traveling over the same network. Network faults and
performance problems have immediate and negative business consequences for
productivity, cost, and revenue.
As a result, the interest and awareness of network fault and performance issues has
expanded to a wider audience of business-oriented and non-technical users who want and
need real-time information. In addition, the network operations team must rapidly learn
new technologies and expand their management responsibilities to support converged data
and voice networks.
In this environment, network management solutions must provide critical information and
management capabilities appropriate to both technical and business users. Network
management solutions need to provide configurable alerts, dashboards, and analyticalcapabilities to all users in addition to delivering traditional fault and performance
management.
Today’s network and voice management solutions must provide the following support:
Heterogeneous Networks They must support data technologies such as internet
protocol (IP), asynchronous transfer mode (ATM), frame relay (FR), and broadband, as
well as voice infrastructures comprised of legacy time-division multiplexing (TDM)
infrastructures, pure-play IP telephony (IPT) infrastructures, and hybrid infrastructures.
Scale They must support vast networks distributed across countries and continents, and
enable central or regional network management teams.
Integration They must be able to use a single solution to manage data and voiceinfrastructures, and critical systems.
Role-Based Service Information They must be able to communicate to external
customers with Service Level Agreements (SLAs) and communicate to internal customers
with either Operational Level Agreements (OLAs) or less formal mechanisms. They must
help both technical and business-minded audiences assess the network’s ability to
support the business, and identify and pinpoint the root cause of problems.
CA’s Network and Voice Management Solution
CA’s Network and Voice Management solution provides converged voice and data
management. It supports the need for IT to proactively manage end-to-end voice and data
services, ensure adequate capacity of bandwidth and systems, and support service levels
defined by the business. This solution is a key part of Enterprise IT Management (EITM),
which is CA’s vision for how to dynamically manage and secure IT environments, enabling
organizations to fully realize the potential of IT.
CA’s Network and Voice Management solution spans both IP and legacy voice technologies,
enabling companies to migrate to IP telephony at their own pace, and reduce the
complexity of managing heterogeneous infrastructures.
10: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 11/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 12/182
Optimization Anticipating Change
Optimization services evaluate ways in which your existing eHealth and SPECTRUM
solutions can be further utilized or fine-tuned. Health check services can include tuning
and reconfiguration, upgrades, and migrations, as well as training and certifications.
With CA’s converged network and voice management solution, the IT organization becomes
more proactive — not reactive — in their approach to managing voice and data. Network
operations teams have the tools to quickly determine the cause of problems. The IT
planning or engineering group can determine if resources are underutilized or reaching a
capacity threshold. The IT department can manage their relationships with key
constituencies with formal service levels. The ability to monitor and report on grade of
service (GoS) and quality of service (QoS) for calls in the voice network is essential to
successful service level management. CA’s Network and Voice Management solution makes
this all possible.
12: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 13/182
Chapter 2: Challenges of
Network and Voice ManagementAll computing — mainframe, client-server, distributed, grid, or web services computing —
depends on the function of the network on which it resides. Today’s businesses recognize
the dependence of business critical services, such as financial applications and voice, on the
network infrastructure. If the infrastructure is down or slow, the resulting impact on
business-critical applications and services, and the end users who rely on them, creates
loss of revenue and productivity, while increasing costs. On average, Infonetics estimates
that infrastructure downtime and degradation costs enterprises up to 3.6% of revenue
annually.1
Evolution of the Network-to-Service Delivery Platform
Over the past couple of years, the nature of the network itself has changed, and this
change has created significant implications for network management software and for the
operations and IT team members who use it. Not long ago, the main function of the
network was strictly maintaining data connectivity.
Today, the network is considered to be more of a service delivery platform. The network
supports real-time services such as Voice over IP (VoIP), IP Television (IPTV), and video
teleconferencing, all of which have evolved from early adoption and are now approaching
mainstream adoption. Enterprises are increasingly reliant on applications distributed over
wide geographic areas to provide any-time access to employees, customers, and partners
to accomplish critical business functions. Furthermore, the equipment comprising today’s
networks is now embedded with services such as security, high availability, and storage,
which were previously provided by infrastructures found outside of the network.
Impact on Network Operations Teams
The impact of this evolution on the roles and responsibilities of network operations teams
has been significant. As companies rely on real-time services to improve revenue, raise
productivity, and cut costs, any degradation in service has an immediate impact on
customer satisfaction and the business.
Because the network is carrying applications that directly impact the company’s bottom
line, the range of internal constituents who want to understand the performance of the
network has expanded from technical groups to business-oriented, non-technical, line-of-
business managers. Network operations team members now have a whole new set of
internal and external customers to whom they need to communicate their ability to meet
service level commitments. Because this group is non-technical, they need to communicatewith them in business terms, rather than technical terms.
1Infonetics Research, The Cost of Enterprise Downtime, North America 2004.
http://www.infonetics.com. Used with permission.
13: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 14/182
In addition, because of the tight inter-dependence between the network infrastructure and
the applications and services that are provided through it, network operations teams need
to extend their oversight from purely network infrastructure to applications, voice, and
other services as well. This extension requires a deeper understanding of new technologies
formerly managed by other teams. The embedding of services into network equipment has
also increased the range and complexity of devices that must be managed by the network
teams, which further complicates their roles.
Impact on Network Management Software Requirements
The evolution of converged networks and network operations challenges requires
management solutions able to maintain the connection between the business and both
internal and external customers. Strategic use of network management capabilities can
minimize potential accountability problems between network managers and application/IT
managers. One group needs to account for network performance issues, while the other is
responsible for the health of the applications deployed across the network and for satisfying
internal customers through the application user experience.
The ability to provide a multitude of reports and statistics on network status is essential.However, simple and user-friendly network management tools — with high-level alert
dashboards and features that enable users to drill down to application and network issues
— are gaining acceptance and will ultimately be accounted for in the IT budget.
Converging networks in both enterprise and service providers will force network and IT
managers to view network conditions on a per-application, per-flow basis. New
opportunities are already in action in wireless local area network (WLAN) management,
security management, VoIP management, and network configuration. Automation and
visibility tools for managed service providers will be critical in offering services across
multiple networks to multiple offices. This need is further amplified by the tightly knit
supply chain within information networks and the increasing trend of distributed and mobile
workforces.
Operations teams charged with the responsibility to maintain converged networks face the
following critical challenges:
Heterogeneous Networks Networks are now composed of a broad range of
technologies and vendors, including data technologies like IP, ATM, FR, and broadband;
as well as voice infrastructures comprised of both legacy TDM infrastructures, pure-play
IPT infrastructures, and hybrid infrastructures. Most migration to VoIP occurs gradually;
therefore, the need to simultaneously manage both legacy TDM and IP telephony
environments still exists.
Scale Today’s voice and data infrastructures are vast and span wide geographic areas —
across time zones, countries, and even continents. The management system must be
able to scale to support these very large infrastructures and provide the requiredinformation to the operations teams, regardless of whether they are located centrally or
regionally.
Required to Manage Data Inf rastructures, Voice Infrastructures, and Critical
Systems The management system must span domains to allow IT team members to
smoothly manage technical domains which were previously managed as silos.
Required to Communicate QoS to a Variety of Constituents Operations teams need
to be able to communicate to external customers through SLAs, and internal constituents
14: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 15/182
through OLAs or less formal mechanisms. Therefore, management software must contain
the intelligence and capabilities to do the following for both technical and business-
minded audiences:
› Assess the infrastructure’s ability to support the business.
› Identify problems as they occur.
› Pinpoint the source or cause of the problems.
In response to these trends, the worldwide network availability market is growing rapidly.
Delivering effective network management software will provide organizations with the tools
that they need to evolve into more efficient structures.
15: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 16/182
(This page intentionally left blank)
16: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 17/182
Chapter 3: CA’s Network and
Voice Management Solution
EITM: CA’s Vision
EITM is CA’s vision for how to dynamically manage and secure IT environments, enabling
organizations to realize the full potential of IT as a source of business value. EITM provides
a common foundation for the integration and sharing of services and data that allows for
the orchestration of all IT assets and resources in unison (infrastructure, applications, and
business processes). This business-oriented approach also makes it possible to integrate
the management of networks, systems, storage, databases, applications, and security as
well as to provide a way to measure, optimize, and demonstrate the impact of IT on the
organization’s goals as never before.
CA is the worldwide leader in management software solutions. We have been in the
management software business for three decades, and have been focused on providingsolutions for all areas of infrastructure management. We are committed to helping
organizations achieve their goals by reducing IT costs to optimize capital and operating
expenses, mitigating risk, achieving compliance, and helping to ensure that the
infrastructure is always available and performing optimally.
The core of CA’s approach is to deliver management solutions that provide a unified view of
all assets and operations of the organization as they relate to business activities and needs.
This view enables organizations to align IT with business, enabling them to make better,
more informed business decisions about how to direct business activities and utilize assets.
CA’s management solutions include Service Availability, which helps IT departments deliver
consistently superior IT services by implementing proactive, integrated management that
provides insight into the health of all systems and applications on which each business
service depends.
17: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 18/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 19/182
The Value of CA’s Network and Voice Management Solution
For today's businesses to be able to compete, high-performance voice and data solutions
are essential. CA’s vision for converged voice and data management supports the need for
IT to proactively manage end-to-end voice and data services, ensure adequate capacity of
bandwidth and systems, and support service levels defined by the business. CA’s Network
and Voice Management solution spans both IP and legacy technologies, enabling companies
to migrate to IP telephony at their own pace, and reduce the complexity of managing
heterogeneous infrastructures. Our product strategy supports this vision through
continuous updating, innovation, and integration as defined by our customers.
CA’s Network and Voice Management solution provides the following key points of value:
Effective Service Level Management This solution enables you to baseline, assess,
and track services through the network and communicate adherence to SLAs and OLAs to
business and technical audiences.
Proactive Service Assurance It provides a policy-based approach to monitoring
degradations in service which gives you the ability to identify degradations before
customers are impacted, and to account for these degradations in RCA and eventcorrelation.
Rapid Problem Detection This solution focuses on resolving the true cause of the
problem — not the symptom — through a combination of event correlation, RCA, and
linkage with real-time and historical reporting.
Predictive Capacity Planning The foundation of this solution is intelligent, embedded
algorithms that inform network operations teams exactly when to upgrade or downgrade
circuits or other hardware based on past usage trends and tailored thresholds.
Our network and voice management strategy is four-fold:
Enable our customers to manage their business processes, as well as the data, voice, and
multimedia services consistent with their competitive strategy.
Enable our customers to manage their converged networks, including business critical
applications and the transition from traditional to IP telephony.
Provide end-to-end fault and performance management for data, voice (TDM and IP
telephony), and system and application infrastructures.
Extend management to the voice and multimedia resources as well as the network
infrastructure.
CA is the only vendor who can provide the following:
An integrated, proactive management solution
A solution that spans both IP and legacy technologies
Solutions that help ensure voice network performance before, during, and after a
migration to VoIP
With CA’s Network and Voice Management solution, the IT organization becomes more
proactive — not reactive — in their approach to managing voice and data. For example,
instead of waiting to receive customer complaints about poor voice quality before acting, IT
staff will be alerted when policies indicating jitter, low mean opinion scores (MOSs), or
hardware problems such as T1 circuits have exceeded user-defined thresholds. When a
problem occurs in an IP telephony environment, it is sometimes difficult to determine the
19: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 20/182
cause. For example, a user may complain about not being able to make a call, but many
alarm events can be generated by routers, IPT systems, switches, etc. CA’s Network and
Voice Management solution provides a proactive approach to managing IP telephony and
VoIP.
Capacity planning is essential to converged voice and data management. Data collected
from voice systems, whether legacy or IPT, and from the network (including trunks andport capacity) will help the IT engineering team determine if underutilized resources are
reaching a capacity threshold. This information can also be used as the basis for predictive
capacity planning for the network. IT departments are tasked with providing service levels
to key constituencies, especially revenue-generating business units such as contact centers
The ability to monitor and report on GoS and QoS for calls in the voice network is essential
to service level management.
A Key Part of CA’s EITM Vision
CA’s Network and Voice management solution fits within CA’s company-wide EITM strategy.
The solution addresses the four major CIO imperatives in the following ways:
Improves service by providing proactive service assurance to detect problems before they
impact end users. Ensures reliability and responsiveness of the network infrastructure
with powerful RCA, event correlation, and impact analysis.
Manages risk by assuring business continuity as it enables organizations to comply with
regulatory and governance requirements.
Manages costs by significantly reducing cost of downtime (outage or degradation) as it
minimizes the number of occurrences and the duration of downtime.
Aligns IT with business by giving the IT team a business view that provides status of end-
to-end business service.
Network and Voice Management for Key Vertical Markets
CA provides powerful and unique management software for managing increasingly complex
network services within traditional enterprise and government environments, as well as
telecommunications, cable, mobile wireless, and other service provider industries. It
enables technical teams to improve service, control costs, reduce risk, increase revenue,
and drive efficiency when managing IT infrastructure as a business service.
Telecommunication Service Providers
CA views the telecommunication service-provider industries as an important vertical
market, the members of which leverage their operational environment as a key part of
controlling costs and delivering product/service differentiation — crucial factors to
remaining competitive in today’s challenging communications marketplace.
Most service providers select a variety of management tools specific to element
management, service provisioning, billing, customer care, and service assurance. Within
large carrier and service provider environments, this creates many disparate applications
and data stores. They must somehow be integrated to form efficient workflow processes
that ensure services are delivered reliably while maintaining operational efficiencies. CA’s
Network and Voice Management solution provides flexible integration points to allow it to
function successfully in heterogeneous Operational Support System (OSS) software
environments, while also reducing deployment time and complexity, as well as
20: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 21/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 22/182
The Components of the Solution
CA’s solution for this market is comprised of three primary product lines, all of which
provide a consistent set of capabilities across the voice and data infrastructure:
e Health assesses the health of your network and determines if it can accommodate
voice. It tells you how well the network is performing, allows you to compare voice MOS
with QoS statistics, and identifies trends in performance.
SPECTRUM provides fault management, RCA, and voice modeling. It also enables you to
model the components of your voice services to ensure that services are operating
smoothly.
e Health for Voice offers system performance management for communication systems
(IP and Traditional TDM), messaging systems, and management from a telephony
perspective.
These applications can work as standalone systems or integrated, as shown below.
eHealth
eHealth helps you take control of network performance and ensure QoS across the entire
network infrastructure. It enables you to successfully accomplish a multitude of tasks such
as ensuring the availability and performance of the network, documenting service levels,
managing capacity, and accurately planning for growth. This solution allows you to face a
number of challenges including managing a diverse collection of devices from numerous
vendors, isolating the source of performance degradation throughout the network,
minimizing recurring wide area network (WAN) expenses, and providing consistent
reporting across your heterogeneous network infrastructure.
This component of the solution enables you to achieve the following goals:
Improve the service availability of your network.
Reduce cost of downtime and end-user impact caused by downtime.
22: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 23/182
Identify and resolve problems faster.
Plan for capacity before it is needed.
Improve QoS.
Meet and prove committed service levels.
e Health ComponentseHealth includes the following:
eHealth
E2E Console
eHealth
Live Health
eHealth
Traffic Accountant
Report Center
Distributed eHealth
eHealth
SPECTRUM
Integration
eHealth
Universal Workflow Integration Modules (HP OpenView, IBM (Micromuse),
Netcool, Cisco CIC)
eHealth
Universal Data Integration Modules (Cisco WAN Manager, Cisco IP Solution
Center, Lucent, Nortel, Alcatel)
eHealth
Universal Wireless Integration Modules (Nortel, Starent)
The Benefits of e Health
eHealth can be differentiated from other solutions in the following ways:
eHealth offers best-in-class proactive management so that IT can correct problems before
they become revenue-impacting issues.
It has the broadest multi-vendor support — over 1000 devices from 100 different
vendors.
Its reports have built-in intelligence to troubleshoot without requiring intimate knowledge
of every component of the service.
It provides auto-baseline; that is, eHealth will learn the normal behavior for each
management device. The “deviation from normal” algorithm offers a more reliable
threshold relative to history because the window of comparison is continuous.
Unlike CA, many vendors have a limited portfolio of device and technology support, which
poses a problem to companies trying to reduce the number of management software
vendors.
› Customers want greater accountability from their vendors as IT becomes more of
a service.
› Fewer vendors results in less complexity, better operating costs, and less risk in
rolling out new business initiatives.
The value of an integrated management platform is significant because IT operations
spend a large amount of time and money identifying the cause of problems, and
downtime is extremely expensive.
23: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 24/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 25/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 26/182
The Benefits of e Health for Voice
eHealth for Voice offers the following benefits:
Enables you to manage your business processes, including voice and multimedia services,
consistent with your competitive strategy.
Enables you to manage your transition from traditional to IP telephony. Provides end-to-end fault and performance management for IP telephony networks.
Manages voice and multimedia resources to reduce risk, manage costs, enable new
services, and align your investments with your IT objectives.
CA Technology Services Network and Voice Management Service
Offerings
CA Technology Services has specialists in eHealth and SPECTRUM to help organizations
assess, design, implement, and optimize network and voice availability and performance
solutions across the enterprise. From financial services companies to telecommunications
companies to government organizations and beyond, CA experts help you establish best-practices workflows, integrate your network management solutions, and combine your
network and voice availability and performance solutions with your service desk for a
consolidated network event management system.
The focus of CA network and voice availability and performance experts is to help you
achieve the following:
Improve business alignment by mapping the network infrastructure to critical IT
services that support the business, and ensuring that your network team is focused on
delivering the organization’s most important services.
Increase business planning capabilities by delivering full visibility across the network
infrastructure through consolidated consoles, reports, and metrics analysis.
Reduce risk by defining and implementing automatic repair responses that avoid the
possibility of human error and guarantee problem repair consistency.
Reduce cost by consolidating network event management into a central point of control
which decreases staffing demands.
Leverage value from existing network management systems by integrating and
building upon your prevailing tools and workflows.
Optimizing IT service delivery by applying International Organization for
Standardization (ISO) and CobiT standards, IT infrastructure library (ITIL) best practices,
and proven network management processes.
LIFECYCLE APPROACH FOR NETWORK AND VOICE AVAILABI L ITY A ND PERFORMANCE
SOLUTIONS
The needs of every organization are unique, but network management yields common
themes in workflow processes and monitoring, and management instrumentation. A CA
solution is deployed and optimized to the particular needs of your organization through a
lifecycle of best practices services offerings.
26: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 27/182
Assessment – Understanding the Gaps
Comprehensive assessments validate the maturity and efficiency of network and voice
availability and performance management. CA experts conduct a comprehensive analysis of
your network management capabilities including the following:
Network management goals, objectives, capabilities, and strategies
Network operations organization structure, and personnel roles and responsibilities
Network monitoring, configuration, and integration software
Network design and topology
Voice traffic simulation Data analysis
Associated security constraints (firewalls, access lists, and so on)
Alarm/event severity definitions
Existing business, technical, and environmental challenges and issues
Change control processes
CA and third-party product integration requirements
Your current management capabilities are compared to the CA maturity model for people,
processes, and technology and the assessment results in a Solution Architecture Overview
(SAO). The SAO is a blueprint that defines achievable solution phases to maximize problem
determination and response workflows, apply automation, and integrate service desk
operations. CA consultants and architects also research and map the network infrastructure
to IT services, propose recommendations, and furnish business justifications to help you
secure funding.
27: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 28/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 29/182
Implementation – The Bottom Line of Solution Success
Using the SAS as a guide, CA consultants prepare the environment; install, configure, and
customize eHealth and SPECTRUM; verify and document your eHealth and SPECTRUM
solutions on test, QA, and production environments; and provide knowledge transfer to
your staff. Implementation services also include the development and deployment of
integration components between eHealth and SPECTRUM, your other IT management
applications, and your service desk.
To ensure that implementation efforts are tightly managed, PMP-certified Project Managers
track and report on progress, questions, issues, and roadblocks. CA Technology Services
uses PMP-certified Project Managers and highly trained architects, consultants, and
partners. On an annual basis, CA Technology Services invests 50% more in training our
professionals than the industry average.
Optimization – Anticipating Change
Optimization services evaluate ways in which your existing eHealth and SPECTRUM
solutions can be further utilized or fine-tuned. Healthcheck services can include tuning and
reconfiguration, upgrades, and migrations. Other services include training and certifications
that focus on increasing staff efficiency. Past experience has found that staff training results
in more efficient operations. These services are offered as onsite or offsite instructor-led,
self-paced, or web-based. Instructors or course developers are also certified experts and
dedicated to network and voice availability and performance.
Why Trust Your Service Availability to CA Technology Services?
Experience: CA has 30 years of enterprise systems management services experience.
Proven Process: A dedicated assessment team plans, designs, and provides business
justification for network and workflow recommendations and builds best practices into
every customer blueprint.
Expertise: A vibrant community of worldwide professionals focused on network, voice
availability, and performance shares their solutions knowledge and continually contributesproven best-practice workflows and solution models.
Focus: CA Technology Services is comprised of a team of Solution Managers and
dedicated architects who are devoted exclusively to the assessment, design, delivery, and
workflow methodologies offered around eHealth and SPECTRUM services and solutions.
29: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 30/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 31/182
Proactive Service Assurance
CA’s Network and Voice Management products are used together for proactive service
assurance through the embedded algorithms within all of the product lines. This helps
operations teams identify potential problems BEFORE they impact customer service.
Within eHealth, this is accomplished primarily through the Time over Threshold and
Deviation from Normal algorithms within Live Health, which allow an intelligent
performance-based alert to be sent when current performance violates either a fixed
threshold, or what is considered “normal” behavior (based on past history) for a particular
length of time within a given analysis window. Similarly, eHealth for Voice sends alerts
when violations of QoS or GoS are experienced. These alerts are fed into SPECTRUM, which
applies its intelligence on policy, models, and rules to identify the severity of the problem,
provide alarm integration and correlation, taking advantage of the SPECTRUM Service
Management and voice modeling capability.
31: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 32/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 33/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 34/182
(This page intentionally left blank)
34: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 35/182
Chapter 4: Deployment
Architecture for Network and
Voice ManagementThis chapter provides information to prepare for the installation and configuration of CA’s
Network and Voice Management solution. The following key topics are presented:
Network performance components
Network fault management components
Voice management components
Deployment architectures
Sizing recommendations
Network Performance Components
eHealth is comprised of the following components:
Required Components Optional Components
Live Health
Integration Modules
Distributed eHealth
Remote Poller
Report Center
E2E Console
Traffic Accountant
E2E Console
The E2E Console is the core of an eHealth implementation and is required to operate
eHealth. The E2E Console includes database, discovery, and poller functionality along with
administration GUIs, reporting GUIs, and so on. eHealth licenses (universal and system)
enable the eHealth Console to poll and collect data from certified devices with an embedded
management software agent, and are required to operate eHealth. An element representsthe eHealth model, or representation, for any part of an infrastructure that eHealth can
analyze. eHealth can analyze a physical element, such as a specific port on a specific card
of a specific router. It can also analyze a logical element, which refers to the logical purpose
for a device or component, such as a network link. To determine if a device is certified for
use with eHealth, log on to the Certification pages at http://support.concord.com.
Note: You must have a Support account to access the http://support.concord.com site. You
obtain an account with the purchase of the eHealth or SPECTRUM products.
35: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 36/182
Live Health
Live Health is the real-time performance monitoring engine that analyzes performance data
collected with eHealth for deviations from normal behavior and threshold violations. Live
Health includes three components:
Live Exceptions gives you the ability to generate and display performance-based alarms.
Live Status provides a single end-to-end view of the status of your infrastructure.
Live Trend provides a real-time reporting capability.
Integration Modules
eHealth provides a set of integration modules (IMs) that enable you to use eHealth to
report on the data that various network management systems (NMSs) collect. When you
license and install an IM, you can use it to tap into data already collected by a present NMS
You can then import it “en masse” into eHealth, providing critical data quickly and
eliminating the need for redundant data gathering via duplicate polling.
Universal Workflow IMs enable customers to drill back from supported fault
management systems to eHealth. This type of IM is supported on the following systems:SPECTRUM, IBM (Micromuse) Netcool, Cisco Information Center (CIC), and HP OpenView
Network Node Manager.
Universal Data IMs enable the import of configuration and performance data. This type
of IM is supported on the following systems: Cisco WAN Manager, Cisco ISC, Lucent,
Alcatel, and Nortel.
Universal Wireless Data IMs enable import of configuration and performance data
from other wireless element management systems into the eHealth E2E Console. This
type of IM is supported on the following systems: Nortel Shasta SCS GGSN and Starent
ST-16 Bulk Stats.
Distributed e Health
If you have a large infrastructure, you could deploy multiple Distributed eHealth Systems
across different physical locations or alternatively co-locate them in a central configuration
referred to as a cluster. The cluster contains several eHealth systems that manage specific
sets of resources, and share the information with each other. By using Distributed eHealth,
you can distribute the workload of collecting and processing data across multiple eHealth
systems that work in parallel. Report users can access reports for any element or groups in
the cluster from Distributed eHealth Consoles, which are reporting front-ends to the cluster
You would typically choose a Distributed eHealth site when you want to run reports for
more elements than a standalone eHealth system can support. You might also choose a
Distributed eHealth site if you want to place an eHealth web server system outside the
firewall and insulate the Distributed eHealth Systems within the firewall of your
infrastructure. Depending on the number of Distributed eHealth Systems that you have,and the system performance of the Distributed eHealth Console, a Distributed eHealth site
could support reports for up to one million elements.
The Distributed eHealth Package software contains all software for Distributed eHealth
Consoles and the software required to turn a standalone eHealth System into a Distributed
eHealth System. You must purchase all console software, elements, and agents for the
standalone eHealth systems separately. For complete instructions on administering a
cluster, see the Distributed eHealth Administration Guide.
36: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 37/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 38/182
Network Fault Management Components
SPECTRUM is comprised of the following components:
Required Components Optional Components
Assurance Server
Assurance Server Xsight
Assurance Server Integrity
Assurance Server Infinity
OneClick
Watch Editor
SNMP V3
Alarm Notification Manager
Secure Domain Manager
Frame Relay Manager
Configuration Manager
VPN Manager
ATM Circuit Manager
Report Manager
Multicast Manager
Service Performance
Manager
QoS Manager
Service Manager
Assurance Server
SPECTRUM offers three types of Assurance Servers designed for different types of
customers:
Assurance Server Xsight (for emerging enterprises)
Assurance Server Integrity (for larger enterprises)
Assurance Server Infinity (for Service Providers)
ASSURANCE SERVER XSIGHT
The Assurance Server Xsight delivers the capabilities of core SPECTRUM technologies to abroader array of small businesses. With the introduction of SPECTRUM Xsight, CA extended
the support of multi-vendor IP fault and performance management in a solution that is
competitively priced and packaged to help you become operational quickly. This component
provides support for most vendor devices found in today’s enterprise networks. It supports
single-server deployment only; it does not allow for a distributed deployment.
The Assurance Server Xsight includes the following key features:
Root cause analysis
Impact analysis
Auto-discovery of multi-vendor and multi-technology networks
Standards-based integrations
One concurrent administrator license (fault-tolerant license not included)
38: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 39/182
ASSURANCE SERVER INTEGRI TY
SPECTRUM’s roots lie within serving large enterprise customers whose businesses change,
merge, and scale rapidly. The SPECTRUM Integrity solution has transformed CA’s patented
technologies and combined them with new features and functionality to offer today’s
evolving enterprises the power to manage business-critical services across the hall or
around the world. This component provides support for most vendor devices found intoday’s enterprise networks.
The Assurance Server Integrity includes the following key features:
Root cause analysis
Impact analysis
Auto-discovery of multi-vendor and multi-technology networks
Standards-based integrations
One concurrent administrator license and a fault-tolerant license
ASSURANCE SERVER I NFINI TY
SPECTRUM Infinity is specifically focused on the needs of today’s service providers. It
provides specific functionality with significant performance improvements dedicated to
accelerating new service rollouts and exceeding customer quality expectations, while
allowing them to manage a growing infrastructure with existing resources. This component
provides Integrity device support and the Advanced Management Module pack that provides
support for high-end devices typically found only in service provider networks.
The Assurance Server Infinity includes the following key features:
Root cause analysis
Impact analysis
Auto-discovery of multi-vendor and multi-technology networks
Standards-based integrations
One concurrent administrator license, a fault-tolerant license, and two Southbound
Gateway integration licenses
OneClick
SPECTRUM OneClick is a three-tier, web-based console. The central component is a web
server that connects directly to SPECTRUM Assurance Servers and delivers information to
distributed Java clients. The feature-rich Java clients are downloaded, installed, and
updated from the OneClick web server to ease implementation, administration, and
maintenance. The SPECTRUM OneClick console combines anywhere/anytime access and
reduces the training requirements of standard web-based applications with the scalabilityand responsiveness of a full desktop client application.
39: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 40/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 41/182
The Alarm Notification Manager includes the following key features:
Alarm consolidation
Alarm filtering
Policy-based alarm forwarding
Alarm notification
Frame Relay Manager
SPECTRUM Frame Relay Manager delivers precise monitoring and performance thresholding
of committed information rates (CIR), bandwidth utilization, and circuit congestion. Root
cause analysis and fault isolation is provided per data link connection identifier (DLCI), with
impact analysis to prioritize response and corrective action. Patented intelligent auto-
discovery techniques leverage remote IP address information and traffic statistics to map
DLCI connectivity and present an integrated topology view. Several large enterprises use
SPECTRUM Frame Relay Manager to document SLA violations with their service providers.
SPECTRUM’s Frame Relay can also determine if an enterprise has purchased too much or
too little bandwidth on a per-circuit basis. This results in cost savings of thousands and tensof thousands of dollars per month in WAN connectivity charges. Service providers have
used SPECTRUM Frame Relay Manager to ensure SLA compliance, improve customer
service quality, and deliver differentiated service offerings. Using this component, one
service provider is able to identify Frame Relay problems in 97 to 99% of cases before their
customers do — and are working to fix them before the customer’s business is impacted.
This component provides a cost-effective way to improve service quality, deliver end-to-end
visibility, and reduce operating costs.
The Frame Relay Manager includes the following key features:
Proactive communication with Frame Relay equipment that supports RFC 1315 or RFC
2115 Frame Relay MIBs with vendor extensions for Cisco and Nortel
Fast, accurate modeling of physical and logical DLCI port connectivity with IP address,
subnet mask, and remote IP address information
Out-of-box performance views show CIR throughput, congestion statistics, and data
terminal equipment (DTE) changes
ATM Circuit Manager
SPECTRUM ATM Circuit Manager delivers precise monitoring and performance thresholding
of ATM throughput, bandwidth utilization, and circuit congestion. Root cause analysis and
fault isolation is provided per virtual private LAN (VPL)/virtual channel links (VCL) with
impact analysis to prioritize response and corrective action. Patented intelligent auto-
discovery techniques leverage remote IP address information and traffic statistics to map
virtual path identifier (VPI)/virtual channel identifier (VCI) connectivity and present anintegrated topology view. The ATM circuit path view displays the endpoint-to-endpoint
mapping for each device, physical port, and logical interface traversed.
Enterprises can also import a list of permanent virtual circuits (PVCs) provided by their
service provider to accurately model all ATM WAN links. Several large enterprises already
use SPECTRUM ATM Circuit Manager to document SLA violations with their service provider.
In addition to this capability, SPECTRUM’s ATM can also determine if an enterprise has
purchased too much or too little bandwidth on a per-circuit basis. This can result in a cost
41: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 42/182
savings of several thousands of dollars per month in WAN connectivity charges. Service
providers have used SPECTRUM ATM Circuit Manager to ensure SLA compliance and deliver
differentiated service offerings. This component provides a cost-effective way to improve
service quality, deliver end-to-end visibility, and reduce operating costs.
The ATM Circuit Manager includes the following key features:
Proactive communication with ATM equipment that supports RFC 1695 with private
management information bases (MIBs)
Fast, accurate modeling of physical and logical VPL/VCL port connectivity with IP address,
subnet mask, and remote IP address information
Out-of-box performance views showing cells-per-second throughput and ATM QoS
information
Multicast Manager
SPECTRUM Multicast Manager provides multi-vendor visibility into logical multicast network
sessions — proactively monitoring key performance indicators while highlighting the impact
of infrastructure outages on multicast services. All logical multicast overlay services are
automatically discovered and modeled within the SPECTRUM Assurance Server. Multicast
session models maintain complete knowledge of the multicast feed including its source,
distribution tree, and receivers.
SPECTRUM Multicast Manager presents the user with an easy-to-use interface for topology
navigation and alarm monitoring. This results in lower training and administration costs as
users have at-a-glance access to actionable information. Multicast enhancements allow the
user to view the per-group multicast topology and the associated routers, switches, and
ports that comprise the IP multicast group. SPECTRUM Multicast Manager also monitors
multicast group health. If a resource in a multicast group (source, routers, switches, ports)
experiences a reliability problem, SPECTRUM Multicast Manager will automatically
understand the impact on the overall group. This component provides a cost-effective way
to manage your multicast infrastructure as a business service.
The Multicast Manager includes the following key features:
Multi-vendor view of IP services with a detailed understanding of the elements that
comprise a multicast group
An intuitive interface for multicast topology navigation and alarm monitoring of groups,
sources, receivers, and rendezvous point (RP) devices
QOS Manager
The SPECTRUM QoS Manager enables enterprises and service providers to verify and
validate the configuration and effectiveness of QoS Policies and Traffic Classes throughout
the IT infrastructure. Technology Relationship Mapping and web-based reporting discoversand documents the health and performance for each CoS configured across the network.
Patented SPECTRUM analytics intelligently integrate and automate modeling of your QoS
Policies and Traffic Classes to deliver RCA and impact prioritization.
42: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 43/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 44/182
Secure Domain Manager
Today’s complex and distributed networks are being driven by security policies that inhibit
the use of insecure management protocols such as Simple Management Network Protocol
(SNMP) v1 or Internet Control Message Protocol (ICMP) for management of those networks
For example, a Demilitarized Zone (DMZ) separates a set of elements from the intranet
through a firewall for security purposes. Still, the businesses, processes, and customersneed to be supported by IT services. This requires visibility into the complete infrastructure
SPECTRUM Secure Domain Manager (SDM) enables customers to manage those domains by
securely tunneling SNMP and ICMP traffic through a secure sockets layer (SSL) connection.
Only a single hole needs to be inserted into the firewall, allowing for extended
manageability without impacting security policies in place. This solution is totally
transparent to the end user and all client applications, eliminating the need to perform
additional administrative tasks.
Note: This feature is not available for the Assurance Server Xsight.
The Secure Domain Manager includes the following key features:
Multiple secure domain connectors
SNMP and ICMP traffic forwarding
Securely tunneled traffic via XML/SSL over Transmission Control Protocol (TCP)
Transparency to users and client applications
Configuration Manager
Managing today’s complex infrastructures involves maintaining hundreds or thousands of
business-critical devices. Being able to keep track of how they are all configured — and
making sure that configurations are accurate — can be overwhelming. SPECTRUM
Configuration Manager is an intelligent, integrated application that automates management
of critical device configurations to keep your business operational. SPECTRUM Configuration
Manager provides the tools that you need to capture, modify, load, and verifyconfigurations for thousands of multi-vendor devices. With its unique design, SPECTRUM
Configuration Manager allows users to perform device administration on configuration files,
MIB object identifiers (OIDs), and SNMP attributes. Each configuration is time-stamped and
identified by the revision number. SPECTRUM-specific values such as polling interval,
community name, or security string can be edited.
SPECTRUM Configuration Manager can quickly load any stored configuration to single or
multiple devices simultaneously — tracking all changes, scheduling automatic uploads
during maintenance windows, or rolling back configurations to their last known good state.
Automatically scheduled configuration comparisons deliver immediate notification of
unauthorized changes. This component provides cost-effective configuration management
to ensure business continuity.
44: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 45/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 46/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 47/182
Voice Management Components
The eHealth for Voice product is comprised of the following components:
Required Components Optional Components
eHealth for Voice
Right to Use License (per PBX/message system)
Node License (1 per call/message system)
eHealth for Voice Policy
Manager
e Health for Voice
eHealth for Voice is a multi-vendor, multi-system (call management and voice messaging)
and multi-technology (traditional PBX (TDM) and IPT) performance management solution
that greatly simplifies management of voice networks. eHealth for Voice eliminates the
manual collection of data and the labor-intensive effort of report compilation and telephony
GoS determination. This translates to improved voice system performance and availability
delivered at a lower cost. Furthermore, eHealth for Voice is an agent-less solution that does
not require any software to be installed on the voice systems, simplifying installation and
greatly reducing time-to-value.
You can run a wide variety of reports for delivery to printers, email recipients, or a
corporate intranet. With eHealth for Voice, accurate current and historical system
information is always available for trending and analysis. Instead of fragmented data
snapshots, true performance measurements are delivered to a desktop or printer, every
day, automatically.
The eHealth for Voice architecture allows maximum scalability by modularizing functions.
For smaller installations, a single server may contain the database and the data collection
module. For larger applications, any number of additional servers in different locations may
act as data collection agents, downloading data from clusters and sending it to the one
central database. Data can be collected according to a user-defined schedule, twenty-four
hours a day, seven days a week, so that all data is retrieved before it is overwritten. The
central database may be accessed by any number of client machines over an IP network to
provide access to data and reports.
You can purchase eHealth for Voice by ordering the eHealth for Voice Right to Use license
for the PBX, call system, or messaging system to be monitored (for example, purchase CA
eHealth for Voice – Nortel CS-1000 and Meridan to monitor Nortel PBXs) and then order the
appropriate number of node licenses. One node license is required for each call system or
messaging system monitored.
e Health for Voice Policy Manager
eHealth for Voice Policy Manager is a component that plugs into the eHealth for Voice
engine to monitor all data activity against user-defined criteria and provide automatic
notification when those criteria are met. The module allows you to set specific thresholds
and conditions at the node, platform, or system-wide level and to set notification actions
including sending e-mail, console, and pager messages, SNMP traps to SPECTRUM,
Unicenter NSM, or third-party monitoring systems, and invoking customized commands and
47: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 48/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 49/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 50/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 51/182
Technical Specifications for SPECTRUM Report Manager and OneClick Servers
UNIX/Linux Windows
Minimum SystemRequirements
Sun SPARCstationLinux – Pentium Xeon
Pentium Xeon
OperatingSystems Solaris 9, 10 (see installationguide for required patches) Windows 2000, Windows XPProfessional, or Windows 2003Server
Note: Business Objects XI supportsa maximum of 10 users on XP.
Linux Red Hat Ver 3, update6 or greater
Memory 1 GB (with 2GB swap spacefor Solaris)
1 GB
Free Disk Space 4 GB 4 GB
Applications Linux Update 6 or greater
Solaris packages
SUNWeu8osSUNWeuluf
See Microsoft Support forinformation about updates for yourWindows version.Business Objects XI Service Pack 1
Technical Specifications for SPECTRUM OneClick Servers(without Report Manager)
UNIX/Linux Windows
Minimum SystemRequirements
Sun SPARCstationLinux – Pentium Xeon
Pentium Xeon
Operating
Systems
Solaris 9, 10 (see installation
guide for required patches)
Windows 2000, Windows XP
Professional, or Windows 2003Server
Linux Red Hat Ver 3, update6 or greater
Memory 1 GB 1 GB
Free Disk Space 230 MB 230 MB
Applications Linux Update 6 or greater
Java 2 SDK, Standard Edition,version 1.5.0_06 or later
Windows 2000 - Service Pack 2 orlater Java 2 SDK, Standard Edition,version 1.5.0_06 or later
51: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 52/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 53/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 54/182
Installation Steps
You can install the software applications in any order; as a best practice, install SPECTRUM
and SPECTRUM OneClick/Report Manager first, then eHealth, and, finally, eHealth for Voice.
Important: The CA network and voice management applications require you to use
systems that are dedicated to each application. Do not use those systems for other
applications or services. While anti-virus and security software are recommended for any
server system in your environment, disable the anti-virus software during installation to
ensure that the applications install completely.
You need a minimum of four systems for the following basic configuration:
SPECTRUM (SpectroSERVER system)
SPECTRUM OneClick and Report Manager server
eHealth
eHealth for Voice
BEST PRACTICESTo facilitate the successful installation and setup of these components, review the following
best practices:
Ensure that the systems on which you plan to install the software have fixed IP
addresses.
Obtain and test login account privileges to the systems. For Windows systems, you need
an account with Administrator privileges. For UNIX systems, you need access to the root
user account.
How You I nstall SPECTRUM
On the system that you have designated as the SpectroSERVER for your environment,
install SPECTRUM Release 8.0. Log on to http://support.concord.com to obtain the latestService Pack for the release from the Software Downloads page.
Follow the instructions in the SPECTRUM Installation Guide to complete the following tasks:
1. Confirm SPECTRUM prerequisites.
2. Prepare the operating system and optimize the system for best performance.
3. Make sure that you have SPECTRUM license and extraction keys. You obtain the keys
from your CA sales representative when you purchase the software.
4. Install the software and perform any necessary troubleshooting for the installation.
5. Start the SPECTRUM software.
6. Enable access to the SPECTRUM system.
Following the SPECTRUM installation, proceed to the SPECTRUM OneClick installation.
Install OneClick so that you have full administrative access to SPECTRUM.
Note: OneClick is the primary administration interface to SPECTRUM. Use OneClick, rather
than the legacy SpectroGRAPH interface, to perform administrative functions.
54: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 55/182
How Y ou Install SPECTRUM OneClick and Report Manager
On the system that you have designated as the OneClick and Report Manager server for
your environment, install SPECTRUM OneClick and Report Manager for Release 8.0.
Follow the steps in the Report Manager Installation and Administration Guide to complete
the following tasks:
Important: To install the OneClick and Business Objects software, follow the
documentation carefully. Installation failures typically result if you diverge from the
documented steps. Install Business Objects first and select the option to use an existing
Java application server. In the event of a failure, you will have to remove and reinstall
OneClick and Report Manager.
1. Confirm OneClick prerequisites and system requirements.
2. Prepare the operating system and optimize the system parameters for best
performance.
3. Install the software and perform any necessary troubleshooting for the installation.
4. Install the OneClick client to run the application and confirm that you can connect to
the SpectroSERVER system.
5. On the OneClick interface, click the Report Manager tab on the OneClick index page to
confirm that Report Manager installed correctly.
How Y ou Install e Health
On the system that you have designated as the eHealth server for your environment, install
eHealth Release 6.0. Make sure that you log on to http://support.concord.com to obtain the
latest InstallPlus kit for the release from the Software Downloads page.
Follow the instructions in the New Installations of eHealth 6.0 Guide for your system
platform (Windows or UNIX) to complete the following tasks:
1. Confirm system prerequisites and locations for the eHealth and embedded Oracle
software.
2. Install the eHealth and Oracle software, and perform any necessary troubleshooting for
the installation.
3. Make sure that you have the eHealth licenses for the features that you will be using.
You obtain these licenses with the purchase of the eHealth products. For eHealth
release 6.0 GA, note that you need an eHealth SPECTRUM Integration license to
configure and use the integrated solution.
Important: After you complete the eHealth installation, follow the instructions provided in
the section “Configuring the Integrated Solution” in this chapter. Do not follow the
instructions to start the eHealth console and begin discovering your resources as elements.
For the integrated solution, you will discover eHealth elements by importing the SPECTRUM
configuration from the SpectroSERVER system. This simplifies the administration tasks for
eHealth discovery. For a description of the eHealth administration tasks and interfaces, see
the eHealth Administration Overview Guide.
55: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 56/182
How Y ou Install e Health for Voice
On the system that you have designated as the eHealth for Voice server for your
environment, install eHealth for Voice Release 4. You can install eHealth for Voice in its
entirety on one PC. You can also install the software in a distributed configuration on
several PCs where one is the database server and the others are client systems that can
access the database server for reports and administration tasks.
The Database Manager server requires a Microsoft SQL Server database engine. You must
purchase and install Microsoft SQL Server before installing eHealth for Voice. You can
typically accomplish this by installing Microsoft SQL Server 2000 on the PC which is to hold
the eHealth database.
Note: Microsoft SQL Server is required only on the PC that will contain the eHealth for
Voice database (the Database Manager installation); it is not required on agent-only or
client-only machines.
Follow the instructions provided in the eHealth for Voice Operations Guide to complete the
following tasks:
1. Confirm system prerequisites.
2. Install the software and perform any necessary troubleshooting for the installation.
3. Optionally, install client-only servers to access the eHealth for Voice database server.
4. Start the Program Console and define your voice environment:
a. Install the licenses for the platforms to be supported.
b. Start the following services (at minimum):
› Task Scheduler
› Data Collector
› Data Loader
› Policy Manager
c. Define the following:
› Company
› Group
› Collector
› Platform
d. Check the Data Collection queue to verify the scheduled data collection.
5. Set up the SPECTRUM integration by following the instructions provided in the eHealth
for Voice Integration for SPECTRUM Guide.
When you complete the eHealth for Voice installation, follow the instructions to start the
Program Console and define your voice environment; then set up your SPECTRUM
integration by following the instructions provided in the eHealth for Voice Integration for
SPECTRUM Guide.
56: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 57/182
Configuring the Integrated Solution
Using SPECTRUM, SPECTRUM OneClick, eHealth, and eHealth for Voice, you can deploy an
integrated solution for managing your network and voice resources.
SPECTRUM provides the top-level management interface for resource identification, fault
management, and IT network problem resolution. SPECTRUM reduces alarm noise and
detects root causes of problems.
eHealth provides performance management by collecting detailed statistics on your
resources and analyzing that data to detect growing problems and changes in behavior.
eHealth Live Health compares performance to thresholds and service rules, and raises
alarms when resource performance starts to degrade. Health reports and Live Health can
send alarms (traps) to SPECTRUM to reflect these problems in OneClick views.
eHealth for Voice manages the end-to-end service for traditional voice networks as well
as Voice over IP converged networks. It can detect service policy violations and capacity
problems, and send alarms to SPECTRUM to alert network managers through their
OneClick views.
While these products can be used separately to manage and report on network
performance and faults, their combined capabilities provide network managers with a single
top-level view of possible problems and changes in network performance, and the
capabilities to drill down to reports for more information and troubleshooting.
Best Practices
The following sections describe the best practices for configuring CA’s integrated Network
and Voice Management solution. These practices streamline common administration tasks,
and reduce time devoted to managing and maintaining the software configurations.
To configure the integrated solution, follow these primary steps:
1. Identify the network resources that you want to manage using SPECTRUM discovery;
then create Global Collections to organize those resources.
2. Import the SPECTRUM-discovered resources into eHealth using eHealth’s discover
process.
3. Facilitate reporting and management of your resources by organizing related elements
into groups and group lists based on the relationships such as the geographic region,
customer, organization, or department that they support.
4. Schedule eHealth discoveries of Global Collections to maintain the poller configuration.
The following sections describe these steps in more detail, and provide references to
product documentation that provides complete information.
Identify Resources and Use SPECTRUM to Discover Them as Global
Collections
Use SPECTRUM discovery to identify the network resources that you want to manage, and
then create Global Collections to organize those resources into topology views. These views
help network operators track various collections of network entities, organizations, or
services that comprise your infrastructure.
57: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 58/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 59/182
Health can detect changes in behavior, identify potential problems in service degradation or
capacity, and provide insight into performance trends over time.
SET UP THE SPECTRUM IN TEGRATION
Before you can import global collections to eHealth, run the SPECTRUM setup program on
the eHealth system.
To run the e Health SPECTRUM Integration setup program
1. Log in to the eHealth system as the eHealth administrator.
2. Open a terminal window and change to the eHealth directory by entering the following
command, where ehealth is the full pathname:
cd ehealth
3. Run the setup program by entering the following command:
./bin/nhSpectrumSetup
The SPECTRUM Import Setup dialog box opens.
4. Enter the following information when prompted by the setup program:
› Hostname or IP address of the SPECTRUM OneClick server
› Port number for OneClick server Web requests
› Path where OneClick is installed on the server
› Username used to log in to the OneClick server
› Password for the specified user name
5. Click OK. eHealth verifies your settings and displays a message notifying you if they are
valid.
Note: The validation process may take a few seconds.
DISCOVER SPECTRUM GLOBAL COLLECTIONS
Use the eHealth discover process to import the SPECTRUM configuration into eHealth.
To discover a SPECTRUM Global Collection
1. Log in to the eHealth console.
2. Select Setup, Discover.
3. In the Discover dialog, do the following:
a. In the Mode list, select the technology types associated with the resources that you
want to discover.
b. Select SPECTRUM Import and specify the SPECTRUM Global Collection that you
want to import.
c. Click Discover.
59: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 60/182
eHealth connects to the OneClick server, extracts the information from the SPECTRUM
collection, and discovers the appropriate elements.
4. Save the discovered elements to the poller configuration. eHealth automatically begins
polling them to collect performance data.
Organize Your Resources by Creatinge
Health GroupseHealth provides a grouping capability that helps you to organize your elements effectively,
facilitate administration, and simplify reporting. By focusing on a subset of elements —
rather than all elements in your infrastructure — you can manage them more easily as well
as create effective reports that address specific needs. To manage your infrastructure, you
can organize related elements into groups based on geographic regions, customers,
organizations, or departments that they support. To organize your groups, you can
associate them to group lists.
For example, if you wanted to monitor the systems supporting your business within Europe,
you could create a group called England (composed of resources that support offices in that
country), and other groups for each country in which you operate. You could then add those
groups to a group list called Europe and generate reports for the entire group list. To
simplify reporting and administration, you can also filter your element lists based on your
grouping strategy. Before grouping your resources, review the eHealth best practices for
grouping outlined in the eHealth Element and Poller Management Guide.
To create a new group
1. Log in to the OneClick for eHealth console as an administrator who has permission to
manage groups.
2. Select Find Elements in the Managed Resources folder.
3. Select the elements that you want to include. Select Element Chooser to filter the list.
Include a wildcard such as an asterisk (*) to match characters, or a question mark (?)
to match a single character.
4. Right-click and select Create Group with Selected Elements.
5. Specify the first group name and a description. If SmartTree is enabled, append a label
to the group name that reflects the location of the elements and use the selected
delimiter (for example: England-1, Germany-1, or Spain-1).
6. Click OK. The group immediately appears under By Group.
7. Repeat Steps 2 through 6 to create other groups with a suffix. For example: England-2,
Germany-2, Spain-2.
8. Under Managed Resources, select By Group. If SmartTree is enabled, the element tree
displays two separate tiers in an alphabetical hierarchy based on that naming
convention.
60: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 61/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 62/182
Network and Voice Monitoring
Using CA’s integrated solution, you can closely monitor the performance of your network
and voice resources. The eHealth Live Health application provides instantaneous feedback
on trouble spots — telling you where the problems are, when they started, and their
severity. It also identifies growing problems before they become failures, allowing you to
take action and keep your business running smoothly. Live Exceptions sends alarm
notifications to the SPECTRUM interface. You can then run reports from the SPECTRUM
OneClick interface to review eHealth’s analysis of the problems.
To configure the integrated solution to monitor performance, follow these primary steps:
1. Set up Live Health monitoring of eHealth groups and group lists.
2. Forward Live Health traps to SPECTRUM.
3. Customize and schedule Health reports to send traps to SPECTRUM.
4. Configure eHealth for Voice to send alerts to SPECTRUM.
5. Configure SPECTRUM to recognize the eHealth server.
6. Configure SPECTRUM to view eHealth alarms.
The following sections describe these steps in more detail, and reference the product
documentation for complete information.
Set Up Live Health
After you discover the resources that you want to monitor and group them, you can
associate them to a Live Health profile to indicate when performance problems are
occurring. A Live Health profile is a set of alarm rules that eHealth applies to groups or
group lists of elements. Alarm rules define the types of elements and conditions to monitor,
the problem thresholds and duration, and the problem severity.
eHealth provides hundreds of technology-specific profiles for managing your network
resources. For each technology, eHealth offers the following types of Live Health profiles:
Profile Name Description of Purpose
Failure Identifies problems with availability, errors, or other device
failures.
Delay Warns of overutilization or congestion problems which could
cause network delay.
Unusual workload Indicates when an element’s capacity or volume is outside its
typical performance for the baseline.
Latency Identifies when the network latency is slowing down. Latency
is usually measured between the eHealth system and the
device itself.
62: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 63/182
Profile Name Description of Purpose
Configuration
change
Detects when a device’s configuration has changed, such as
module/card insertions to a switch.
Security Warns of problems such as a firewall detecting a “ping of
death” attack, login failures, or unauthorized accesses.
Once you assign a profile to a group or group list of elements, Live Exceptions monitors the
group or group list to look for any activity that violates the specified rules, and produces
alarms when activity triggers any of the rules in the profile. With this integrated solution,
you can configure Live Health to send alerts to the SPECTRUM interface when problems
occur. Carefully review the Live Exceptions web help available with the product to ensure
that you understand performance and how the rules identify performance problems.
FIND L I VE EXCEPTIONS PROFILES
The Live Exceptions feature has hundreds of default profiles that you can use to monitoryour resources. To search and review the profiles that apply to your types of resources, use
the Live Health Profiles tool on the eHealth Certification support site.
To review available Live Exceptions profiles
1. Using a web browser, log on to http://support.concord.com.
2. Click Certification.
3. On the Certification page, click Live Health Profile Descriptions under Certification
Information.
4. Click Element Types to display the various types of resources that eHealth can monitor.
5. Scroll through the list of elements to locate the types that you are currently monitoring
For example, if you are monitoring CPUs, click CPU, Router/Switch CPU(1), Generic
Router/Switch CPU (2).
6. Click a profile name, and review the profile description to determine the types of
problems for which it will raise alarms.
You can also create custom profiles and rules. For a description of how to create rules and
profiles, see the Live Exceptions web help.
START L IVE EXCEPTIONS AND ASSOCIATE PROFILES
To use Live Exceptions, you must log in to the eHealth Web interface and download the LiveHealth client application to your local PC or workstation. Install the Live Exceptions client
following the instructions provided on the download page.
To access the eHealth web interface, use a web browser to navigate to
http://hostname:port, where hostname is the name or IP address of your eHealth system,
and port is the HTTP port used by the web server. If your Web server uses the default port
80, you can omit the port number. You must have an eHealth Web user account to log on
to the eHealth web interface.
63: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 64/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 65/182
4. Specify the following information for the SpectroSERVER under Edit Trap Destination:
› Hostname
› IP address
› Port number
5. Click Add.
6. Confirm that the name of the SpectroSERVER appears in the Existing Trap Destinations
list; then click OK.
7. Select Setup, Notifier Rules.
8. In the Notifier Manager dialog, click New.
9. In the Notifier Rule Editor dialog, do the following:
a. In the Name field, enter SPECTRUM.
b. In the Action list, select Send Trap.
c. In the To NMS list, select the SpectroSERVER that you specified in Step 4.
d. Under When an alarm is, select both Raised and Cleared.
e. Under Elements within, specify either a specific technology type or All
Tech/Subjects.
f. Click OK to save your Notifier rule.
10. Confirm that the Notifier rule appears; then, close the window.
Customize and Schedule Health Reports to Forward Traps
A Health report evaluates the health of a group of elements by comparing current
performance to historical performance over the course of a day, week, or month. The repor
identifies errors, unusual utilization rates, or shifts in volume that warrant investigation.
This report helps you evaluate the health of your resources by monitoring how efficiently
those resources are running, checking for availability of critical resources, and detecting
whether they are beginning to experience problems. The report analyzes trends based on
historical data and calculates averages using a service profile.
You can configure individual Health reports to forward traps for Health exceptions to the
SpectroSERVER. When a scheduled Health report runs, eHealth sends an SNMP trap to the
SpectroSERVER for the top problem of each element in the Exceptions section of the Health
report.
Note: Only scheduled Health reports forward exceptions. If you manually run a Health
report, it will not forward exceptions.
65: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 66/182
To forward exceptions from Health reports
1. Log in to the eHealth Console.
2. Select Reports, Customize, Health Reports.
3. Select the report from which you want to forward Health exceptions.
4. In the Presentation Attributes drop-down list, select General.
5. Select NMS IP and Port Trap Address in the Attribute table.
6. In the Value field, specify the SpectroSERVER IP address and SNMP port number,
separated by a colon. For example:
001.02.03.004:162
7. Click Apply to save.
8. In the Presentation Attributes drop-down list, select Exceptions.
9. Select Send Exceptions SNMP Trap in the Attribute table.
10. Select Yes in the value field.
11. Click OK.
12. Click Save to save the custom report.
13. Select Setup, Schedule Jobs.
14. Select Add Health Report from the list.
15. In the Add Scheduled Report dialog, do the following:
a. Select the report.
b. For the subject, select the technology type and group for the report.
c. Specify a time range for the report, and optionally, a time zone.
d. Select the format in which you would like to output the report.
e. Set the schedule for the job.
f. Click OK.
16. Click OK.
Configure e Health for Voice to Send Alerts to SPECTRUM
To allow SPECTRUM OneClick to show the voice-specific problems in PBXs, messaging
systems, and other voice infrastructure monitored by eHealth for Voice, configure the
eHealth for Voice Policy Manager to send alerts (SNMP traps) to SPECTRUM when a
particular condition occurs. To configure the Policy Manager, you create a policy based on a
defined action plan (the responses assigned to policies) and conditions.
66: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 67/182
To configure e Health for Voice to send alerts to SPECTRUM
1. On the system on which eHealth for Voice is installed, select Start, Programs, eHealth
for Voice, eHealth for Voice. The eHealth for Voice Program Console appears.
2. Select Tools, Service Setup to configure and start the Policy Manager service.
3. Click Configuration, Servers to configure the Email, SNMP, Web, and SPECTRUM
servers.
4. Define the actions to include in the action plan:
a. Click Templates in the Policy Manager group of the console tree.
b. Click Actions.
c. Right-click in the right pane and click New.
d. Complete the details for the action type. Specify information under the Properties
and the Configure tabs.
e. Click Save.
f. Click Cancel to close the window.
5. Create an action plan template to define the responses that you want to assign to the
policy:
a. Click Templates in the Policy Manager group of the console tree.
b. Click Action Plans.
c. Right-click in the right pane and select New.
d. Specify a name, description, time zone, and actions.
e. Click Save.
6. Create a policy based on that action plan:
a. Click Policies in the Policy Manager group of the console tree.
b. Click Global to create a policy based on eHealth for Voice global data.
c. Right-click in the blank area of the right pane, and select New from the menu.
d. Select Blank Policy, and click Next.
e. Specify a name and description.
7. Click Add.
67: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 68/182
8. Define the condition:
a. Specify the name, platform for the element, and data table.
b. Specify the build criteria.
c. Click Apply to save the condition.
9. Define the policy:
a. Select the time zone, operating interval, and timeframe.
b. Specify the number of times the condition should match the policy before triggering
the action plan.
c. Specify the severity level.
d. Select an action plan.
e. Click Save.
Configure SPECTRUM to Recognize the e Health Server
After completing the eHealth setup, you must also configure SPECTRUM to recognize the
eHealth server. This allows you to drill down to eHealth reports, as well as to clear alarms
from the OneClick console.
To enable SPECTRUM to recognize the e Health server
1. Log on to the SPECTRUM OneClick homepage using your SPECTRUM credentials and
click Administration at the top of the page.
2. From the Administration menu, select eHealth Configuration.
3. In the eHealth Configuration window, enter the following information:
› Hostname or IP address of the eHealth server.
› Port number on which eHealth listens for web requests.
› eHealth web administrator user name
› eHealth web administrator password
4. Select Started in the Alarm Notifier Status section to enable SPECTRUM to clear Live
Health alarms.
Note: If you configure eHealth to forward alarms to SPECTRUM, and configure
SPECTRUM to view eHealth alarms, the alarm notifier enables you to clear those alarms
directly from the OneClick console.
5. Click Save.
68: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 69/182
Configure SPECTRUM to View e Health Alarms
If you configured eHealth to forward Live Health alarms or Health exceptions to a
SpectroSERVER, you must also configure SPECTRUM to receive the alarms.
To enable SPECTRUM to view e Health alarms
1. Log in as a SPECTRUM administrator.
2. Select Start Console at the top of the OneClick page to launch the OneClick Console.
3. In the Explorer tab of the OneClick Navigation panel, select your SpectroSERVER, and
then select Universe.
Note: If you are monitoring multiple SpectroSERVERs, select Universe under the landscape
for the Trap Director SpectroSERVER.
4. In the Contents panel, select the Topology tab.
5. In the Topology tab toolbar area, click the Create a new model by type icon. The Select
Model Type dialog appears.
6. Select the All Model Types tab.
7. Select EventAdmin, and then click OK. The Create Model of Type dialog appears.
8. Specify the name and IP address of the eHealth server, and click OK. The eHealth
server appears in the topology as an EventAdmin model.
Note: For more information on creating a model in OneClick, see the Modeling Your IT
Infrastructure Administrator Guide.
9. Select the EventAdmin model in the OneClick Topology.
10. Right-click the EventAdmin model; then select Utilities, Attribute Editor. The Attribute
Editor dialog appears.
11. In the Attributes tree, select User Defined and click add. The Attribute Selector dialog
appears.
12. In the Select Model Type window, select Other, EventAdmin.
13. In the Attributes for EventAdmin window, select
map_traps_to_this_model_using_IP_header, and click OK. The attribute appears in the
User Defined list in the Attribute Editor.
14. Click the arrow that points to the right. The attribute moves to the right window.
15. In the right window, select map_traps_to_this_model_using_IP_header, and select Yes
16. Click Apply. SPECTRUM applies the attributes to the model, and the Attribute Edit
Results dialog appears.
17. Confirm your changes in the Attribute Edit Results window, and click Close.
18. Click OK in the Attribute Editor.
69: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 70/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 71/182
Chapter 6: Gathering System
Information from AgentsSystems are important components of the network. They typically contain your critical
business applications, such as web servers, database applications, e-mail applications, and
other company-critical applications. When their performance degrades, users are unable to
run their applications and perform tasks that utilize those servers. This chapter discusses
how to gather system monitoring information, and outlines best practices for configuring
SPECTRUM and eHealth to manage Unicenter NSM, SystemEDGE, and third-party system
agents.
Deployment and Administration of System Agents
SPECTRUM and eHealth leverage installed system monitoring agents for fault and
performance information. This chapter describes three types of system agents:
CA Unicenter NSM agents
CA SystemEDGE agents
Third-party agents
Important: The installation procedures for the Unicenter NSM and SystemEDGE agents are
described in detail in product-specific installation guides. You must use those guides to
correctly install the agents. After you complete the software installation, review this
chapter to obtain the best practices for configuring SPECTRUM and eHealth to leverage
these agents.
Best Practices
To facilitate the monitoring and management of system agents, follow these best practices:
Ensure that the system agent has been successfully installed.
Configure an SNMP read-only or read-write community string on the system.
Configure the system agent to send traps to the SpectroSERVER.
Confirm that you have specified the correct IP address and community string to discover
the agents.
Confirm that your systems have only one management agent enabled and running on
them. Systems can sometimes have multiple SNMP agents. For example, they could have
the Microsoft SNMP agent and a CA SystemEDGE agent. If multiple agents are running
and responding to SNMP queries, SPECTRUM and eHealth could model both agents for the
one system. For more information, see the Unicenter NSM Agents section later in this
chapter.
71: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 72/182
Supported Agents
The following table highlights the Unicenter NSM agents supported by eHealth and
SPECTRUM based on release. The Active Directory and Performance agents are supported
only for eHealth reporting.
Unicenter NSM r11 Systems Agents Unicenter NSM 3.1 Systems Agents
UNIX System Agent (caiUxsA2) UNIX System Agent (caiUxOs)
Windows System Agent (caiWinA3) Windows System Agent (caiW2kOs)
Active Directory Services Agent
(caiAdsA2)
Active Directory Services Agent
(caiAdsA2)
Log Agent (caiLogA2) Log Agent (caiLogA2)
Performance Agent (hpxAgent) Performance Agent (hpxAgent)
SPECTRUM and eHealth also support all SystemEDGE agents as well as a variety of third-
party agents provided by vendors such as Microsoft, Dell, Sun, HP, and IBM. In addition,
these applications also support any MIB-II or RFC 2790-compliant agents. These
applications provide out-of-box automated fault management, trap support, and
performance reporting and trending.
Note: For agents that support and use the RFC 2790 extensions of MIB-II, SPECTRUM can
perform process, file system, and log file monitoring in addition to basic host systemsperformance monitoring. If you discover agents that do not have the RFC 2790 extensions,
only basic host systems performance and log file monitoring may be possible.
Prerequisites
Before you begin, do the following:
Confirm that you have administrator account access to both the SPECTRUM OneClick
console and the eHealth console.
If you are not familiar with the SPECTRUM OneClick console, see the OneClick
Administration Guide for more information.
If you are not familiar with the eHealth interfaces, review the descriptions of the eHealth
console and the OneClick for eHealth (OneClickEH) console provided in the eHealth
Administration Overview Guide.
How You Add System Agents in SPECTRUM
You can use either of these methods to add the system agents to SPECTRUM:
Automatically discover the system agents using SPECTRUM’s AutoDiscovery application.
Manually add the system agents to SPECTRUM.
72: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 73/182
AUTOMATICALLY DISCOVER SYSTEM AGENTS
SPECTRUM can automatically discover and model your system resources using auto-
discovery capabilities.
To automatically discover your systems
1.
In the SPECTRUM OneClick console, select Tools, Utilities, Discovery, New Discovery.
The Discover dialog appears.
73: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 74/182
2. In the Discovery window, do the following:
› Specify a configuration name.
› Specify an IP range or list, or select Import to import an IP list file.
› Specify a valid community string. If you specify more than one, OneClick uses
the entry at the top first.
› Select Discover Only in Modeling Options.
› Click Advanced Options and specify port 6665 to discover Unicenter NSM agents.
3. Click Discover. The Discovery dialog appears.
74: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 75/182
4. (Optional) After the results set appears, exclude entries by right-clicking them and
selecting Exclude. This prevents those devices from being “modeled” in the SPECTRUM
database.
5. Click Model to add the systems to SPECTRUM.
75: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 76/182
6. Under Model Options, deselect Create Wide Area Link Models and Create LANs. By
deselecting these options, SPECTRUM does not automatically create “subnet”
containers. For more information about discovery and modeling options, see the
Modeling Your IT Infrastructure Administrator Guide.
7. Click OK.
8. After the systems are modeled, click Close in the Discovery dialog.
76: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 77/182
9. (Optional) Click the paper and pencil icon in the left corner of the Tools menu.
10. Edit the topology by moving icons and add background images, as desired.
MANUALLY ADD A DEVICE USING CREATE MODEL BY IP ADDRESS
As an alternative to automatically discovering your systems, you can manually model them.
This procedure works for systems that do not respond to discovery.
To model your system resources manually
1. Using the Explorer tab of the OneClick navigational panel, navigate to the Universe
topology view in which you want the new device to appear. The selected Universe
topology view appears under the Topology tab of the Contents panel. Tip: If you want
to place the new device inside a network group container, double-click the container
icon to display the topology view for that container.
2. In the Topology tab toolbar area, click the Create model by IP address button. The
Create Model by IP Address dialog appears.
Note: To remove a modeled element from a view, select the element and click
Delete (X).
77: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 78/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 79/182
Tips: To move or enhance the appearance of the recently modeled device icon, click the
Edit mode button in the Topology tab toolbar. You can edit and arrange the model devicesusing the following techniques:
To copy or paste the modeled device icon to another topology view other than the
Universe topology, use the copy and paste functions in the Topology tab toolbar area.
To change configuration parameters of a modeled device (for example, community name,
polling interval, logging interval, security string, and so on), select the modeled device
and change the appropriate settings in the Component Detail panel.
Unicenter NSM Agents
You can discover and model Unicenter NSM agents automatically using SPECTRUM
discovery, or you can manually model them. Because Unicenter NSM agents use UDP port
6665 for SNMP communications, by default, rather than the standard SNMP port 161,SPECTRUM can discover and model other agents running on the host device. For example, if
a Windows workstation is running a Unicenter NSM agent bound to port 6665, as well as
the Microsoft SNMP agent bound to port 161, SPECTRUM will create two models for the
device; a Unicenter NSM System Host device model and a Windows Host device model, as
shown in the following figure.
79: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 80/182
This scenario can cause poor performance for the following reasons:
It creates unnecessary duplicate models in SPECTRUM.
It causes redundant SNMP traffic and polling which can reduce network and SPECTRUM
performance.
It reduces performance of the agent host machine because multiple management agents
are providing performance data.
To avoid this scenario, do the following:
1. Before discovering and modeling, stop and/or remove all management agents except
the one that you want to use to manage the system. By doing this, you can avoid
creating and managing multiple models in SPECTRUM for the same host. Remember to
use the correct SNMP port for the discovery.
2. If you must run more than one agent on a given host system, consider manually
modeling only the agent that you want to manage with SPECTRUM.
How You Add System Agents in e Health
After you discover and model systems using SPECTRUM, import a SPECTRUM Global
Collection to add those system resources to eHealth for reporting and Live Health
monitoring. You could add the systems using eHealth discovery as well, but as a best
practice for the integrated solution, import the systems from SPECTRUM as described in
Chapter 5. After a few eHealth poll cycles, you can run At-a-Glance reports and Trend
reports from the OneClick interface.
Performance Reporting On System Agents
eHealth normalizes common performance data across all managed system agents
(Unicenter NSM, SystemEDGE, and third-party). By presenting all performance data in a
common and understandable format, this minimizes the learning curve for all users who
access real-time and historical trending reports.
At-a-Glance Reports
An eHealth At-a-Glance report for system elements provides summary capacity statistics fo
the specified system including CPU, interface, and partition utilization; disk faults and I/O;
and system availability. With these reports, you can quickly isolate busy CPUs or full disks
and compare groups of systems. A sample At-a-Glance report for a system element follows.
80: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 81/182
81: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 82/182
HOW Y OU RUN AT-A-GLANCE REPORTS
You can run At-a-Glance reports using one of these methods:
In SPECTRUM OneClick, right-click a device and click At-a-Glance Reports. The integrated
At-a-Glance report runs in the background and appears automatically in a web browser
on your system.
With a web browser, log in to the eHealth Web interface at the URL
http://hostname:port, where hostname is the name or IP address of your eHealth
system, and port is the HTTP port used by the web server. If your Web server uses the
default port 80, you can omit the port number. You must have an eHealth Web user
account to log in to the eHealth web interface. Navigate to the Run Reports tab and run
an At-a-Glance report on demand.
MyHealth Reports for Systems
The MyHealth report page on the eHealth Web interface contains a series of charts that are
tailored to your particular interest. MyHealth provides eHealth web users with one or more
customized reports on the elements and groups that they consider critical. A MyHealth
report page contains one or more panels, and each panel contains a separate chart.
Health Reports for Systems
A Health report contains information about the performance of a group of elements for a
report period and alerts you to situations that require your attention. The report also
identifies situations to investigate because of errors, unusual utilization rates, or excessive
volume.
You can use a Health report to do the following:
Identify normal and exceptional system behavior.
Compare the performance of a group of elements during a report period to their
performance over a baseline period.
Detect changes in behavior that indicate imminent or existing problems.
Identify trends in volume.
Identify systems that require further investigation.
Using Live Trend
You can use Live Trend to create charts that monitor statistics elements that you are polling
using eHealth. You can create a single chart or multiple charts in various styles to represent
element trends (a single element with multiple variables) or variable trends (a single
variable for multiple elements). The following chart shows a Live Trend chart for four
variables on a system called atlanta.
82: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 83/182
START L IVE TREND
To use Live Trend, you must log in to the eHealth Web interface and download the Live
Health client application to your local PC or workstation. Install the Live Health client
following the instructions provided on the download page. You can then start the Live Trend
application to run real-time performance charts for your systems and resources.
To start the Live Trend application
1. Make sure that you have downloaded and installed the Live Health client software from
the eHealth Web interface.
2. Do one of the following to open the Live Trend application:
› If your system is a Windows system, select Start, Programs, eHealth, Live Trend.
Your program group name will vary, depending on the name that you used when
you installed the Live Health client.
›
On a UNIX system, change to the Live Health client installation directory and runthe command nhLiveTrend.
3. In the eHealth System field in the Live Trend application window, specify the name of
the system to which you want to connect, and then specify your user name and
password. The Live Trend Chart Definition Manager appears.
You can create your own charts through the Live Trend Chart Definition Editor to specify the
elements and variables for which you want to view data. For more information, see the Live
Trend web help that is accessible from the eHealth Web interface.
83: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 84/182
How You Run Trend Reports for Systems
You can use Trend reports to determine the value of one or more variables for your
systems over a specified report period. This can help you to track the values of the
variables to determine when values might have changed radically or when a particular
event, such as a reboot or missed poll, occurred.
The Trend variables differ for each element type. You can run reports for the following
types of systems and system components:
CPU
Disk
LAN
Process and process set
User or system partition
WAN
Each of these types includes specific variables on which you can run reports. For example,
server disk elements have variables for disk reads and writes, storage capacity, and storage
utilization. You can select up to ten variables at a time on which to run a Trend report. For
a complete list of system Trend variables, see the eHealth web help.
The following sample Trend report shows several common system variables:
Total Bytes
Total Incoming Bytes
Total Outgoing Bytes
System Calls
84: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 85/182
RUN A TREND REPORT
You can run Trend reports from the eHealth web interface.
To create a Trend report similar to the example above
1. Use a web browser to log on to the eHealth web interface.
2. Click the Run Reports tab.
3. Scroll the Available Reports frame to the Trend reports section.
4. Click Standard.
5. Select the System Element Type.
6. Scroll the Elements list and select the target system element.
7. Scroll the Variables list and select variables; the sample report shows the four variables
Total Bytes, Total Incoming Bytes, Total Outgoing Bytes, and System Calls.
85: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 86/182
8. Select the chart type such as a stacked line chart.
9. Scroll the right frame and click More Options.
10. Select Show Summary Statistics in the General tab to show the tabular data below the
chart.
11. Click Generate Report. eHealth processes the report data and displays the Trend report.
Top N Reports
A Top N report lists all of the elements in a group that exceed or fall below the report
criteria goals that you specify. You can also specify the goal for each variable. eHealth
calculates the difference between the actual value for that variable and the goal that you
have set.
What-If Capacity Trend Reports for Systems
The eHealth What-If Capacity Trend report enables you to perform capacity planning by
adjusting factors for capacity and demand until you have devised an appropriate what-if
solution. By giving you the capability to illustrate possible future scenarios, this report helpsyou prepare for problems before they occur.
86: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 87/182
Chapter 7: Service Level
ManagementAt the core of CA’s solution is a concept called Business Service Intelligence (BSI) — a
methodology for understanding the relationships and impact of IT infrastructure on
business services. BSI delivers Technology Relationship Mapping, impact analysis, and RCA
that enables our customers to evolve their IT organizations from being tactically reactive to
strategically proactive, while improving IT service quality from a customer and business
perspective.
BSI provides adaptive analytics that communicate bi-directionally with thousands of multi-
vendor, multi-technology devices to identify, verify, and solve complex problems using
model-based, rule-based, and policy-based correlation engines. Business service definition
and on-going maintenance issues are eased through automation, while asset, availability,
capacity planning, change management, performance, and trend analysis validate SLA
compliance. BSI provides a bottom-up approach to Business Service Management (BSM)
that is practical, achievable, and delivers rapid time-to-value.
BSM provides the most obvious value when the basic fault management data is insufficient
and it requires additional correlation to determine the impact that may have occurred as
the result of a fault and to identify the business services that may be impacted. The
SPECTRUM Service Management module features the ability to organize, analyze, and
control all aspects of this area. It also provides a dashboard view as an extension of
OneClick that focuses directly on service health and hides the complexity of topology that is
normally seen using OneClick.
In general, the approach to service management can be described as “top-down” to identify
the relationships and dependencies of devices, systems, applications, or performance
measurements. Within SPECTRUM, they are referred to as resources (models or data) and
relationships. These resources and relationships are organized into service or subservice
models. You should define a service from the bottom up to permit the future reuse of
common services or subservices. You can configure SLAs to dynamically measure violations
and send alerts.
This chapter describes an approach to designing and implementing a service management
system within the SPECTRUM application. Unlike most of the functions within SPECTRUM,
preparing for service management may involve considerable planning to determine all of
the required information and implications.
The SPECTRUM methodology is designed to evolve over time. As more information becomes
available as the implementation proceeds, a more granular representation and
measurement of service modeling and management typically emerges.
Additional References:
Service Manager User Guide
Report Manager User Guide
SERVICE Performance Manager User Guide
87: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 88/182
Interview Procedures
This section describes an approach to designing and implementing the interview process for
service management.
Interview Questions
To organize and implement service management and service level management, documentand collect responses to the following interview questions:
Which business services do you want to monitor?
Which particular resources support those services?
› Processes?
› Software applications?
› IT devices?
How can conditions and faults that affect services be detected?
Which resource attributes should be monitored to determine the health of a service?
Who should be notified if a given service fails?
What are the SLAs, and how should they be quantified (metrics)?
What is the criticality of a given service relative to other services or subservices?
General Questions
To organize and implement service management and service level management, document
and collect responses to the following general questions:
Which WAN and LAN technologies support the service?
Are QoS CoSs currently set up?
Are the MPLS-based VPNs that are currently in use being monitored?
Are all of the critical network devices and servers manageable and being managed?
Are all elements being properly discovered and mapped down to layer 2 or layer 3 as
appropriate?
Are thresholds configured on your critical interfaces?
› Error rate?
› Discard rate?
› Load, etc?
Do any environmental monitors need to be monitored (temperature or humidity)?
Do any power systems or battery backup systems need to be monitored? Are the critical log files or windows event logs being monitored?
Are the critical processes or windows services being monitored?
88: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 89/182
Are application ports being tested?
› File Transfer Protocol (FTP)?
› Hypertext Transfer Protocol (HTTP)?
› Domain Name System (DNS)?
For more advanced needs, are custom thresholds/alarms configured?
Can any model attributes be used to determine the health of a resource?
Have unique alarms been created using event or condition correlation?
Are any existing or custom integrations enabled with alarm data that can be used?
Of the IT resources listed, who is responsible for the proper operation of each?
Does each individual have access to the correct tools, and do they have their contact
information (email, phone, etc) available for distribution?
Do each of the resources have a corresponding troubleshooter, or are troubleshooters
added for proper notification?
Which users benefit from the IT resources listed?
Rate each user relative to each other:
› Low
› Medium-low
› Medium
› Medium-high
› High
Do logical groups of users exist for ordering purposes?
› Department
› Function
› Role, etc
Is the device criticality defined and/or measured for all of the network devices and
servers?
Analysis and Mapping Procedures
This section describes an approach to analyzing and mapping the results of the interview
process for service management.
How You Organize the Resource Information
Follow these best practices to organize the information that you collected for your services.
Sort the information by common information types such as the following:
› Application names
› Server names
› Device names and or types
› Metrics measurement and sources of that data
Identify logical groupings of these common resources to avoid duplication.
89: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 90/182
How You I llustrate the Relationships of Resources to Each Other
Create a diagram that shows how the resources relate to each other. The diagram can help
you to map the way in which resources depend on or impact each other.
How You Decompose the Information and Mapping to Service
Models
The information that you gather and prepare can help you to build the service models that
you need to monitor with SPECTRUM.
Take a bottom-up approach to create the most common resource models first.
Create a service model by creating a relationship to the proper subservice models.
Add service-specific resources not available via subservices.
Example of a Business Service Map to Service Models
Customer ABC has identified a critical business process. When their clients place phone
orders, operators enter these orders into a web-based order processing system. These
orders are stored and processed from an Oracle database. Because many problems can
occur throughout this process, customer ABC wants to build a service that will indicate
when the order processing is adversely impacted. In the interview process, some of the
critical items were identified and then grouped in the following hierarchy:
Web server (WEBORDER1)
› Dell hardware, running SNMP agent (RFC 2790 or equivalent)
› Microsoft Internet Information Services (IIS) web server
› Log file with critical data flow entries
› CPU and memory need to be monitored
› APC uninterruptible power supply (UPS) battery backup
› Proper response from web server required
Oracle Database Server (WEBDB1)
› Dell hardware, running SNMP agent (RFC 2790 or equivalent)
› Oracle Database with Oracle Intelligent Agent
› CPU and memory need to be monitored
› APC UPS battery backup
Cisco 6509 Catalyst switches
› DATASW1 responsible for Server connections
› DATASW2 responsible for Operator Workstation connections
25 Operator Workstations
› DNS service monitoring is required
› Dynamic Host Configuration Protocol (DHCP) service must function
90: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 91/182
By posing more questions such as the following, you can discover the criticality of items by
possible faults:
What are the most catastrophic failures that could occur?
Is it possible to measure all of the items chosen?
What would be the criticality of each item relative to every other item?
Can any items within the list be reused, or are any necessary as a generic service to
other IT business processes?
What are the processes by which you want to manage problems?
What would be more critical, losing 25% of the workstations, or losing the switch used to
connect the servers?
Start by grouping the most critical assets and the most critical outages. Most certainly, the
loss of the servers or switches would be the most catastrophic failure to occur, so begin by
grouping items as follows:
SERVICE: Web Order Processing
› Components – WEBORDER1, WEBDB1, DATASW1, DATASW2
If any component is down, service is down.
Ports on switches with server connections
If either port is down, service is down.
› Components – operator workstations
If 75% of the workstations are down, service is down.
If 50% of the workstations are down, service is degraded.
If 25% of the workstations are down, service is slightly degraded.
› Performance – web response time, TCP port for Oracle
If both are critical, service is down.
If one or the other is critical, service is degraded.
If one or the other is violated, service is slightly degraded.
› Alarm condition of all four resources
General criticality for alarm conditions (minor, major, or critical)
It is also necessary to determine how users are affected when these business services are
impacted. To put it simply: who is affected when your business service is impacted, and
how critical is that person? A customer who cannot access your sales website will be very
inconvenienced; therefore, that customer is very likely a critical (very important) user.
If your internal users cannot access an internal web server that is not very important fortheir day-to-day tasks, assign a much lower criticality to that problem. Answer the following
questions to help ascertain the impact of our business process:
Of the server users listed, can you sort the list of users by relative importance?
Once listed, can you organize these users or customers by company, organization,
department, or role?
91: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 92/182
You also need to consider the network services. Although more general, network services
such as the following do affect the business service: DNS, DHCP, and e-mail. You can set
up response time tests and use a service to monitor the servers providing the service;
however, you should treat them slightly differently. Since these services may be a common
dependency for other services, you should create them with reuse and modularity in mind.
An example of a DNS and DHCP service might look like this:
SUBSERVICE: DNS
› Components – DNS servers SERVER-DNS1 and SERVER-DNS2
If both servers are down, the service is down.
If one server is down, service is slightly degraded.
› Response time tests – test DNS response time
If response time is violated, service is slightly degraded.
› Alarm condition of both resources
General criticality for alarm condition (minor, major, or critical)
Creating Service Models and Relationships
This section introduces service modeling concepts and techniques. Before creating service
models, you should gain an understanding of a few key concepts.
Key Concepts
Resource Monitoring Every service model is a resource monitor that actively monitors
its resources to determine its own service health. Service resources are SPECTRUM
models, and virtually any model could be a service resource. Service resources might
consist of device models, interface models, SPM tests, process models, and even other
service models. To monitor a resource, the service watches specific attributes of the
resource model. A service model can monitor any attribute whose values are whole
numbers. This behavior of a service watching the attribute values of its resources is called
resource monitoring.
Service Health Service health is represented by a small set of values: up, down,
degraded, and slightly degraded. Each resource monitor determines its own service
health based on attribute values from its resources. Specifically, a service health policy is
applied to the collective attribute values from all resources. A policy is essentially a
formula which calculates a service health value based on one or more resource attribute
values. The logic applied by the policy is encapsulated into a set of policy rules. Each rule
is a statement which, when evaluated, will be labeled as true or false. When a policy is
evaluated, the first rule that is found to be true, or the first rule satisfied, determines the
service health taken on by the service or resource monitor.
Root Cause and Service Impact Considering that a service determines its own healthby monitoring its resources, a logical relationship exists between resource outages and
service health. This relationship is expressed in terms of root cause and service impact.
When a resource outage results in a change in the health of a service, that outage is the
root cause of the service health change. Likewise, when a resource outage affects service
health, the outage has a service impact. These concepts become very important for users
who must address service outages.
92: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 93/182
Hierarchical Service Modeling As mentioned above, each service is a resource
monitor that determines its own health by applying a policy to a set of attribute values
from its resources. It is important to note that a service can monitor resources that are
actually other services. This allows for the creation of service hierarchies; thus, a user
can build services with components of other services. This allows for service modeling to
extend from very low-level fundamental services to high-level conceptual service models.
How You Create Service Models
The process of creating a service model is composed of two primary steps:
1. Select resources.
2. Select the policy that monitors the resources.
The following examples show how a user might create service models representing a web-
based service. For more information about creating service models, see the Service
Manager User Guide.
Example 1: A Customer Account Access Service
Determining the resources of a particular service can seem like a daunting task. In many
cases, it is not possible for you to consider all possible components of a service, and then
map how each component might impact a given service. One distinct advantage of
SPECTRUM Service Manager is that you can start with small, simple models, and continually
refine their service modeling as you gain a better understanding of the service components
and how each one impacts the overall service.
Although understanding all components of a service is difficult, it is usually easy to identify
some of the most critical components, which provides an effective starting point. For
example, consider a simple web service used by a phone support organization to access
customer account data. This service will be referred to as the Customer Account Access
Service.
With just this basic, general description, you can begin to identify some of the service
components. As this is a web-based service, it must be supported by one or more web
servers. In addition, the service is providing access to information from a database of
customer accounts. This implies that the database is likely hosted on one or more systems.
For this example, consider an environment with two web servers, and two database
servers. This provides a starting point for modeling the service. If both web servers, or both
database servers, are down, the entire service will not work; as long as one web server and
one database service is up, the service will run, even though it will likely experience some
degradation. This very simple description provides the basis for creating the Customer
Account Access Service.
To begin device modeling in SPECTRUM, consider each web server and database host to
be resource models. Monitoring the contact status of these device models will determine
if the systems are up. As mentioned in the Key Concepts section, each service is a
resource monitor.
SPECTRUM offers a basic formula for service health which provides a general
understanding of the availability of these four service resources. The following table
presents a matrix containing each component and how its status (up/down) would affect
the service relative to the status of the other resources.
93: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 94/182
Service Health Matrix Table
Web Server 1 Web Server 2 DB Server 1 DB Server 2 Service
ESTABLISHED ESTABLISHED ESTABLISHED ESTABLISHED UP
LOST ESTABLISHED ESTABLISHED ESTABLISHED SLIGHTLY
DEGRADED
ESTABLISHED LOST ESTABLISHED ESTABLISHED SLIGHTLY
DEGRADED
ESTABLISHED ESTABLISHED LOST ESTABLISHED SLIGHTLY
DEGRADED
ESTABLISHED ESTABLISHED ESTABLISHED LOST SLIGHTLY
DEGRADED
LOST ESTABLISHED LOST ESTABLISHED DEGRADED
LOST ESTABLISHED ESTABLISHED LOST DEGRADED
ESTABLISHED LOST LOST ESTABLISHED DEGRADED
ESTABLISHED LOST ESTABLISHED LOST DEGRADED
LOST LOST ESTABLISHED ESTABLISHED DOWN
LOST ESTABLISHED LOST LOST DOWN
ESTABLISHED LOST LOST LOST DOWN
ESTABLISHED ESTABLISHED LOST LOST DOWN
LOST LOST LOST LOST DOWN
This table indicates that if both web servers or both database servers are down, the service
is down. If one web server is down and one database server is down, the service is
degraded. If any one server is down, the service is slightly degraded. This is a very
simplified approach, but it demonstrates a good starting point.
From this, you can consider how to monitor each component. The table shows thatparticular combinations of status values result in specific levels of service degradation.
Essentially, you can classify the resources into web server components and database
components, and think of the grouped resources as services within a service. To enable the
Customer Account Access Service to function, the web server components and database
server components must be functioning. Within the Customer Account Access Service,
small, more discrete, subservices may exist.
94: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 95/182
Considering that each service is also a resource monitor, this example is a good case for
creating resource monitors within the Customer Account Access Service, as shown below.
Resource monitors allow you to organize resources, and monitor them based on specific
criteria with knowledge of how it will impact the service. The resource monitor becomes an
abstraction of multiple resources, and reports a health value based on the collective status
of the resources it monitors. That knowledge is the basis for a service resource monitoring
policy.
In this example, it was established that the contact status of each device model can be
monitored to determine its availability. In addition, the table indicated how different
combinations of contact status values impact the service. Looking first at the web servers, a
policy can be produced which will adequately report the status of the web server
components as a whole. These statements can be called the web servers redundancy
policy.
WEB SERVERS REDUNDANCY P OLICY
When the contact status of all web servers is lost, the web server component of the
service is down.
When the contact status of any one web server is lost, the web server component of the
service is degraded.
The web components and database components are described as services within a service.
That concept is important when dealing with groups of resources that support a specific
aspect of a service. In all cases, if both web server machines are down, the Customer
Account Access Service is down. However, just one web server down does not necessarily
indicate that the Customer Account Access Service is down, or even degraded. You can
think of the web servers, collectively, as a component of the service in that when one of
those servers is down, that component of the service is degraded. This might not be
completely clear yet, but as the service model evolves, it will become apparent why this
approach should be taken.
The impact of loss of contact with the database servers mirrors that of the web servers.
These statements can be referred to as the database servers redundancy policy.
DATABASE SERVERS REDUNDANCY P OLICY
When the contact status of all database servers is lost, the database component of theservice is down.
When the contact status of any one database server is lost, the database component of
the service is degraded.
The web servers and database servers have been described collectively as a web server
component and a database component. Consider how the web server and database server
components impact the Customer Account Access Service.
95: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 96/182
If both web servers — or both database servers — are down, the service is down. These
were organized into two groups: a web server component and a database component.
These can be labeled as the Web Servers resource monitor and the Database Servers
resource monitor. In review, each resource monitor determines its own health value, based
on the resources that it is monitoring. The Web Servers resource monitor determines its
health based on the contact status of both web server 1 and web server 2. The Database
Servers resource monitor determines its health based on the contact status of databaseserver 1 and database server 2.
Encapsulating the web server and database server systems into resource monitors
considers these statements, which will be called the Standard Account Access Policy.
STANDARD ACCOUNT ACCESS POLICY
When any resource monitor is down, the Customer Account Access Service is down.
When all resource monitors are degraded, the Customer Account Access Service is
degraded.
When any one resource monitor is degraded, the Customer Account Access Service is
slightly degraded.
Although redundancy exists within each resource monitor, if either resource monitor is
down, the overall service is down. Looking back at the table of Contact Status and Service
Health values, this design can be validated. You can use the following three scenarios to
test the design.
DESIGN TEST SCENARI OS
Web Server 1 is down. This will cause the web server resource monitor to become
degraded; the database servers are not affected, so the database servers resource
monitor is up. To apply the rules defined in the account access policy:
› The first rule is not satisfied because neither of the resource monitors is down.
› The second rule is not satisfied because the Database Server resource monitor is
up, and not degraded.
› The third rule; however, is satisfied, because the Web Server resource monitor is
degraded.
In this scenario, the Customer Account Access Service will report slightly degraded.
Looking back at the matrix, when web server 1 was down and all other devices were up,
the overall service health should be considered slightly degraded. The design works for
this scenario.
Web Server 1 is down and Database Server 1 is down. Based on the implementation
described above, this would result in both the web server’s resource monitor and the
database server’s resource monitor becoming degraded. By evaluating the account access
policy, the second rule is satisfied and the Customer Account Access Service will bedegraded. By reviewing the matrix, when web server 1 and database server 1 are both
down, the overall service health should be degraded, so, again, this design works
correctly.
Database Server 1 and Database Server 2 are down. If this was the case, the database
server’s resource monitor would be down. In review of the Account Access Policy, the first
rule is satisfied; thus, it produces a result of down. By reviewing the matrix, when both
database servers 1 and 2 are down, the overall health of the service should be down.
96: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 97/182
Although it is a very simple example, this process has identified the resources of a service
and how to monitor them. Despite its simplicity, this implementation provides the
knowledge to correctly report the health of the Customer Account Access Service for
thirteen different fault scenarios involving the systems which host the web servers and
database server applications. Obviously, this implementation is not yet very robust because
it is only monitoring the four systems as up or down. Before extending the Customer
Account Access Service, review the following steps to implement this design usingSPECTRUM Service Manager.
IMP LEMENT EXAMPLE 1 IN SPECTRUM
You create Service Models in SPECTRUM using the Service Editor, which you launch from
the Tools, Utilities menu of the OneClick Console.
To start building the service model
1. In the OneClick Console, select Tools, Utilities, Service Editor.
2. Click Create.
3. Specify the policy name Web Server Contact Monitor, and a description and securitystring.
4. Click the Locate resources and containers button (binoculars).
5. In the left pane of the Locate Resources dialog, click Devices, Devices, By Model Name
(or By IP Address). Locate the selected search (binoculars). Specify search criteria
(leading and trailing wildcards are implicit for model name) and click OK.
6. In the right pane, select all server models that you would like to associate with this
service model.
7. Click Add Selected to Monitored Resources.
8. Click Close.
9. Click Select to display the Select Policy dialog. The resource monitor will use the Web
Servers Redundancy Policy described previously in this chapter.
10. In the left pane, select Contact Status as the Value Map.
11. Click New in the Rule Set, and name the rule set Web Server Redundancy Rules.
12. Click Add to create the first rule: Rule Type All, When all are Down, the service is
Down.
13. Click OK.
14. Click Add to create the second rule: Rule Type Any, When any 1 are Down, the service
is Degraded.
15. Click OK.
16. Click Create in the Create Rule Set dialog.
17. Click OK.
97: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 98/182
18. Click Create in the Create Service dialog.
19. Repeat Steps 2 through 10 to start creating the Database Server Contact Monitor.
Note: This policy will be identical to the Web Server Redundancy Rules.
20. In the right pane, select Web Servers Redundancy Rule and click Copy.
21. Define the new rules name as Database Server Redundancy Rules.
22. Click Create.
23. Click OK to close the Select Policy dialog.
24. Click Create in the Create Service dialog.
25. Click Create to start creating the top level service (Customer Account Access Service)
of hierarchal structure.
26. Specify the service name Customer Account Access Service, and, optionally, a
description and security string.
27. Click the binoculars, and then click Locater, Services, Services, All. Launch the selected
search (binoculars).
28. Select the Landscape, if it appears.
29. Select the Web Server Contact Monitor and Database Server Contact Monitor Services.
30. Click Add Selected to Monitored Resources.
31. Click Close.
32. Click Select to display the Select Policy dialog.
33. In the left pane, select Service Health as the Value Map.
34. Click New in the Rule Set, and name the rule set Standard Account Access Policy, which
is based on the following set of rules:
› Rule Type Any: When any 1 are Down, the service is Down.
› Rule Type All: When all are Degraded, the service Degraded.
› Rule Type Any: When any 1 are Degraded, the service is Slightly Degraded.
35. Click Create in the Rule Set dialog.
36. Click OK.
37. Click Create in the Create Service dialog.
38. Close the window.
98: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 99/182
REVIEW EXAMPLE 1
The design for Example 1 includes one Service monitoring two Resource Monitors. This is a
two-tiered approach in which each Resource Monitor consolidates the status of its own
resources, and then reports the result as its service health. The Customer Account Access
Service then determines its own service health, based on the collective service health of the
two resource monitors. This pattern encompasses an important abstraction that is essential
to understanding service management.
Each service and resource monitor performs two tasks:
Monitors those “resources” to which they are related.
Determines its own service health by applying values from those resources to a policy.
Consider these questions regarding the implementation of Example 1:
Does the Customer Account Access Service have any knowledge of database server 2?
The three test scenarios do not mention the Customer Account Access Service monitoring
database server 2; however, in scenario 3, when both database servers 1 and 2 are down,
the Customer Account Access Service correctly determined that its service health should be
down.
How did it work?
The Database Servers Resource Monitor determined that its own health was down. The
Customer Account Access Service, which monitors the Web Servers Resource Monitor and
the Database Servers Resource Monitor, determined that it, too, should be down. When
evaluating its Account Access Policy, it found that one of the resource monitors was down
and, therefore, its own health should be down. Database servers 1 and 2 are “resources” of
the Database Servers Resource Monitor, and the Database Resource Monitor is a “resource”
of the Customer Account Access Service. Each component determines its own health based
on its resources.
Example 2: Extend the Service to Monitor Critical Processes
Example 1 describes how to design and implement a very simple service using two resource
monitors. Although this is a legitimate service, it is not a very complete one. In revisiting
the Customer Account Access Service, you could expand the monitoring of service
components in several ways. So far, only the Contact Status of those devices hosting the
web servers and database servers has been incorporated into the service. Device
availability alone does not ensure that you will be able to obtain customer account
information.
You need to also consider that a web server is an application that supports web
transactions. This application must be running in order for customer account access
requests to be processed. Considering the criticality of these web server systems, it is
logical that they will also host an agent supporting process monitoring, or host information
MIB such as defined by RFC 2790. This allows a user to actually monitor the web server
process itself.
You can use a process model to determine if a particular process is actually running on a
device. Considering that the web server system might be up, but the web server application
might not running, additional monitoring of the web server application processes is
important to correctly determine the overall health of the Customer Account Access Service
99: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 100/182
At first, it might appear simple enough to just add another resource monitor to watch the
Condition of the web server process model and treat the availability of each process
redundantly, in the same way as the device availability is monitored. Consider the following
table, which shows a breakdown of potential fault scenarios and how each combination
affects the availability of the web servers in terms of being able to process a request. This
table demonstrates what is often called a “high-sensitivity policy.”
Service Health Matrix – Servers and Processes
Web Server 1 Web Server 2 Process 1 Process 2 Web Service
Health
ESTABLISHED ESTABLISHED NORMAL NORMAL UP
ESTABLISHED ESTABLISHED CRITICAL NORMAL DEGRADED
ESTABLISHED ESTABLISHED NORMAL CRITICAL DEGRADED
ESTABLISHED ESTABLISHED CRITICAL CRITICAL DOWN
LOST ESTABLISHED CRITICAL NORMAL DEGRADED
LOST ESTABLISHED CRITICAL CRITICAL DOWN
LOST LOST CRITICAL CRITICAL DOWN
The following table replaces the individual devices and processes with the resource
monitors that could be used to monitor them.
Service Health Matrix – Devices and Processes
SERVER DEVICES SERVER PROCESSES WEB SERVICE
UP UP UP
UP DEGRADED DEGRADED
UP DOWN DOWN
DEGRADED DEGRADED DEGRADED
DEGRADED DOWN DOWN
DOWN DOWN DOWN
DEGRADED UP DEGRADED
DOWN UP DOWN
DOWN DEGRADED DOWN
Note: The three rows at the end of this table typically would not happen since a system
that is reported as down should not report that it has running processes. However, the
rules should handle these situations to avoid the possibility of getting into unknown states.
100: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 101/182
Looking at the devices and processes collectively, the table indicates that this is not a case
for a redundancy policy, which appeared to be the first choice when evaluating the
resources. As found in the case of the overall service, the relationship between web service
hosts and the web server processes implies a high-sensitivity rule set similar to this one.
When any resource is down, the service is down.
When any resource is degraded, the service is degraded.
After evaluating this relationship between the server devices and processes, it would seem
that you cannot easily extend the design in Example 1 to include supplemental monitoring
such as including the new process resource models. Because the initial design tried to
encompass a high level service with multiple components, it did not recognize that there
are subservices within the Customer Account Access Service. After extending the
monitoring to the process level, it becomes apparent that a web subservice and a database
subservice do exist. Much like the web servers, you can monitor the database service host
application using process models. A hierarchy is beginning to appear as the resource
monitoring is extended to the process level.
Service Hierarchy
CUSTOMER ACCOUNT
ACCESS SERVICE
WEB SERVICE DATABASE SERVICE
DEVICES PROCESSESDEVICES PROCESSES
It is very typical to discover lower level services which at first did not appear to be
significant enough to warrant a service model. In general, the service modeling process is
an iterative process. Each revision adds additional precision and extends the total number
of fault scenarios that can be correctly reported.
This iterative approach can be summarized in different ways. One way is to consider that
the goal of each revision is to enrich the root cause information which will be available in
the event of a service fault. Looking back to Example 1, if both web server devices were
available, but one web server process was down, the service would not have reported a
fault although service users would have experienced some performance degradation. By
extending the monitoring to the process level, the service would now report the
degradation and the process failure as the root cause. The next section shows how Example
2 can be implemented in SPECTRUM.
Implement Example 2 in SPECTRUM
The design for Example 2 includes the creation of four process models. Two of these
process models will monitor the web server application and the other two will monitor the
database server application. It is likely that a user may identify additional processes which
impact the availability of a particular service component. This approach can be extended to
include those processes as well.
101: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 102/182
To create the process models, you should locate the host model representing the server
machine on which the process is running. In the example, this would be a web server or
database server device model. If the agent on the device supports RFC 2790, you can
create process models for each process that you want to monitor.
To create a process model for each process
1. In the SPECTRUM OneClick console, list the host or hosts in the OneClick Contents
panel.
2. Select the host for which you want to create a monitoring rule.
3. Expand the System Resources section within the OneClick Component Detail view. A
subsection named Running and Monitored Processes appears.
4. Expand the Running and Monitored Processes view to show a section for Running
Processes, which, in turn, reveals a table of processes.
Note: If the text (RFC 2790) does not appear in the section names, the agent does not
support the RFC 2790 extension to MIB-II. You will not be able to monitor processes on
that host and raise alarms when processes start or stop.
a. Right-click a process in the table and select Monitor this process. The Add
Monitored Process dialog appears.
b. Select Alarm on Stop and click OK. Using this setting, the process model will
experience a critical alarm if the corresponding process is stopped. The process
appears in the Monitored Processes view.
102: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 103/182
5. After creating the appropriate process models, launch the Service Editor by selecting
Tools, Utilities, Service Editor, or by right-clicking a process and selecting Utilities,
Service Editor. The goal is to modify the Service that was created in example one to
handle this more complex situation by using similar steps as are outlined for example
one, but now with a deeper hierarchy, adding a new middle layer as well as adding
logic for the processes.
› Using the Condition value map and Redundancy rule set policy, create the service
Web Servers Redundancy Monitor, which watches the web server process
models.
› Using the Service Health High Sensitivity policy, create the Web Service, which
watches the Web Servers Contact Monitor and the Web Servers Redundancy
Monitor. This will require the reparenting of the Web Servers Contact Monitor
from the Customer Account Access Service to this service.
› Duplicate these tasks for the Database Server Redundancy Monitor and the
Database Service.
The Customer Account Access Service will now monitor the Web Service and Database
Service with the Standard Account Access Policy described in the implementation of
Example 1.
REVIEW EXAMPLE 2
Example 2 expanded the monitoring of the Customer Account Access Service to include
monitoring the actual web server process. This example also reveals two distinct
subservices within the Customer Account Access Service exist. Each of these subservices
consists of multiple resources which are monitored in different ways, as shown in the
Service Editor Hierarchy view below.
The table below displays the ever-increasing set of fault scenarios which can be supported
by the existing service modeling.
103: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 104/182
Service Health Matrix Fault Scenarios
Legend:
WSD Web server device
WSP Web server process
DBD Database device
DBP Database process
AAS Customer Account Access Service
DG Degraded service health
SD Slightly degraded service health
DN Down service health
WSD1 WSD2 WSP1 WSP2 DBD1 DBD2 DBP1 DBP2 CAAS
UP UP UP UP UP UP UP UP UP
UP DN UP DN UP UP UP UP SD
DN UP DN UP UP UP UP UP SD
UP UP DN UP UP UP UP UP SD
UP UP UP DN UP UP UP UP SD
UP UP UP UP DN UP DN UP SD
UP UP UP UP UP DN UP DN SD
UP UP UP UP UP UP DN UP SD
UP UP UP UP UP UP UP DN SD
DN UP DN UP DN UP DN UP DG
DN UP DN UP UP DN UP DN DG
UP DN UP DN DN UP DN UP DG
UP DN UP DN UP DN UP DN DG
UP UP DN UP UP UP DN UP DG
UP UP DN UP UP UP UP DN DG
UP UP UP DN UP UP DN UP DG
UP UP UP DN UP UP UP DN DG
DN DN DN DN UP UP UP UP DN
DN DN DN DN DN UP DN UP DN
104: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 105/182
WSD1 WSD2 WSP1 WSP2 DBD1 DBD2 DBP1 DBP2 CAAS
DN DN DN DN UP DN UP DN DN
UP UP DN DN UP UP UP UP DN
UP UP DN DN DN UP DN UP DN
UP UP DN DN UP DN UP DN DN
UP UP DN DN UP UP DN UP DN
UP UP DN DN UP UP DN DN DN
DN DN DN DN DN DN DN DN DN
The table indicates 25 different fault scenarios that can be reported with the
implementation of Example 2. Note the scenario in the row above that is bold. In this
scenario, all critical processes have failed. In this situation, the service is down, but it would
not have been reported as down by the implementation of Example 1.
Example 3: Extend the Service to Include a Response Time
Element
Example 2 enhanced the Customer Account Access service by extending visibility to the
process level. In some situations, the devices may be up and the processes are running,
but the service is not performing optimally. It is often useful to include some level of
performance monitoring as a resource of service components. This is particularly important
when the service health is intended to reflect what an end user is experiencing when using
a service.
In this example, you add a response time element to the Web Service component of the
Customer Account Access Service. Adding the performance element will not only enhance
the service monitoring, it will also test the modularity of the design produced in Example 2.
One goal of service design should be to produce services that you can easily enhance as
you gain more insight into how each service resource can be monitored.
Adding the response time component involves creating Response Time Test models in
SPECTRUM. Many devices and system agents are capable of supporting response time
tests. Since this example is intended to enhance the monitoring of the Web Service
component, you will be creating HTTP response time tests.
The number of tests can vary based on your design. It is generally a good idea to build at
least one HTTP request to each web server. For example, you could select two SPM test
hosts and create two HTTP tests on each. The test host should issue requests to each web
server. This would provide multiple request points to each individual server. The four new
response time tests will collectively comprise a new set of resources within the Web
Service.
You can take two typical approaches to monitoring response time tests:
Monitor the latest error status of each response time test model.
Monitor the aggregate result values of each test model.
105: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 106/182
The second approach is discussed in more detail later. For this example, you will monitor
the latest error status of each test model. The following table maps Latest Error Status
(Response Time) values to equivalent service health values. This process is used
extensively by the SPECTRUM Service Manager. The goal is to normalize pure attribute
values to comparable service health values which can easily be applied to various rule sets.
Service Health Matrix – Response Tests
Response Time Value Equivalent Service Health
OK UP
TIMEOUT CRITICAL
THRESHOLD CRITICAL CRITICAL
THRESHOLD MAJOR DEGRADED
THRESHOLD MINOR SLIGHTLY DEGRADED
Under some circumstances, documentation might indicate acceptable response time levels.
If this is not the case, a useful approach is to create response time tests without thresholds
and review the latency results over a period of time. This will help you to establish baseline
threshold values to ensure that an unusual latency value would result in a threshold
violation.
In Example 2, the Web Service was developed to include a resource monitor for the contact
status of the web server devices and a resource monitor for the condition of the web server
process models. It may be possible to extend the monitoring of the Web Service to include
the response time component by simply adding a third resource monitor which monitors theresponse time test models.
When monitoring these response time test models, the following rule set might be
appropriate:
When all resources are down, the service is down.
When any one resource is down, the service is degraded.
When all resources are degraded, the service is degraded.
Consider how this rule set would apply to a set of response time tests as described above.
If all response time tests experienced a timeout or critical threshold violation, it would
indicate that neither web server was capable of responding. Clearly, this is a criticalscenario and should indicate a down service health.
If any one response time test timed out or violated a critical threshold, it would indicate
that one of the web servers was impacted to such a degree that it could not adequately
handle requests. Considering that some of the other response time tests are succeeding,
it can be surmised that the service is not entirely down, but it is degraded.
If none of the tests were timing out or violating a critical threshold, but all were violating
a major threshold, you could assume that the service health is degraded.
106: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 107/182
Based on the configuration established above, you could enhance the web service by adding
a new resource monitor for the response time tests. The web service component would
function correctly under any of the scenarios described above. By monitoring its resources
with the Service Health High Sensitivity policy, any resource monitor that is down would
cause the Web Service to go down. Likewise, any resource that is degraded would cause
the Web Service to also degrade. It turns out that the design produced in Example 2 can
easily be extended to include the response time element. The following is an example of what the service hierarchy would look like after the addition of the response time
component to the Web Service.
CUSTOMER ACCOUNT
ACCESS SERVICE
WEB SERVICE DATABASE SERVICE
DEVICES PROCESSESDEVICES PROCESSESRESPONSE TIME
IMP LEMENT EXAMPLE 3 IN SPECTRUM
For this example, you will create four HTTP response time tests. You can locate response
time test hosts using the Locator tab in the OneClick Console. The Locator menu has a set
of pre-configured SPM Searches.
Note: To run an HTTP test, you must discover test sources such as SystemEDGE Service
Availability agents, Cisco IP SLA-enabled routers, and Network Harmoni agents using read-
write community strings. For details about response testing and supported agents, refer to
the Service Performance Manager User Guide.
To create response time tests
1. Use the All Test Host search to locate test host models that can measure HTTP
response time to the web servers. (In the Contents panel, expand SPM Searches and
Test Hosts By; then right-click All Test Hosts and select Launch the selected search.)
2. From each designated test host, create new HTTP tests by right-clicking the host in the
table, choose New Test, then select HTTP.
3. Specify the threshold data. Configure the thresholds to ensure that a critical threshold
is generated when the response time is too slow to be usable, and a major threshold is
generated when response time is usable but very slow. Add the destination for the test,
which would be one of the web server hosts.
4. To add the response time tests, use the Service Editor to add a new resource monitor,
Web Server Response Monitor, which uses the Response Time High Sensitivity policy.
The resources can be located by expanding SPM Searches. You can then add the four
response time tests to the new resource monitor. Finally, attach this new resource to
the Web Service.
107: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 108/182
REVIEW EXAMPLE 3
Example 3 describes how you can extend an existing service implementation to include
more sophisticated resource monitoring without altering the service hierarchy. This
flexibility in the service hierarchy design makes it very easy for users to continually
enhance their service models. In addition, Example 3 outlines how to incorporate a
response time component within a service to greatly enhance the accuracy of service health
reporting. Again, this iteration expanded the set of fault scenarios support and enriched the
set of potential root causes of service impact.
Create SLAs
This section introduces SLA modeling concepts and techniques that you should understand
before modeling SLAs.
Note: The following sections up to and including Example 4 provide instructions for tracking
SLA based on business hours. This functionality is included in SPECTRUM Release 8.1. For a
complete description of the available capabilities, see the Service Performance Manager
User Guide.
Key Concepts
SLA Periods SLAs consist of a set of service level objectives or guarantees that are
measured for a given period of time. Commonly, this period of time coincides with a well-
defined billing cycle or a reporting cycle. Frequently, an SLA period will be monthly (that
is, the compliance of an SLA is evaluated on a month-to-month basis). Typically, the
compliance or violation of an SLA will be expressed in terms of a particular period. For
example, you might consider an SLA compliant for the month of January. If the period
was weekly, you might consider an SLA violated for the week of November 5-11.
SLA Guarantees or Service Level Objectives Among other stipulations, an SLA will
include a set of guarantees or service level objectives. In particular, many of these
guarantees relate to the availability and performance of a particular service or set of
services. In typical service provider environments, SLAs often state very specific
guarantees. Users may find stipulations similar to the following: “… certifies uptime at
99.9% monthly…” or “…will credit the customer 1/30th of the monthly service fee in the
event that the customer reports a service outage of 30 minutes or more…” These
statements represent guarantees given by the provider of a particular service. Within the
enterprise environment, SLAs also exist, although an enterprise SLA may be less formal.
It is very common to find SLAs such as “…the IT department guarantees no more than 30
minutes of web access down time per week…” In either case, it is these guarantees or
service level objectives which provide the basis for determining SLA compliance with
SPECTRUM.
Active SLA Monitoring Unlike other SLA management products, SPECTRUM Service
Manager provides active SLA management. This means that within a given period, you
are able to determine the status of the SLA for that period. Based on outage trends, you
are provided a projected status for the overall period. At the beginning of each SLA
period, an SLA is considered unaffected. The unaffected status will persist until some
form of outage causes the SLA to record outage time for the period. An SLA which has
recorded outage time, but is not at a significant risk for a violation, is considered to have
a compliant status.
If additional outage time occurs within the period and the outage time accumulates to
levels where the SLA is approaching violation, the SLA will transition to a status of
108: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 109/182
warned. If outage time for the period continues and specific guarantee thresholds are
reached, the SLA will transition to a status of violated. This transition from unaffected to
a violated state happens at real time. If the SLA period is monthly, and the SLA violates
on the fifth day, you will be aware as soon as the SLA is violated as opposed to waiting
for a report at the end of the period to indicate a violation. Consequently, this active SLA
monitoring allows service providers to take action before the SLA becomes violated.
SLA Time of Enforcement or Business Hours Within the SLA period, a particular
guarantee will be enforced frequently. The SLA may contain statements such as
“…guarantees no outage exceeding 30 minutes between the hours of 8AM and 5PM on
Monday through Friday…” A statement such as this is commonly known as a business-
hour guarantee. Sometimes, multiple guarantees will be based on particular timeframes.
For example, “…guarantees 97.5% availability on a 7x24 basis, with 99.9% availability
between the hours of 8AM to 5PM Monday-Friday, and 8AM to 12 PM on Saturday…”
Although the same service is being measured, this statement actually includes two
guarantees: one guarantee with a 7x24 timeframe, and a second guarantee for specific
hours during the week.
Create SLAs and Guarantees
The first step in creating an SLA is to understand the particular service with which it is
associated and the period during which the SLA will be in effect. The service modeling
hierarchy often has some top-level service model which is logically associated to an SLA.
For example, in the service provider environment, a high-level service such as Customer A
High Speed Data may exist.
Logically, the SLA is a binding of the high-speed data service which is being provided to
Customer A. The particular period may be stipulated in a SLA document or may be
determined arbitrarily, but it must be a timeframe which is agreed upon by both the service
provider and service customer. Monthly SLA periods are very common as they frequently
coincide with a service billing cycle. For example, an SLA period may be in effect from the
first of the month with guarantees based on availability and performance for that month.
Commonly, an SLA will specify restitution guarantees if a customer contacts the service
provider regarding a dispute within a certain number of days from the end of a given
period.
Once the top level service and SLA period have been determined, the user should identify
the SLA guarantees or service level objectives related to the availability and performance of
the service being provided. Often, you can find these guarantees within the SLA document
among other stipulations that are not within the scope of measuring the availability or
performance of a service. You should look for those statements which specify a level of
availability, a guaranteed response time, acceptable level of latency, and so on. In addition,
you should determine if those statements are accompanied by statements that dictate
specific times within the SLA period as to when they are guaranteed.
Having identified the guarantees within an SLA, you should categorize them into availability
and response time guarantees. Availability guarantees within the SLA are frequently
specified as a percentage of availability. However, availability guarantees may be described
in terms of downtime. For example, “…no more than 1 hour of outage time…” Response
time guarantees can be identified either by specific statements such as “…2000 ms or
better response time…” or “…latency not exceeding 5000 ms for more than 30 minutes…”
Availability can be described in a couple of different ways. Previous sections of this
document discussed service health. Typically, you could describe availability as a service
109: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 110/182
being available when it is not down or a service being unavailable when it is down.
However, an availability guarantee might also be described as a service being unavailable
when it is not responsive. This second description can be very important when building
guarantees models.
Response time guarantees measuring services which utilize response time components as
their resources. An interesting point regarding response time guarantees is thatcomponents within a service hierarchy may monitor response time specifically for the
purpose of providing a way to support an SLA’s response-time guarantee. This is actually a
very common scenario.
Frequently, a service hierarchy is built on the foundation of resources that actually comprise
the physical devices and applications providing a user-consumable service. Response-time
tests, despite providing an excellent way to report service health, are often not identified as
service resources until an SLA is applied that stipulates response time guarantees. As
mentioned in previous sections, you can use response-time monitors to identify high
latency or service degradation. The response time tests should report a major threshold
violation when latency exceeds an acceptable level. In addition, you can also use response-
time monitors to report a critical condition when latency reaches an unusable level or
response-time requests time out. Considering this when response time monitors are built,
they can support both the notion of monitoring latency and monitoring availability.
In the case in which a service is considered unavailable when it is not responsive, although
a service designed to report availability will never be guaranteed for response time, a
service designed to report response time can also be used to measure availability.
Example 4: An SLA for the Customer Account Access Service
This section contains an example based on the Customer Account Access Service from the
previous section. It includes an SLA and several guarantees.
In the “Creating Service Models” section of this chapter, you implemented the Customer
Account Access Service. For this example, the Customer Account Access service willrepresent the service being provided by a fictional company called Northeast Data Solutions
(hereafter, referred to as Northeast).
Northeast maintains customer account information for a large number of small businesses.
Each small business is responsible for creating and maintaining its own customer data.
Northeast takes responsibility for supporting and securing the customer account data. In
addition to supporting the databases and web access, Northeast also negotiates with
various Internet Service Provider (ISPs) to provide a local routing device for the remote
customer site to ensure that customers will have reliable internet access to their customer
account information. The relationship with the ISP is transparent to Northeast customers.
They pay Northeast directly for service.
The following items are segments of an SLA provided to each Northeast Data Solutionscustomer:
Northeast Data Solutions provides access to customer account data guaranteeing that
account access for each customer location will be available 99% of each month excluding
those periods of scheduled system maintenance to be conducted between the hours of
12AM to 3AM on each Sunday.
110: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 111/182
Service availability to be restored such that the average outage resolution time is 30
minutes or less, with no individual outage exceeding 1 hour; outages are guaranteed to
not exceed a rate of two or more outages within any 24-hour period.
A standard business hours timeframe to be defined as the hours of 7AM to 6PM Eastern
Standard Time on the days of Monday through Friday of each week.
Within standard business hours, account access to be guaranteed available at 99.5% withno individual outage exceeding 20 minutes.
Average transaction time for initial account access is not to exceed five seconds for more
than 5% of the standard business hours timeframe, with successful transaction
completion to be guaranteed at 99% for standard business hours. With a transaction
deemed successful if completed within 15 seconds, no period of transaction failure shall
persist for more than 20 minutes.
Transaction monitoring average based on a sampling of five queries to be delivered
randomly within a five-minute interval during standard business hours, each query
originating from the customer access point device.
Northeast assumes responsibility for an access device assuming the device is operational
with the exception of power failure or an act of nature deemed beyond the control of
Northeast.
In this example, the SLA text includes a variety of guarantee metrics and terminology that
allows for fictional representative statements such as those found within an actual SLA.
Despite its confusing terminology, this SLA actually includes some very precise guarantee
information, including how response time will actually be measured.
This SLA would be provided for each Northeast customer, but this example will focus on the
SLA between Northeast and a customer called A to Z Performance Components, which has
offices in Atlanta and Savannah, Georgia.
As mentioned above, the first step to designing an SLA implementation is to determine
which service supports the SLA and identify the period. The hierarchy below represents the
Customer Account Access Service.
CUSTOMER ACCOUNT
ACCESS SERVICE
WEB SERVICE DATABASE SERVICE
DEVICES PROCESSESDEVICES PROCESSESRESPONSE TIME
Many components are required to monitor A to Z’s service availability. In addition to
providing web access and database access, Northeast must now build service components
that monitor availability and response time specific to A to Z’s Atlanta and Savannah
offices.
These new service components will monitor access routers at each site and response time
for newly created response time tests that are hosted on the access routers at each site.
The following figure shows how you might extend the hierarchy to support A to Z.
111: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 112/182
A to Z ACCOUNT
ACCESS
A to Z Site Access A to Z Site
Response Time
Atlanta Response
Time
Savannah Response
Time
Atlanta Routing Savannah Routing
Customer Account
Access
Evaluation of the SLA implies that a variety of guarantees exist and that some new service
models will be required. The chart above represents one possible configuration to use. You
should review each SLA implementation carefully to determine the best way to organize
services.
Among the new services is a hierarchy called A to Z Site Access. A to Z Site Access has two
subservices called Atlanta Routing and Savannah Routing. These services are designed to
monitor the on-site router which provides access to the Customer Account Access Service.
You can break down each one of these subservices into a set of resource monitors,producing a hierarchy similar to the figure below.
Routing
Router Interfaces
One resource monitor watches the contact status of the router device model, while the
other resource monitor watches the port status of interfaces on the router which are critical
for providing access for the office. The routing service is considered down if the router is
down or if all required interfaces are disabled. A similar service would be implemented for
both Atlanta and Savannah. In reference to the SLA, the following statements are related to
routing components of each site:
Northeast provides access to customer account data guaranteeing that account access for
each customer location will be available 99% of each month excluding those periods of
scheduled system maintenance to be conducted between the hours of 12AM to 3AM on
each Sunday.
Service availability to be restored so that the average outage resolution time is 30
minutes or less, without an individual outage exceeding one hour; outages are
guaranteed to not exceed a rate of two or more outages within any 24-hour period.
Guarantees apply on a per-site basis. For the service manager user, consider offering 99%
availability for the month of November. This implies that 432 minutes of downtime are
allowed. When building the SLA, carefully consider the service or services to which this
should be applied. If this guarantee was applied to the A to Z Site Access service, and
Atlanta experienced 300 minutes of downtime and Savannah experienced 200 minutes of
downtime (for a total of 500 minutes of downtime), the SLA would be violated. However,
the wording in the SLA states “..each customer site..”, so a guarantee should be applied at
each site. By applying the guarantees in this manner, the SLA would not be violated as
112: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 113/182
neither site experienced more than 432 minutes of downtime. With regard to the
availability of the Routing service, two separate guarantees of 99% apply:
Atlanta availability 99%
Savannah availability 99%
The SLA also states “…average outage resolution time is 30 minutes or less...” In addition
to the 99% availability guarantee, a supplemental guarantee which specifies an averageoutage time of 30 minutes or less is unnecessary. This component can be added to the
availability guarantees as a MTTR supplement. The availability guarantees should now
include the MTTR component:
Atlanta availability 99%, MTTR 30 minutes
Savannah availability 99%, MTTR 30 minutes
In addition to the MTTR component, the SLA states “…outages are guaranteed to not
exceed a rate of 2 or more outages within any 24-hour period…” This statement is referred
to as a Mean-Time-Between-Failures (MTBF) clause. The MTBF clause states that more than
one outage per day cannot occur. The availability guarantees should now include the MTBF
component:
Atlanta availability 99%, MTTR 30 minutes, MTBF 24 hours
Savannah availability 99%, MTTR 30 minutes, MTBF 24 hours
A similar guarantee should be applied to the Customer Account Access Service; however,
this guarantee will be independent of either customer site:
Atlanta availability 99%, MTTR 30 minutes, MTBF 24 hours
Savannah availability 99%, MTTR 30 minutes, MTBF 24 hours
Customer Account Access availability 99%, MTTR 30 minutes, MTBF 24 hours
In addition to the 99% overall availability guarantee, consider these additional availability
specifications:
A standard business hours timeframe is to be defined as the hours of 7AM to 6PM EasternStandard Time Monday through Friday of each week.
Within standard business hours, account access will be guaranteed available at 99.5%
with no individual outage exceeding 20 minutes.
Business-hour guarantees can be created by applying a schedule during creation. A weekly
schedule for the days Monday through Friday from 7AM to 6PM will be applied to new
guarantees ensuring 99.5% availability throughout the scheduled period.
The new guarantees should be applied to each customer Routing Service and the Customer
Account Access Service:
Atlanta availability 99%, MTTR 30 minutes, MTBF 24 hours
Savannah availability 99%, MTTR 30 minutes, MTBF 24 hours
Customer Account Access availability 99%, MTTR 30 minutes, MTBF 24 hours
Atlanta availability 99.5% M-F 7AM-6PM
Savannah availability 99.5% M-F 7AM-6PM
Customer Account Access availability 99.5% M-F 7AM-6PM
113: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 114/182
An additional stipulation to the business-hours guarantee must be accounted for: “no
outage can exceed 20 minutes…” This stipulation is referred to as a Maximum Outage Time
(MOT) clause. The business-hours guarantees should also include the MOT component:
Atlanta availability 99%, MTTR 30 minutes, MTBF 24 hours
Savannah availability 99%, MTTR 30 minutes, MTBF 24 hours
Customer Account Access availability 99%, MTTR 30 minutes, MTBF 24 hours
Atlanta availability 99.5% M-F 7AM-6PM, MOT 20 minutes
Savannah availability 99.5% M-F 7AM-6PM, MOT 20 minutes
Customer Account Access availability 99.5% M-F 7AM-6PM, MOT 20 minutes
To this point, six different availability guarantees have been identified, but none of these
guarantees account for the response time element within the SLA:
Average transaction time for initial account access is not to exceed five seconds for more
than 5% of the standard business hours timeframe, with successful transaction
completion to be guaranteed at 99% for standard business hours. A transaction will be
deemed successful if completed within 15 seconds, and no period of transaction failure
shall persist for more than 20 minutes.
Transaction monitoring average based on a sampling of five queries to be delivered
randomly within a 5-minute interval during standard business hours, each query
originating from the customer access point device.
The response time stipulations are very thorough and dictate how response time will be
measured. To support this component of the SLA, you need to create two additional
services using the response time tests as monitored resources.
Create an Atlanta Response Time Service to monitor five new response time test models
which will be hosted on the Atlanta access router and run at five-minute intervals. The SLA
specifies 5 seconds and 15 seconds as major and critical thresholds. At first, you might
consider that each response time test (SPM test) should be configured with a 5-second
major threshold and a 15-second critical threshold; however, the SLA wording suggeststhat this would not be appropriate. Note the wording “…Average transaction time for initial
account access to not exceed 5 seconds…” The average response time should be monitored,
rather than the individual response time of each test.
Imagine a response time result set of 4 seconds, 3 seconds, 3 seconds, 3 seconds, and 6
seconds. The 6-second result is in violation of the 5-second threshold. However, the
average response time is less than 4 seconds, which is not in violation. To support this
behavior, you should not set thresholds on the individual response time tests. Instead,
create a new Response Time Service with a policy to monitor the latency of the response
time tests. The new service policy should be created to monitor the Latest Result attribute
on the response time test models. The policy should apply an aggregate rule set when
evaluating response times as follows:
When the average for all resources is greater than 15000, the service is down.
When the average for all resources is greater than 5000, the service is degraded.
The Latest Result attribute value of a response time test model is the number of
milliseconds that the most recent test took to complete. Therefore, the values in the service
policy above are expressed in terms of milliseconds. The Atlanta Response Time and
Savannah Response Time Services should both utilize this policy to monitor five response
time tests hosted by the respective site access router.
114: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 115/182
Referring back to the response time specifications in the SLA, two business-hours response
time guarantees are needed to track response time of the services created above:
Atlanta response time 95% M-F 7AM-6PM
Savannah response time 95% M-F 7AM-6PM
An additional availability component is included with the response time stipulation:
…successful transaction completion to be guaranteed at 99% for standard business hours
A transaction is deemed successful if completed within 15 seconds, and no period of
transaction failure shall persist for more than 20 minutes.
Recalling the second definition for an availability guarantee, “a service is unavailable if it is
not responsive”, the specification in the SLA above requires two additional 99% availability
guarantees with a supplemental MOT component:
Atlanta response time 95% M-F 7AM-6PM
Savannah response time 95% M-F 7AM-6PM
Atlanta availability 99% M-F 7AM-6PM, MOT 20 minutes
Savannah availability 99% M-F 7AM-6PM, MOT 20 minutes
A Maintenance window clause within the SLA should also be considered:
…excluding those periods of scheduled system maintenance to be conducted between the
hours of 12AM to 3AM on each Sunday…
To account for maintenance windows, modify each service to include a maintenance
schedule for that time period.
The following table lists all SLA components that have been accounted for in this design:
SLA Design Components
SERVICE: SLA COMPONENT:
A to Z Account Access Monthly SLA
Customer Account Access Availability 99%, MTTR 30 minutes, MTBF 24 hours
Availability 99.5% M-F 7AM-6PM, MOT 20 minutes
Atlanta Routing Availability 99%, MTTR 30 minutes, MTBF 24 hours
Availability 99.5% M-F 7AM-6PM, MOT 20 minutes
Savannah Routing Availability 99%, MTTR 30 minutes, MTBF 24 hours
Availability 99.5% M-F 7AM-6PM, MOT 20 minutes
Atlanta Response Time Response time 95% M-F 7AM-6PM
Availability 99% M-F 7AM-6PM, MOT 20 minutes
Savannah Response Time Response time 95% M-F 7AM-6PM
Availability 99% M-F 7AM-6PM, MOT 20 minutes
115: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 116/182
How You Implement the A to Z Account Access SLA in SPECTRUM
To implement the SLA design described in the previous section, follow these high-level
steps:
1. Create the two Routing Services and their resource monitors for Routers and
Interfaces. This results in 6 services grouped into two hierarchies.
› For each site use the Service editor to configure the first resource monitor
(Atlanta Router and Savannah Router) to watch the contact status of the access
router device model with the contact status high-sensitivity policy.
› For each site configure the second resource monitor (Atlanta Interfaces and
Savannah Interfaces) to watch the port status of any critical router interface
using a Port Status policy. Consider using either the Low Sensitivity or
Percentage rule set for the service policy depending on the number of interface
models required to provide access.
› Use the service editor to create the Atlanta Routing and Savannah Routing
Services, which monitor the service health of their two resource monitors
(defined above) using the Service Health High Sensitivity Policy.
2. Create the Atlanta and Savannah Response Time Services and their individual response
time test (SPM) models.
› Use the OneClick Console’s Locater tab to configure 5 SPM test models for each
site with the following settings:
A 5-minute (300 seconds) schedule interval and thresholds disabled
A timeout value of 25-30 seconds
Filter Timeout Data set to FALSE to configure the test models to
have the timeout value written to the Latest Result
› Use the Service Editor to create the Atlanta Response Time and Savannah
Response Time Services monitor and ensure that they monitor the newly created
response time test (SPM) models.
› Using the Service Policy Editor, configure each Response Time Service to use a
new service policy which monitors the response time test’s Latest Result attribute
and the following aggregate rule set:
When the average for all resources is greater than 15000, the
service is down.
When the average for all resources is greater than 5000, the service
is degraded.
116: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 117/182
3. Consolidate the Routing, Response time, and the previously created Customer Access
(from example 2) service into higher-level services.
› Consolidate the Routing and Response Time Services at both customer sites
under two new services: A to Z Site Access and A to Z Response Time. The two
new services should monitor their respective components with a Service Health
High Sensitivity policy.› Create the A to Z Account Access Service to monitor the A to Z site Access, A to
Z Response Time, and Customer Account Access services. The new services
should monitor its components with a Service Health High Sensitivity policy.
4. Set up the SLA rules.
› After completing the changes made to the service hierarchy, navigate to the SLA
tab within the Service Editor. Create an SLA against the A to Z Account Access
service using a monthly SLA period. Do not at this time create guarantees
against the A to Z Account Access Service.
› Launch the Guarantee Editor with the new SLA highlighted.
› Use the Guarantee Editor to create each of the ten guarantees (8 availabilityguarantees and 2 Response Time guarantees) that are identified in the previous
section.
› Apply the MTTR, MTBF and MOT specification to the appropriate guarantees.
Note: The functionality to associate business hours to these guarantees is planned
for SPECTRUM 8.1. Disregard the Business Hour restriction for releases prior to 8.1
An SLA guarantee will be violated if the following occurs:
The threshold for any guarantee is violated.
The threshold for any supplemental guarantee is violated (that is, MOT, MTTR, MTBF).
When the MOT threshold is violated, the supplemental guarantee will immediately violate
the SLA. If the MTTR or MTBF threshold is violated, the guarantee will transition the SLA to
a state of “at risk” because the final determination of whether MTTR or MTBF has been
violated cannot be made until the end of the SLA period.
The SLA status is equivalent to the status of its worst guarantee. If any guarantee is
violated, the SLA will likewise be violated for the SLA period. When the SLA period rolls
over, the SLA will transition back to a state of “unaffected.”
An SLA guarantee will be violated if the following occurs:
The threshold for any guarantee is violated.
The threshold for any supplemental guarantee is violated (that is, MOT, MTTR, MTBF).
When the MOT threshold is violated, the supplemental guarantee will immediately violatethe SLA. If the MTTR or MTBF threshold is violated, the guarantee will transition the SLA to
a state of “at risk” because the final determination of whether MTTR or MTBF has been
violated cannot be made until the end of the SLA period.
The SLA status is equivalent to the status of its worst guarantee. If any guarantee is
violated, the SLA will likewise be violated for the SLA period. When the SLA period rolls
over, the SLA will transition back to a state of “unaffected.”
117: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 118/182
Service and SLA Reporting
Service Availability and SLA Reports are a major component of the Service Management
solution. These reports complement the service and SLA modeling process and provide
insight into the performance of service components over a variety of time periods. Service
and SLA reports can be categorized into two groupings: customer-facing and internal.
Customer-facing reports provide service availability and SLA status information, and can be
delivered to service customers. Frequently, SLAs will stipulate that customers will receive
Service and SLA reports for each SLA period. Customer-facing reports tend to summarize
status. For example, a customer-facing availability report would only show two metrics:
available time or down time. Likewise, a customer-facing SLA report would only show two
metrics: compliance or violation.
Internal reports are designed to provide a rich set of detailed data for use by the service
provider or enterprise customer. In contrast to the customer-facing service availability
report, an internal service availability report would display maintenance time, loss of
management time, etc. Similarly, an internal SLA report would include all possible SLA
states including unaffected, compliant-warned, and violated.
Other internal reports may summarize services with the greatest downtime or service
resources which contribute the most downtime. Internal reports are intended to provide
insight into the health and performance of their Services and SLAs over a period of time.
Run SPECTRUM Service Manager Customer-Facing Reports
Several different customer-facing reports within the Service Availability and SLA category
are available to Service Management users. You use the SPECTRUM Report Manager
application to generate and manage your reports. You can access the application from any
computer that can connect via a web session to the OneClick server on which Report
Manager is installed.
To access Report Manager and run reports
1. Point a web browser to the Report Manager web page using the URL
http://hosthame /spectrum/repmgr, wherehostname is the name of the OneClick and
Report Manager system.
2. Log in to the application by specifying your username and password in the OneClick
login window.
3. Click the Begin Session link on the Report Manager Welcome window.
The Report Manager main window appears. The main window provides access to all report
and report management options for your account. It lists any scheduled reports that have
been generated for your account, reports that are scheduled to be generated for your
account, and any messages in the Message of the Day and What’s New text boxes posted
by a Report Manager administrator. For a complete description of Report Manager and how
to use it, see the Report Manager User Guide.
The following sections describe the customer-facing reports.
118: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 119/182
Service Availability by: Name, Customer, Owner
This report includes a pie chart showing service up/down time and availability percentage
based on the period for which the report was run. A table listing all down shows start time,
end time, duration, and outage notes. In addition, a subreport with detailed outage
information is available for any outage with the table. You can generate multiple service
availability reports by service name, service customer, or service owner.
119: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 120/182
Service Availability Variable Health Level
This report is similar to the Service Availability report, but allows you to include degraded
and slightly degraded time if you choose. A pie chart including all service health types is
shown with availability percentage calculation based on the period for which the report was
run. All included outages are listed in the subreport showing detailed outage information.
120: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 121/182
Service Summary by: Name, Customer, Owner
This report lists multiple services based on service name, service customer, or service
owner with outage times and percentage of availability.
Service Summary Variable Health Level
This report provides a table of services with columns that display summarized data for each
service health level that you choose to include in the report, similar to the previous report.
You can choose to display down only; down and degraded; or down, degraded, and slightlydegraded. For each service listed in the table, a subreport with more detailed outage
information is available.
121: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 122/182
SLA Detail By Customer
You can generate this report for one or more SLA periods. The report includes a pie chart
which displays the percentage of guarantees for all reported periods which are compliant or
violated. Below the chart, each of the guarantees is reported, including the status for each
period. For any period, you can open a subreport showing detailed outage information for
the particular guarantee, including any outage exemptions. If you run the report based on
customer, a separate report will be generated for each of the customer’s SLAs. You can
provide the report to the customer at the end of each period.
122: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 123/182
SLA Inventory by Customer
This report shows the configuration of each SLA and guarantee for a particular customer.
This is a useful report to generate for a customer when the SLA and guarantee models are
first created. The user should be able to compare the configuration with the SLA document
to verify that all guarantees or service level objectives are addressed.
SPECTRUM Service Manager: Internal Reports
Several different internal reports within the Service Availability and SLA category are
available to Service Management users. The following sections describe the internal reports
Service Health by Service Name
This report is very similar to a service availability report, but includes all service health
levels including maintenance and loss of management. The report can be run for both
services and resource monitors. A pie chart showing the percentage of each service healthvalue is shown. A table showing outage of all service health types, including outage notes
and links to detailed outage information, is also included in this report. The service health
report provides service manager users with very detailed information regarding the
performance of a service over a given period of time.
123: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 124/182
124: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 125/182
Service Inventory
This report shows a breakdown of all services, resource monitors, and resources which are
modeled in the system. It can be used to preserve a “snapshot” of service inventory
configuration for the current time.
125: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 126/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 127/182
and links to detailed service availability reports are available for each service model. The
report provides very detailed information for the service manager user indicating which
services experienced the most overall outage time for a particular period.
Top N Worst Service Outages
This report allows you to view the top N worst service outages which resulted in service
downtime. This report is a useful tool for summarizing the worst outage for a period of time
and may highlight areas within the service hierarchy which are lacking the redundancy to
prevent service downtime.
127: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 128/182
Top N Worst Service Resources by Total Downtime
This report shows summarized information regarding the total service downtime caused by
individual resources. This highlights the cumulative effect of each individual resource
outage which results in downtime for one or more services. This report can be an important
tool for identifying service resources which are chronic problem areas within the service
modeling hierarchy.
SLA Status Current and Recent by Customer
This report provides you with a quick way of obtaining summarized SLA status for the
current and recent periods. Status includes unaffected, compliant, warned, and violatedSLAs with detailed subreports showing results for specific guarantees. This report can be
run for selected SLAs or SLAs for a specific customer. This report can provide a quick
review of the status of many SLAs for any customer.
128: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 129/182
SLA Summary by: Name, Customer, Status
This report produces a table of summarized SLA status for one or more periods. The report
can be generated by SLA name, customer name, or simply organized by status. The report
provides a summarized reference for multiple SLAs or multiple periods. You can access
detailed subreports showing results for specific guarantees.
SLA Summary Warned or Violated
This report produces a table of all SLAs that are currently in the warned or violated state.
The table also provides access to a subreport showing detailed guarantee outage
information for the current period. This is a useful report for the service manager user to
view SLAs that are not performing well for the current period.
129: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 130/182
SLA Detail By: SLA Name, Time Range, Last N Periods
This report is similar to the SLA Detail By SLA Customer report except that it displays all
SLA Status values including unaffected, compliant, warned, and violated states. Detailed
information is provided in a subreport which includes guarantee outages for the particular
period. This is useful for obtaining detailed information about individual SLAs for one or
more periods.
130: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 131/182
PERIOD DETAIL SUBREPORTS
131: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 132/182
132: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 133/182
SLA Detail with Resource Outages
This is a complex report that brings together SLA status and the associated resource
outages which ultimately impacted the SLA for a specific period. This report is useful when
used in conjunction with the Top N Worst Resources By Total Down Time report. You can
use this report to show the impact of a particular resource at a very high level. Because it
provides a great deal of information, it may generate many pages of data for SLAs with a
high number of resource outages.
133: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 134/182
Customer SLA Summary
This report shows the status of the last six SLA periods for all customers’ SLAs. The status
includes all four values. For each SLA, a chart summarizing six periods of status information
is shown within a table providing summarized information for each period and a link to
more detailed guarantee outage information. This report provides service managers with a
quick view of SLA performance for a specific customer over the last six periods. The report
may also be used by the sales organization to verify if a customer’s SLAs have been met for
recent periods.
134: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 135/182
Chapter 8: Proactive Service
AssuranceThe CA Network and Voice Management solution has embedded algorithms that help
operations teams to identify growing problem areas within the infrastructure before they
impact customer service. Problems rarely occur instantaneously; often, warning signs occur
such as subtle but growing service degradations, increasing errors, and delays. These
problems might not be serious enough for users to notice or to warrant the opening of
service calls, but they are growing.
With tools that can analyze and detect growing problems and raise warnings, operations
teams can proactively fix the problems before they result in outages and interrupted or lost
service. This capability is particularly important for SLA enforcement. If you can resolve SLA
troubles before the SLA is violated — and without requiring additional network resources or
servers — you can avoid excessive effort and expense.
Prerequisites: The procedures in this chapter assume that you have installed SPECTRUMand eHealth, and that you have configured Live Health to send traps to SPECTRUM. For
more information about configuring the integrated product solution, see Chapter 5. For
details on the Live Health application and how to create monitoring rules, see the Live
Health web help. The eHealth web help is installed on the eHealth system and is also
available on the TotalDoc online documentation CD.
How You Identify Potential Problems
The proactive analysis of the eHealth Live Health application and the Health report
exceptions analysis are the key tools that warn you about growing problems in your
network. For converged networks, the eHealth for Voice Policy Manager identifies when
voice and messaging problems are starting. All of these tools provide configurablethresholds and settings so that you can define when a problem is serious enough to merit
proactive attention.
You can configure these tools to automatically watch for these growing problems and send
alarms to SPECTRUM when the problems require attention. In addition, you can define how
135: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 136/182
long the behavior must be occurring before alarms are raised so that you can reduce the
“false” alarms of simple threshold violations and focus on the real, continuing situations.
For example, the Live Exceptions application of the Live Health product family provides
notifications of potential delay, failure, and unusual workload problems within networks,
systems, and applications. It uses the historical data that eHealth gathers and maintains to
assess potential problems over time. When Live Exceptions detects a condition that meritsoperator attention, it raises an alarm and sends it to SPECTRUM.
Configure Live Health to Watch for Grow ing Problems
For proactive service assurance, you use the Time over Threshold and Deviation from
Normal algorithms within Live Exceptions to watch for growing problems in service. When
performance changes from what is considered “normal” behavior (based on past history) for
a particular length of time, Live Health raises an alarm and can send that alarm to
SPECTRUM.
To configure Live Health for proactive service assurance
1. Use the Live Exceptions Browser to associate the applicable Unusual Workload default
profiles to groups or group lists of your managed resources. Use the Live Health profile
descriptions tool on http://support.concord.com/devices to identify the correct profiles
for the element types that you have discovered. For more information about associating
profiles to groups, see Chapter 5.
2. If you have custom SLAs, you can create custom profiles with Time over Threshold and
Deviation from Normal alarms to reflect your service thresholds. Make sure that your
rules are configured to warn you when the service degradations require attention,
which will typically be at a threshold that is lower than your service agreement
thresholds. For instructions on creating custom profiles, see the Live Health web help.
Configure Health Reports to Send Traps for Growing Problems
The Exceptions section of a Health report contains information about elements that have
experienced unusual events or that may not have sufficient resources to accommodate the
demand that is placed on them. This section of a Health report identifies elements that
have accumulated a high number of exception points as the result of errors, high utilization
and divergence from trends. Elements appear in the report only when their accumulated
exception points exceed a minimum number. eHealth administrators can specify this
number in the service profile for the report.
As an additional means to proactively monitor service, you can configure Health reports to
forward traps for Health exceptions to the SpectroSERVER. When the scheduled Health
report runs, eHealth sends an SNMP trap to the SpectroSERVER for the leading problem for
each element in the Exceptions section of the Health report. Trap-forwarding is not enabled
by default for eHealth; you must create a custom Health report to enable this feature, andthen schedule that Health report to run automatically.
Note: Only scheduled Health reports forward exceptions. If you manually run a Health
report, it will not forward exceptions.
For instructions on creating a custom Health report that forwards exceptions as traps to
Live Health, see Chapter 5.
136: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 137/182
Send Voice Alerts to SPECTRUM
eHealth for Voice Release 4.0 provides alarm integration and correlation with SPECTRUM,
taking advantage of the SPECTRUM Service Management and voice modeling capability. You
can configure the eHealth for Voice Policy Manager application to send SNMP traps to
SPECTRUM when violations of QoS or GoS occur. SPECTRUM applies its intelligence on
policy, models, and rules to identify the severity of the problem.
Policy Manager monitors all data voice and PBX activity, and then reviews that data against
pre-defined criteria. With Policy Manager, you can define rules or policies against any data
— configuration changes, system traffic, individual usage, alarms, historical events, and so
on. For instructions on configuring eHealth for Voice to send performance traps to
SPECTRUM, see Chapter 5.
How You Respond to Alarm Actions in SPECTRUM
Using the SPECTRUM OneClick console, network operators and managers can view the
models (or resources) in their topology and watch for events or status changes that indicate
growing problems in their network. When SPECTRUM receives a trap from eHealth, the
model that represents the element changes color to represent the alarm severity of the trap
that was received. For example, when critical problems occur, the device icon changes to
red, while minor problems cause it to change to yellow.
Operators can right-click the icon and take the following actions to troubleshoot or
investigate the problems:
Drill down to an eHealth Alarm Detail report to obtain a picture of the performance trends
that caused Live Health to detect performance problems that required an alarm to be
raised. For example, if a device has performed outside its normal operating thresholds for
more than 15 minutes, Live Health Alarm Detail reports can show you the performance
trend line for the element.
Run an At-a-Glance or Trend eHealth report to review the performance history of theresource. While a Trend report shows you the performance of the specific problem
variable, the At-a-Glance shows you a set of common performance variables for that
element type. Using this data, you can identify contributing causes or the root of the
problem.
Clear the alarm. If the operator knows that the alarm is related to a known problem or
situation, the operator could clear the alarm and return the device status to normal.
Open Service Desk tickets to record the problem as a work task and assign it to
personnel to fix. With the Unicenter Service Desk integration, SPECTRUM can open,
update, and close Service Desk tickets that track work to address problems in the
network. Operators at the OneClick console can drill down to the Service Desk ticket
details to determine the latest status and assigned troubleshooter for the tickets.
137: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 138/182
(This page intentionally left blank)
138: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 139/182
Chapter 9: Predictive Capacity
PlanningEmployee productivity and customer satisfaction both depend on the availability and
performance of mission-critical applications. The applications depend on the IT
infrastructure running smoothly and efficiently.
Ensuring that IT resources meet the needs of your users requires more than just
responding to problems. To keep your infrastructure running efficiently, you must obtain
real-life data about the current status of your network, identify congestion and trouble
spots before they affect users, and plan effectively for the future. These tasks are all part of
predictive capacity planning.
Capacity planning is a complex and critical part of managing IT resources. It helps you to
use your current resources efficiently, evaluate trends in demand, and project future
resource needs. Effective capacity planning allows you to achieve the following:
Reduce costs through the reduction or elimination of underused leased lines.
Improve performance though identification of both overused and underused elements,
and rebalancing of capacity with demand.
Reduce server and network downtime by anticipating overloads before they occur, and
ensuring adequate capacity is in place.
Improve budget predictability by tracking trends and modeling the affects of new services
or infrastructure, allowing you to avoid emergency purchases and ensure you get the bes
prices.
This chapter describes how eHealth can help you to perform three major capacity planning
tasks:
Identify underutilized resources to find existing devices or resources that are underused,
resulting in unnecessary costs for leased lines and systems that are sitting idle.
Identify overutilized resources to find existing devices or resources that are overused,
resulting in performance degradation or penalty charges from overuse.
Plan future capacity needs to project capacity needs based on current demand trends or
anticipated business changes, allowing you to plan purchases and install upgrades as
needed.
Perform voice capacity planning to find over- and underutilization problems in your Telco
or converged networks.
Prerequisites: To use the best practices in this chapter, your eHealth database must have
at least a week of collected data. With more performance data and longer history, thesereports perform better for highlighting capacity trends and utilization problems.
These examples also assume that you are viewing the reports from the eHealth Web
interface. Reports on the Web interface have interactive “hot-spots,” which you can click to
drill down to other reports and closer detail. Drilldowns are not available from reports that
are in PDF format.
139: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 140/182
Additional References: Procedures for running, scheduling, and customizing reports are
described in detail in the eHealth Report Management Guide and the eHealth web help. For
details on the eHealth reports and how they work, see the eHealth web help. The best
practices in this section are taken from the Capacity Planning with eHealth topic, which is
available on the eHealth Support web site at http://support.concord.com.
How You Identify Underutilized Resources
To identify underused resources, follow these steps:
1. Locate underutilized resources.
2. Confirm underutilization.
3. Address underutilized resources.
4. Show ROI.
5. Update your configuration.
Locate Underutilized Resources
eHealth provides an Underutilized Elements report that allows you to quickly identify
elements that may be underutilized. This report is an optional Supplemental report in
eHealth Health reports. To view this report, you must customize a Health report to include
it.
To locate the underutilized elements in your network
1. Log in to the eHealth Web interface, and select the Run Reports tab.
2. Click the Standard Health report link. The Run Health Report page appears.
3. Specify the report subjects (for example, LAN/WAN technology, and a group of
elements).
4. Click More Options, and select Supplemental under Presentation Attributes.
5. In the list of Supplemental reports, select Underutilized Elements to include that report
6. Save the report with a unique name, such as Underutilization_Report.
7. Click Generate Report to run the report. Because this report can take several minutes
to run on demand, the recommended best practice is to schedule the report to run from
the eHealth console so that the report runs overnight or during a time when the
eHealth system is not very busy.
8. Review the Underutilized Elements supplemental report. The report lists elements that
meet the following criteria for the past 8 days:
› Never reached 50% utilization
› Did not reach 10% utilization more than 5% of the time
140: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 141/182
9. In the report, look for leased lines, routers, switches, and systems that have
underutilized bandwidth, CPU capacity, memory, or disk space. For example, the
following report shows several high-speed OC-3 lines that have very low usage and
should be investigated further.
When you run an Underutilized Elements report for only LAN/WAN elements, the
elements are sorted first by speed (since faster WAN links are more expensive), and
then by the percentage of time that they were underutilized.
BEST PRACTICES
As you use the Underutilized Elements report, consider the following best practices that can
help to make the report more meaningful for your environment:
When you first install eHealth, run it weekly to identify resources that are not being used.
After this initial period, you can run it less frequently (monthly or quarterly) to identify
usage changes in your network.
Since the Underutilized Elements report looks at data from the past 8 days, you should
schedule the report to run on Sunday so that you get data for an entire business week.
Depending on how your network is used, you can edit the service profile so that the
report includes data from only certain days or times, to eliminate periods of low network
usage such as nights or weekends.
Confirm Underutilization
After you find underutilized elements, analyze the purpose of each element and run reports
to confirm that it is actually underused. Run a monthly Health Report to confirm that it has
been underutilized for at least a month.
Important: Check unused network links to determine if they are backups. Since backups
are used only when the primary fails, they often do not have any usage.
141: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 142/182
To confirm that an element is underutilized
1. In the Health report (such as the one from the previous section), examine the
Bandwidth Utilization chart on the Element Detail page of the report.
The Bandwidth Utilization chart shows the load on each of the network interfaces over
the report period. For example, the bar for Helium 5734 is completely gray, indicating
that it did not have any usage during the month. Several other bars, such as Helium
7839, Miami, and Atlanta are all dark green, indicating they never exceeded 10%
usage. All of these interfaces appear underutilized.
2. Run a Bandwidth Trend Report by clicking the bar for the element that you suspect is
underused. The Bandwidth Trend Report shows the utilization for that element for the
same time period as the Health Report.
142: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 143/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 144/182
Net-Link Chicago T1 Paxton T1
Current Speed 100 Mps 1.54 Mbps 1.54 Mbps
New Speed 10 Mbs 512 Kbs 128 Kbs
Current Cost $3,300 $2,500 $5,000
New Cost $2,500 $2,000 $3,200
Switching Cost $5,000 $1,000 $1,200
Monthly Savings $800 $500 $1,800
4. Based on your ROI calculations, determine whether making the proposed changes
makes sense. For example, the table shows that downgrading the Net-Link from a 100Mbs line to a 10 Mbs line would save $800 each month, but the high switching cost
means that you would not break even for over six months. Downgrading the Paxton T1
to a 128 Kbs line, on the other hand, would give you an ROI in less than one month.
Update Your Configuration
After you change your configuration to resolve capacity issues, update your SPECTRUM and
eHealth environments to ensure that they reflect updated speeds and perhaps any
resources that have been retired.
To update your configuration
1.
Update your SPECTRUM views using rediscovery to ensure that they reflect the latestdevice information.
2. Update the eHealth polling configuration and element lists by re-importing the element
information from SPECTRUM and rediscovering your elements. Future reports for the
time ranges when element speeds changed may show unusual utilization percentages.
3. If you decreased capacity or added demand to existing resources, run Trend and Health
reports on those resources. Look for any Health exceptions or other utilization problems
that may result from the increased traffic.
4. If you eliminated an element, disable polling and retire the element in the eHealth
database. Retiring the element allows you to continue reporting on it until its data ages
out of the database.
144: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 145/182
How You Identify Overutilized Resources
To identify overutilized resources, follow these steps:
1. Locate overutilized resources.
2. Confirm overutilization.
3. Address overutilized resources.
Locate Overutilized Resources
eHealth’s capacity planning tools can help you to identify overutilized resources before they
start causing problems. By examining a single Health Report each week, you can identify
network elements that are reaching their capacity. You can then consult other reports to
analyze problems, and solve the issues before they become fires for your IT team.
To locate overutilized resources
1. On the eHealth web interface, run a daily Health report for the busiest day of your
week.
2. In the left pane of the Health report window, click Exceptions Summary to open the
Exceptions Summary Report. The Exceptions Summary report identifies elements that
have experienced unusual events or whose resources are consistently inadequate for
the demand on them. The elements are ranked by exception points, so that those
elements experiencing the worst problems are listed first.
3. Look for elements in the report that list Utilization Health Index or Congestion Health
Index in the Leading Exception column. These elements are experiencing high volume
and may be overutilized.
For example, the Frame Relay link to the Virginia office is listed first, and has Utilization
Health Index as its leading exception. This link is likely overutilized and should be
investigated further.
Confirm Overutilized Resources
The Situations to Watch chart identifies elements that are predicted to exceed, reach, or
come close to reaching their trend thresholds. The chart shows you how close each element
is to its threshold, how fast utilization is growing, and how long until demand exceeds
capacity.
145: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 146/182
To confirm that resources are overutilized
1. On the eHealth web interface, run a Health report for the previous week.
2. Review the Situations to Watch chart in the Summary section of the Health report.
3. Review the elements listed in the chart, looking for those that have exceeded their
threshold or are growing fast enough to soon reach it. For example, the first element
(Virginia) has already exceeded its threshold for two days, while the next two are
predicted to reach threshold in the next week. All of these are likely to be overutilized
elements. Demand on the final two elements listed is increasing, but both are still at
less than 20% capacity, and do not represent a problem.
4. Select Element Detail in the Health Report, and examine the Bandwidth Utilization chart
for the elements that you suspect to be overutilized.
The Bandwidth Utilization chart shows the percentage of time that each element was in
each usage range. Generally, purple and red colors indicate an overutilized resource.
Purple indicates greater than 100% utilization, meaning that the element is probably a
leased line exceeding its contracted bandwidth, and, therefore, incurring overage
charges.
5. Examine the chart to see how often a suspected element was overutilized during the
course of the week. Some elements may show consistently high demand (such as the
Virginia line), but since demand varies over time, most elements will show significant
periods of low usage. Depending on your network activity, an element may not have
any usage at certain times (overnight for example), but still be overutilized because
demand exceeds capacity at peak times.
146: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 147/182
For example, the Vermont line in the chart does not show any usage a third of the time
(possibly overnight), and is under 20% usage most of the time. However, since it
exceeds 100% utilization at peak demand, it could be incurring overage charges, and,
therefore, be considered overutilized.
6. To obtain more details about an element’s performance, create an At-a-Glance report
by clicking that element in the Bandwidth Utilization chart.
7. Review the Bandwidth Utilization charts in the At-a-Glance report to determine how
frequently the element was overused, and during what time periods. The sample charts
show that the element had 50% utilization most of the week, but peaked near 100%
several times. Depending on your business needs, an element that reaches its capacity
for only one hour per week may be acceptable, or that one hour of overutilization could
be a critical problem if it occurs at a key business time.
8. Review the other charts in the At-a-Glance report for any anomalies, including high
error rates or signs of congestion (forward explicit congestion notifications (FECNs),
backward explicit congestion notifications (BECNs), discards). Use this information todetermine the conditions that might be affecting the element such as the following:
› Insufficient capacity
› Inefficient or misconfigured applications consuming excessive bandwidth
› Too many or too few stations overloading a WAN link
› A highly repeated or bridged domain that should be routed
9. Establish a report trail to document evidence of high usage. In addition to the reports
described here, you can select specific elements in the Exceptions Summary Report and
Situations to Watch chart to run detail reports for those elements. You can also run
Bandwidth Trend reports on specific elements to show the long-term utilization of a
resource.
How You Address Overutilized Resources
After you have identified and documented underused resources, consider taking these
typical actions to resolve the problem:
Upgrade the element to a higher capacity.
Relocate demand to other resources.
147: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 148/182
Add additional elements to share the workload.
BEST PRACTICE
Use Capacity Trend What-If reports to visualize the effects of higher capacity or lower
demand on the overutilized element, and determine the optimal capacity of any new
resources. When you run the report, you specify an element, a capacity variable, and a
time range. The report shows the value of that performance variable during that historicalrange.
The What-If report is very similar to the eHealth Trend report; however, you can change
the capacity of the resource, the demand placed on the resource during that time, or both;
and then update the report to model the effects of possible changes.
Note: When you enter values for capacity and demand, note that you must specify
percentage values. For example, 100% causes the report to use the current values; 50%
causes the report to show half the current values (dividing the capacity or demand by 2);
and 200% causes the report to double the current values.
This report shows that by increasing the capacity of the Virginia line by 50% (capacity =125%), peak utilization would be reduced to about 60% of capacity. This capacity should be
sufficient to meet expected demand.
148: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 149/182
How You Plan Future Capacity Changes
To plan and predict capacity changes, follow these steps:
1. Identify potential capacity changes
2. Analyze capacity trends.
3. Visualize capacity changes.
4. Address capacity changes.
Identify Potential Capacity Changes
eHealth provides capacity planning reports that enable you to analyze the behavior of your
resources under varying conditions, and predict where and when you’ll need to add
capacity.
To identify potential capacity issues
1. Schedule a Health report to run every Sunday to ensure that you obtain data for an
entire business week. For instructions, refer to the section on customizing and
scheduling Health reports in Chapter 5 of this guide.
2. Examine the Situations to Watch chart in the Summary section of the Health report.
The Situations to Watch chart shows the top 10 elements (network interfaces, CPUs,
disk partitions) that are nearing their capacity. The chart shows how close each
element is to its threshold, how fast utilization is growing, and how long until demand
exceeds capacity.
This report shows several user partitions that are nearing their thresholds. In the Days
To Threshold column, System-Orange shows 0, meaning that utilization has reached
the Trend threshold. System-Green shows 20 days to threshold, and System-Pinkshows Increasing, indicating utilization is growing, but will not reach threshold for a
long period of time.
Each of the systems at or near their threshold merit further investigation. For example,
System-Orange could already be overutilized, or it could be a system partition designed
to operate near capacity. System-Green, on the other hand, is 20 days from meeting
its threshold, but could be a good candidate for upgrade if it is showing a steady
increase in demand.
149: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 150/182
3. To drill down to more information for each reported situation, click the element name to
run a Situations to Watch Detail report for the partition.
4. Examine the trend line to see how quickly the trend is approaching the threshold. If the
line is rising at a steady rate, as in this example, consider adjusting capacity by
increasing the size of the partition, deleting unneeded directories and files, or buying a
new system.
Analyze Capacity Trends
After identifying potential upgrade candidates, run Capacity Projection and CapacityProvisioning reports to forecast volume changes over the upcoming weeks and months, and
predict when elements need to be upgraded.
To run Capacity Projection and Capacity Provisioning reports
1. Log in to the eHealth Web interface, and select the Run Reports tab.
2. Click the Standard Health report link. The Run Health Report page appears.
3. Specify the report subjects (for example, System technology, and a group of
elements).
4. Click More Options, and do the following:
a. Under Presentation Attributes, select Capacity.
b. Select Capacity Projection and Capacity Provisioning to those reports.
c. Specify 20 in the Capacity Provisioning Minimum Lead-Time field.
d. Specify 90 in the Capacity Provisioning Maximum Lead-Time field.
150: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 151/182
5. Save the report as a template with a unique name, such as Capacity_Report.
6. Click Generate Report to run the report. The report can take a few minutes to run on
demand. As a best practice, you can schedule the report to run from the eHealth
console so that the report is automatically generated during off-peak hours and is
ready for review when you need it.
7. Review the Capacity Projection report. The report forecasts how the capacity of a
particular variable (partition utilization, for example) will change in the future. You can
run the report based on peak, average, or percentile capacity values. eHealth measures
the predicted capacity values against a threshold that you specify, and displays those
elements predicted to exceed the threshold.
This report displays the percentage of partition capacity that will be consumed on each
system at 30 days, 90 days, and nine months into the future. You can see that demand
on System-Orange is near threshold, but not increasing very much. Demand on
System-Purple; however, is quickly increasing and will soon exceed capacity. System-
Purple, therefore, may be in greatest need of upgrade.
8. To project when these elements will need to be upgraded, review the CapacityProvisioning report. The Capacity Provisioning report compares projected capacity
values against an upgrade threshold, and displays those elements predicted to exceed
the threshold, along with the number of days until an upgrade is required.
Like the Capacity Projection report, you can run this report based on peak, average, or
percentile capacity values. You can set both the upgrade threshold and an upgrade
lead-time window by customizing the Presentation Attributes for the Health report.
The report shows elements that are predicted to meet a 90% capacity upgrade point
within the next 20 to 90 days. System-Green is most in need of upgrade, and should be
addressed in the next 20 days.
151: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 152/182
BEST PRACTICES
For the Capacity Projection and Provision reports, it is important to know how much lead
time you need to bring new capacity online. For example, some service providers may
require 90 days to provide a new T1 line. For systems, you might need 30 days to order
and add new disk space or memory. Therefore, for the types of resources you manage, you
need to know when you must order additional capacity so that it is available — installed,tested, and turned over — before the upgrade point is reached.
These reports identify those locations for which additional capacity needs to be ordered
today to avoid reaching the threshold. The examples in this section show a 20-90 day lead
time, and all three locations are projected to require an upgrade within that window. If it
takes 90 days to add disk space or memory, it is very likely that the 90% upgrade
threshold will be violated during this time period. When you first start to use eHealth to
monitor your resources, you may find that some of your resources need upgrades sooner
than your lead times might allow; but over time, these reports will help you to isolate
problems earlier and avoid threshold violations before your lead time windows expire.
Visualize Capacity Changes
The Capacity Trend What-If report shows how resources perform as your infrastructure
changes and grows. These reports allow you to leverage historical data to predict future
patterns, model changes in capacity or demand, and determine the effect on resources.
To help visualize the impact of changes in demand
1. Run a Capacity Trend What-If report to analyze potential solutions:
a. On the Run Reports page in the Available Reports column under What-If, select
CapacityTrend or another template name. The Run a Capacity Trend What-If
Report page appears.
b. Select an element type from the Element Type list, and then select an element
from the Available elements list.
c. Select a variable for your report.
d. Under Chart type, select the chart format.
e. Under Divide by, specify how you want to graph the selected variable
f. Optionally, select a time interval during which the data is aggregated.
g. Select a sample size based on the time range for your report. The As Is sample size
uses the most granular data available and does not aggregate the values.
h. Under Report Time, select the report period. You can specify the values now,today, or yesterday, or an actual date or time value.
i. Select More Options to specify the hours and days that the report will show.
j. Optionally, customize the report by setting presentation attributes.
k. Click Generate Report.
152: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 153/182
2. Use the fields at the top of the report to adjust the capacity and/or demand for the
resource, and run the report again to model the change.
3. Use the report to model and determine whether an existing resource can support
anticipated changes and, if not, how much capacity must be added. You can also
illustrate potential problems so that you can propose requests for new equipment or
upgrades.
For example, this report shows that by doubling CPU capacity, demand on the server
will be well under the trend threshold of 80%, even with a 50% increase in demand.
How You Address Capacity Changes
After you have identified possible capacity issues or improvements, consider these typical
actions to resolve the problem:
Upgrade the element to a higher capacity
Replace the element with a larger or faster device (such as a larger disk, faster interface,or faster CPU system).
The What-If report can help you to model, or visualize, how the proposed changes will
improve performance. After making any changes to your devices or resources, update your
configuration as described in Chapter 5.
153: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 154/182
Voice Capacity Planning
For networks with traditional or IP voice telephony devices or voice messaging systems,
eHealth for Voice can help you to identify capacity problems and monitor GoS during the
peak hours of the network. For voice devices, capacity problems can include trunk/port
utilization and voice mail messaging disk space. When these factors are overutilized, it
impacts service degradations and customer satisfaction. When these factors are
underutilized, it is important to identify where your devices might be overprovisioned so
that you can take some steps to reduce costs or reallocate resources to resolve congestion
in other areas of the network.
Effective capacity planning enables you to achieve the following:
Reduce costs through the reduction or elimination of underused leased lines, as well as
the reduction of maintenance costs for unused or unnecessary ports.
Improve performance though identification of overused and underused ports or trunks
and rebalancing of capacity with demand.
Improve budget predictability by tracking trends, which helps you to avoid emergency
purchases and to ensure that you can research and plan for the best service costs.
To understand traffic patterns, you need to collect information from the PBX that details
peak traffic for each trunk group for at least a few weeks, preferably months. This
information is available on the switch. eHealth for Voice automates the collection of this
information, making it easier to run quarterly and on-demand maintenance assessments of
your voice capacity and usage patterns.
Analyze Voice Capacity
Once you have collected traffic for the desired period, you can use the Capacity Analyzer
tool to determine how well your voice devices are servicing customers during the busiest
hour. From this dialog, you can quickly calculate GoS, view disk space capacity for message
servers, and view process capacity for communications servers.
To access the Capacity Analyzer
1. On the system on which eHealth for Voice is installed, select Start, Programs, eHealth
for Voice, eHealth for Voice. The eHealth for Voice Program Console appears.
2. Select Measurements, Reports in the left navigation console tree.
154: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 155/182
3. Double-click the Capacity Analyzer icon in the right pane of the console. The Capacity
Analyzer dialog appears.
Analyze GoS
eHealth for Voice calculates a GoS to determine how callers are serviced (calls answered,
busy, or ring no answer) during the busiest hour of the period. This can help you to
determine additional bandwidth needed to carry voice traffic on the network.
To analyze the GoS
1. In the Capacity Analyzer dialog, select the Port Analysis tab to access the grade of
service calculation tool. The target grade of service refers to the percentage of callersthat will be serviced (calls answered) during the busiest hour. A GoS of .001 means
that .999% of callers will get through.
2. Select the target GoS and click Apply. The dialog shows the number of trunks that you
need to add to support that GoS.
155: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 156/182
3. Use the horizontal scroll bar to scroll to the right side of the dialog.
4. Review the Add/(Delete) Trunks column to determine the number of trunks that you
will need to provide that GoS. A number in parentheses shows the number of trunks
that you could remove and still be able to support the GoS during the peak hour, which
can help you to detect underutilized resources.
5. Review the Erlangs column to determine the actual peak traffic. An Erlang is a
measurement of voice traffic capacity. It represents how many minutes of voice traffic
occur during an hour of time. If 10 users each make one 10-minute call in a given hour
the hour had 100 minutes of calls, and had 1.67 Erlangs of traffic. This information
helps you to identify how much additional bandwidth you need on the network to
support voice.
BEST PRACTICES
As you use the Capacity Analyzer, consider the following best practices that can make
results more meaningful for your environment:
When you first install eHealth for Voice, run the Capacity Analyzer weekly to identify
resources that are not being used. After this initial period, you can run it less frequently
(monthly or quarterly) to identify usage changes in your network.
Trunk or port lines with zero or very low traffic could be backups or overflow lines. Before
proceeding with detailed service change plans, always confer with the person responsible
for PBX/IP-PBX engineering to ensure that you understand the purpose of any trucks or
ports.
The level of over- or underutilization varies, depending upon the GoS selected. As the
GoS decreases, the need for additional resources increases. Company service levels will
help define the GoS needed for your environment.
156: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 157/182
How You Address Underutilized Resources
After you have identified and documented underused resources, consider eliminating
unused trunks, ports, or PBXs to reduce the service charges and/or maintenance charges
for your network.
Show ROI
After you have identified capacity changes that you could make, you can calculate and
show potential monthly savings from eliminated trunks and the difference in maintenance
costs between current and future configurations.
To estimate the ROI
1. Review your monthly usage fees to identify the cost of leased lines that may be
underutilized.
2. Review your port maintenance fees to identify costs for unused ports.
3. Contact your service providers to identify possible costs for changing service or
reducing the number of ports. If you have internal costs for changing service, take
those costs into consideration as well.
4. Calculate the ROI for making changes using the following equation:
ROI = (service change + port-change fees) / monthly savings
5. Based on your ROI calculations, determine whether making the proposed changes is
wise.
How Y ou Address and Confirm Overutilized Resources
The Capacity Analyzer provides the peak traffic for a given timeframe as well as therequired number of trunks or ports to handle the traffic load for the desired grade of
service.
To confirm the results of overutilization, do the following:
Run the Capacity Analyzer again and select a more granular date range. For example, if a
quarterly-report peak hour shows utilizations that seem unusually out of range, evaluate
each month to see the pattern or trends of busy-hour data. This can help you to
investigate whether the busy hour is an anomaly, or if the traffic is growing in your
network. If the busy hour is related to a one-time event, you can ignore this atypical
activity in your capacity planning.
Running Voice traffic reports for the platform will show trends in trunk or port usage. In
this way, you can see if there is any overflow to another trunk group.
Verify the GoS selected with the person responsible for PBX/IP-PBX engineering for this
trunk group. Confirm that your analysis for each trunk group uses the GoS originally
intended or planned for that group.
Contact your service provider to add additional capacity, such as adding trunks to the
hunt group or if a fractional T1, to add more capacity.
157: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 158/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 159/182
Chapter 10: Rapid Problem
ResolutionWhen even the smallest problem occurs in a network, a wide range of services and
capabilities can be affected. Network management systems detect these problems and can
often send streams of events to report slowdowns, outages, and impacted services. This
barrage of information, though accurate, often hinders troubleshooting efforts simply
because of the amount of data that operators must filter through.
CA’s Network and Voice Management solution helps you to direct your troubleshooting
efforts to the source of the problem. SPECTRUM software performs event correlation,
impact analysis, and RCA for multiple vendors and technologies across network, system,
voice, and application infrastructures. Combined with eHealth’s ability to find and report on
performance behavior changes, and eHealth for Voice’s ability to monitor the policies and
capacities of voice networks, CA offers a key solution for identifying problems, quickly
targeting the real source of the problem, and providing deeper insights into historical trends
and reports.
This chapter describes how SPECTRUM’s problem resolution and root cause identification
processes work.
Problem-Solving Techniques
SPECTRUM offers three intelligent, automated, and integrated approaches to problem
solving:
Model-based IMT
Rules-based EMS
Policy-based Condition Correlation Technology (CCT)
SPECTRUM is fundamentally a model-based system. Model-based systems are adaptable to
changes that regularly occur in a real-time, on-demand, IT infrastructure. Rules-based
systems are flexible in allowing customers to add their own intelligence without requiring
programming skills. SPECTRUM combines the best of both approaches, using models to
keep up with changes while leveraging easy-to-create rules running against the models to
avoid the need for constant rule editing. Policy-based systems are automated means of
connecting seemingly unrelated pieces of information to determine condition and state of
physical devices and logical services. This condition correlation engine combines with
SPECTRUM’s modeling engine and rules engine to deliver a higher level of cross-silo service
analysis.
You can place almost every service delivery infrastructure problem into one of three
categories: availability, performance, or threshold exceeded. Infrastructure faults occur
when things break, whether they are related to LAN/WAN, server, storage, database,
application or security. Infrastructure performance problems often result in brown-out
conditions in which services are available but are performing poorly. From the user’s
perspective, a slow infrastructure is a broken infrastructure. The final category is abnormal
behavior conditions in which performance, utilization, or capacity thresholds have been
exceeded as demand/load factors fall significantly above or below observed baselines.
159: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 160/182
SPECTRUM, eHealth, and eHealth for Voice can detect these problems in your network and
raise alarms when problems occur. By sending all alarms to SPECTRUM, you can pinpoint
the cause of problems.
Model-based, rules-based, and policy-based analytics in SPECTRUM understand
relationships between IT infrastructure elements and the customers or business processes
that they are designed to support. It is through this understanding of relationships thatSPECTRUM has been shown to deliver 70% reduction in downtime while resolving 90% of
availability or performance problems from a central location. SPECTRUM’s RCA has been
able to reduce the number of alarms by several orders of magnitude while reducing MTTR
from hours to minutes. SPECTRUM’s distributed management architecture has also proven
effective at performing RCA for over 5 million devices (20+ million ports) in a single
environment with fully meshed and redundant core and distribution network layers. Our
integrated approach to fault and performance management has enabled enterprise,
government, and service provider organizations around the world to manage what matters
through service level intelligence.
Complex Problems and Pow erful Solutions
IT infrastructure operations management is a difficult and resource-intensive — yetnecessary — undertaking. When the infrastructure fails or slows down, tools are required to
quickly pinpoint the root cause, suppress all symptomatic faults, prioritize based on
business impact, and aide in the troubleshooting and repair process to accelerate service
restoration.
To ensure the performance and availability of the infrastructure, most companies employ a
dual approach of highly available, fault-tolerant, load-balancing designs for infrastructure
devices and communication paths, and a management solution to ensure proper operation.
In fact, the job of the management solution is further complicated by today’s high-
availability environments. The management solution must understand the load-balancing
capacity; it must be able to track primary and fault-tolerant backup paths; and understand
when redundant systems are active. The investment in the management solution is as
important as the investment in the infrastructure itself.
Problem P rediction and P revention
Management software should help predict or prevent problems. CA’s out-of-the-box
utilization, performance, and response time thresholds can be used to act as an early
warning system when a problem is about to happen or when a service level guarantee is
about to be violated. While these thresholds can obviously be tuned for a specific customer
environment, it is also important to have out-of-the-box thresholds that are relevant from
the start of your monitoring baselines.
Before you can begin the true task of troubleshooting, you must isolate the problem.
Simply being aware of the problem and collecting the data is not sufficient. To effectively
triage the issue, you need to determine the location or source of the problem (and wherethe problem does not exist). If multiple problems are occurring simultaneously, you should
be able to automatically prioritize issues based on impacted customers, services, or
infrastructure devices. It is far too costly to rely on human intervention to determine the
root cause of problems, and to sift through an unending stream of symptomatic problems.
Every minute that you devote to isolating the problem is a minute lost to solving the
problem.
160: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 161/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 162/182
status, parameter-based threshold violations, response time measurement threshold
violations, deviations from historical performance, and health analysis.
SPECTRUM’s RCA is the automated process of troubleshooting the infrastructure and
identifying the managed elements that have failed to perform their function. The goal of
SPECTRUM’s RCA is straightforward: identify a single source of failure, the Root Cause, and
generate the appropriate actionable alarm for the failed managed element.
Inductive Modeling Technology
The core of SPECTRUM’s RCA solution is its patented IMT. IMT uses a powerful object-
oriented modeling paradigm with model-based reasoning analytics. In SPECTRUM, IMT is
most often used for physical and logical topology analysis as SPECTRUM can automatically
map topological relationships through its auto-discovery engine.
In SPECTRUM, a “model” is the software representation of a real-world managed element,
or a component of that managed element. This representation allows SPECTRUM to not only
investigate and query an individual element within the network, but also provides the
means to establish relationships between elements to recognize them as part of a larger
system. IMT’s RCA is based on a sophisticated system of models, relationships and
behaviors that create a software representation of the infrastructure. Decisions concerning
which element is at fault are not determined by looking at a single element alone. Instead,
the relationship between the elements is understood and the conditions of related managed
elements are factored into the analysis. Models are in direct communication with their real-
world counterparts, enabling SPECTRUM to not only listen, but proactively query for health
status or additional diagnostic information. Models are described by their attributes,
behaviors, relationships to other models, and algorithmic intelligence.
Intelligent analysis is enabled through the collaboration of models in a system. This
collaboration enables correlation of the symptoms, suppression of unnecessary alarms, and
impact analysis of affected users, customers, and services. Collaboration includes the ability
to exchange information and initiate processing between any models within the modeling
system. A model that is making a request to another model may, in turn, trigger that modeto make requests to other models, and so on. Relationships between models provide a
context for collaboration.
Collaboration between models enables the following:
Correlation of the symptoms
Suppression of unnecessary/symptomatic alarms
Impact analysis
A simple example of IMT in action can be demonstrated by a network router port transition
from UP to DOWN. If a port model receives a LINK DOWN trap, it has intelligence to react
by performing a status query to determine if the port is actually down. If it is, in fact,
DOWN, it consults the system of models to determine if the port has lower layer sub-
interfaces. If any of the lower layer sub-interfaces are also DOWN, only the condition of the
lower layer port will be raised as an alarm. An application of this example can be described
by several Frame Relay DLCIs transitioning to INACTIVE. If the Frame Relay port is DOWN,
IMT will suppress the symptomatic DLCI INACTIVE conditions and raise an alarm on the
Frame Relay port model. Additionally, when the port transitions to DOWN, IMT will query
the status of the connected Network Elements (NEs) and if those are also DOWN, those
conditions will be considered symptomatic of the port DOWN, will be suppressed, and will
162: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 163/182
be identified as impacted by the port DOWN alarm. Root cause and impact are determined
through IMT’s ability to both listen and talk to the infrastructure.
Event Management System
At times, event streams local to a specific source are the only source of management
information. Any one event may or may not be a significant occurrence — but in the
context of other events, information, or time, it may be an actionable condition. Event
Rules in SPECTRUM’s Event Management System provide a more complex decision-making
system to indicate how events should be processed. You can apply Event Rules to look for a
series of events to occur on a model in a certain pattern, within a specific timeframe, or
with certain data value ranges. You can use Event Rules to generate other events or even
alarms.
If events occur that meet the preconditions of a rule, SPECTRUM may do the following:
Generate another event, allowing cascading events.
Log the event for later reporting/troubleshooting purposes.
Promote the event into an actionable alarm.
SPECTRUM provides six customizable Event Rule types that form the basis of the Event
Management System rules-based engine.
These rule types are building blocks that can be used individually or cooperatively to effect
an alarm on the most simple or sophisticated event-oriented scenarios. This Event
Management System rules engine allows for the correlation of event frequency/duration,
event sequence and event coincidence.
The Event Rule types are as follows:
Event Pair (Event Coincidence): This rule generates an error when the first of two
events that you define do not occur in sequence. If the second event in a series does not
occur, this may indicate a problem. The Event Pair rule type creates a more relevantevent based on this scenario. Event rules based on the Event Pair rule type generate a
new event when an event occurs without its paired event. It is possible for other events
to occur between the specified event pair without affecting this event rule.
Event Rate Counter (Event Frequency): This rule type generates a new event based
on events that occur at a specified rate in a specified time span. A few events of a certain
type might not be a problem, but if the number of these events reaches a certain
threshold within a specified time period, notification is required. SPECTRUM does not
generate additional events if the rate stays at or above the threshold. If the rate drops
below the threshold and then subsequently rises above the threshold, it generates
another event. The Event Rate Counter type is best suited for detecting a long, sustained
burst of events.
Event Rate Window (Event Frequency): This rule type generates a new event when a
number of the same events are generated in a specified time period. The Event Rate
Window type is best suited for accurately detecting shorter bursts of events. It monitors
an event that is not significant if it occurs occasionally, but is significant if it happens
frequently within a short period of time. If an event occurs a few times during the day, a
problem may not exist. If an event occurs five times in one minute, perhaps that is a
condition for which you want to be notified. If the event occurs above a certain rate,
SPECTRUM generates another event. SPECTRUM will not generate additional events if the
163: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 164/182
rate stays at or above the threshold. If the rate drops below the threshold and then
subsequently rises above the threshold, it generates another event.
Event Sequence (Event Sequence): This rule type generates an event when a
particular order of sequenced events might be significant in your environment. This
sequence can include any number and any type of events. When the sequence is detected
in the given period of time, SPECTRUM generates a new event.
Event Combo (Event Coincidence): This rule type generates a new event when a
certain combination of events occurs in any order. The combination can include any
number and type of events. When the combination is detected within a given time period,
SPECTRUM generates a new event.
Event Condition (Event Coincidence): This rule type generates an event based on a
conditional expression. Part of SPECTRUM’s “trust but verify” methodology — a series of
conditional expressions can be listed within the event rule and the first expression that is
found to be TRUE will generate the event specified with the condition. You can construct
rules to provide correlation through a combination of evaluating event data with IMT
model data (including attributes which can be read directly from the remote managed
element). For example, if a trap is received notifying the management system of memory
buffer overload, to validate that an alarm condition has occurred, an Event Condition rulecan initiate a request to the device to check actual memory utilization.
SPECTRUM implements a number of event rules out-of-box by applying one or more of the
event rule types to event streams. You can create or customize event rules using any of the
rule types and apply these Event Rules on other event streams. Further implementation of
event rules using the Event Management System is discussed later in this paper.
Condition Correlation
To perform more complex user-defined or user-controlled correlations, SPECTRUM offers a
policy-based CCT that enables the following:
Creation of correlation policies
Creation of correlation domains
Correlation of seemingly disparate event streams or conditions
Correlation across sets of managed elements
Correlation within managed domains
Correlation across sets of managed domains
Correlation of component conditions as they map to higher order concepts such as
business services or customer access
Several important concepts relate to condition correlation:
Conditions: A condition is similar to state. An event/action can set a condition and it can
clear it. It is also possible to have an event set a condition but require a user-based
action to clear the condition. The condition exists from the time it is set until the time it is
cleared. A very simple example of a condition is a “port down” condition. The “port down”
condition will exist for a particular interface from the time that the LINK DOWN trap or set
event (such as a failed status poll) is received until the time the LINK UP trap or clear
event (such as a successful status poll) is received. A number of conditions that may be
useful for establishing domain level correlations are defined out-of-box in SPECTRUM, and
you can add more.
164: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 165/182
Seemingly Disparate Conditions: Many devices in an IT infrastructure provide a
specific function. The device-level function is often without context as it relates to the
functions of other devices/components. Most managed elements can emit event streams,
but those event streams are local to each component. A simple example is when a
Response Time Management system identifies a condition of a test result exceeding a
threshold. At the same time, an Element Management System may identify a condition of
a router port exceeding a transmit bandwidth threshold. These conditions are seeminglydisparate as they are created independently and without context or knowledge of each
other. In reality, the two are often closely related; that is, an overutilized port could be
the cause of the response degradation.
Rule Patterns: Rule Patterns associate conditions when specific criteria are met. A
simple example is a “port down” condition caused by a “board pulled” condition. The two
conditions are likely related if the port and board have the same slot number. The
following diagram illustrates this rule pattern. A rule pattern can result in the creation of
an actionable alarm or the suppression of symptomatic alarms.
Correlation Domains: You can use a Correlation Domain to both define and limit the
scope of one or more Correlation Policies. You can apply it to a specific Service. For
example, in the Cable Broadband environment, a return path monitoring system may
detect a return path failure in a certain geographic service area. This “return path failure”
condition is causing subscriber’s high-speed cable modems to become unreachable and
Video on Demand (VoD) pay-per-view streams to fail. The knowledge that the return path
failure, the modem problems, and the failed video streams are all in the same correlation
domain is essential to correlating the events and ultimately identifying the root cause.
However, it is also important to have the ability to distinguish that a “return path failure”
condition occurring in one correlation domain (Philadelphia) should not be correlated with
VoD stream failure conditions occurring in a different correlation domain (New York).
Correlation Policies: You can bundle Multiple Rule Patterns into Correlation Policies. You
can then apply Correlation Policies to a Service or Correlation Domain. For example, you
can create a bundle of rule patterns applicable to OSPF and label them the OSPF
Correlation Policy. You can apply the OSPF Correlation Policy to each Correlation Domain,
where each autonomous OSPF region and the supporting routers in that region define the
Correlation Domain. As another example, you could define Correlation Policy based on a
set of rule patterns that operate within the confines of a MPLS/BGP VPN, labeled as the
165: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 166/182
Intra-VPN Policy, and apply them to all modeled VPNs. Whenever you add a rule to a
Correlation Policy, or delete one from it, SPECTRUM automatically updates all related
Correlation Domains immediately. You can apply multiple Correlation Policies to any
Correlation Domain, and apply a Correlation Policy to many Correlation Domains.
Condition-based correlations are very powerful and provide a mechanism to develop
Correlation Policies and apply them to Correlation Domains. When you apply them to
Service Level Management, Correlation Policies are similar to metrics of an SLA, and
Correlation Domains are similar to service, customer, or geographical groupings.
Occasionally, the only way to infer a causal relationship between two or more seemingly
disparate conditions is when those conditions occur in a common Correlation Domain. These
mechanisms are necessary when you SPECTRUM cannot discover causal relationships
through interrogations.
Fault Scenarios
Out-of-box, SPECTRUM addresses a wide range of different scenarios to which it can
perform RCA. This section provides specific scenarios where the techniques described in the
previous section are employed to determine RCA and impact analysis. For the sake of
simplicity and brevity, the detail will be limited to the basic processing. Also, for the
purpose of the discussion and figures, the following table shows the color of alarms that are
associated with the icon status of SPECTRUM models at any given time.
Communication Outages and Impacts
Communication outages are types of faults often described as “black-outs” or “hard faults.”
With these types of faults, one or more communication paths are degraded to the point that
traffic can no longer pass. The fault could be caused by many situations including broken
copper/fiber cables/connections, improperly configured routers/switches, hardware failures,
severe performance problems, security attacks, and so on. With these hard communication
failures, limited information is available to the management system as it is unable to
exchange information with one or more managed elements. With SPECTRUM’s sophisticated
system of models, relationships, and behaviors available through IMT, SPECTRUM can infer
the fault and impact. IMT inference algorithms are also called Inference Handlers. A set of
Inference Handlers designed for a purpose is referred to as an Intelligence Circuit, or simply
Intelligence.
166: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 167/182
How SPECTRUM’s Intelligence Isolates Communication Outages
SPECTRUM offers powerful capabilities that can help you to identify the real sources of
problems in the network. For many management solutions, the steps to achieve this
capability are often manual and very time-intensive. With SPECTRUM, however, many of
these steps are performed automatically by the SPECTRUM software.
The process that SPECTRUM uses to identify and isolate outages is as follows:
1. Use SPECTRUM discovery to build a model of your infrastructure that shows the
resources in your network and how they are connected.
2. Upon receipt of a problem event, SPECTRUM checks the status of closely-connected
resources to determine whether they have problems.
3. SPECTRUM analyzes the status of the resources to identify the likely root cause of the
problem.
4. SPECTRUM suppresses alarms that are symptoms of the root cause, but not the cause
itself.
5. SPECTRUM evaluates the severity of the problem to help prioritize the problem among
any other reported problems in the network.
The following sections describe these SPECTRUM capabilities in more detail.
BUILD THE MODEL WI TH AUTODISCOVERY
An accurate representation of the infrastructure is critical for determining the fault and the
impact of the fault. SPECTRUM’s modeling system can represent not only a wide array of
multi-vendor equipment, but also a wide range of technologies and connections that can
exist between various infrastructure elements. SPECTRUM has specific solutions for
discovering multi-path networks over a variety of technologies supporting many different
architectures. SPECTRUM offers support for meshed and redundant, physical and logicaltopologies based on ATM, Ethernet, Frame Relay, HSRP, ISDN, ISL, MPLS, Multicast, PPP,
VoIP, VPN, VLAN and 802.11 Wireless environments — even legacy technologies such as
FDDI and Token Ring. SPECTRUM’s modeling is extremely extensible and can be used to
model OSI Layers 1-7 in a communication infrastructure.
SPECTRUM provides four different methods for building the physical and logical topology
connectivity model for any given infrastructure:
SPECTRUM’s AutoDiscovery application automatically and dynamically interrogates the
managed infrastructure about its physical and logical relationships. This approach to
AutoDiscovery was patented in 1996, and SPECTRUM was the industry’s first product to
discover Layer 2 switch connectivity. SPECTRUM’s AutoDiscovery application works in two
distinct phases (although there are many different stages within each phase that are notcovered here). The first phase is Discovery. When initiated (as described in Chapter 5),
AutoDiscovery automatically discovers the elements that exist in the infrastructure. This
provides SPECTRUM with an inventory of elements that could be managed. The second
phase is Modeling. AutoDiscovery uses management and discovery protocols to query the
elements it has found to gain information that will be used to determine the Layer 2 and
Layer 3 connectivity between managed elements. For example AutoDiscovery uses SNMP
to examine route tables, bridge tables, and interface tables, but also uses traffic analysis
167: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 168/182
and vendor proprietary discovery protocols such as Cisco’s CDP. AutoDiscovery is a very
thorough and automated mechanism for building the infrastructure model.
The Modeling Gateway imports a description of the entire infrastructure’s components, as
well as physical and logical connectivity information from external sources, such as
Provisioning systems or Network Topology databases.
The command line interface or Programmatic APIs can build a custom integration orapplication to import information from external sources.
Graphical user interfaces allow users to quickly point, click, and drag and drop to
manually build the model.
SPECTRUM’s modeling scheme allows a single managed element to be logically divided into
any number of sub-models. This collection of models and the relationships between them is
often referred to as the semantic data model for that type of managed element. Thus, a
typical semantic data model for a networking device may include a chassis model with
board models related to the chassis. Physical interface models would be associated to the
board models. Each physical interface model may have a set of subinterface models
associated below them.
SPECTRUM has a set of well-defined associations that define how different semantic datamodel sets act with one another. When SPECTRUM represents the connectivity between two
devices, a relationship is established not only between the two ports that form the link
between them, but also between device models and to the corresponding interface and port
models of other devices, as shown in the following figure.
START THE PROBLEM ANALYSIS
SPECTRUM can begin to solve a problem proactively upon receipt of a single symptom.
Many problems share the same set of symptoms, but SPECTRUM must perform further
analysis to determine the root cause. For communication outages, the analysis begins when
a model in SPECTRUM recognizes the communications failures through failed polling, traps,
events, performance threshold violations, or lack of response. SPECTRUM automatically
validates the communication failures through retries, alternative protocols, and alternative
path checking as part of its “trust but verify” methodology. The model that raised the
168: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 169/182
problem which started the intelligence is called the initiator, although more than one model
can trigger the intelligence.
The initiator model intelligence requests a list of other models that are directly connected to
it. These connected models are referred to as the initiator model’s neighbors. For example,
the following figure shows five models, where Model B is the initiator, and models A, C, D,
and E are neighbors.
With a list of neighbors identified, the intelligence directs each neighbor model to check its
current status. This check is referred to as the “Are You OK?” check. “OK” is a relative term
and a unique set of attributes related to performance and availability will vary from model
to model based on the real-world capabilities of the device that the model is representing.
When a model is asked “Are You OK?”, the model can initiate a variety of tests/checks to
verify its current operational status. For example, with most SNMP-managed elements, the
check is typically a combination of SNMP requests but could be more involved by
interrogating an Element Management System or as simple as an ICMP Ping. A
comprehensive check could include threshold performance calculations or execution of response time tests.
Each neighbor model returns an answer to “Are YOU OK?”.
169: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 170/182
LOCATE THE ROOT CAUSE — FAULT I SOLATION
If the initiator model has a neighbor that responds that it is “OK”, such as Model A in the
previous figure, SPECTRUM can infer that the problem lies between the unaffected neighbor
and the affected initiator (Model B). In this case, the initiator model that triggered the
intelligence is a likely culprit for this particular infrastructure failure. As a result, SPECTRUM
raises a critical alarm on the initiator model, which is considered the “Root Cause” alarm, as
shown in the next figure.
HIDE THE NOISE OF SYMPTOMATIC PROBLEMS WI TH ALARM SUPPRESSION
As the analysis continues beyond isolating the device at fault (Model B), the next step is to
analyze and suppress reporting of the effects of the fault. This is the goal of intelligent
Alarm Suppression. If a neighbor (such as Models C, D, or E) of the initiator model
responds that it is not OK, this neighbor is considered to be affected by the failure occurring
elsewhere in the infrastructure. As a result, SPECTRUM places these models into a
suppressed condition (Grey Color) because the alarms are symptomatic of a problemelsewhere. While these resources are experiencing problems, they are not the root cause
problem; they will likely be fixed when operators have addressed the problems that are
affecting Model B.
170: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 171/182
PRIORI TIZE THE PROBLEM — IMPACT ANALYSIS
SPECTRUM continues to analyze the total impact of the fault because of its ability to
understand that the individual models exist as part of a larger network of models
representing the managed infrastructure.
As such, the intelligence will analyze each Fault Domain, which is the collection of models
with suppressed alarms related to the same failure. These impacted models are linked tothe root fault for presentation and analysis. The intelligence provides a measurement of the
impact that this fault is having by examining the models that are included within this Fault
Domain and calculating a measurement that serves as the impact severity. The impact
severity value provides a ranking system so that operators can quickly assess the relative
impact of each particular infrastructure fault in order to prioritize their corrective actions.
Event Management System
Event Rules provide even more processing and correlation of event streams. Event Rule
processing is required for situations in which the event stream is the only source of
management information. For example, SPECTRUM’s Southbound Gateway enables
SPECTRUM to accept event streams from devices and applications not directly monitored by
SPECTRUM, such as the eHealth for Voice PBX devices and message servers. You can alsoapply event rules to perform intelligent processing of events within certain contexts;
frequency, sequence, combination. As described earlier in this chapter, you can apply six
event rule types as event rules:
Event Pair: Expected pair event or missing pair event in specified time span.
Event Rate Counter: Events at specified rate in specified time span.
Event Rate Window: Number of events in specified time span.
Event Sequence: Ordered sequence of events in specified time span.
Event Combo: Two or more events, any order in specified time span.
Event Condition: Events parsed for specific data to allow creation of new events based oncomparisons of variable bindings, attributes, constants, etc.
SPECTRUM provides many out-of-the-box event rules, but also provides easy-to-use
methods for creating new rules using one or more of the event rule types. This section
highlights a couple of out-of-box event rules and also a few customer examples of event
rule applications.
OUT-OF-BOX EVENT PAIR RULE
SPECTRUM has the ability to interpret Cisco syslog messages as event streams. Each syslog
message is generated on behalf of a managed switch or router and is directed to the
SPECTRUM model representing that managed element. One of the many Cisco syslog
messages indicates a new configuration has been loaded into the router. The Reload
message should always be followed by a Restart message, indicating the device has beenrestarted to adopt the newly loaded configuration. If not, a failure during reload is probable
SPECTRUM uses an event rule based on the Event Pair rule type to raise an alarm with
cause “ERROR DURING ROUTER RELOAD” if it does not receive the Restart message within
15 minutes of the Reload message. The following diagram illustrates the events and timing.
171: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 172/182
MANA GING SECURITY EVENTS USING AN EVENT RATE COUNTER RULE
SPECTRUM is able to collect event feeds from many sources. Some customers send events
from security devices such as Intrusion Detection Systems (IDSs) and firewalls. These
types of devices can generate millions of log file entries. These customers could use an
Event Rate Counter rule to distinguish between sporadic client connection rejections and
real security attacks. The rule generates a critical alarm if 20 or more connection failures
occurred in less than one minute, as shown in the following figure.
MANA GING SERVER MEMORY GROWTH USI NG AN EVENT SEQUENCE RULE
A common problem with some applications is the inability to manage memory usage. Some
applications will use system memory and never free it again for other applications to use.
This can degrade performance on the host machine, and eventually the “memory leaking”
application will fail. As one example, if you have a web server application with a history of
slow memory leak problems, you might schedule a reboot once a week during a planned
maintenance window to compensate for the memory consumption. However, if the memory
leak occurs more quickly than usual, which is a deviation from normal behavior, you might
want to perform an emergency reboot before the scheduled maintenance. You can employ
172: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 173/182
a combination of progressive SPECTRUM thresholds with an Event Sequence rule to monitor
for abnormal behavior, or you could use eHealth Live Health analysis to report the deviation
from normal memory consumption. Using the SPECTRUM thresholds as an example, you
could set monitoring to create events as the memory usage passed threshold points of
50%, 75% and 90%. If those threshold points are reached in a period of less than one
week, SPECTRUM generates an alarm to provide notification to reboot the server prior to
the scheduled maintenance window, as shown in the following diagram.
AN OUT-OF-BOX EVENT CONDITION RULE COMBINED WI TH AN EVENT PAI R RULE
RFC2668 (MIB for IEEE 802.3 Medium Attachment Units) provides management definitions
for Ethernet hubs. Within the RFC, is the definition of an SNMP trap used to notify a
management system when the “jabber state” of an interface changes. Jabber occurs when
a device that is experiencing circuitry or logic failure continuously sends random (garbage)
data. The trap identifier simply indicates a change in condition and the variable data portion
of the trap indicates whether “jabbering” has started or stopped. SPECTRUM applies anEvent Condition rule to create distinct start/stop events by looking at the variable portion of
the trap, and uses an Event Pair rule to create an alarm if the “jabbering start” is not
closely followed by a “jabbering stop” event.
CONDITION CORRELATION TECHNOLOGY
The CA SPECTRUM CCT offers advanced customization capabilities for defining event
relationships to isolate root causes of problems. For example, consider the complexities of
managing an IP network that provides VPN connectivity across an MPLS backbone with
intra-area routing maintained by Intermediate System-to-Intermediate System (IS-IS) and
inter-area routing maintained by BGP. Any physical link or protocol failure could cause
dozens of events from multiple devices. Without applying sophisticated correlation carefully
the network troubleshooters could spend most of their time chasing after symptoms, ratherthan fixing the root cause.
173: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 174/182
AN IS-IS ROUTING FAILURE EXAMPLE
The following example illustrates the range of capabilities for Condition Correlation. A core
router, labeled in the figure as R1, has lost IS-IS adjacencies to all neighbors (labeled in
the figure as R2, R3, and R4). This also causes the BGP session with the route reflector
(labeled in the figure as RR) to be lost. This condition, if it persists, will result in routes
aging out of R1 and adjacent edge routers R3 and R4. Eventually, the customer VPN sites
serviced by these edge routers will be unable to reach their peer sites (labeled in the figure
as CPE1, CPE2, CPE3).
This failure causes the routers to generate a series of syslog error messages and traps. The
following table shows the messages and traps that SPECTRUM would receive:
The root cause of all these messages is the IS-IS routing problem related to R1. For many
management systems, the operator or troubleshooter would see each of these messages
and traps as seemingly disparate events on the event/alarm console. A trained operator or
experienced troubleshooter may be able to deduce, after some careful thought, that an R1
routing problem has occurred. However, in a large environment, these events/alarms will
likely be interspersed with other events/alarms cluttering the console. Even if the operator
or troubleshooter had the experience to identify the correlation manually, effort and time
174: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 175/182
would be devoted to doing so. That time is directly related to costs, lower user satisfaction,
and lost revenue.
Without condition correlation, SPECTRUM would send the alarm console users notification of
ten or more events. However, using a combination of an Event Rule and Condition
Correlation, you can apply a set of rule patterns to a Correlation Domain consisting of all
core (LSR) routers, enabling SPECTRUM to produce a single actionable alarm. This alarmwill indicate that R1 has an IS-IS routing problem, and a network outage may result if this
is not corrected. The seemingly disparate conditions that SPECTRUM correlates which
results in this alarm appear in the “symptoms” panel of the alarm console as follows:
1. A local Event Rate Counter rule was used to define multiple ‘IS-IS adjacency change’
syslog messages reported by the same source as a routing problem for that source.
2. A rule pattern was used to make an IS-IS adjacency lost event “caused by” an IS-IS
routing problem when the neighbor of the adjacency lost event is equal to the source of
the routing problem event.
3. A rule pattern was used to make a BGP adjacency down event “caused by” an IS-IS
routing problem when the neighbor of the adjacency down event is equal to the sourceof the routing problem event.
4. A rule pattern was used to make a BGP backward transition trap event “caused by” an
IS-IS routing problem when the neighbor of the backward transition event is equal to
the source of the routing problem event.
AN HSRP/ VRRP ROUTING FAILURE EXAMPLE
Condition Correlation can also provide interesting and useful correlation of events when a
link is lost to a router in a Hot Standby Routing Protocol (HSRP) or Virtual Router
Redundancy Protocol (VRRP) environment. In the following example, a site has two
redundant routers that provide access via HSRP. For this case, the primary router
experiences a failure, but the redundant router is still servicing the customer’s site. You
might want an alarm notification of the redundant fail-over, and distinguish that from a
total site outage. Knowledge from IMT, EMS and CCT can help to provide the RCA.
175: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 176/182
The following table outlines the syslog error messages and trap sequences for the HSRP
failover.
The seemingly disparate conditions that SPECTRUM correlated to create this alarm appear
in the “symptoms” panel of the alarm console as follows:
1. A correlation domain consisting of only of the two CPE HSRP routers and the PE router
interfaces that connect to these sites.
2. A rule pattern correlating the coincidence of an HSRPGrpStandByState event with a
state of active and a Device Contact Lost event to infer a Primary Connection Lost
condition.
3. A rule pattern that defines a Bad Link event caused by a Primary Connection Lost
event.
It applies these rule patterns to the HSRP correlation domains to prevent any correlations
outside of that scope.
Without these rules, SPECTRUM would have raised a critical alarm on the lost CPE device,
and on the connected port model. With these rules, it raises a major (Orange) alarm on the
CPE device indicating that the primary connection to the customer is lost. The other
conditions will appear in the symptoms table of this alarm.
176: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 177/182
Apply Condition Correlation to Service Correlation
Typically, networks carry and support more than one service. As an example, in the cable
industry, telephone service (VoIP), internet access (High Speed Data), VoD and digital
cable are delivered over the same physical data network. Managing this network can be
quite a challenge. Inside the network (cable plant), the video transport equipment, video
subscription services, and the Cable Model Termination System (CMTS) all work together to
put data on the cable network at the correct frequencies. Uncounted miles of cable along
with thousands of amplifiers and power supplies must carry the signals to the homes of
millions of subscribers.
If the network lines are cut in one area, as shown in the following diagram, the return path
monitoring system and the head end controller would report return path and power
problems in that area. The CMTS would provide the number of cable modems off-line for
the node. The video transport system would generate tune errors for video subscriptions in
that area. Lastly, the management system will lose contact with any business customer
modems that it is managing. With the flood of events and error messages from the
managed elements, it will be very obvious that problems exist with the service; the
challenge is to translate all that data into root cause and service impact actionable
information.
SPECTRUM can interpret the resulting deluge of events by using the service area of the
seemingly disparate events as a factor in the Condition Correlation. If the service areas and
services are modeled in SPECTRUM, it can use Condition Correlation to determine which
services in which areas are affected and the root cause or causes.
177: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 178/182
Service impact relevance goes beyond understanding what is impacted; it is also important
to identify what is not impacted. It is possible for the video subscription service to fail to
deliver VoD content to a single service area, and yet all other services to that area could be
operating normally. In another case, a return path problem in one area could cause
Internet, VoIP, and VOD services to fail and digital cable to degrade, yet analog cable would
still function normally. With the SPECTRUM capabilities and views of your infrastructure,
you can more quickly and easily detect the root cause and focus on addressing thatproblem first.
Leverage the Integrated Solution
After SPECTRUM has identified the root cause problems, operation staff can quickly obtain
details about the problem history and troubleshooting information by drilling down from the
alarms in the OneClick browser to the eHealth and eHealth for Voice reports and tools. For
example, operators could right-click an alarm in OneClick and do any of the following:
Drill down to Trend reports for the problem variable.
Drill down to At-a-Glance reports for a snapshot of several key performance variables for
that resource.
Drill down to the eHealth for Voice console or the eHealth web reporting interface to view
more reports and details about problem voice PBX systems and call message servers.
Launch a browser to the eHealth web reporting interface to view more reports and details
about problem resources.
178: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 179/182
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 180/182
Exceptions section, sending traps, 136Exceptions Summary Report, 145
F fault scenarios, 103
Frame Relay Manager, 41
G Global Collections, 58governments, 21Grade of service (GoS), 155group lists, purpose of, 60groups
adding to group list, 61creating, 60purpose of, 60
H Health reports, 65
forwarding traps, 65systems, 82
Healthcheck services, 29heterogeneous networks, 10
I Inductive Modeling Technology
(IMT), 162installation
prerequisites, 53
steps, 54InstallPlus kit, 55integrated solution, configuring, 57integration
eHealth SPECTRUM, 25modules (IMs), 36value, 11
IT resources, 18
L lifecycle of best practices, 26Live Exceptions
profiles, 63
service alarm situations, 136starting, 63
Live Health, 36forwarding traps to SPECTRUM, 64profiles, 62
Live Trend, 82
M Model by IP Address settings, 78
Modeling Gateway, 168models, SPECTRUM, 162Multicast Manager, 42MyHealth reports, 82
N neighbors, 169Network and Voice Management solution,
eHealth, 10network and voice management
strategy, 19network evolution, 13network fault and performance issues, 10Network Fault Management,
components, 38network management solutions, 10Network Performance Management,
components, 35network support, 10node licenses, 48
O OneClick
clearing alarms, 137SPECTRUM, 39
OneClick for eHealth console, 60Operational Support System (OSS), 20overutilized resources
confirming, 145documenting history of, 147locating, 145modeling changes for, 148resolving, 147
P predictive capacity planning, 33, 139proactive service assurance, 31, 135process rules, 102
Q QoS Manager, 42
R rapid problem resolution, 32, 159Remote Poller, 37Report Center, 37Report Manager
accessing, 118SPECTRUM, 45
reportsAt-a-Glance, 80Health, 65, 82MyHealth, 82
180: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 181/182
Top N, 86Trend, 84What-If Capacity Trend, 86
resource monitoring, 92response time tests, 105return on investment (ROI), 143RFC 2790
extensions, 72support for process modeling, 102RMON2 probes, 37role-based service information, 10root cause, 92
S scale, 10Secure Domain Manager (SDM), 44service assurance, 135Service Availability, 17
reports, 118service dashboard, 46
service delivery platform, 13Service Desk tickets, 137Service Editor, 103Service health, 92Service Health Matrix table, 94service hierarchies, 93service level management, 30service management
approach, 87interview process, 88mapping procedures, 89
Service Management module, 87Service Manager, 46service modeling, 92
creating, 93
fault scenarios, 103health table, 94policy design, 95process rule, 102response time, 105
Service Performance Manager (SPM), 45service providers, 20Situations to Watch chart, 149Situations to Watch Detail report, 150Sizing Wizard, 50guarantees, 108SLA
business hours, 109components, 115guarantees, 108
implementing, 116monitoring, 108periods, 108reports, 118
SLA modeling concepts, 108SNMPv3 support, 43software and hardware requirements
eHealth, 50OneClick, 51SPECTRUM, 50
Voice, 52Solution Architecture Overview (SAO), 27Solution Architecture Specification
(SAS), 28SpectroSERVER, 54SPECTRUM
backing up, 70
benefits, 25components, 24configuring eHealth, 68discovery, 57eHealth integration, 25OneClick, 39Service Manager, 30viewing alarms, 69Watch Editor, 40
SPECTRUM Alarm Notification Manager(SANM), 40
SPECTRUM Integrity, 39SPECTRUM Report Manager, 45syslog messages, 171system agents, monitoring best
practices, 71system monitoring, 71system requirements, 54SystemEDGE Agents, 71
T telecommunication service provider, 20third-party agents, 71Time over Threshold alarms, 31Time over Threshold rules, 136Top N reports, 86topology, 77, 167Traffic Accountant, 37
trapsforwarding from Health reports, 65forwarding from Voice, 66forwarding to SPECTRUM from
eHealth, 64Trend reports, 84
U underused resources
confirming, 141finding, 140resolving, 143return on investment, 143
Underutilized Elements report, 140best practices, 141
Unicenter NSM Agents, 71discovering in SPECTRUM, 79
V voice message disk capacity, 158voice services, network, 25
181: Network and Voice Management
8/3/2019 Network and Voice Management Green Book ENU
http://slidepdf.com/reader/full/network-and-voice-management-green-book-enu 182/182
Voice, capacity planning, 154VPN Manager, 43
W Watch Editor, 40
watches, SPECTRUM, 40What-If reports
running, 152What-If reports, 86