RB ITSMbyException WP
Transcript of RB ITSMbyException WP
7/27/2019 RB ITSMbyException WP
http://slidepdf.com/reader/full/rb-itsmbyexception-wp 1/11
Table of ContentsI. The Layers of Enterprise Management
II. Integrating Between Management Layers
SNMP-Managed Networks
Systems Management Monitoring
and Notification
III. 12 Unique Things to Manage on IBM i
IV. Easier Ways to Meet SLAs
Adaptive Analytics
Enterprise Monitoring Agents
Systems Management Solutions
V. Robot vs. Enterprise Monitoring Agents
VI. The Benefits of Robot Systems
Management Solutions
VII. Conclusions
I. The Layers of Enterprise ManagementLots of things come in layers: cold weather clothes,
cakes, even companies. A layered approach toward
managing all key areas of operation in complex
environments ensures that there is sufficient
coordination and escalation between different groups
of stakeholders when providing IT service. Many small
and midsized enterprises (SMEs) on IBM Power Systems
running IBM i accomplish this effectively and efficiently
using a single layer of systems management solutions.
Larger enterprises, however, require more layers ofmanagement, with each layer optimized for a specific
purpose and notifying up.
While SNMP is often used to exchange data between
layers, tickets generated at the top layers record and
track problems to their resolution. This paper explores
the continuing role of systems management software
for stopping tickets at their source and managing
operations by exception within multi-layered enterprise
management environments.
First, let’s take a look at the top layer in most large
enterprises, Business Services Management (BSM). BSM
is a set of software tools, processes, and methods that
help manage IT from a business-centric approach. Thislayer interfaces with customers and business units to
ensure that IT always remains aligned with business
objectives.
The middle layer of management is IT Services
Management (ITSM), which focuses on the quality of IT
services across all IT domains in the enterprise, as well
as the relationship between IT and customers. ITSM
takes its priorities from BSM with a shared purpose of
ensuring that IT is supporting business objectives. Insome organizations, ITSM and BSM are combined into
a single layer.
Within large enterprises, IBM i is typically considered a
domain, which can be either platforms or applications.
Each domain has a common set of functional areas to
monitor, terminology, and functionality. You’ll find our
final layer, systems management software such as the
Robot solution, deployed on each IBM i domain to allow
improvements to quality of service and cost reductions,as well as monitoring on IBM i.
While only certain details such as active/inactive, busy/
not busy, problem/no problem, and full/how full can be
monitored when viewed from higher levels (i.e., ITSM/
BSM), the systems management layer is an expert in
ITSM by ExceptionIt’s Just the Ticket!
© 2014 HelpSystems. All trademarks and registered
trademarks are the property of their respective owner
Robot | A Division of HelpSystems | www.helpsystems.com
United States: +1 952-933-0609 | Outside the US: +44 (0) 870 120 3148
Pg. 1
7/27/2019 RB ITSMbyException WP
http://slidepdf.com/reader/full/rb-itsmbyexception-wp 2/11
the IBM i domain. Monitoring your IBM i domain using
systems management tools gives you a much more
granular and proactive view, plus the ability to notify up
to the ITSM/BSM layers. The most sophisticated systems
management tools also have management capabilitiesthat proactively improve quality at the system level.
Within the top layers of enterprise management, tickets
are used as running reports on particular problems,
their status, and other relevant data that can be used to
help resolve the problem. In the lowest layer, however,
there typically are no tickets because the priority
here is to prevent tickets from happening in the first
place. Notification up through the layers occurs only if
problems cannot be solved at this lowest layer, and thenresult in tickets.
II. Integrating Between Management Layers
While monitoring takes a more reactive approach
to system events—you wait until there is a negative
event and then receive notification—it’s an integral
part in communication between the layers of enterprise
management. Monitoring alone does not implement
solutions on your system. In fact, the administrator often
has to work from experience in order to know whether
the monitoring tool is showing a problem or a normal
situation. But thanks to monitoring, stakeholders within
each layer can view status information on dashboards
or receive immediate automated alerts via email/SMS.
Any event or condition that affects quality of service
(QoS) and cannot be resolved automatically by systemsmanagement software gets notified up to the ITSM layer
where a ticket is created for manual corrective actions.
ITSM in turn notifies BSM of unresolved events within a
certain time window. The notification is most frequently
achieved with SNMP, and all good monitoring tools
support SNMP integration.
SNMP-Managed Networks
Simple network management protocol (SNMP) is an
internet standard protocol originally designed to help
automate the management of devices on IP networks.
SNMP has become a standard way for ITSM/BSM
solutions to receive information about negative events
occurring in the SNMP-managed network.
A managed IBM i domain is an IBM i server or application
being monitored in an SNMP-managed network. Events
occur constantly on each IBM i managed domain.
Systems management solutions like Robot must be used
to convert IBM i events to SNMP traps; the OS does not
do this automatically.
© 2014 HelpSystems. All trademarks and registered
trademarks are the property of their respective owners
Robot | A Division of HelpSystems | www.helpsystems.com
United States: +1 952-933-0609 | Outside the US: +44 (0) 870 120 3148
Figure 1: Large enterprises need multiple management layers whereas SMEs
typically do not.
Figure 2: Any unresolved event that could affect QoS is notified up to thelayer above.
Messages
Resources
Logs
Batch Job Schedules
Agent Jobs
Mul7-System
Interac7ve Jobs
Storage
Backup & Recovery
R o b o t S y s t e m s
M a n a g e m e n t ( S M )
IT Technicians
Manual Correc7ve Ac7ons
I T S e r v i c e
M a n a g e m e n t ( I T S M )
B u s i n e s s S e r v i c e
M a n a g e m e n t ( B S M )
IT Management
Manual Correc7ve Ac7ons
Customer Business Units
Manual Correc7ve Ac7ons
Domain specific
no5fica5ons
(SNMP)
IT specific
no5fica5ons
(SNMP)
Proac5vely
Monitored
Func7onal Areas
Spooled Output
Performance
SNMP Traps
Events which
could impact
quality of
business service
Events which
could impact
quality of IT
service
Automated
Correc7ve
Ac7ons
Monitor
DashboardsEmail/SMS
Alerts
Automated
Correc7ve
Ac7ons
Automated
Correc7ve
Ac7ons
W i n d o w s
U N I X
I B M i
Monitor
DashboardsEmail/SMS
Alerts
Monitor
DashboardsEmail/SMS
Alerts
M a i n f r a m e
IBM Power Systems
IBM i Domain
Pg. 2
7/27/2019 RB ITSMbyException WP
http://slidepdf.com/reader/full/rb-itsmbyexception-wp 3/11
© 2014 HelpSystems. All trademarks and registered
trademarks are the property of their respective owners
Robot | A Division of HelpSystems | www.helpsystems.com
United States: +1 952-933-0609 | Outside the US: +44 (0) 870 120 3148
An event is something that happens which can be
monitored, indicates a problem, or is just information
(e.g., a file fills up and needs a response to a message
to allow it to grow, causing part of the business ERP
to freeze pending a response). Software componentscalled enterprise monitoring agents (EMAs) run on
the managed IBM i domains and periodically collect
information about events. The EMAs then initiate
SNMP messages called trap messages as a response to
each event.
The data in an SNMP trap message can be read as a
set of variables using an external structure called a
management information base (MIB). For example, the
MIB may describe that the first 10 characters of theSNMP trap message refer to an IBM i system name. A
manager is a server used to collect and process SNMP
trap messages. A software component called a network
management system (NMS) runs on the manager,
receives SNMP trap messages, and executes applications
that monitor and control the managed domains (i.e.,
they execute components of the ITSM application). The
ITSM application receives the information about each
event from the NMS, triages it, and generates tickets
that are then routed to the right operator, manager, orsubject matter expert for manual resolution.
Systems Management Monitoring
and Notification
The most sophisticated systems management tools
offer multiple ways to display information about
system events and raise awareness. For monitoring,the Robot solution features message centers. Much
more efficient than message queues, these message
centers are graphical displays optimized for different
types of users. Messages in message centers have been
filtered, summarized, suppressed, and color-coded and
automated responses have been applied to most.
The Robot systems management solution also offers
different centers for performance, a map center for high-
level status, and a product metric center to show thelevel of automation achieved. Operators and managers
look at these centers regularly to get large volumes of
information quickly.
Notification from Robot systems management tools
find the right technical person when a message or
event appears. It doesn’t matter if you’re not looking at
a message center, they’ll find you online. Robot alerts
arrive as emails or SMS, delivered to correctly identified
operators and managers. Messages become alertsif they are very important, if an event needs manual
reactions, or if an event has not been acknowledged
within a certain time. Alerts can include attachments
to help explain the cause and allow the recipient to
respond by email/SMS. You can send alerts to broadcast
lists and escalation lists if a response is not received
from the first recipient within a certain timeframe.
Alerts assume that you do not have staff just staring at
consoles watching for errors.
Figure 3: SNMP has become the standard way for I TSM/BSM solutions to receive
information about unresolved events.
Pg. 3
7/27/2019 RB ITSMbyException WP
http://slidepdf.com/reader/full/rb-itsmbyexception-wp 4/11
© 2014 HelpSystems. All trademarks and registered
trademarks are the property of their respective owners
Robot | A Division of HelpSystems | www.helpsystems.com
United States: +1 952-933-0609 | Outside the US: +44 (0) 870 120 3148
III. 12 Unique Things to Manage on IBM i
Where monitoring is reactive, management implies
taking a more proactive approach on your system.
By incorporating a systems management layer to
your enterprise management strategy, you helpprevent negative events from happening. Invariably,
some negative events will occur, but the best
systems management tools adopt a solve-at-source
approach. These tools maintain a repository of pre-
defined, automated, and adaptive solutions that
are matched to negative events automatically and
in real time. Significant functionality is devoted to
anticipating commonplace, recurring negative events
and implementing automated and adaptive solutions
to minimize escalation. Customizable rules providefiltering and automation to align and uphold Service
Level Agreements (SLAs) and business needs.
IBM i has a unique set of functional areas to manage,
plus unique terminology and functionality that differs
from Windows, UNIX, Linux, or mainframe. IBM i
is often a cornerstone domain to the provision of
business services, running mission critical business
applications such as accounting, banking, logistics,
warehousing, manufacturing, insurance, sales, retail,and telecommunications. Clearly, it should be a high
priority to manage this domain well.
So, what’s unique to manage on each IBM i?
1. Messages
IBM i is constantly producing messages. Many events that
trigger messages are a proactive way for the operating
system to tell operators what’s happening. However, it’s
not always clear just by looking at a single messagewhether the event is negative. A message stating that
a job finished normally may seem like a good thing, but
if that job finished after five minutes when it normally
runs for an hour, it could be bad. It’s typically necessary
to analyze more than just one message to understand
an event. Some messages demand a response and slow
or incorrect responses could lead to problems. If a file
has reached its maximum size, for example, a message
is issued to the operator asking what to do now. Type
the wrong response and you may end up canceling an
important job accidentally or causing an unimportant
job to continue, fill ing the disk with unnecessary data.
For many large enterprises, there are simply too many
messages to handle manually, which is where message
management tools like Robot/CONSOLE come in to
automate message monitoring, filtering, responses,
and escalation.
2. Resources
There are hundreds of resources on most enterprise
servers, including subsystems, jobs, job queues, devices,lines, WebSphere MQ servers, Domino servers, and
more. Many of these resources must be in the correct
state at the correct time for business applications to
function properly, but most resources do not generate a
message saying they are not in the correct state—you
don’t get a message when a subsystem isn’t active or a
job didn’t run.
It’s necessary to understand what status every critical
resource must have, at what time, and to proactivelyverify that it is so. This area is referred to as work
management on IBM i, and it is a major area that ITSM/
BSM tools have no clue about. Here again, resource
monitoring tools like Robot/CONSOLE keep an eye on
things around the clock to keep your critical systems
and applications available.
3. Logs
The history log, system audit log, and communications
logs contain important information that could givewarning when a negative event is happening. Log
entries typically do not generate messages and manually
looking at logs can be a mind-numbing task. Systems
management tools proactively and continuously scan
logs for important entries as they occur and take
immediate action.
Pg. 4
7/27/2019 RB ITSMbyException WP
http://slidepdf.com/reader/full/rb-itsmbyexception-wp 5/11
© 2014 HelpSystems. All trademarks and registered
trademarks are the property of their respective owners
Robot | A Division of HelpSystems | www.helpsystems.com
United States: +1 952-933-0609 | Outside the US: +44 (0) 870 120 3148
4. Batch Job Schedules
Much of the workload on an IBM i server is batch.
Traditional time-based job schedulers such as the
IBM Job Scheduler do not anticipate problems so the
schedules are vulnerable to relatively small hiccups.Something as simple as a job running too long can
upset several jobs ahead of it. Implementing event-
based schedulers like Robot/SCHEDULE enables your
schedules to automatically adapt when the unexpected
happens, minimizing the risk of a negative event.
5. Agents
Jobs running on IBM i can have dependencies on other
systems. Those dependencies could be to other IBM i
systems or Windows, UNIX, or Linux systems and mostoften involve file sharing. Natively, IBM i does not see
much outside of its own, single system environment.
Without native multi-system capabilities, it can’t be
sure if a dependent job has run or not. This could lead to
some serious business problems unless you implement
a system of agents that run on remote systems,
communicate with jobs on IBM i, and coordinate
activities in a harmonious way. Robust job scheduling
software like Robot/SCHEDULE Enterprise automates
file transfers and helps you manage your entire cross-platform job schedule centrally from your IBM i.
6. Multi-System Environments
Some business applications require that multiple logical
partitions (LPARs) and even multiple distributed IBM
i systems behave as if they were one homogenous
environment. This is increasingly necessary in large
enterprises where departments, companies, and
countries find themselves as a subcomponent working
within one LPAR of a multi-LPAR and multi-system
environment. Managing such an extensive environment
manually is complicated, but implementing a multi-
system management technology hides this complexity.
Robot/NETWORK, for example, consolidates performance
data across one or multiple IBM i partitions and readily
displays it on your mobile device.
7. Resource-Heavy Interactive Jobs
Some jobs that require high levels of CPU and IO
(report jobs, for example) are not designed within
an application to run in batch. Allowing users to run
these interactively and on an ad hoc basis can lead toperformance problems. You may work at a company
with an application developed in-house that allows
your sales team to produce sales orders interactively.
At peak times of day you may have multiple users on
the system producing resource-heavy sales orders
interactively, most of them a duplicate effort. Invariably,
this leads to complaints from all departments that the
system is running too slow.
You can solve the issue by implementing a system thatintercepts such requests and converts the job types
from interactive to batch and schedules them at regular,
off-peak times to prevent performance problems. Smart
systems management tools like Robot/REPLAY can also
save you time and eliminate errors in interactive jobs
by capturing your keystrokes in dynamic fields and
automatically repeating your processes in future jobs.
8. Storage
Storage fills up naturally over time with unnecessary oroversized objects that impact performance and could,
in severe cases, crash the system. Anecdotally, let’s
say Company A is complaining that their system runs
slowly and periodically crashes without reason. When
they restart the system, it runs okay for a while and
then crashes again. A quick look at the storage shows
that the disk utilization is constantly running above
95 percent full. Apparently, Company A feels they have
5 percent left. After running some storage analysis
tools, you discover that they never deleted sales invoice
spooled files—ever! They have over 15 years of invoices
on disk that they’re keeping just in case they need to do
a reprint, which they admit happens rarely.
Pg. 5
7/27/2019 RB ITSMbyException WP
http://slidepdf.com/reader/full/rb-itsmbyexception-wp 6/11
© 2014 HelpSystems. All trademarks and registered
trademarks are the property of their respective owners
Robot | A Division of HelpSystems | www.helpsystems.com
United States: +1 952-933-0609 | Outside the US: +44 (0) 870 120 3148
Collecting data about how your storage is being
consumed, including temporary storage, is essential to
smart systems management. Products like Robot/SPACE
even produce graphs and projections to make sure
storage stays at healthy levels and run regular cleanuproutines to reduce storage utilization automatically.
9. Backup and Recovery
Saving to tape or virtual tape is your last line of defense
against a disaster. Manual methods involve a lot of work
after business hours with high potential for human error.
Tapes contain your company’s most valuable asset—
its data—and are typically taken off-site. Automated
backup and recovery solutions can eliminate long, ugly
hours for operators, even during restricted state saves.Robot/SAVE even includes encryption so your data is
protected when it’s off-site.
10. Spooled Output
Users rely on spooled output to run the business, and
they are getting pickier about how they receive spooled
output—everyone wants it a different way and most
want to view it on their laptops or tablets. Spooled
files contain sensitive company information and are
difficult to reproduce once they are lost. It’s importantto automate spooled output management to ensure
the information is automatically archived to the proper
media or distributed securely. Robot/REPORTS also adds
flexibility to report handling processes by automatically
bursting IBM i reports into segments for distribution to
the end users that need them.
11. Performance
Business application performance often impacts business
performance. Systems can be tuned manually to maximize
response times during the day and batch throughput
overnight, but this is a time-consuming, specialized task.
Get it wrong and an immediate negative event may result
in many tickets. Luckily, smart systems management tools
like Robot/AUTOTUNE automatically and dynamically
tune systems based on the actual workloads at any given
time and the business priorities that have been set.
12. SNMP Traps
Simple network management protocol (SNMP)
allows the capture of messages from any connected
network devices, including routers, switches, servers,
workstations, printers, modem racks, time clocks, andmore. It’s also a way for IBM i to communicate events
out. Communicating with all devices essential to the
correct functioning of your business applications is
vital, and SNMP is a good way of doing that. This area
helps to tie domain-level monitoring with ITSM/BSM,
provided only exceptions are sent forward.
Managing by exception is where Robot solutions excel.
Combining Robot/CONSOLE with Robot/NETWORK
and Robot/ALERT allows you to centralize messagemanagement and resource monitoring, distribute
product instructions and objects (such as automation
instructions), and monitor resources and system logs
throughout your network of IBM i systems. And of course,
you can notify up through your layers of enterprise
management via text, email, or SNMP messages.
IV. Easier Ways to Meet SLAs
In recent years, large enterprises have faced new
challenges when it comes to meeting Service LevelAgreements (SLAs). Fueled by new innovations such as
cloud-as-a-service computing, sophisticated enterprise
environments are increasingly dependent on virtualized
environments managed by multiple entities (internal
and external), thus creating silos. IT resources are
increasingly scarce in proportion to the challenges
at hand.
Pg. 6
7/27/2019 RB ITSMbyException WP
http://slidepdf.com/reader/full/rb-itsmbyexception-wp 7/11
© 2014 HelpSystems. All trademarks and registered
trademarks are the property of their respective owners
Robot | A Division of HelpSystems | www.helpsystems.com
United States: +1 952-933-0609 | Outside the US: +44 (0) 870 120 3148
As a result, the number of tickets is overwhelming IT
operations. Individual tickets don’t provide sufficient
insight into service issues. Too often tickets fail to notify
IT about problems before there is a business impact.
Escalations keep IT dependent on subject matter
experts. Since the silo model is growing in these larger
organizations, tickets no longer facilitate essential
collaboration across IT and its external partners.
The consequence may be that end users are the first
to identify the existence of a problem that affects
the business.
SLAs are part of a service contract between the
customer and the service provider(s) where a service
is formally defined. An SLA referring to an IBM i
domain will typically have a technical definition that
outlines measurable details for any of these unique
IBM i management categories and others. While every
organization has its own methods for meeting SLAs,
we’ll discuss potential options.
Adaptive Analytics
To tackle growing issues, an innovative new approach
is being pioneered that looks set to disrupt the way
ITSM/BSM traditionally works. This technique is called
adaptive analytics. Adaptive analytics uses advancedprobability algorithms combined with natural language
processing techniques to infer the existence of business-
impacting problems by analyzing tickets. Adaptive
analytics uses natural language searches to link tickets
together across different silos and identify which tickets
are related and which business areas are going to be
impacted. This means various IT stakeholders from
different silos, support staff, subject matter experts, and
customers can come together faster to tackle a problem
collaboratively before there is a business impact.
Adaptive analytics also promises to help consolidate
and reduce the overall number of tickets within ITSM/
BSM systems.
Although this is an important development, it does
not change the need to tackle all events on the IBM
i domain at the source and in real time in order to
prevent them from becoming tickets in the first place.
The fewer tickets adaptive analytics has to analyze in
the ITSM system, the lower the chance for problems that
impact business.
Enterprise Monitoring Agents
The enterprise monitoring agent (EMA) approach
collects event statuses periodically, which means many
events can happen before they are captured. As there
is no automated resolution, all collected events are
escalated, which means support teams get a lot of
tickets that are already delayed by the time they arrive.Each ticket must then be escalated to the right operator
for a manual solution, which introduces yet another
delay. These delays and manual actions make it difficult
to meet SLAs, particularly when ticket volumes increase
and support staff headcount is limited, and gives
management a good idea of when and why SLAs were
compromised but contributes little toward preventing
the compromise in the first place.
Figure 4: Adaptive analytics promises to consolidate related tickets into groups.
Robot automatically solves issues and stops tickets from being generated in the
first place.
Pg. 7
7/27/2019 RB ITSMbyException WP
http://slidepdf.com/reader/full/rb-itsmbyexception-wp 8/11
© 2014 HelpSystems. All trademarks and registered
trademarks are the property of their respective owners
Robot | A Division of HelpSystems | www.helpsystems.com
United States: +1 952-933-0609 | Outside the US: +44 (0) 870 120 3148
Systems Management Solutions
Using systems management solutions like Robot differs
from EMA in that it traps events and reacts in real
time (e.g., a job queue which usually has only two jobs
waiting suddenly has ten causing an unacceptable delayto processing key business ERP functions). There is no
delay between the event happening and an automated
solution. Non-essential events are suppressed
immediately and then pre-defined, and automated
solutions are applied to any recognizable, unresolved
events immediately. Robot filters out only those events
that require human action. As a result, fewer events
are escalated and they are escalated sooner. This
means fewer tickets; better, faster, and more consistent
solutions; and less work for the over-worked support
teams, making it easier to meet SLAs.
IBM i domains can be managed from a central
repository, better known as console consolidation.
This management by exception approach creates a
single status center for all IBM i partitions but still
converts events to an SNMP trap for the ITSM/BSM.
Many large organizations need stepping stones in
their organizations so that the domain-level expert can
immediately be notified and respond to events before
escalation to ITSM/BSM.
V. Robot vs. Enterprise Monitoring Agents
Consider how the enterprise monitoring agent approach
and the Robot approach might affect your SLAs. EMAs
are periodic collectors. They collect event data and
then they pass it to the ITSM layer, which generates aticket for a subject matter expert. EMAs do not apply
automated solutions, but sometimes EMAs are used in
place of automation software like Robot on IBM i.
Robot captures events and applies automated and
adaptive solutions in real time. Many events that
happen on IBM i have been seen before so automated
solutions can be defined as sets of rules. For example, if
too many jobs are waiting in a job queue, automatically
move some of them to another queue.
Both approaches to monitoring have advantages
and disadvantages, and these differ from platform to
platform. IBM i is more complex to monitor than some
other domains because it has so many important and
unique functional areas. We listed 12 of them already in
this paper. IBM i also runs multiple applications in the
same server so it has more processes and the potential
for more events than other servers.
Figure 5: It’s easier to meet your SLAs with Robot than with ITSM/BSM alone.
EventHappens
Delay
Fast
Enterprise Monitoring Agent (EMA)
Robot Systems Management
Time Taken to Implement Solution Slow
Delay
Filter Escalate
100%CreateTicket
ManualSolution
ManyEscalations
EventHappens
Fast Time Taken to Implement Solution Slow
Trigger Filter FilterAuto
SolutionEscalate<10%
FewerEscalations
CollectStatus
Pg. 8
7/27/2019 RB ITSMbyException WP
http://slidepdf.com/reader/full/rb-itsmbyexception-wp 9/11
© 2014 HelpSystems. All trademarks and registered
trademarks are the property of their respective owners
Robot | A Division of HelpSystems | www.helpsystems.com
United States: +1 952-933-0609 | Outside the US: +44 (0) 870 120 3148
Enterprise Monitoring Agent
Custom code – Most EMAs for IBM i are not a ready-to-
deploy solution. Instead you get a tool that enables you to
build a solution by writing custom code. This is expensive,
time-consuming, and often requires an external specialist.
High maintenance – Custom code takes time to maintain.
The more granular the information you are monitoring, the
more likely you will need to maintain/update/modify your EMA
when there are small changes to your environment. This can
lead to frequent, inconvenient, and expensive maintenance
activity, often through external experts. Installing new agents
on new partitions is often time-consuming.
Heavy overhead – EMAs are collectors. Collection typically
requires a lot of resource on the system. The more granular
the collected information, the larger the potential overhead.
You may need to oversize your systems to accommodate this
extra workload.
Low on detail – It’s not easy to get good, granular status
information from IBM i with an EMA unless you write a lot
of custom code. It’s impractical to do that very often, so you
end up with non-detailed statuses (e.g., it’s active/inactive ora percent of how busy). This does not give you much detail on
what caused a problem. You get alerted only when a big event
has already happened.
No corrective actions – EMAs are not designed to man-
age IBM i, just to monitor it. Even after investing in building
EMAs, you’re still left to manually manage your IBM i and
correct problems yourself. EMAs require staff that has the skills
to build scripts to automate, which is a burden on change
management and the application development team.
Delivery lag – Most EMA collections can only happen for
practical reasons on a periodic check. The more frequently they
check the more overhead they consume, so there is usually a
lag between the time an event happens and the time it gets
captured. This means many important escalations are late.
Robot
No custom code – Robot does not require custom code
to monitor an environment. All key IBM i components have
already been considered and configuration is relatively quick
using a combination of a GUI and self-verification reports that
check how well you are doing.
Low maintenance – Robot uses sharable configuration
objects that simplify and accelerate maintenance. Monitoring
logic is configured rather than programmed. You can change a
configuration object in one place and it can have a large effect.
Low overhead – Events are triggered rather than collected.
Triggering does not add significantly to the overhead because
these events are happening anyway. Also, there is no searching
for a status.
Lots of detail – All key functional areas to monitor are
built in. It’s easy to escalate an event, for example
when a high availability replication job is not active and
compromises recovery objectives. This type of granular
monitoring detail provides early warnings of coming problems
so it’s possible to take action before they impact business.
Automated corrective actions – Automated corrective
actions take place instantly and escalations only take place
if automated corrective actions are not possible. This means
management by exception. Managing fewer events makes
a big difference. If you can correct nine out of ten events
automatically, you’re more likely to meet SLAs—and you’ll
have less work to do!
Real-time – Most events are triggered when a message
arrives in a message queue or when a job starts/ends. Thereis no need to search for these events; you can react to them
instantly. Notification is quicker so potential corrective actions
can start sooner.
Table 1: Robot handles events better than enterprise monitoring agents (EMAs).
This table summarizes the differences between the EMA approach and the Robot approach to monitoring.
Pg. 9
7/27/2019 RB ITSMbyException WP
http://slidepdf.com/reader/full/rb-itsmbyexception-wp 10/11
© 2014 HelpSystems. All trademarks and registered
trademarks are the property of t heir respective owners
Robot | A Division of HelpSystems | www.helpsystems.com
United States: +1 952-933-0609 | Outside the US: +44 (0) 870 120 3148
VI. The Benefits of Robot Systems
Management Solutions
Large enterprises with multiple platforms and
increasingly virtualized environments typically need
multiple layers of management with each layerperforming a different function and interacting with a
different group of people. While the main role of ITSM/
BSM layers is to collect incoming events from multiple
silos, create tickets, triage and assign them to subject
matter experts, and then track their manual resolution,
Robot systems management solutions automate and
manage IBM i operations and prevent tickets from
being generated in the first place.
For over 30 years, Robot has been the standard forsystems management on IBM i. Used by over 2,100
companies in all industries, ranging from SMEs to
large enterprises, the Robot solution includes over 15
products that provide systems management capabilities
for all key areas of IBM i operations.
Robot systems management solutions automate IBM i
operations to reduce the occurrence of negative events
and to automatically correct repeat negative events at
the source, escalating only a handful of notifications forhuman response. Another benefit of the Robot systems
management solution is that the tools do not require
programmers for implementation. Their fill-in-the-
blank rules can be administered by the team that was
previously managing events manually.
Robot solutions are designed to anticipate problems
and have built-in logic to prevent minor events from
becoming major events. Robot includes sophisticated,
graphical alert consoles that allow managers to monitor
the status of multiple systems at a high level with directdrill-down to the underlying events. Every effort is made
to avoid an eyes-on-glass approach when tackling
individual events. If Robot needs an operator to get
involved, it automatically locates the right person based
on work schedules and subject areas and sends them
an email in real time. Messages can also be sent via
text or SMS, and some manual fixes can be completed
remotely. If the operator does not acknowledge, Robot
automatically escalates to the next person in line.
Robot also automates SNMP notification so any alert
can notify your ITSM/BSM via an SNMP trap. This is in
addition to or instead of an email or SMS, making it
easy to integrate Robot in an ITSM/BSM environment.
For more information about Robot, please visit
www.helpsystems.com.
Automated Job Scheduling
Mul4-SystemEnvironment
Support
Reports ManagementAdvancedMessage
No4fica4on
Disaster Recovery
Message Management
Performance Management
Figure 6: Robot covers all essential areas of systems management.
Pg. 10
7/27/2019 RB ITSMbyException WP
http://slidepdf.com/reader/full/rb-itsmbyexception-wp 11/11
VII. Conclusions
On IBM i, Robot is faster and more effective at solving
negative events than ITSM/BSM alone. This is because
ITSM/BSM alone simply collect and forward event data
to be solved later, manually. Robot helps to avoid suchevents and, when they do occur, solves most of them
automatically and immediately at the source.
• SNMP messages are an industry standard way
of communicating between systems management,
ITSM, and BSM.
• Many SMEs can manage their IBM i systems
effectively with Robot alone, particularly if their
environment includes Robot/NETWORK, which helps
managers see an overview of systems management
activity across their entire IBM i network.
• Because IBM i has unique management
requirements that set it apart from Windows, UNIX,
Linux, and mainframe, Robot systems management
solutions are a critical part of enterprise management
architecture within large enterprises for all
IBM i domains.
• While adaptive analytics will become useful over
time in helping to consolidate tickets into logically
related groups, helping stakeholders collaborate on
manual solutions, it does nothing to prevent negative
events from happening on IBM i in the first place.
• Robot handles events better than enterprise
monitoring agents (EMAs). It exposes more detail,
consumes less resource, and avoids custom code,
making it easier to manage.
• Robot is the most effective way to tackle events
with a solve-at-source approach so fewer tickets
appear in ITSM/BSM, allowing you to truly manage
by exception.
© 2014 HelpSystems. All trademarks and registered trademarks are the property of their respective owners. (R021IE4)
About HelpSystem
HelpSystems is a leading provider of systems management
business intelligence, and security and compliance software
We help businesses reduce data center costs by improvin
operational control and delivery of IT servicesRobot | A Division of HelpSystems | www.helpsystems.com
United States: +1 952-933-0609 | Outside the US: +44 (0) 870 120 3148
The increasing popularity of
cloud-as-a service solutionsmay require a multi-layered
enterprise management
response. But beware. Skip
the systems management
layer and you could find
yourself drowning in tickets.
Pg. 11
Let’s Get Started
To set up a personal consultation, call 1 800-328-1000
or email [email protected]. We’ll review your
current setup and see how Robot products can help you
achieve your automation goals.