Implementing ITIL for Incident Management

17
 Implementing ITIL for Incident Management  Hi-Tech HTTD Storage CoE Brijesh Das M .K. [email protected] Date Month Year CONFIDENTIAL

Transcript of Implementing ITIL for Incident Management

Page 1: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 1/17

 

Implementing ITIL for Incident Management 

Hi-Tech HTTD Storage CoE

Brijesh Das M.K.

[email protected]

Date Month Year

CONFIDENTIAL

Page 2: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 2/17

Table of  Contents Why use ITIL ...................................................................................................3

Incident Management Process............................................................................4

Customer ........................................................................................................5Challenge........................................................................................................5

Implementation ...............................................................................................5Incident Management Goal ................................................................................6Who Reports an Incident ...................................................................................7Escalation Policy...............................................................................................7

Reporting an Incident .......................................................................................7

Incident Management Activity Workflow ..............................................................8

Evaluation Process ...........................................................................................9

Escalation Process ............................................................................................9Resolution Process .........................................................................................12

Inputs...........................................................................................................13

Activities .......................................................................................................13

Outputs ........................................................................................................13Roles & Responsibilities...................................................................................14Incident Record Keeping .................................................................................17

Incident Report ..............................................................................................17

The Result.....................................................................................................17

2

Page 3: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 3/17

Why Use ITIL?

Organizations are increasingly dependent upon IT to satisfy their corporateaims and meet their business needs. This growing dependency leads togrowing needs for quality IT services – quality that is matched to businessneeds and user requirements as they emerge.

ITIL provides a comprehensive, consistent and coherent set of best practicesfor IT Service Management processes, promoting a quality approach toachieving business effectiveness and efficiency in the use of informationsystems. ITIL processes are intended to be implemented so that theyunderpin, but do not dictate, the business processes of an organization

ITIL codifies IT services management best practices. Among the benefitsassociated with adopting the libraries best practices, clients have identified,improved customer satisfaction with IT services, better communications and

information flows between IT staff and customers, and reduced costs indeveloping procedures and practices within an enterprise

ITIL can help with IT provision by providing:

•  Better Customer Service: ITIL will deliver better services which aretailored to the needs of the customer

•  Better Cost Effectiveness: ITIL assists organizations in providing aquality IT service within a business environment affected bybudgetary constraints but also growing user expectations.

•  Better Motivation and Productivity: ITIL encourages staff to view ITService Management as a recognized professional skill, ultimatelyincreasing effective performance.

3

Page 4: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 4/17

4

Incident Management Process

An incident is an unplanned interruption to an IT service, or a reduction inthe quality of an IT service. Failure of a configuration item that has not yetimpacted service is also an incident.

The purpose of Incident Management is to restore normal service as quicklyas possible, and to minimize the adverse impact on business operations.Incidents are often detected by event management, or by users contactingthe service desk. Incidents are categorized to identify who should work onthem and for trend analysis, and they are prioritized according to urgencyand business impact.

If an incident cannot be resolved quickly, it may be escalated. Functional

escalation passes the incident to a technical support team with appropriateskills; hierarchical escalation engages appropriate levels of management.

After the incident has been investigated and diagnosed, and the resolutionhas been tested, the Service Desk should ensure that the user is satisfiedbefore the incident is closed.

An Incident Management tool is essential for recording and managingincident information.

Incident Management links with other processes, activities and functions,these are:

•  Configuration Management Process•  Change Management Process•  Problem Management Process•  Process, Procedures and documentation for Training and Knowledge

Management.

Incident Management lays more emphasis on Change and ProblemManagement. Change Management tries to limit incidents happening as a

result of any change whereas repetitive incidents are eliminated by ProblemManagement by making available the known error database.

Page 5: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 5/17

Page 6: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 6/17

Incident Management Goal:

The primary goal of the Incident Management process was to restore normal service operation as

quickly as possible and minimize the adverse impact on business operations, thus ensuring that the

best possible levels of service quality and availability are maintained. ‘Normal service operation’ is

defined here as service operation within Service level agreement (SLA) limits.

Inciden t Management Process Overview

6

Page 7: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 7/17

7

Who reports I ncident?

  Customer: They are individuals who commission, pay for, and own the IT Services. A

customer is likely to report a service deficiency within the SLA. 

  User: People who use IT Services on daily basis are the users, they are likely to report a

software application incident or a printer malfunction. 

Escalation Policy

  ‘Escalation’ is the mechanism that assists timely resolution of an incident. It takes place during

every activity in the resolution process, it can be of the following two types:

  Functional

Transferring an Incident from 1st Level to 2nd Level support groups or further is called

 ‘functional escalation’ and primarily takes place because of lack of knowledge or expertise.

Functional escalation was to be followed when agreed time intervals elapsed and must not

exceed the (SLA) agreed resolution times.

  Hierarchical 

 ‘Hierarchical escalation’ would take place at any moment during the resolution process when

it is likely that resolution of an incident would not be in time. In case of lack of knowledge or

expertise, hierarchical escalation was performed manually (by the Service Desk or other

support staff). Automatic hierarchical escalation could be considered after a certain criticaltime interval, when it was likely that a timely resolution would fail. Preferably, this takes

place long enough before the (SLA) agreed resolution time is exceeded so that corrective

actions by authorized line management can be carried out.

Reporting an Incident

End User could report incidents to Service Desk by using the following:

• Phone\ Fax\Walk-in

• Monitoring Tool (HpOpenview)

• Messaging Services (Emails\Messenger Services)

Page 8: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 8/17

 

8

Page 9: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 9/17

9

Evaluation Process 

Incident captured through monitoring or reported by the plant

  Customer Impacted

•  Determine which systems are impacted•  Determine the level of the impact (customer down?)•  Determine if a work around exists and is being implemented•  Immediately escalate to

o  2nd Level business Analyst•  Inform

o  SD Managero  Site Analysto  Plant Managero  Regional IT Manager

  Plant P roduction Impacted

• Determine extent of the impact

•  Determine the types of system impactedo  Database: Immediately escalate to 2nd level DBAo  SAP: Utilize 1st level SAP troubleshooting matrixo  E-Mail: Utilize 1st level email troubleshooting matrixo  WAN: Utilize 1st level WAN troubleshooting matrixo  LAN: Utilize 1st level LAN troubleshooting matrixo  WLAN: Utilize 1st level WLAN troubleshooting matrixo  Hardware: Utilize 1st level hardware troubleshooting matrix

•  Problem resolved using the problem using 1st level troubleshooting matrixo  No: escalate to appropriate 2nd level supporto  Yes:

  log incident in HPOV

  assign incident to SD Mgr for review

Escalation Process 

During all incidents, the service desk representative would do the following:

  Track important events during the incident1.  Ask for clarification as required2.  Include the root cause if identified during the incident3.  Include immediate problem resolution activities4.  Include any permanent problem resolutions that may be identified5.  Coordinate resources as required during the incident

Page 10: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 10/17

10

  < 10 minutes

Initial call or monitoring notification1.  Determine the cause of the incident2.  Determine if production is down or impaired3.  Escalate to Tier II (BA, Infrastructure, DBA) using the on-call matrix4.  Track incident in HPOV5.  Open the conference bridge if required by the Tier II support6.  Transfer the phones to other SD team member if available 

  @ 10 M inutes

1.  Telephonically notify the Site Analysta.  For SAP or E-Mail notify all affected sites

2.  Telephonically notify the OSC3.  For SAP or E-Mail notify all affected sites

  <= 15 Minutes (Incident Resolved no plant/ customer dow ntime)

1.  Log incident in HPOV2.  Continue to follow incident management process

  @ 15 Minutes

1.  Track the incident using the incident record2.  Contact the following telephonically

a.  SD Manageri.  If unavailable, contact the Director of Infrastructure

1.  If unavailable, contact the V.P. of Software development

b.  Site Analystc.  OSC or Site contact

i.  If unable to reach the OSC or site contact, call the plant manager3.  Send the Incident Notification E-Mail to the following

a.  CIOb.  Plant Managerc.  Site Analystd.  OSCe.  Regional IT Managerf.  Vice President of Application Softwareg.  Technical Services Directorh.  Software Development Manager

i. 

Application and Software Analysis Manager j.  Database and Application Managerk.  Service Desk Manager

Page 11: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 11/17

11

  @ 20 M inutes

1.  Telephonically contact:a.  Regional IT Directorb.  Plant Managerc.  Database Manager (as required)d.  Software Manager (as required)

  @ 25 M inutes

1.  Telephonically contacta.  V.P. Softwareb.  Director Infrastructure

  @ 30 M inutes 

1.  Telephonically contact CIO

  @ 45 Minutes (and every 30 minutes thereafter)

1.  Send the Incident Notification E-Mail to the followinga.  CIOb.  Plant Managerc.  Site Analystd.  OSC

e.  Regional IT Managerf.  Vice President of Application Softwareg.  Technical Services Directorh.  Software Development Manageri.  Application and Software Analysis Manager j.  Database and Application Managerk.  Service Desk Manager

Page 12: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 12/17

12

Resolution Process 

Send the Incident Resolved email to:

1.  CIO2.  Plant Manager3.  Site Analyst4.  OSC5.  Regional IT Manager6.  Vice President of Application Software7.  Technical Services Director8.  Software Development Manager9.  Application and Software Analysis Manager10. Database and Application Manager11. Service Desk Manager

Perform a debrief of the individuals involved to ensure that the following have been tracked in

Incident report:-

1.  Confirm the time line2.  The root cause3.  Immediate corrective actions4.  Permanent corrective actions5.  Identify the incident owner (SD Manager will assign as required)

Page 13: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 13/17

13

Inputs

• Incident details were sourced from End Users via the Service Desk, networks or computer

operations via monitoring tools and manual detection during defined operational hours (Service

Catalogue)

• Configuration item (CI) details from the Configuration Management Database (CMDB)

• Response from Incident matching against Problems and Known Errors

• Resolution details

• Response on RFC to effect resolution for Incident(s).

Activities

• Incident detection, recording and alerting

• Interrogation, classification, prioritization and initial support

• Investigation and diagnosis: A resolution or Work-around was required to be established as

quickly as possible in order to restore the service to End Users with minimum disruption to their

work.

• Resolution and recovery, resolution of the Incident and restoration of the agreed service.

• Closure

• Incident ownership, monitoring, tracking and communication.

Outputs 

• Resolved (via Workarounds or Known Errors) and closed Incidents

• RFC for Incident resolution;

• Incident record information (including linkages to resolutions and/or Workarounds and/or CI

data) `

• Communication to Clients and End Users.

• Management information (reports and procedural information)

Page 14: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 14/17

14

Roles & Responsibilities

1st Leve l - Service Desk 

The Service Desk was responsible for the monitoring of the resolution process of all registeredincidents– in effect the Service Desk was the owner of all incidents. The Service Desk played an

important role in the Incident Management process, as follows:

• Service Desk was an independent function, monitoring Incident resolution progress of all

registered and reported incidents.

• All Incidents were reported to and registered by the Service Desk – where detected Incidents

were generated automatically, the process still included registration of the incident by the Service

Desk (automatically or manually)

• Primary goal of the Service Desk was to resolve majority of the issues at the 1st level itself 

On receipt of an incident notification, the responsibilities and main actions carried out by the

Service Desk were:

• Incident detection and recording; record basic details – this included timing data and details of 

symptoms observed.

• Routing service requests to support groups when incidents were not closed within the stipulated

amount of time as defined in the SLA, if a service request had been made, the request was handled

in conformance with the organization’s standard procedures

• Initial support and classification and prioritization from the CMDB, the Configuration Items (CI)

reported as the cause for an Incident was selected, to complete the Incident record, the

appropriate priority was derived and the End User was given unique system-generated Incident

number for all future communications.

• Tracking of incidents assigned to 2nd level support following unsuccessful resolution at 1st level,

in this case the history was updated and incident assigned to second level with the relevant details

and then assigned back to the Service Desk to then notify the End User.

• Closure of incidents, following the review of classification, the incident record is closed and details

of the resolution action and the appropriate category code were added.

• Ownership, monitoring, tracking and communication.

Page 15: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 15/17

15

Specialist Support Groups

IT department within the company had specialist groups which contributed to handling and

investigation of incidents at critical times. Incidents that cannot be resolved immediately by the

Service Desk are assigned to specialists within 2nd and 3rd Level Support groups. Support wouldbe involved in tasks such as:

• Monitoring Incident details, including the Configuration Items affected

• Incident investigation and diagnosis (including resolution where possible)

• Detection of possible Problems and the assignment of them to the appropriate Problem

Management team for them to raise Problem records

• The resolution and recovery of assigned Incidents.

The definition for 2nd Level and 3rd Level support were defined as under

  2nd Level Support

IT 2nd Level Support included Network, Database, Application and System’s Team, they

were part of the internal workforce. When an incident required additional 2nd Level

resources from internal support teams to assist with investigation and resolution of the

error, Service Desk was responsible for engaging the help of other 2nd Level resources asrequired.

  3rd Level Support

IT 3rd Level Support referred to the support personnel that were external to the organization

i.e. they worked for an external company, supplier or vendor. When an incident required

3rd Level resources from external support to assist with investigation of the error, the 2nd

Level support group assigned to the incident was responsible for engaging the help of those

extra resources.

Page 16: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 16/17

16

Service Desk Manager

Service Desk Manager played an important role in Incident Management and had the primeresponsibility for ensuring compliance with the process and ensuring the highest standards for

ongoing delivery of 1st Level support services, besides these he also had the following

responsibilities

• Ownership, monitoring, and keeping effective records of the incident.

• Monitoring the status and progress towards resolution of all open Incidents.

• Keeping affected End Users informed about progress.

• Follow the escalation procedure as and when required.

Incident Manager

The Incident Manager had the responsibility for:

• Driving the efficiency and effectiveness of the Incident Management process

• Producing management information

• Managing the workflow of the Incident Management Process

• Monitoring the effectiveness of Incident Management and making recommendations for

improvement

• Developing and maintaining the Incident Management systems.

Page 17: Implementing ITIL for Incident Management

8/7/2019 Implementing ITIL for Incident Management

http://slidepdf.com/reader/full/implementing-itil-for-incident-management 17/17

17

Inciden t Record Keeping

Throughout the Incident lifecycle the record must be maintained, this would allow the Service Deskagents to provide an End User with the most up to date progress report. Such activities would

include:

• Modify status (e.g. ‘new’ to ‘work-in-progress’ or ‘pending’)

• Modify business impact/priority

• Monitor escalation status.

• Update history details

• Enter time spent and costs

HP Openview was used as the authoritative tool to record this information.

Incident Report

The Incident Report showed entire life-cycle of the case, it was therefore one of the most important

aspects of an Incident to keep up to date. Without an incident report ongoing process

improvements would not have been possible. The report had a field for Immediate Corrective

Actions which enumerated the steps taken to resolve the issue and a Permanent Corrective Actionfield which described the future course of action to prevent the incident from re-occuring. This

report is made available to the Problem Management team during the Corrective Action meeting.

The Results

Following were the benefits of implementing Incident Management:

a) Providing timelier incident resolution resulted in reduced business impact.

b) Improved user productivity

c) SLA focused production information

d) Independent, customer-focused incident monitoring.