HP & WIND Hellas IT

27
©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Session ID: BTOT-WE-1630/4 Speaker: Charis Tsevis, Senior Operations Mgr WIND Hellas SA Twitter hashtag #HPSWU

description

Overview of HP IT implementation at Greek bank WIND Hellas

Transcript of HP & WIND Hellas IT

Page 1: HP & WIND Hellas IT

©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

Session ID: BTOT-WE-1630/4Speaker: Charis Tsevis, Senior Operations Mgr WIND Hellas SATwitter hashtag #HPSWU

Page 2: HP & WIND Hellas IT

2

WIND Hellas Company

History

• Established in 1992 as Telestet, the first mobile Telecom operator in Greece

• In 1993, places the first mobile call in Greece

• In Jan 2006 acquired Q-Telecom, the 4th mobile Telecom operator in the country

• In summer 2007 changes the brand name to WIND

• Oct 2007, acquired Tellas, fixed telephone and broadband Internet provider

• Tellas fully integrated in early 2010

Today

• Only Telecom Operator in Greece that offers all in one : Mobile, Fixed, Internet

• More than 5,5 million customers

• 99,6% network coverage across the country

• 400 WIND stores nationwide

• 1,07 billion Euro turn over in 2009

Page 3: HP & WIND Hellas IT

3

WIND IT Organization - I

• Business Support Services - BSS

Develop, Support and Implement all core business applications, such as :

- Postpaid Billing

- Prepaid Services

- Service Provisioning

- Point Of Sales

- CRM

- ERP

- Data Warehouse

- Fraud

• Business Solutions by strong vendors, such as :

- Ericsson, LHS

- Oracle (Siebel, BEA-WebLogic)

- SAP

- HP

- Microsoft

Page 4: HP & WIND Hellas IT

4

WIND IT Organization - II

• Operations Support Services - OSS

- Proactive systems support & maintenance

- 24x7 Monitoring & Reactive Support

- WIND Retail Chain support (400 shops across Greece)

- Provisioning of core infrastructure services

- Infrastructure software implementation & configuration (Openview, Legato,

EMC sw, etc.)

- 3 Data Centers, 300 Enterprise Systems, 250TB Storage Area Network

- Real time systems, mission critical applications

- Use of high-availability options, advanced technologies, Virtualization

• Solutions by strong vendors such as :

- HP

- Oracle, Sun

- EMC

- CISCO

- Microsoft

Page 5: HP & WIND Hellas IT

5

WIND IT Approach

Page 6: HP & WIND Hellas IT

6

Initiative – The 2004 story

• WIND decided in 2002 to introduce a new Prepaid platform

• Provide innovative services to our customer base

• Over 4.5M investment

• Revenue booster

• Real time system

• Platform became in production at Feb 2004

• 11 Clustered systems in Alpha Server, Tru64 environment

• Served 1.5M customers

• 12M monthly revenues

Page 7: HP & WIND Hellas IT

7

Daily Operational Challenges

We have to….

• Act as a service provider towards internal and external customers

• Meet tight SLAs (99,999% for Prepaid platform)

• Operate as a full 24x7 organization

• Manage Real time systems

So, we must…

• Maintain a robust and reliable event notification mechanism

• Utilize performance statistics and trend analysis results

• Focus on proactive activities and notification

• Enhance monitoring aspect to a wider range of activities and events, up to the

application level

• Keep a high-level of expertise

• Have efficient support contracts with vendors

Special challenges of the 2004 initiative

• Full manual monitoring cycle of the entire platform took 3hrs

• 6 months later the Athens Olympics 2004 were about to start

Page 8: HP & WIND Hellas IT

8

2004 Initiative – Implementation

• HP Operations Manager was chosen as the core monitoring platform

• HP Network Node Manager was chosen as network monitoring tool

• SPIs for Oracle, WebLogic, Tuxedo

• 250 processes to monitor

• 150 logfiles to check / filter

• MIBs for monitoring GSM network connectivity (SS7)

• Event notification through GUI, e-mail and SMS

• Define and classify events with severity types

• Define message groups and recipients

• Service Tree definition

Page 9: HP & WIND Hellas IT

9

Nowadays

• Significant growth of WND environment

Year Systems %Growth VirtualStorage

Capacity(TB) %Growth Monitors

2004 104   0 21    400

2005 128 23,1 0 37 76,2

2006 175 36,7 0 48 29,7

2007 188 7,4 0 75 56,2

2008 202 7,4 0 130 73,3

2009 253 25,2 30 210 61,5

2010 295 16,6 86 250 19,0 3500

• Systems increase from 2004 to 2010 : 283%

• Data growth from 2004 to 2010 : 1.190%

• Evolution of advanced technologies (Oracle RAC, Virtualization, J2EE)

• Rapid launch of new commercial products (3G / VAS services, SmartPhones evolution)

• No tolerance in service unavailability

Page 10: HP & WIND Hellas IT

10

Adapt IT Operations to new commercial needs

• Enhance and enrich the event notification environment

• Define better communication and escalation paths for incident management

• Measure the results and redefine the environment

• Next Steps : Monitor & Measure the core Business

Page 11: HP & WIND Hellas IT

11

Enhance & Enrich the event notification environment

• Not plain monitoring : “System up”, “System down”

• Introduce application monitoring based on

- Business Rules (e.g. specific action on MSISDN failed)

- Flow control (e.g. check for wrong delivery of Call Detail Records, flow inactivity over

time etc.)

- Control smooth application execution (message queues, concurrency of application

processes)

• Strict event classification and escalation path per event

• Integrate foreign monitoring systems (Ericsson SLM, Building Management System)

• Immediate checks, Smart checks (“no-login” checks for standby engineer assistance)

• Agent health check

• Service Tree Enhancement

Page 12: HP & WIND Hellas IT

12

Enhance & Enrich the event notification environment – Real Examples

• An alarm is raised with message “Action for MSISDN xxx Failed”

- If message appears once, then the customer performed an illegal action,

inform Customer Service dept to come in contact with customer.

- If message appears more than 100 times in a minute, raise CRITICAL,

check Provisioning platform

• Flow control check

- Check under directory </dir>

- Filenames with pattern “WSDP_*”

- If receive less than 2 files per 3 minutes, raise WARNING and on 3

WARNINGS raise CRITICAL

OR

- If file size is less than 2KB, raise CRITICAL alarm

• Message queue check

- Check provisioning queue length

- If length > threshold1, raise WARNING

- If length > threshold2, raise MAJOR

- If length > threshold3, raise CRITICAL

Page 13: HP & WIND Hellas IT

13

Immediate Checks / Smart Checks

• When an incident occurs, operator calls standby engineer

• Before logging in, standby engineer asks to perform some predefined checks / scripts

Page 14: HP & WIND Hellas IT

14

Build Service Tree

• Assist standby engineer to understand the service impact

Page 15: HP & WIND Hellas IT

15

Define better communication & escalation paths - I

System A System B System C

Network Device A

Network Device B

Network Device C

Corporate LAN/WAN

Database

Application 2

Application 1

Database

Application 2

Application 1

Database

Application 2

Application 1

System & Database Admins

Network Engineers

Application Engineers / A

Application Engineers / B

Application Engineers / C

Operator / Sys & DB cons

8 x 5 & Standby Engineers

Operator / Network cons

Operator / Application cons

24 x 7 Shift Operators

Openview ServerOM & NNM

Events / MessagesCritical, Major,Warning, Minor

Phone Call

Mail / SMS for Critical & Major Messages

All messages in group console GUI

Page 16: HP & WIND Hellas IT

16

Define better communication & escalation paths - Example

System A System B System C

Network Device A

Network Device B

Network Device C

Corporate LAN/WAN

Database

Application 2

Application 1

Database

Application 2

Application 1

Database

Application 2

Application 1

System & Database Admins

Network Engineers

Application Engineers / A

Application Engineers / B

Application Engineers / C

Operator / Sys & DB cons

8 x 5 & Standby Engineers

Operator / Network cons

Operator / Application cons

24 x 7 Shift Operators

Openview ServerOM & NNM

Events / MessagesCritical, Major,Warning, Minor

Phone Call Standby(non-working hours)

Phone Call Responsible(working hours)

Page 17: HP & WIND Hellas IT

17

Measure the results and redefine the environment

• Average No of events per day (Critical, Major, Warning) : 11.700

• Average response time of engineer to take over, Working hours : 5min

• Average response time of engineer to take over, Non-Working hours : 15min

Percentage of daily events per severity

Page 18: HP & WIND Hellas IT

18

Next Steps : Monitor & Measure the core Business

• Beta testing of OMi9

- Selected by the HP-Boeblingen team as candidates for beta testing

- Provide feedback to HP for improvements

• Pilot run, proof of concept for EUM

- Improve quality of service towards our retail chain

- Infrastructure monitoring already in place through OM

- Focus on end-user-experience and measure core business transactions

Page 19: HP & WIND Hellas IT

19

OMi9 - Feedback

Main features, areas of interest

• Unified environment, common look and feel for the entire BSM product suite (APM, EUM,

OMi, etc)

• Intuitive and well-designed GUI

• UCMDB, CIs

• Dynamic update of CIs

• Association of events with CIs

• Correlation Rules -> Focus on root cause of a problem

• Flexible and efficient graphical representation of landscape on top of the

Event Management Foundation (the well-known Operations Manager) :

- Site Map

- Topology View, TBEC

- Health Perspective View, same with BAC

- Performance Perspective

• Integration with BAC, SiteScope, Service Desk

Page 20: HP & WIND Hellas IT

OMi9 – GUI environment

Page 21: HP & WIND Hellas IT

21

EUM – Proof Of Concept

Scope

• Measure core-business transactions exactly as the end-users experience them

• Increase proactive actions

• Reduce service unavailability

• Proceed in corrective actions, based on findings, to improve quality of service

Implementation

• Selected 3 different shops across Greece

• Defined core transactions to monitor

• Environment setup and configuration

• Record business transactions (BPM)

• Deploy monitors

• Create baseline

• Start real monitoring & measurement

Page 22: HP & WIND Hellas IT

22

BAC Results - I

Page 23: HP & WIND Hellas IT

23

BAC Results - II

Page 24: HP & WIND Hellas IT

24

BAC Results - III

Page 25: HP & WIND Hellas IT

25

Benefits

• Increased availability times

• More proactive than reactive

• Kept low headcount in Operations

• Reduced emergency overtimes of standby Operations personnel

• Better systems resource utilization according to trend analysis through OVPM

• Decreased Fraud cases due to increased availability times (revenue gain)

• Decreased negative prepaid balances due to increased availability of Prepaid systems

• Improved company image due to increased availability times

• EUM results convinced company to invest on this platform

Page 26: HP & WIND Hellas IT

26

Roadmap

• Enhance application monitoring to include more business logic

• Upgrade to OMi9/NNMi

• Invest on a bundle of HP software products such as

- Service Manager, Service Discovery, Asset Manager, Program Manager, BPM/RUM

- Service Level Management, Change Management, Server Automation

- VMware Infrastructure SPI

• Under negotiations with HP to produce an Enterprise License Agreement for the above

products

Page 27: HP & WIND Hellas IT

Continue the conversation with your peers at the HP Software Community hp.com/go/swcommunity