INSURANCEINVESTMENTSLOANSMORTGAGESPENSIONSSAVINGS BANKING CREDIT CARDS DSI Investigation & Practical...
-
Upload
richard-jewett -
Category
Documents
-
view
217 -
download
2
Transcript of INSURANCEINVESTMENTSLOANSMORTGAGESPENSIONSSAVINGS BANKING CREDIT CARDS DSI Investigation & Practical...
INSURANCE INVESTMENTS LOANS MORTGAGES PENSIONS SAVINGSBANKING CREDIT CARDS
DSI Investigation DSI Investigation
& Practical Health Modelling& Practical Health Modelling
DSI Investigation DSI Investigation
& Practical Health Modelling& Practical Health Modelling
Kev Robinson
&
Rob Morgan
BankingBanking PensionsPensions MortgagesMortgages LoansLoansCredit CardCredit Card InvestmentsInvestments SavingsSavingsInsuranceInsurance
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
2/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Agenda• Introduction & Background
• Elements of DSI
• Aims
• Build & Deployment
• Monitoring & Creating a Health Model
• Dynamic Systems Initiative
– Availability & Resource on demand
• Conclusions
3/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Background
Nationwide
Building
Society
4/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Nationwide• The World’s largest Building Society
• Seventh largest financial organisation in the UK
• 9% par share of the UK retail savings balances – 2nd largest
• 11.8% par share of the UK residential mortgage lending - 4th largest
• 11 million customers
• 1st On-line banking offering in UK
• 1 in 4 UK households have a relationship with Nationwide
• 16,000 employees
• Around 880 Retail Outlets
• Over 2,350 ATMs
• £112 billion assets
5/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Nationwide – Technology
• Technology Division
– Approx. 1350 employees
• www.nationwide.co.uk
• Business Systems & Servicesinclude:
– Online Banking– Payments processing– Mortgage and loans systems– Customer Relationship Management– Point of sales systems– Call centre technologies– Regulatory systems, e.g. BASEL II– … Total 130+ systems
6/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Background
Why
Investigate
DSI?
7/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Background to investigating DSI• Why were we interested in doing this?
• History
– A service was a server or group of servers
– Shared environments
– More complex relationships between systems
• Improved delivery of infrastructure
• Improved Enterprise Systems Management
• Service monitoring was previously “Server Monitoring”
• Options to reduce the TCO of Technology Infrastructure
8/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Aims
How to best
deploy and manage
new systems and
technologies?
9/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Challenges• Builds
– Manual builds aren’t sustainable
– Lack of consistency
• Monitoring
– Eventlog monitoring & performance counters aren’t sufficient
– Systems Management needs to be part of the development process
• Load Balancing & Clustering
– So what is running where?
– When is a service reduced or unavailable?
• Adapting to a changing business
– Additional processing required at key times of the year
10/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Proof points• Provisioning and de-provisioning
• Automate wherever possible
• Building an effective health-model
• End to end systems management
• Prove high-availability technologies
• Review published best practices
• Demonstrate DSI principles
Best position Nationwide with technology and skills to run the core infrastructure of
future application delivery.
With technologythat is
availabletoday
11/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Technologies Involved• Automating the build of servers
– Automated Deployment Services (ADS)
• Automating the deployment of standard components
– Automated Purposing Framework (APF)
• Deploy new releases of applications
– Standard approach to application install (MSI technology)
• Monitor & manage systems – reporting of problems
– Microsoft Operations Manager 2005 (MOM)
• Review how to improve infrastructure availability
– Database Clustering, Load-balancing, etc.
12/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Automated
Builds
13/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Why automate builds?• Automated Builds
– Manually building servers and associated installs can take hours
– Consistency of builds and between environments
– Reduce the time taken to resolve problems
– Improve configuration management
– Knowledge management – scripts rather than in people’s head
• Provision on demand & provision on failure
• Availability: Rapid deployment and reaction to events in a controlled and managed way
• Business: Ability to deploy systems when business process or demand changes
14/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Server buildWindows
StandardComponents
SystemProducts
BusinessApplications
Server
.
AntiVirus
Backup
MOM
CustomSettings
Securitysettings
Schedulingagent
Role:
BizTalk
SQL Server
Web Server
Web Services
ApplicationServer
Bu
sin
ess
App
licati
on
s
Sta
nd
ard
Win
dow
s O
/S
ADS APF APF APF & MSI
15/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
• What do you do?
Automated Builds
Manual Builds
Deploy Just baseOperating System
Operating System& Standard Components
Automate completesystem builds
16/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Release
Deployment
17/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Release Deployment• Consistent delivery from development into production
• Scheduled release rather than manually intensive
• Capability for Operations to release changes
– Enabling the ability to regress.
• Help reduce the deployment manpower
– Improve speed to deliver
– Reduce complexity
– Provide consistency
– Better configuration & change management
• Availability: Slick release and regression
• Business: Improved reliability of releases
18/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
System
Monitoring DB
System Center System Center Data Data
WarehouseWarehouse ReportingReporting
AgentsAgentsAgentsAgents
Ops ConsoleOps ConsoleAdmin ConsoleAdmin ConsoleWeb ConsoleWeb Console
MOMMOM ServerServer
19/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
System Monitoring & Management
•Traditional Alert Management
– Monitor event logs
– Monitor performance statistics
– Event correlation
•But…
– What about load-balanced systems?
– What about differentiating between degraded service and service outage?
– Is all OK when there are no alerts?
– Is the service down when there are dozens of events?
20/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Health
Modelling
Rob Morgan
Stay Aware
Effectively
Respond
Be Accountable
21/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Health/Task Model – Background• During development, applications and services are
coded to produce numerous alerts and messages.
– No consistent understanding how these map to service availability.
– No view on application coverage
• Task Model
– No consistent delivery of required processes
• To recover service should an outage occur
• Administer the service
22/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Service Composition• Server estate contains hundreds of interconnecting:
– Servers– Services– Components
• Elements are complex and not consistently recorded
• Entity Hierarchy
– Host
• The location on which components are hosted or accessed
– Components
• Application or services installed on a logical host
– Service
• Highest level description of the end to end solution
23/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Component Relationships• Relationships
– Consumes
• The component that receives information
– Consumed
• The component that provides information
• Impact
– Green
• Outage of the component has no impact
– Amber
• Outage of the component will degrade the operation of the dependant components
– Red
• Outage of the component will stop the operation of the dependant component
24/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Developing the modelAn Example Component Relationship Mapping
Server02
Server03
Payment Submit
Server01
Payment Workflow
Work Allocation
PaymentBatchStatus
Workflow System
Payment System
25/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Health Model Hierarchy
Payment Workflow
Work Allocation
Payments SystemWorkflow System
Payment Batch Status
Payment Submit
26/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Operational Conditions
Managed Entity – Payment Submit
Aspect – Queue Lengths
Up
Down
Degraded
•High level condition
• Impact to the end user
27/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Method of Collection
Managed Entity – Payment Submit
Aspect – Queue Lengths
Up
Down
Degraded
T3
T2
T1
Example items:
• Eventlog
• Standard Info
• Component
• Blame component
• State before
• State afterwards
28/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Managed Entity – Payment Submit
Aspect – Queue Lengths
StateDetector -1
ProviderProvider
NT event log
Perfmon data
WMI
SNMP
Log files
Syslog
CriteriaCriteria
Wheresource=DCOM and Event ID=1006
Detector – What has gone wrong
29/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Managed Entity – Payment Submit
Aspect – Queue Lengths
State Diagnoser
ResponseResponse
Alert
Script
SNMP trap
Pager
Task
Managed Code
File Transfer
Diagnose – Identify Root Cause
30/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Diagnoser
Resolver
Resolver
ResponseResponse
Alert
Script
SNMP trap
Pager
Task
Managed Code
File Transfer
Resolver – How do we fix it?
31/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Managed Entity – Payment Submit
Aspect – Queue Lengths
State Verifier
ProviderProvider
NT event log
Perfmon data
WMI
SNMP
Log files
Syslog
CriteriaCriteria
Wheresource=DCOM and Event ID=1006
Verifier – Has the problem been resolved?
32/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Method of collection – Task Model
Managed Entity – Payment Submit
Aspect – Queue Lengths
Up
Down
Degraded
T3
T2
T1
Example items:
• Description
• Operator Instructions& procedures
• Roles
• Frequency
33/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Health Model
Health Model & Components
Service View
Business ApplicationBusiness ApplicationBusiness ApplicationBusiness Application
Component ComponentComponent
.
34/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Limitations• How to implement within MOM2005
• Confined to physical servers
• No service view
• Server role versus component
35/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
System Monitoring – Summary
36/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Business
&
information
Monitoring
Kev Robinson
37/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Additional Monitoring & Reporting• Business reporting
– Reporting business state rather than technical
• Business events
– e.g. suspicious payment data being processed – data driven alerts
• Operator tasks
– Restore service using common tasks
– Improve 1st level support
• Availability: Full proactive management. Automated responses. Visible “health-state”
• Business: Reduced downtime. Informed of what is happening. Reporting on service level exceptions
38/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
DSI
Dynamic
Systems Initiative
39/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
DSI – Building Blocks• Model based development tools
– System Definition Model– Coming with Visual Studio 2005
• Operationally aware applications
– Management packs for MOM– Application instrumentation
• Model-based Management
– State view & Service availability reporting– Health models
• Dynamic Resource Availability
– Automated builds and deployment– Performance governed deployment– Improved hardware utilisation
40/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
DSI
Resilience
&
Dynamic
Resource
41/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Why Dynamic Resource?
• Meeting performance needs
– Year ends
– Business growth
– Internet adoption
– “Hidden internet threats”
• Biggest “hidden threats” to performance?
– Is it security?
– Is it hacking?
– Is it phishing?
– Clue: Insider threat…
MARKETING .
XXX
Safe advert
£10 voucherfor every10,000th eBankingsign-on
Unsafe advertMARKETING
42/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Resilience
1 server unavailabl
e
1 database taken down for maintenance
SQLCluster
Web BizTalk Database Web BizTalk Database
Technology Service
Web BizTalk Database Web BizTalk Database
Technology Service
Web BizTalk Database Web BizTalk Database
Technology Service
Both servers
unavailable
Web BizTalk Database Web BizTalk Database
Technology Service.
Users
43/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Dynamic Resourcing
Pool of Servers.
44/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Conclusions
45/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
Conclusions
• Automate builds and deployments where possible
• Develop a health-model for business systems
– Needs to be factored in with the application development
• Improved monitoring & reporting
• Automated recovery and pro-active responses
• Make better use of hardware, reducing TCO
46/46
MICROSOFT INFRASTRUCTURE ARCHITECT FORUM : October 2005
BankingBanking PensionsPensions MortgagesMortgages LoansLoansCredit CardCredit Card InvestmentsInvestments SavingsSavingsInsuranceInsurance
• Thank you
• Kev Robinson [email protected]
• Rob Morgan [email protected]
? Any Questions
© Nationwide Building Society, 2005. Some images © Microsoft Ltd.