Enhancing Catastrophic Risk Analysis with IBM Puredata for Analytics

27
© 2012 IBM Corporation 1 Enhancing Catastrophic Risk Analysis with IBM Puredata for Analytics

description

Enhancing Catastrophic Risk Analysis with IBM Puredata for Analytics. Agenda. Leveraging IBM Puredata into Catastrophic Risk Analysis IBM Puredata Success Stories in Catastrophic Risk Analysis IBM Puredata In-database Analytics IBM Puredata User Defined Extensions (UDX) - PowerPoint PPT Presentation

Transcript of Enhancing Catastrophic Risk Analysis with IBM Puredata for Analytics

Page 1: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation1

Enhancing Catastrophic Risk Analysiswith IBM Puredata for Analytics

Page 2: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation2

Agenda

Leveraging IBM Puredata into Catastrophic Risk Analysis

IBM Puredata Success Stories in Catastrophic Risk Analysis

IBM Puredata In-database Analytics

IBM Puredata User Defined Extensions (UDX)

Migration of a Catastrophic Risk Application to IBM Puredata

Page 3: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation3

IBM Big Data Platform

InfoSphere BigInsights

Hadoop-based low latency analytics for variety and volume

IBM Puredata 2000

BI+Ad Hoc Analytics Structured Data

IBM Smart Analytics System

Operational Analytics on Structured Data

IBM InfoSphere Warehouse

Large volume structured data analytics

InfoSphere Streams

Low Latency Analytics for streaming data

MPP Data Appliances

Stream ComputingInformation Integration

Hadoop (NoSQL)

InfoSphere Information Server

High volume data integration and transformation

Page 4: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation4

On-Demand Catastrophe Risk Analysis with IBM Puredata for Analytics

Page 5: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation5

Who is interested in Catastrophe Risk Models?

Insurers- Managing their exposure and

filing for rates

Catastrophic Risk Models

Brokers – Assessing risk management

strategies for clients

Reinsurers – Pricing reinsurance

Capital markets – Pricing cat bonds

Rating agencies– Evaluating a company’s

capital requirements

Page 6: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation6

Leveraging Catastrophe Risk Modeling

Reduce the risk that an insurer is unable to meet claims- Reduce policyholder loss if firm is unable to fully meet all claims- Provide an early warning system if capital falls below a required

level Promote confidence in financial stability

- Evaluate the company's risk profile and related reinsurance and investment strategies

- Discuss capital management with other external parties (ratings)

Evaluate returns on risk-adjusted capital for strategy development and implementation for individual business segments

Understand the relative contribution of the major risk categories to the overall risk profile (non-cat losses, catastrophes, reserve, credit and market)

Page 7: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation7

Catastrophe Risk Modeling

Treaty Conditions

Standard Models

Scenario Based Models

Value at Risk

Underwriting

Re-Insurance

Policy Pricing

Policyholder Loss

Loss Estimating

Sensitivity Analysis

Capital Management

Geospatial Peril ModelsHistorical/ForecastedTemporal/Real Time

Performance Improvement by Understanding Risk

Simulations

Temporal Correlation

Likelihood/Probability

Page 8: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation8

Changing the Game in Catastrophe Risk Modeling

Back-office Applications Downstream Analytics

Catastrophe Modeling Workflow ControlPolicy Demographics

IBM Puredata Analytic Appliance

Net

ezza

Hig

h-Sp

eed

Spa

tial

Dat

a Lo

ader

( AIR

,RM

S da

ta)

SPSS

Workflow Management

FasterNear-Real-Time Data Ingestion

Shortened Analytic Cycles

New MethodsComprehensive

Risk AnalysisIn-process Risk

Analysis

Flexibility & Understanding

What-if ModelingHigh-Speed Risk

Analysis

Increased DepthIncreased Analytic

Dimensionality

Expanded Peril Models

Cognos

Treaty Conditions

Standard Models

ScenarioModel

Simulations

Temporal Correlation

Likelihood/Probability

IBM NetezzaIn-Database

Analytics

Embedded Customer Algorithms (SQL & UDX)

Stat & Treaty Engine

Value at Risk

Underwriting

Re-Insurance

Policy Pricing

Policyholder Loss

Loss Estimates

Sensitivity Analysis

Capital Management

Ad hoc QueryData Mining

Page 9: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation9

Complementing AIR & RMS with IBM Puredata for Analytics

Data Extraction

& Grouping

Simulation

Recovery

SortOn Year

SQL ExportStats

Module

Sorted onYearly

Max Loss

Sorted onYearly

Total Loss

CalculationEngine

Pre-CatStats

RecoveryStats

Post-CatStats

Apply TreatyData

Calc net losses

Report generationAd hoc query

Pre Cat data

Initial Scope

Upstream RMS & AIR

application

In-database Analytics

EPDefinition

Expanded Capability by moving to in-database Analytics

Page 10: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation10

Key Points for Migrating to IBM Puredata for Analytics

Database Migration- IBM Puredata is a SQL-92 compliant database- If you are using SQL-Server proprietary extensions there will be some migration

effort- Initial review indicates we may not want to use the existing UDF, but rather optimize

the SQL for IBM Puredata Analytic Applications

- Netezza Analyitcs UDX framework essentially allows a wrapper to be put around typical “file-in – file-out” applications to run in-database

- We may want to alter some of the existing application for improved parallelism (non-serial) as well as set-based logic

Long-term Simplicity- IBM Puredata essentially eliminates the need for database tuning and performance

issues associated with Analytic- Consolidation of analytics into the database simplifies the entire architecture

Only the IBM Puredata Analytic Performance is proprietary- Again, IBM Puredata is SQL-92 compliant- Our UDX wrappers are similar to every other database platform

Page 11: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation11

IBM Puredata Advanced Analytics Improved Analytics for Catastrophe Risk

Page 12: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation12

Up-to-the-minute Risk Modeling – Guy Carpenter

Large reinsurance company

Exposure management application calculates risk on insured properties

Risk data changes constantly as hurricane is approaching

4 million insured properties, tens of thousands of risk polygons

Previously analysis took 45 minutes using Oracle Spatial

Now takes 5 seconds using IBM Puredata

Page 13: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation13

National Fire Station Alignment

Determine the 5 Nearest Fire Stations to each household - 41,000 US Fire Stations- 114,00,000 million Zip 12 Points (Parcels) for Entire US…. - Calculated all scenarios in 30 minutes!- Analysis was never possible on Oracle! 41,000 – U.S. Fire stations

Page 14: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation14

Proximity to Coast

Shortest Distance to Coast: Florida- 14,700 coast segments (each defined by 300 vertices on average)- 8,500,000 Points ZIP12 Points- Cartesian Join

Netezza: 3 Hours, 42 Minutes

Inhouse GIS – 3 weeks!

(100+x Improvement)

Page 15: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation15

Policy Accumulation – Total Insured Value

Define a “buffer” around each insured property Sum all the insured properties in each buffer

Page 16: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation16

Calculate Total Insured Value

Sample Data – Miami Florida (Miami-Dade County), - 939,000 properties, Sum each value within a buffer centered around each point - 1 km radius search, On average 600 properties summed into each calculation- Individual Calculation - < 1 second- Bulk calculation - 2 hours

Page 17: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation17

Determining Portfolio Value-at-Risk In-Database

CHALLENGE Evaluate massive portfolios as fast as possible to minimize future losses and risk exposure

BENEFITSReal-time, high performance, scalable in-database analytics enables faster risk analysis

“This technology will allow us to revolutionize our risk calculation environment...we will be able to completely change the we that we look at and calculate risk.”

- Risk Quant at a Top 3 Bank

SOLUTIONIn-database analytics moves the complex calculations

next to the data, harnessing the power of up to 920

CPU cores to attack one of the most challenging

trading analytic processes, Value-at-Risk that uses

statistical simulations to compute forward looking

portfolio values running in minutes as opposed to

hours

Page 18: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation18

Calculating Value-at-Risk In-Database

Determine the Value-at-Risk for an equity options desk- 200,000 positions – different instruments and maturities- 1000 underlying stocks

Required to do the following:- Calculate daily returns on underlying stocks using historical prices- Calculate the correlation of daily returns- Perform Singular Value Decomposition (SVD)

• Simulate correlated returns for all underlying stocks using SVD for next 1 year- Perform 10,000 simulations and calculate the 95% percentile loss on each day for the entire portfolio

Puredata TF-6 Puredata TF-12

Nodes 12 CPU/48 Core 24 CPU/96 Core

Storage 60 TB 120 TB

#rows(data volume)

200 Positions 200 Positions

#columns(dimensions, features)

10,000 Simulations,1000 Stocks,

250 Days

10,000 Simulations,1000 Stocks,

250 Days

Total Simulations 2.5 Billion – 3 minutes 2.5 Billion – 1.5 minutes

Calculations 200 Thousand – 7 minutes 200 Thousand – 3.5 minutes

Total Elapsed Time < 10 minutes 5 minutes

Page 19: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation19

OpenRisk uses In-Database Scoring and Spatial Analytics on Netezza

CHALLENGE Quickly and on-demand determine combined risk across all portfolios of any size (1M+) for all insured catastrophic events

BENEFITSReal-time, high performance, scalable in-database analytics enables broader risk analysis

“Because of Netezza, we were able to launch a new business model – an on-demand, software-as-a-service large scale catastrophic risk modeling – that radically reduces the exposure for insurance companies.”

- Shajy Mathai, CTO, OpenRisk

SOLUTIONIn-database analytics eliminate data movement and execute 500B+ complex calculations in minutes to determine risk across portfolios

Page 20: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation20

OpenRisk Natural Disaster Portfolio Loss Estimate

Statistical model with a stochastic set of hurricane events that are applied to portfolio of properties to generate loss estimates over time  - 1M policies assessed for the entire state of Rhode Island

Required to do the following:- Computing the nearest “surface roughness” coefficient- Nearest GID for every impacted site (Lat/Long accuracy of .2 minute)- Interpolation on continuous distribution functions

Puredata TF-6 Puredata TF-12

Nodes 12 CPU/48 Core 24 CPU/96 Core

Storage 60 TB 120 TB

#rows(data volume)

1 Million Policies 1 Million Policies

#columns(dimensions, features)

100K Events1 M Locations

40K Geographic Bins

100K Events1 M Locations

40K Geographic Bins

Loss Matrix > 1 TB > 1 TB

Total Elapsed Time 45 minutes 20 minutes

Page 21: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation21

Optimizing Your Own Advanced Analytics OpenRisk Hurricane Risk Model

Page 22: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation22

Use Case Summary – Hurricane Risk Assessment

Catastrophe modelers run various models which simulate hazard and vulnerability over extremely large time periods (thousands of years) for portfolios of property risk. - This process generates terabytes of data which in turn is

analyzed to make loss estimates.Challenge: To develop a framework for implementing a

Hurricane Model that will:- Improve performance from days to hours- Reduce data movement- Increase integration flexibility- Reduce operational footprint by integrating database with

analysis grid

Page 23: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation23

Technical Architecture Imperatives

IBM Netezza Analytics as a SaaS Platform Facilitate rapid porting of Existing Hurricane Insurance Risk Models Maximum performance & scalability

- Millions of sites affected by a disaster event, i.e.. Hurricane Simplicity of a SQL call to run a sophisticated hurricane model Leverage the flexibility of IBM Netezza Analytics to implement a

hurricane risk model- UDX: User Defined Extensions to incorporate legacy code- Geospatial Analytics: Run risk for sites impacted in hurricane

polygon Facilitate rich, high-performance reporting and 3D map rendering!

- Accurately forecast damage assessment to property- Report discrepancy between coverage and damage assessment

Page 24: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation24

Database FortranProgram

ODBC

• Process a site if in hurricane• Gather building structural characteristics

• Gather terrain data• Apply mathematical modeling to score risk

• Compute predicated losses to site in $

Results to Files

Bulk load todatabase

The Existing Solution

The Existing Solution- Single threaded processing, very slow- Potent risk modeling intellectual property

locked away in Fortran- Difficult to apply parallel processing- Lots of infrastructure- Bulk-movement of data

Challenges– How to leverage existing code without

significant rewrite?– How to apply parallel processing in

simple way? – How to avoid massive data shipping?

Page 25: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation25

Analytics Computing Grid

C++/Fortran UDX, Geospatial Analytics

Client Company 1Proprietary Risk Model

Client Company nProprietary Risk Model

IBM Netezza SolutionMulti-tenant Solution for Applying Advanced Analytics in-Database

Simplicity of SQL!!! – Two Steps:- 1. Run Models on Demand!- 2. Execute Reports!

Massively parallel!!!- Speed!- Optimal distribution of site, building,

terrain, and physics data In-database Analytics

- Geospatial analytics applies latitude & longitude appropriately.

- C++/Fortran UDX implement model.- 1 thread/Shared Nothing node.- Elimination of DATA SHIPPING…- Emphasis on FUNCTION SHIPPING- True Multi-Tenant SaaS

Page 26: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation26

Running the Model on Demand

Reporting Layer

ETL Preprocessing • SQL • ETL Tools

Run Model • Simple & elegant single SQL statement• Use C++/Fortran UDXs that execute for a site:

• Determine of building characteristics.• Determine terrain factors.• Determine physical forces in effect.• Use proprietary mathematics • Output data in complex, proprietary data structure.

Populate input tables • Simple SQL insert statements using pure C++ UDX.

Process reporting tables • A SQL stored procedure

One elegant,master storedprocedure

Page 27: Enhancing  Catastrophic  Risk Analysis with IBM  Puredata  for Analytics

© 2012 IBM Corporation27

THANKSQUESTIONS?