IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival...

34
IFRS9 Survival Analysis with an Application in Apache Spark Credit Scoring and Credit Control XV Conference Hristiana Vidinova August 2017

Transcript of IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival...

Page 1: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

IFRS9 Survival Analysis with an Application in Apache SparkCredit Scoring and Credit Control XV Conference

Hristiana VidinovaAugust 2017

Page 2: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

2 © Experian

Introduction

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Page 3: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

3 © Experian

Introduction

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Expected credit losses

Macroeconomic modelling

Probability of default

IFRS 9 aspects

Page 4: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

4 © Experian

Lifetime forecasts

Forward-looking information

Probability –weighted estimates

IFRS 9 Expected credit losses

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

5.5.11. If reasonable and supportable forward-looking information is available without undue cost of effort, an entity cannot rely solely on past due information when determining whether credit risk has increased significantly since initial recognition.

B5.5.17. An entity shall measure expected credit losses of a financial instrument in a way that reflects:a) an unbiased and probability-weighted amount that

is determined by evaluating a range of possible outcomes;

b) the time value of money; andc) reasonable and supportable information that is

available without undue cost or effort at the reporting date about past events, current conditions and forecasts of future economic conditions.

B5.5.3: …an entity shall measure the loss allowance for a financial instrument at an amount equal to the lifetime expected credit losses if the credit risk on that financial instrument has increased significantly since initial recognition.

Page 5: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

5 © Experian

Goal

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Produce ECL forecasts

Include macroeconomic data

Build upon existing models Calibrate behavior scores

Account-classification

Page 6: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

6 © Experian

Methodology

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Page 7: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

7 © Experian

Survival model

Maturity effects

Error correction

Building blocks

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Page 8: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

8 © Experian

Survival analysis

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Account level model – PD prediction for each account (similarly for other events)Models not if but when an account will enter default• Better estimates of expected losses• Strong framework for IFRS 9 Duration (t) Time to default

Page 9: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

9 © Experian

Easily accommodates IFRS 9 framework

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Expectedcreditlosses

Constant (or structural) account characteristics

Time-varying account characteristics (e.g. maturity; current status; behaviour data and

scores)

Time-varying internal factors (policy changes etc.)

Time-varying external factors (economics, etc.)

Page 10: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

10 © Experian

Time specifications

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Application of continuous-time models is not recommended to discrete survival data due to the large number of ties that resultmore natural in social and behaviouralapplications where time is measured discretely. Discrete-time models can easily accommodate time-varying covariates. Do not require a hazard-related proportionality assumption. These models allow for unstructured, as well as structured estimation of the hazard function at each discrete time point.

Often proportional hazard assumption

Individual hazard functions differ proportionately based on a function of observed covariates

Parametric estimation

• Assume distribution

• eg. Log-logistic hazard

• Maximum likelihood estimation

Non-parametric estimation

Cox PH

Continuous-time Discrete time

Page 11: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

11 © Experian

Estimate𝑃𝑃 𝑦𝑦𝑡𝑡+𝜏𝜏|𝑦𝑦𝑡𝑡 = 0, 𝑆𝑆𝑡𝑡 ,𝐴𝐴𝑡𝑡+𝜏𝜏, 𝐸𝐸𝑖𝑖 𝑖𝑖=−∞

𝑡𝑡+𝜏𝜏

Where• 𝑦𝑦𝑡𝑡- account status at time t• 𝑆𝑆𝑡𝑡- behaviour score at time t• 𝐴𝐴𝑡𝑡- age of account at time t• 𝐸𝐸𝑡𝑡- economics variables at time t

Objective

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Behaviour component

Age component

Economic component

Page 12: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

12 © Experian

Let define𝑝𝑝𝑡𝑡,𝑘𝑘,𝑆𝑆,𝐴𝐴,𝐸𝐸 = 𝑃𝑃 𝑦𝑦𝑡𝑡+𝑘𝑘|𝑦𝑦𝑡𝑡+𝑘𝑘−1 = 0, 𝑆𝑆𝑡𝑡 ,𝐴𝐴𝑡𝑡 , 𝐸𝐸𝑖𝑖 𝑖𝑖=−∞

𝑡𝑡+𝑘𝑘

Then

𝑃𝑃 𝑦𝑦𝑡𝑡+𝜏𝜏|𝑦𝑦𝑡𝑡 = 0, 𝑆𝑆𝑡𝑡 ,𝐴𝐴𝑡𝑡 , 𝐸𝐸𝑖𝑖 𝑖𝑖=−∞𝑡𝑡+𝜏𝜏

= �𝑖𝑖=1

𝜏𝜏

𝑝𝑝𝑡𝑡,𝑖𝑖,𝑆𝑆,𝐴𝐴,𝐸𝐸�𝑗𝑗=1

𝑖𝑖−1

1 − 𝑝𝑝𝑡𝑡,𝑗𝑗,𝑆𝑆,𝐴𝐴,𝐸𝐸

Therefore it is enough to calculate 𝑝𝑝𝑡𝑡,𝑘𝑘,𝑆𝑆,𝐴𝐴,𝐸𝐸 for k = 1, … , 𝜏𝜏

Recursive representation

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Page 13: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

13 © Experian

We will focus on the following form𝑝𝑝𝑡𝑡,𝑘𝑘,𝑆𝑆,𝐴𝐴,𝐸𝐸 = ℎ 𝑔𝑔 𝑝𝑝𝑡𝑡,𝑘𝑘,𝐴𝐴,𝑝𝑝𝑡𝑡,𝑘𝑘,𝑆𝑆,𝑝𝑝𝑡𝑡,𝑘𝑘,𝐸𝐸

Where• 𝑝𝑝𝑡𝑡,𝑘𝑘,𝑆𝑆 = 𝑃𝑃 𝑦𝑦𝑡𝑡+𝑘𝑘|𝑦𝑦𝑡𝑡+𝑘𝑘−1 = 0, 𝑆𝑆𝑡𝑡 - account behaviour component

• 𝑝𝑝𝑡𝑡,𝑘𝑘,𝐴𝐴 = 𝑃𝑃 𝑦𝑦𝑡𝑡+𝑘𝑘|𝑦𝑦𝑡𝑡+𝑘𝑘−1 = 0,𝐴𝐴𝑡𝑡+𝑘𝑘 - account maturity component

• 𝑝𝑝𝑡𝑡,𝑘𝑘,𝐸𝐸 = 𝑃𝑃 𝑦𝑦𝑡𝑡+𝑘𝑘|𝑦𝑦𝑡𝑡+𝑘𝑘−1 = 0, 𝐸𝐸𝑖𝑖 𝑖𝑖=−∞𝑡𝑡+𝑘𝑘 - economics environment

component

• ℎ 𝑥𝑥 = 11+𝑒𝑒−𝑥𝑥

• 𝑔𝑔 𝑥𝑥,𝑦𝑦, 𝑧𝑧 = 𝛼𝛼1 + 𝛼𝛼2𝑥𝑥 + 𝛼𝛼3𝑦𝑦 + 𝛼𝛼4𝑧𝑧

Probability decomposition

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Page 14: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

14 © Experian

Include calibrated behaviour score to behaviour probabilitiesStandard scorecard development approach• Fine/Coarse classing per score and

forecasting window• Incorporate interaction terms as well• Logistic regression estimates

Account behaviourcomponent

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Page 15: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

15 © Experian

How to create a forecast

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

𝑃𝑃𝐷𝐷𝑡𝑡+1 = 𝑓𝑓(𝑏𝑏𝑏𝑏ℎ𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑠𝑠𝑠𝑠𝑏𝑏𝑏𝑏𝑏𝑏𝑡𝑡)Modelling stage

Forecasting stage Behaviour score needs to be forecasted

Page 16: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

16 © Experian

How to create a forecastMultiple-horizon model

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

𝑃𝑃𝐷𝐷𝑡𝑡+1 = 𝑓𝑓(𝑏𝑏𝑏𝑏ℎ𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑠𝑠𝑠𝑠𝑏𝑏𝑏𝑏𝑏𝑏𝑡𝑡)

Modelling stage

Forecasting stage Behaviour score does not need to be forecasted

𝑃𝑃𝐷𝐷𝑡𝑡+2 = 𝑓𝑓(𝑏𝑏𝑏𝑏ℎ𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑠𝑠𝑠𝑠𝑏𝑏𝑏𝑏𝑏𝑏𝑡𝑡)

𝑃𝑃𝐷𝐷𝑡𝑡+𝑠𝑠 = 𝑓𝑓(𝑏𝑏𝑏𝑏ℎ𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑠𝑠𝑠𝑠𝑏𝑏𝑏𝑏𝑏𝑏𝑡𝑡)

Page 17: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

17 © Experian

Multiple-horizon modelExploded panel

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

𝑃𝑃𝐷𝐷𝑡𝑡+1 = 𝑓𝑓(𝑏𝑏𝑏𝑏ℎ𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑠𝑠𝑠𝑠𝑏𝑏𝑏𝑏𝑏𝑏𝑡𝑡)

Modelling stage𝑃𝑃𝐷𝐷𝑡𝑡+2 = 𝑓𝑓(𝑏𝑏𝑏𝑏ℎ𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑠𝑠𝑠𝑠𝑏𝑏𝑏𝑏𝑏𝑏𝑡𝑡)

𝑃𝑃𝐷𝐷𝑡𝑡+𝑠𝑠 = 𝑓𝑓(𝑏𝑏𝑏𝑏ℎ𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑠𝑠𝑠𝑠𝑏𝑏𝑏𝑏𝑏𝑏𝑡𝑡)

Exploded panel

Page 18: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

18 © Experian

Motivated by age-cohort analysis / EMV models

Account maturity component

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

0 6 12 18 24 30 36 42 48 54 60 66 72 78 84 90 96 102

108

114

120

126

132

Maturity

Page 19: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

19 © Experian

Standard scorecard development approach

• Fine/Coarse classing per age of account• Logistic regression estimates• No problem in forecasting

Account maturity component

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

0 6 12 18 24 30 36 42 48 54 60 66 72 78 84 90 96 102

108

114

120

126

132

Maturity

Page 20: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

20 © Experian

Economics componentError-correction models

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

“Errors” in the short-term

Transitory deviations from the long-term relationshipdue to shocks

Equilibrium state – no inherent tendency to change

The system tends to return to this state after deviations

Entails a systematic co-movement among economic variables

Short-term dynamicsLong-term dynamics

Page 21: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

21 © Experian

Error correction equation:

∆(𝑦𝑦𝑡𝑡) = 𝛼𝛼 + 𝛽𝛽∆𝑥𝑥𝑡𝑡 − 𝛾𝛾 𝑦𝑦𝑡𝑡−1−𝛽𝛽𝑥𝑥𝑡𝑡−1𝛾𝛾 – speed of error correctionInclude more lags

Economics component

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Page 22: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

22 © Experian

Account states

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Good

Bad

Cured

Written offPrepaid

Marginal probabilities are estimated byusing approach from previous section

Page 23: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

23 © Experian

Technical aspects

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Page 24: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

24 © Experian

• Millions observations• Hundreds of variables• 10-20 years of history• Multiple-horizon forecast

Account level models

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

• Huge datasets• Difficult to process

Page 25: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

25 © Experian

Sample size

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Time series

• 100 historic months

Accounts Observations Exploded panel

1 32 704

100 000 32 mn 704 mn

Page 26: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

26 © Experian

What is Apache Spark?

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

1.Powerful open source engine for large-scale dataprocessing

2.Started as a research project in UC Berkeley AMPLab in 2009

3.The largest Apache Foundation project4.Sorts 100TB of data in 23 minutes

Page 27: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

27 © Experian

Spark Architecture

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Page 28: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

28 © Experian

1.Fast in-memory and disk computing2.Runs everywhere – on a cluster or standalone3. Write Spark applications in Python, R, Java, Scala4.A rich stack of libraries – SQL, machine learning,

graph analytics, steaming data5.Batch jobs and real-time analytics6.Lazy evaluations and Catalyst optimizer7.It’s fault tolerant

Why Spark?

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Page 29: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

29 © Experian 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Python application

Page 30: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

30 © Experian

Use

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Page 31: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

31 © Experian 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Triggers / stage

allocation

PD model developm

ent

PD forecasts

LGD forecasts

EAD ECL

Lifetime / prepayment

Process iterated for each macro scenario

LGD model development

Probability weighted ECL

Page 32: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

32 © Experian

Significant increase in risk

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Remaining PD

Cum

ulat

ive

PD

PD Curves

Page 33: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models

33 © Experian

Significant increase in risk

07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark

Remaining PD

Cum

ulat

ive

PD

PD Curves

PD estimate at reporting date

Page 34: IFRS9 Survival Analysis with an Application in Apache Spark · 07/09/2017 Public IFRS 9 Survival Analysis with an Application in Apache Spark. Application of continuous -time models