Credit Scoring with Deep Learning · Payment history 8. Credit Scoring ... Easier / more...

Post on 12-Apr-2020

3 views 0 download

Transcript of Credit Scoring with Deep Learning · Payment history 8. Credit Scoring ... Easier / more...

Credit Scoring with Deep LearningHåvard Kvamme

1

Kjersti AasØrnulf Borgan

Håvard Kvamme Nikolai Sellereite

Steffen Sjursen

2

3

Credit Scoring7

Credit Scoring

➢ Determine loan eligibility

➢ Evaluate existing loans

○ Easier / more information

■ Payment history

8

Credit Scoring

➢ Determine loan eligibility

➢ Evaluate existing loans

○ Easier / more information

■ Payment history

➢ Mortgage

➢ Car loan

➢ Credit Card

➢ etc.

9

Credit Scoring

➢ Determine loan eligibility

➢ Evaluate existing loans

○ Easier / more information

■ Payment history

➢ Mortgage

➢ Car loan

➢ Credit Card

➢ etc.

10

➢ Estimate default probabilities

➢ Statistical prediction / machine

learning

➢ Objective:

○ Default within one year (2, 3,

etc.)

11

Credit Scoring

➢ Current balance on accounts

➢ Loan balances

➢ Previous delinquencies

➢ Age

➢ Profession

➢ Salary

➢ etc.

12

Credit Scoring

So…

What is new?

13

In 2012, the average Norwegian made 323 card transactions, where 71% of the value transferred was through debit payments.

(Norges-Bank, 2012)

14

Customer transaction time series

➢ Checking

➢ Savings

➢ Credit card

➢ Checking: Number of transactions

➢ Checking: Into checking

15

Time series classification

➢ Create features

○ Mean, std, max, min, etc

○ TS analysis features

■ E.g parms. from ARIMA

○ DFT, DWT

16

Time series classification

➢ Create features

○ Mean, std, max, min, etc

○ TS analysis features

■ E.g parms. from ARIMA

○ DFT, DWT

➢ Data-driven features

○ MLP

○ Convolutional Neural Nets

○ Recurrent Neural Nets

17

Deep Learning

18

Neural Networks

➢ Differentiable transformations

○ E.g. f(x) = x * w

➢ Differentiable loss function

➢ Backpropagation (chain rule) -> LEGO

19

Neural Networks➢ Each block (transform) needs:

○ Forward pass (perform transformation)

○ Calculate gradient from BP-loss

○ Calculate BP-loss

➢ Extremely flexible

○ Classification / regression

○ Encoding

○ Generation (image, text, sound)

○ etc.

20

Colorization

http://richzhang.github.io/colorization/

21

Translation

https://research.googleblog.com/2015/07/how-google-translate-squeezes-deep.html

22

Playing Games

https://deepmind.com/research/alphago/

23

MLP

24

MLP

w1

Series

365

25

MLP

w1

Series

365

26

MLP Convolutions

w1

Series

365

w1

Series

365

27

MLP Convolutions

w1

Series

365

w1

Series

365

28

MLP Convolutions

w1

Series

365

w1

Series

365

29

MLP Convolutions

w2

Series

365

w1

Series

365

30

MLP Convolutions

w2

Series

365

w2

Series

365

31

MLP Convolutions

w2

Series

365

w2

Series

365

32

MLP Convolutions

w2

Series

365

Series

365

33

MLP Convolutions

w2

Series

365

Series

365

34

MLP Convolutions

w2

Series

365

Series

365

35

MLP Convolutions

w2

Series

365

Series

365

36

MLP Convolutions

w2

Series

365

Series

365

37

Our Architecture

38

Our Architecture

MLP

39

Our Architecture

MLP

Logistic regression

40

41

http://yosinski.com/deepvis

42

http://yosinski.com/deepvis

43

http://yosinski.com/deepvis

44

http://yosinski.com/deepvis

45

http://yosinski.com/deepvis

Data

46

Housing prices in Norway have generally increased steadily since 2003, and thus, the mortgage market has seen few defaults.

(Finanstilsynet, 2016)

47

Data

48

Results

49

➢ Increase the “low risk” group from 80% to 95%.

➢ 50% of defaults can be found in the 1% highest

risk group.

➢ Not restricted to mortgages.

➢ Is only one part of the full mortgage risk model!

50

ROC Curve➢ TP: True Positive rate

○ TP / P

➢ FP: False Positive rate

○ FP / N

➢ AUC: Area Under Curve

TP r

ate

FP rate

51

ROC Curve➢ TP: True Positive rate

○ TP / P

➢ FP: False Positive rate

○ FP / N

➢ AUC: Area Under Curve

TP r

ate

FP rate

Example:60% of defaults as default20% of non-default as default

52

Architectures

53

Architectures

DNB current risk model: 0.866

54

55

Size of Data

56

➢ Min

➢ Max

➢ Avg

➢ Std

➢ Missing

➢ Scaled versions

➢ Combinations

Random Forests

57

Random Forests

58

Random Forests

59

Questions?

60