Measuring the Quality of Credit Scoring Models
Transcript of Measuring the Quality of Credit Scoring Models
Measuring the Quality of Credit Scoring Models
Martin ŘezáčDept. of Mathematics and Statistics, Faculty of Science,
Masaryk University
CSCC XI, EdinburghAugust 2009
2/30
1. Introduction 3
2. Good/bad client definition 4
3. Measuring the quality 6
4. Indexes based on distribution function 75. Indexes based on density function 17
6. Some results for normally distributed scores 24
7. Conclusions 30
Content
3/30
Introductionq It is impossible to use scoring model effectively without knowing how good it is.
q Usually one has several scoring models and needs to select just one. The best one.
q Before measuring the quality of models one should know (among other things):Ø good/bad definitionØ expected reject rate
4/30
Good/bad client definition
Ø GoodØ BadØ IndeterminateØ InsufficientØ ExcludedØ Rejected.
q Good definition is the basic condition of effective scoring model.q The definition usually depends on:
q Generally we consider following types of client:
Ø days past due (DPD)Ø amount past dueØ time horizon
5/30
Customer
Default(60 or 90 DPD)
Not default
Fraud (first delayed payment, 90 DPD)
Early default(2-4 delayed payment, 60 DPD)
Late default (5+ delayed payment, 60 DPD)
Good/bad client definition
Rejected
Accepted
Insufficient
GOOD
BAD
INDETERMINATE
6/30
Measuring the qualityq Once the definition of good / bad client and client's score is available, it is possible to evaluate the quality of this score. If the score is an output of a predictive model (scoring function), then we evaluate the quality of this model. We can consider two basic types of quality indexes. First, indexes based on cumulative distribution function like
The second, indexes based on likelihood density function like
Ø Kolmogorov-Smirnov statistics (KS)Ø Gini indexØ C-statisticsØ Lift.
Ø Mean difference (Mahalanobis distance)Ø Informational statistics/value (IVal).
7/30
Indexes based on distribution function
=.,0
,1otherwise
goodisclientDK
( )11)(1
. =∧≤= ∑=
Ki
n
iGOODn DasI
naF
( )01)(1
. =∧≤= ∑=
Ki
m
iBADm DasI
maF
[ ]HLa ,∈( )asIN
aF i
N
iALLN ≤= ∑
=1.
1)(
Ø Empirical distribution functions:
( )
=otherwise
trueisAAI
01
mnmpB +
=,mn
npG +=
Number of good clients:Number of bad clients:Proportions of good/bad clients:
nm
Ø Kolmogorov-Smirnov statistics (KS)
[ ])()(max ,,,
aFaFKS GOODnBADmHLa−=
∈
8/30
Ø Lorenz curve (LC)
Ø Gini index
ABA
AGini 2=+
=
[ ].,),()(
.
.
HLaaFyaFx
GOODn
BADm
∈==
( ) ( )∑+
=−− +⋅−−=
mn
kkGOODnGOODnkBADmkBADm FFFFGini
k2
1..1..1
kBADmF . kGOODnF .where ( ) is k-th vector value of empirical distribution function of bad (good) clients
Indexes based on distribution function
9/30
21 GiniCAstatc +
=+=−
( )012121 =∧=≥=− KK DDssPstatc
It represents the likelihood that randomly selected good client has higher score than randomly selected bad client, i.e.
c
Indexes based on distribution function
Ø C-statistics:
10/30
q Another possible indicator of the quality of scoring model can be cumulative Lift, which says, how many times, at a given level of rejection, is the scoring model better than random selection (random model). More precisely, the ratio indicates the proportion of bad clients with less than a score a, , to the proportion of bad clients in the general population. Formally, it can be expressed by:
[ ]HLa ,∈
( )
( )
( )
( )
( )
( )
Nn
asI
YasI
YYI
YI
asI
YasI
BadRateaCumBadRateaLift
i
mn
i
i
mn
i
mn
i
mn
i
i
mn
i
i
mn
i
≤
=∧≤
=
=∨=
=
≤
=∧≤
==∑
∑
∑
∑
∑
∑+
=
+
=
+
=
+
=
+
=
+
=
1
1
1
1
1
10
10
0
0
)()(
Indexes based on distribution function
BadRateaBadRateaabsLift )()( =
11/30
# bad clients Bad rate abs. Lift # bad clients Bad rate cum. Lift1 100 16 16,0% 3,20 16 16,0% 3,20 2 100 12 12,0% 2,40 28 14,0% 2,80 3 100 8 8,0% 1,60 36 12,0% 2,40 4 100 5 5,0% 1,00 41 10,3% 2,05 5 100 3 3,0% 0,60 44 8,8% 1,76 6 100 2 2,0% 0,40 46 7,7% 1,53 7 100 1 1,0% 0,20 47 6,7% 1,34 8 100 1 1,0% 0,20 48 6,0% 1,20 9 100 1 1,0% 0,20 49 5,4% 1,09 10 100 1 1,0% 0,20 50 5,0% 1,00 All 1000 50 5,0%
absolutely cumulativelydecile # cleints
Indexes based on distribution function
-
0,50
1,00
1,50
2,00
2,50
3,00
3,50
1 2 3 4 5 6 7 8 9 10
decile
Lift
valu
e
abs. Liftcum. Lift
Gini=0,55
0
0,2
0,4
0,6
0,8
1
0 0,2 0,4 0,6 0,8 1
Lornz curveBase line
q Usually it is computed using table with numbers of all and bad clients in some bands (deciles).
12/30
# bad clients Bad rate abs. Lift # bad clients Bad rate cum. Lift1 100 8 8,0% 1,60 8 8,0% 1,60 2 100 12 12,0% 2,40 20 10,0% 2,00 3 100 16 16,0% 3,20 36 12,0% 2,40 4 100 5 5,0% 1,00 41 10,3% 2,05 5 100 3 3,0% 0,60 44 8,8% 1,76 6 100 2 2,0% 0,40 46 7,7% 1,53 7 100 1 1,0% 0,20 47 6,7% 1,34 8 100 1 1,0% 0,20 48 6,0% 1,20 9 100 1 1,0% 0,20 49 5,4% 1,09 10 100 1 1,0% 0,20 50 5,0% 1,00 All 1000 50 5,0%
decile # cleintsabsolutely cumulatively
Indexes based on distribution function
-
0,50
1,00
1,50
2,00
2,50
3,00
3,50
1 2 3 4 5 6 7 8 9 10
decile
Lift
valu
e
abs. Liftcum. LiftGini=0,48
0
0,2
0,4
0,6
0,8
1
0 0,2 0,4 0,6 0,8 1
Lornz curveBase line
q When bad rates are not monotone:
Ø LC looks fineØ Gini is slightly loweredØ Lift looks strange
13/30
# bad clients Bad rate abs. Lift # bad clients Bad rate cum. Lift1 100 16 16,0% 3,20 16 16,0% 3,20 2 100 12 12,0% 2,40 28 14,0% 2,80 3 100 8 8,0% 1,60 36 12,0% 2,40 4 100 5 5,0% 1,00 41 10,3% 2,05 5 100 3 3,0% 0,60 44 8,8% 1,76 6 100 2 2,0% 0,40 46 7,7% 1,53 7 100 1 1,0% 0,20 47 6,7% 1,34 8 100 1 1,0% 0,20 48 6,0% 1,20 9 100 1 1,0% 0,20 49 5,4% 1,09 10 100 1 1,0% 0,20 50 5,0% 1,00 All 1000 50 5,0%
absolutely cumulativelydecile # cleints
Indexes based on distribution function
# bad clients Bad rate abs. Lift # bad clients Bad rate cum. Lift1 100 1 1,0% 0,20 1 1,0% 0,20 2 100 1 1,0% 0,20 2 1,0% 0,20 3 100 1 1,0% 0,20 3 1,0% 0,20 4 100 1 1,0% 0,20 4 1,0% 0,20 5 100 2 2,0% 0,40 6 1,2% 0,24 6 100 3 3,0% 0,60 9 1,5% 0,30 7 100 5 5,0% 1,00 14 2,0% 0,40 8 100 8 8,0% 1,60 22 2,8% 0,55 9 100 12 12,0% 2,40 34 3,8% 0,76 10 100 16 16,0% 3,20 50 5,0% 1,00 All 1000 50 5,0%
absolutely cumulativelydecile # cleints
Gini= - 0,55
0
0,2
0,4
0,6
0,8
1
0 0,2 0,4 0,6 0,8 1
Lornz curveBase line
-
0,50
1,00
1,50
2,00
2,50
3,00
3,50
1 2 3 4 5 6 7 8 9 10
decile
Lift
valu
e
abs. Liftcum. Lift
q When score is reversed, we obtain reversed figures.
14/30
Gini = 0.42
0
0,2
0,4
0,6
0,8
1
0 0,2 0,4 0,6 0,8 1
Lornz curveBase line
Indexes based on distribution function
# bad clients Bad rate1 100 35 35,0%2 100 16 16,0%3 100 8 8,0%4 100 8 8,0%5 100 7 7,0%6 100 6 6,0%7 100 6 6,0%8 100 5 5,0%9 100 5 5,0%10 100 4 4,0%All 1000 100 10,0%
decile # cleints
# bad clients Bad rate1 100 20 20,0%2 100 18 18,0%3 100 17 17,0%4 100 15 15,0%5 100 12 12,0%6 100 6 6,0%7 100 4 4,0%8 100 3 3,0%9 100 3 3,0%10 100 2 2,0%All 1000 100 10,0%
decile # cleints
Ø SC 1:
K-S = 0.36
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1
goodbad
K-S = 0.34
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1
goodbad
Gini= 0,42
0
0,2
0,4
0,6
0,8
1
0 0,2 0,4 0,6 0,8 1
Lornz curveBase line
Ø SC 2:
q The Gini is not enough!!!
15/30
Indexes based on distribution function
-
0,50
1,00
1,50
2,00
2,50
3,00
3,50
4,00
1 2 3 4 5 6 7 8 9 10
decile
Lift
valu
e
abs. Liftcum. Lift
-
0,50
1,00
1,50
2,00
2,50
1 2 3 4 5 6 7 8 9 10
decile
Lift
valu
e
abs. Liftcum. Lift
Ø SC 1: Ø SC 2:
Lift20% = 2.55 >Lift50% = 1.48 <
Lift20% = 1.90Lift50% = 1.64
SC 2 is better if reject rate is expected around 50%.SC 1 is much more better if reject rate is expected by 20%.
16/30
)()(
)(.
.
aFaF
aLiftALLN
BADn= [ ]HLa ,∈
( ))(1))(())(( 1
..1..
1.. qFF
qqFFqFFLift ALLNBADn
ALLNALLN
ALLNBADnq
−−
−
==
{ }qaFHLaqF ALLNALLN ≥∈=− )(],,[min)( .1.
( ).)1.0(10 1..%10
−⋅= ALLNBADn FFLift
Indexes based on distribution function
q Lift can be expressed and computed by formulae:
17/30
Indexes based on density function
)(xfGOOD )(xf BAD
( )∑=
=
−=n
Di
ihGOOD
K
sxKn
hxf1,1
1),(~ ( )∑=
=
−=m
Di
ihBAD
k
sxKm
hxf0
1
1),(~
12112
1
23
,~
)!32()52()!12( +
++
⋅⋅
+++
= kkk
kOS nk
kkkh σ
Ø Likelihood density functions:
Ø Kernel estimates:
Ø Optimal bandwidth (maximal smoothing):
where: is the order of kernel function(e.g. 2 for Epanechnikov kernel)is number of actual casesis an estimate of standard deviation
k
nσ~
18/30
Ø Mean difference (Mahalanobis distance):
21
22
+
+=
mnmSnS
S bg
SMM
D bg −=
Indexes based on density function
where S is pooled standard deviation:
gM bM
gS bS, are standard deviations of good (bad) clients
, are means of good (bad) clients
19/30
Ø Information value (Ival) – continuous case (Divergence):
( ) dxxfxfxfxfI
BAD
GOODBADGOODval
−= ∫
∞
∞− )()(
ln)()(
)()()( xfxfxf BADGOODdiff −=
=
)()(ln)(
xfxfxf
BAD
GOODLR
Indexes based on density function
20/30
• Replace density functions by their kernel estimates and compute integral numerically (e.g. by composite trapezoidal rule).
• Using Epanechnikov kernel, given byand optimal bandwidth we have
• For given M+1 points we obtain
Ø Information value (Ival) – discretized continuous case:
( ) [ ]( )1,1143)( 2 −∈⋅−= xIxxK
++
−= ∑
−
=
1
10
0 )(~)(~2)(~2
M
iMIViIVIV
Mval xfxfxf
MxxI
( )
−=
),(~),(~
ln),(~),(~)(~
2,
2,2,2,
OSBAD
OSGOODOSBADOSGOODIV hxf
hxfhxfhxfxf
Indexes based on density function
kOSh ,
Mxx ,,0 K
0x Mx
21/30
• Create intervals of score – typically deciles. Number of goods (bads) in i-th interval is marked by .
• It must holds
• Then we have ∑
−=
i i
iiival nb
mgmb
ngI ln
Indexes based on density function
score int. # bad clients #good clients % bad [1] % good [2] [3] = [2] - [1] [4] = [2] / [1] [5] = ln[4] [6] = [3] * [5]1 1 10 2,0% 1,1% -0,01 0,53 -0,64 0,012 2 15 4,0% 1,6% -0,02 0,39 -0,93 0,023 8 52 16,0% 5,5% -0,11 0,34 -1,07 0,114 14 93 28,0% 9,8% -0,18 0,35 -1,05 0,195 10 146 20,0% 15,4% -0,05 0,77 -0,26 0,016 6 247 12,0% 26,0% 0,14 2,17 0,77 0,117 4 137 8,0% 14,4% 0,06 1,80 0,59 0,048 3 105 6,0% 11,1% 0,05 1,84 0,61 0,039 1 97 2,0% 10,2% 0,08 5,11 1,63 0,1310 1 48 2,0% 5,1% 0,03 2,53 0,93 0,03All 50 950 Info. Value 0,68
Ø Information statistics/value (Ival) – discrete case:
( )ii bg
ibg ii ∀>> 0,0
22/30
q Information value for our example of two scorecards:
Indexes based on density function
Ø SC 1:
ØSC 2:
decile # cleints # bad clients #good % bad [1] % good [2] [3] = [2] - [1] [4] = [2] / [1] [5] = ln[4] [6] = [3] * [5] cum. [6]1 100 35 65 35,0% 7,2% -0,28 0,21 -1,58 0,44 0,442 100 16 84 16,0% 9,3% -0,07 0,58 -0,54 0,04 0,473 100 8 92 8,0% 10,2% 0,02 1,28 0,25 0,01 0,484 100 8 92 8,0% 10,2% 0,02 1,28 0,25 0,01 0,495 100 7 93 7,0% 10,3% 0,03 1,48 0,39 0,01 0,506 100 6 94 6,0% 10,4% 0,04 1,74 0,55 0,02 0,527 100 6 94 6,0% 10,4% 0,04 1,74 0,55 0,02 0,558 100 5 95 5,0% 10,6% 0,06 2,11 0,75 0,04 0,599 100 5 95 5,0% 10,6% 0,06 2,11 0,75 0,04 0,63
10 100 4 96 4,0% 10,7% 0,07 2,67 0,98 0,07 0,70All 1000 100 900 Info. Value 0,70
decile # cleints # bad clients #good % bad [1] % good [2] [3] = [2] - [1] [4] = [2] / [1] [5] = ln[4] [6] = [3] * [5] cum. [6]1 100 20 80 20,0% 8,9% -0,11 0,44 -0,81 0,09 0,092 100 18 82 18,0% 9,1% -0,09 0,51 -0,68 0,06 0,153 100 17 83 17,0% 9,2% -0,08 0,54 -0,61 0,05 0,204 100 15 85 15,0% 9,4% -0,06 0,63 -0,46 0,03 0,225 100 12 88 12,0% 9,8% -0,02 0,81 -0,20 0,00 0,236 100 6 94 6,0% 10,4% 0,04 1,74 0,55 0,02 0,257 100 4 96 4,0% 10,7% 0,07 2,67 0,98 0,07 0,328 100 3 97 3,0% 10,8% 0,08 3,59 1,28 0,10 0,429 100 3 97 3,0% 10,8% 0,08 3,59 1,28 0,10 0,52
10 100 2 98 2,0% 10,9% 0,09 5,44 1,69 0,15 0,67All 1000 100 900 Info. Value 0,67
23/30
q Using markings we have:
−=
mb
ngI ii
diffi
Indexes based on density function
=
nbmgI
i
iLRi
ln
Ø SC 1:
ØSC 2:
-0,30
-0,25
-0,20
-0,15
-0,10
-0,05
0,00
0,05
0,10
1 2 3 4 5 6 7 8 9 10
-2,00
-1,50
-1,00
-0,50
0,00
0,50
1,00
1,50I_diffI_LR
0,00
0,05
0,10
0,15
0,20
0,25
0,30
0,35
0,40
0,45
0,50
1 2 3 4 5 6 7 8 9 100,00
0,10
0,20
0,30
0,40
0,50
0,60
0,70
0,80I_diif * I_LRcum. I_diff * I_LR
-0,15
-0,10
-0,05
0,00
0,05
0,10
1 2 3 4 5 6 7 8 9 10
-1,00
-0,50
0,00
0,50
1,00
1,50
2,00I_diffI_LR
0,00
0,02
0,04
0,06
0,08
0,10
0,12
0,14
0,16
1 2 3 4 5 6 7 8 9 100,00
0,10
0,20
0,30
0,40
0,50
0,60
0,70
0,80I_diif * I_LR
cum. I_diff * I_LR
K-S = 0.34Gini = 0.42Lift20% = 2.55Lift50% = 1.48Ival = 0.70Ival20% = 0.47Ival50% = 0.50
K-S = 0.36Gini = 0.42Lift20% = 1.90Lift50% = 1.64Ival = 0.67Ival20% = 0.15Ival50% = 0.23
24/30
Ø Assume that the scores of good and bad clients are normally distributed, i.e. we can write their densities as
Ø Estimates of parameters and :
Ø Pooled standard deviation:
Ø Estimates of mean and standard dev. of scores for all clients :
Some results for normally distributed scores
( )2
2
2
21)( g
gx
gGOOD exf σ
µ
πσ
−−
=
( )2
2
2
21)( b
bx
bBAD exf σ
µ
πσ
−−
=
gbb σµµ ,, bσ
,
.
gM bM
gS bS, are standard deviations of good (bad) clients
, are means of good (bad) clients
21
22
+
+=
mnmSnS
S bg
mnmMnM
MM bgALL +
+==
( )21
2
2222
+
+=
mnSmSn
S bgALL
ALLALL σµ ,
25/30
12
222
−
Φ⋅=
−
Φ−
Φ=
DDDKS
Where is the standardized normal distribution function, the normal distribution function with parameters , and is the standard quantile function.
σµµ bgD
−=
( )
⋅+Φ⋅Φ= − Dpq
qLift G
ALLq
11σ
σ
2DIval =
12
2 −
Φ⋅=DGini
SMM
D bg −=
Some results for normally distributed scores
Ø Assume that standard deviations are equal to a common value :σ
( )⋅Φ)(1 ⋅Φ −
)(2,⋅Φ
σµµ 2σ
( )
⋅+ΦΦ= − Dpq
SS
qLift G
ALLq
11
26/30
Ø Generally (i.e. without assumption of equality of standard deviations):
⋅+−⋅Φ−
⋅+−⋅Φ= cbDa
bD
bacbDa
bD
baKS bggb 2121 22 *2**2* σσσσ
Some results for normally distributed scores
,22gba σσ +=
22
*
bg
bgDσσ
µµ
+
−= 22
*
bg
bg
SS
MMD
+
−=
where
=
b
gcσσ
ln,22gbb σσ −=
( ) ( )
( ) ( )
−⋅++
−−⋅
−
+Φ−
−⋅++
−−⋅
−
+Φ=
b
ggbgbb
gbg
gb
gb
b
ggbgbg
gbb
gb
gb
SS
SSDSSSSS
DSSSSS
SS
SSDSSSSS
DSSSSS
KS
ln21
ln21
22*2222
*22
22
22*2222
*22
22
2
2
27/30
Ø Generally (i.e. without assumption of equality of standard deviations):
( ) 12 * −Φ⋅= DGini
Some results for normally distributed scores
( )( ) ( )
−+Φ⋅Φ=Φ⋅+Φ=
−−
b
bALLALLALLALLq
Liftbb σ
µµσσµ
σµ
11
,
112
+=−++= 2
2
2
22*
21,1)1(
b
g
g
bval AADAI
σσ
σσ
+=−++= 2
2
2
22*
21,1)1(
b
g
g
bval S
SSSAADAI
( )
−+Φ⋅Φ=
−
b
bALLq S
MMqSq
Lift11
28/30
q KS and the Gini react much more to change of
and are almost unchanged in the direction of .
Ø Gini ,
Some results for normally distributed scores
0=bµ 12 =bσØ KS: ,
0=bµ 12 =bσ
• Gini > KS
gµ
2gσ
29/30
Ø Lift10%: ,
Some results for normally distributed scores
0=bµ 12 =bσ
Ø Ival: ,0=bµ 12 =bσ
q In case of Lift10%it is evident strong dependence on and significantly higher dependence on than in case of KS and Gini.
q Again strong dependence on . Furthermore value of Ival rises very quickly to infinity when tends to zero.
gµ
2gσ
gµ
2gσ
30/30
Conclusions
q It is impossible to use scoring model effectively without knowing how good it is.q It is necessary to judge scoring models according to their strength in score range where cutoff is expected. q The Gini is not enough!q Results concerning Lift and Information value can be used to obtain the best available scoring model.q Results for normally distributed scores can help with computation of referred indexes. Furthermore they can help to understanding how those indexes behave.