Jacobs Kiefer Bayes Guide 3 10 V1
-
Upload
michael-jacobs -
Category
Documents
-
view
699 -
download
0
description
Transcript of Jacobs Kiefer Bayes Guide 3 10 V1
The Bayesian Approach to Default Risk: A Guide
Michael Jacobs, Jr.
Credit Risk Analysis Division
Office of the Comptroller of the Currency
Nicholas M. Kiefer
Cornell University
Departments of Economics and Statistical Science
March 2010Forthcoming: “Rethinking Risk Measurement & Reporting”, Risk Books, Ed. Klaus Blocker
The views expressed herein are those of the authors and do not necessarily represent the views of the Office of the Comptroller of the Currency or the Department of the Treasury.
Outline
• Introduction
• Elicitation of Expert Information
• Statistical Models for Defaults
• Elicitation: Example
• Inference
• Conclusions & Directions for Future Research
Introduction• All competent statistical analyses involve subjective inputs, importance of
which is often minimized in a quest for objectivity• Justification of these is an important part of supervisory expectations for
model validation (OCC 2000, BCBS 2009b)• But to appear objective estimation with such judgments typically proceeds
ignoring qualitative information on parameters• However, subject-matter experts typically have information about parameter
values and model specification • The Bayesian approach allows formal incorporation of this by combining
“hard” & “soft” data using the rules of probability • Another advantage is availability of powerful computational techniques such
as Markov Chain Monte Carlo (MCMC)• A difficulty in Bayesian analysis is elicitation & representation of expert
information in the form of a probability distribution
Introduction (continued)• While may not be important in "large“ samples, expert information is of value
if data is scarce, costly, or unreliable• Herein we illustrate the practical steps in the Bayesian analysis of a PD
estimation for a group of homogeneous assets• This is required for determining minimum regulatory capital under the Basel
II (B2) framework (BCBS, 2006)• This also implications for BCBS (2009a), which stresses the continuing
importance of quantitative risk management• Focus 1st on elicitation & representation of expert information, then on
Bayesian inference in nested simple models of default• As we do not know in advance whether default occurs or not, we model this
uncertain event with a probability distribution • We assert that uncertainty about the default probability should be modeled
the same way as uncertainty about defaults - i.e, represented in a probability distribution
Introduction (concluded)• There is information available to model the PD distribution - the fact that
loans are made shows that risk assessment occurs!• The information from this should be organized & used in the analysis in a
sensible, transparent and structured way• First discuss the process for elicitation of expert information and later show
a particular example using the maximum entropy • Present a sequence of simple models of default generating likelihood
functions (with generalized linear mixed models):– The binomial model, 2-factor ASRM of B2, and an extension with an autocorrelated
systematic risk factor
• We sketch the Markov Chain Monte Carlo approach to calculating the posterior distribution from combining data and expert information coherently using the rules of probability
• Illustrate all of these steps using annual Moody's corporate Ba default rates for the period 1999-2009
Elicitation of Expert Information• General definition: a structured algorithm for transforming an
expert's beliefs on an uncertain phenomenon into a probability• Here a method for specifying a prior distribution of unknown
parameters governing a model of credit risk (e.g., PD, LGD) • While focus is inference regarding unknown parameters, arises
in almost all scientific applications involving complex systems • The expert may be an experienced statistician or a somewhat
quantitatively oriented risk specialist (e.g., a loan officer, PM) • Situation also arises where decision-making under uncertainty
needs to be expressed as such to maximize an objective • Useful framework: identify a model developer or econometrician
as a facilitator to transform the soft data into probabilistic form• The facilitator should be multi-faceted, having also business
knowledge and strong communication skills
Elicitation of Expert Information (continued)
• In setting criteria for the quality of an elicitation, we distinguish between the quality of an expert's knowledge & the elicitation
• By no means a straightforward task, even if it is beliefs regarding only a single of event or hypothesis (e.g., PD)
• We seek assessment of probabilities, but it is possible that the expert is not familiar the meaning or can think in these terms
• Even if the expert is comfortable with these, it is challenging to accurately assess numerical probabilities of a rare event
• In eliciting a distribution for a continuous parameter it is not practical to try eliciting an infinite collection of probabilities
• Practically an expert can make only a finite number (usually limited) number of statements of belief (quantiles or modes)
• Given these formidable difficulties an observer may question if it is worth the effort to even attempt this!
Elicitation of Expert Information (continued)
• Often a sensible objective is to measure salient features of the expert's opinion - exact details may not be of highest relevance
• Similarity to specification of a likelihood function: infinite number of probabilities expressed as a function of small parameter sets
• Even if the business decision is sensitive to the exact shape, it may be that another metric that is of paramount importance
• Elicitation promotes a careful consideration by both the expert and facilitator regarding the meaning of the parameters
• Two benefits: results in an analysis that is closer to the subject of the model & gives rise to a meaningful posterior distribution
• A natural interpretation is as part of the statistical modeling process - this is a step in the hierarchy & the usual rules apply
• Stylized representation has 4 stages: preparation, specific summaries, vetting of the distribution & overall assessment
Elicitation of Expert Information (concluded)
• Non-technical considerations include quality of the expert and quality of the arguments made
• While the choice of an expert must be justified, it usually not that hard identify: risk management decision-making individuals
• It is useful to have a summary of what the expert knowledge is based on and be wary of any conflicts of interests
• Important that if needed training should be offered on whatever concepts will be required in the elicitation (a “dry-run“)
• The elicitation should be well documented: set out all questions & responses, and the process of fitting the distribution
• Documentation requirements here fit well into supervisory expectations with respect to developmental evidence of models
• Further discussion of this can be found in Kadane and Wolfson (1998), Garthwaite et al (2005) and O'Hagan et al (2006)
Statistical Models for Defaults• The simplest probability model for defaults for a homogeneous
portfolio segment is Binomial, which assumes independence across assets & time, with common probability
• As in Basel 2 IRB and the rating agencies, this is marginal with respect to conditions (e.g., through taking long-term averages)
• Suppose the value of the ith asset in time t is:
• Where , and default occurs if asset value falls below a common predetermined threshold , so that:
• It follows that default on asset i is distributed Bernoulli:
• Denote the defaults in the data and the total count of defaults
0,1
it it ~ 0,1it NID 1id
* *Pr 1 Pri itd T T
*T
1~ 1 iidd
i id p d , 1,.., ,iD d i n n I
1
n
ii
r D d
Statistical Models for Defaults (continued)
• The distribution of the data and the default count is:
• This is Model 1, which underlies rating agency estimation of default rates, where the MLE estimator is simply
• Basel II guidance suggests there may be heterogeneity due to systematic temporal changes in asset characteristics or to changing macroeconomic conditions, giving rise to our Model 2:
• Where is a common time-specific shock or a systematic factor and is asset value correlation
1
1
~ 1 1ii
nd n rd r
i
D p D
~ , 1
n rrnr p r Bin n
r
ˆ r
n
1122 1it t itx
~ 0,1tx NID 0,1
Statistical Models for Defaults (continued)
• We have that and the conditional (or period-t) default probability is (4):
• We can invert this for the distribution function of the year t default rate for (5):
• Differentiating with respect to A yields the well-known Vasicek distribution (5.1):
12~ ,1it t tx N x
11 2*
12
, Pr1
tt t it
xx T
0,1A
111 1 122
1 12 2
1Pr Pr
1
tt
x AA A
1 1 12
1 12 2
11,
1
tt
t
p
Statistical Models for Defaults (continued)
• The conditional distribution of the number of defaults in each period is (6):
• From which we obtain the distribution of defaults conditional on the underlying parameters by integration over the default rate distribution (6.1):
• By intertemporal independence we have the data likelihood across all years (7):
• Model II allows clumping of defaults within time periods, but not correlation across time periods, so the next natural extension lets the systematic risk factor xt follow an AR(1) process:
, 1t t
t tt t
n rt r rt t t t
t
np r Bin n
r
1
0
, ,t tp r p r z p z dz
11
,.., ~ ,T
T
T tt
R r r p r
1 , ~ 0,1 , 1,1t t t tx x NID
Statistical Models for Defaults (concluded)
• The formula for the conditional PD (4) still holds, but we don’t get the Vasicek distribution of the default rate (5) and (6)-(6.1) becomes this without the Vasicek distributed default rate:
• Now the unconditional distribution is given by the T-dimensional integration as the likelihood now can’t be broken up the period-by-period (8):
• Where is the joint-density of a zero-mean random variable following and AR(1) process
• While Model 1 is a very simple example of a Generalized Linear Model - GLMs (McCullagh and Nelder, 1989), Models II &III are Generalized Linear Mixed Models - GLMMs), a parametric mixture (McNeil and Wendin, 2007; Kiefer, 2009)
11
,.., , ,T
T t t tt
p R p r x
1 11
, , , , ,...,T
t t t T Tt
p R p r x p x x dx dx
1,..., Tp x x
Elicitation: Example• We asked an expert to consider a portfolio of middle market
loans in a bank's portfolio, typically commercial loans to un-rated companies (if rated, these be about Moody's Ba-Baa)
• This is an experienced banking professional in credit portfolio risk management and business analytics, having seen many portfolios of this type in different institutions
• The expert thought the median value was 0.01, minimum of 0.0001, that a value above 0.035 would occur with probability less than 10%, and an absolute upper bound was 0.3
• Quantiles were assessed by asking the expert to consider the value at which larger or smaller values would be equiprobable given the value was less or greater than the median
• The 0.25 (0.75) quantile was assessed at 0.0075 (0.0125), and he added a 0.99 quantile at 0.2, splitting up the long upper tail from 0.035 to 0.3
Elicitation: Example (continued)• How should we mathematically express the expert information?• Commonly we specify a parametric distribution, assuming standard
identification properties (i.e., K conditions can determine a K-parameter distribution-see Kiefer 2010 a)
• Disadvantage: rarely good guidance other than convenience of functional form & this can insert extraneous information
• We prefer the nonparametric the maximum entropy (ME) approach (Cover & Thomas, 1991), where we choose a probability distribution p that maximizes the entropy H subject to K constraints:
.
max ln
. . 0 1,...,
1
p
k
p x p x dx H p
s t p x c x dx for k K
and p x
Elicitation: Example (continued)• Our constraints are the values of the quantiles , and we can express the solution in terms of
the Lagrangian multipliers chosen such they are satisfied, so from the 1st order conditions:
• This is a piecewise uniform distribution, which we decide to smooth with an Epanechnikov kernel, under the assumption that discontinuities are unlikely to reflect the expert’s view:
• Where h is the bandwidth, chosen such that the expert was satisfied with the final product
.
1
expK
ME k k kk
p I q
kqkk
1
1
21
1
3 11,1
4
S ME
ME
up K u p du
h
u up du for u
h
Elicitation: Example (continued)• We address the boundary problem, that K has a larger support by pME, using the reflection
technique (Schuster, 1985):
• For asset correlation in Models 2 & 3, B2 recommends a value of about 0.20 for this segment, so due to little expert information on this, we choose a Beta(12.6, 50.4) prior centered at to 0.20
• With even less guidance on the autocorrelation in Model 3, other than from asset pricing literature that is likely to be positive, we chose a uniform prior in [-1,1], with the B2 value of 0 as its mean
.
10,
1 1,1
12 1 ,1
S S
SM S
S S
p p for h
p p for h h
p p for h
Elicitation: Example (continued)
0.000 0.005 0.010 0.015 0.020 0.025 0.030
02
04
06
08
0
Smoothed Prior Density for Theta
De
nsi
ty
Elicitation: Example (concluded)
0.0 0.2 0.4 0.6 0.8 1.0
02
46
8
Prior for Rho: Beta (12.6,50.4)
db
eta
(x, 1
2.6
, 50
.4)
Inference: Bayesian Framework• Let us write the likelihood function of the data generically:
• The joint distribution of the data R and the prior p is:
• The marginal (predictive) distribution of R is:
• Finally, we obtain the posterior (conditional) distribution of the parameter as:
• Perhaps take a summary statistic like , the posterior expectation, for B2 or other purposes, which is (asymptotically) optimal under (bowl-shaped) quadratic loss
• Computationally high dimensional numerical integration may be hard and inference a problem, therefore simulation techniques
, , , , ,p R for
,p R p R p
,p R p R d
p R pp R
p R
E R
Inference: Computation by Markov Chain Monte Carlo
• MCMC methods are a wide class of procedures for sampling from a distribution when the normalizing constant is unknown
• In the simple case, the Metropolis method, we sample from our posterior distribution that is only know up to a constant:
• We construct a Markov Chain which has this as its stationary distribution by starting with a proposal distribution
• The new parameter depends upon the old one stochastically, and the diagonal covariance matrix of the normal error is chosen specially to make the algorithm work
• We draw from this distribution and accept the new draw according to the ratio of joint likelihoods of the data and the parameter, known as the acceptance rate
0 ,.., N from p R
' ' , ~ , ; ' 'q with NID q q 0 V
Inference: Computation by MCMC (concluded)
• Note and therefore is easy to calculate in that:
• The resulting sample is an MC with this equilibrium distribution & moments calculated from it approximate the target
• We use the mcmc package (Geyer, 2009) used in the R programming language (R Development Core Team, 2009)
• The package takes into account that standard errors from this are not independent in computation of confidence bounds
, '
',,
n
n
p R
p R
1' ',
1 ',
n
n
n n
with probability
with probability
', '
, n n
p Rp R
p R p R
,p R ', n
Inference: Data• We construct a segment of upper tier high-yield corporate
bonds of firms rated Ba by Moody's Investors Service• Use the Moody's Default Risk ServiceTM (DRSTM) database
(release date 1-8-2010) • These are restricted to U.S. domiciled, non-financial and non-
sovereign entities• Default rates were computed for annual cohorts of firms starting
in January 1999 and running through January 2009• Use the Moody’s adjustment for withdrawals (i.e., remove ½
from the beginning count) • In total there are 2642 firm-years of data and 24 defaults, for an
overall empirical rate of 0.00908
Inference: Data (continued)
2000 2002 2004 2006 2008
0.0
00
0.0
10
Time Series of Moody's Annual Ba Default Rate 1999-2009
Year
De
fau
lt R
ate
-0.01 0.00 0.01 0.02 0.03
04
08
01
20
Kernel Density of Moody's Ba Annual Default Rate 1999-2009
Default Rate
De
nsi
ty
Inference: Empirical Results
• PD estimates in 2- & 3-parameter models are only very slightly higher than in the 1-parameter model
• Higher degree of variability of AVC estimate rho relative to the mean as compared to the PD
E(θ|R) σθ
95% Credible Interval E(ρ|R) σρ
95% Credible Interval E(τ|R) στ
95% Credible Interval
Acceptance Rate
Stressed Regulatory Capital (θ)1
Minimum Regulatory Capital2
Stressed Regulatory Capital Markup
1 Parameter Model 0.00977 0.00174
(0.00662, 0.0134) 0.245 6.53% 5.29% 23.49%
2 Parameter Model 0.0105 0.00175
(0.00732, 0.0140) 0.0770 0.0194
(0.0435, 0.119) 0.228 6.72% 5.55% 21.06%
3 Parameter Model 0.0100 0.00176
(0.0069, 0.0139) 0.0812 0.0185
(0.043, 0.132) 0.162 0.0732
(-0.006, 0.293) 0.239 6.69% 5.38% 24.52%
1 - Using the 95th percentile of the posterior distribution of PD, an LGD of 40%, and asset value correlation of 20% and unit EAD in the supervisory formula2 - The same as the above but using the mean of the posterior distribution of PD
Markov Chain Monte Carlo Estimation: 1 ,2 and 3 Parameter Models Default (Moody's Ba Rated Default Rates 1999-2009)
Inference: Empirical Results (continued)
• Relatively low estimate of rho consistent with various previous calibrations of structural credit models to default vs. equity data
• But the prior mean (0.2) is well outside the posterior 95% confidence interval for the AVC rho – why?• Theta = 0.01 & rho = 0.2 in the Vasicek distribution implies an
intertemporal standard deviation in default rates of 0.015• But with rho = 0.077, the posterior mean, the implied standard deviation
is 0.008, which better matches that in our sample of 0.0063 • This aspect of the data is moving the posterior to the left of the prior
• There is evidence that autocorrelation parameter tau of systematic risk factor may be mildly positive
• Estimates of stressed regulatory capital are 6.53% (6.7%) for Model(s) 1 (2 & 3), and the mark-up over the base ranges in 21-25%
Inference: Empirical Results (continued)
Markov Chain Monte Carlo Posterior Density: Probability of Default (1-Parameter Model)
Moody's Ba Default Rates: Annual Cohorts 1999-2009
De
nsity
0.005 0.010 0.015
05
01
00
15
02
00
25
03
00
Empirical Results (continued)Posterior Density: Probability of Default (2-Parameter Model)
Moody's Ba Default Rates: Annual Cohorts 1999-2009
De
nsi
ty
0.005 0.010 0.015 0.020
05
01
50
25
0
Posterior Density: Asset Value Correlation (2-Parameter Model)
Moody's Ba Default Rates: Annual Cohorts 1999-2009
De
nsi
ty
0.00 0.05 0.10 0.15
05
10
20
Empirical Results (concluded)
Conclusion & Directions for Future Research
• Modeling the data distribution & expert information statistically increases the range of applicability of econometric methods
• We have gone through the steps of a formal Bayesian analysis for PD, required under B2 for many institutions worldwide
• We concluded with posterior distributions for the parameters of a nested sequence of models with summary statistics
• The mean PD a natural estimator for minimum regulatory capital requirements, but such distributions have many uses– E.g., stressing IRB models, economic capital or credit pricing
• More general models provide insight into the extent to which default rates over time are predictable & the extent to which risk calculations should look ahead over a number of years
• Analysis of LGD or economic capital using Bayesian methods (jointly with PD?) would be useful (substantial experience here)
• Many other possible analyses could build on these methods