Collateralized Debt Obligation Pricing on the Cell/B.E. -- A
Transcript of Collateralized Debt Obligation Pricing on the Cell/B.E. -- A
IBM TJ Watson Research Center
Collateralized Debt Obligation Pricingon the Cell/B.E.-- A preliminary Result
Lurng-Kuo LiuVirat Agarwal
© 2007 IBM Corporation
IBM TJ Watson Research Center
Outline
Objecti eObjective
Collateralized Debt Obligation Basics
CDO on the Cell/B.E. – A preliminary result
Conclusion
© 2007 IBM Corporation2 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
IBM TJ Watson Research Center
Objectivej
ObjectiveObjective–Demonstrate the competitive edge of the Cell/B.E. on CDO pricing using Monte Carlo simulation with Gaussian Copula
–No intention to develop new models for CDO pricingNo intention to develop new models for CDO pricingWhy CDO?–The fastest growing sector of the asset-backed securities market. According to SIFMA, global CDO issuance increased to $488.6 cco d g to S , g oba C O ssua ce c eased to $ 88 6billion in 2006, nearly twice the $249.3 billion issued in 2005.
–CDO is challenging to price. Monte Carlo simulation has been the most popular method for CDO valuation. Monte Carlo simulation can be very resource intensive for large CDOssimulation can be very resource intensive for large CDOs.
–Seems to be the good fit for the Cell/B.E.
© 2007 IBM Corporation3 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
IBM TJ Watson Research Center
CDO Basics
A Collateralized Debt Obligation (CDO) is an asset-backedA Collateralized Debt Obligation (CDO) is an asset-backed security backed by a diversified pool of defaultable instruments like loans, junk bonds, mortgages, etc. If the portfolio contains only credit default swaps (CDS), it is called a synthetic CDOcalled a synthetic CDO.It is structured as multiple tranches and sold to investors. Each tranche has different priority to claim on the principal.Separate out the risks by prioritize the receipt of principalSeparate out the risks by prioritize the receipt of principal among the investors.
SeniorAssets sold to
the SPVPrincipal &
interest
SPVOriginatingBank
30-70%
Mezzanine5-30%Equity
Cash Funding
LossCas
h
Detachmentpoint - d
Attachmentpoint - a
© 2007 IBM Corporation4 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
0-5%point a
IBM TJ Watson Research Center
Distribution of Losses
Loss given default amount of the ith reference obligation:
NRL )1(where Ni is the notional amount and Ri is the recover rate.
The accumulated portfolio loss is
iii NRL )1( −=
The accumulated portfolio loss is
}{1
1)( t
n
ii i
LtL ≤=∑= τ
where is a default indicator
Cumulative loss on a given trance
1i=
}{1 ti ≤τ
Portfolio loss
d
a EquityMezzanine
Senior
)0max()(
))(())(()(,
xxwhere
dtLatLtL da
≡
−−−=+
++
© 2007 IBM Corporation5 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
)0,max()( xxwhere ≡
IBM TJ Watson Research Center
CDO Pricingg
Losses due to defaults (the issuer fails to satisfy the terms of the obligation) are the main source of risk as payoffs.Estimate the present value of tranche losses due to defaults – default leg (floating leg)
⎥⎦
⎤⎢⎣
⎡ ∫= ∫−T
da
duurtdLeEDL
t
0 ,
)()(0
Calculate the present value of the premium payments weighted by the outstanding capital – premium leg (fixed leg)
⎥⎤
⎢⎡
−−∫= ∑−w duur
adtLdeEsPLiT
)(}]0)(min{max[0δ
The fair price of the CDO tranche is defined to be spread such that the expected value of both legs is equal.
⎥⎦
⎢⎣
−−= ∑=i
iida adtLdeEsPL1
, }],0),(min{max[δ
⎤⎡ ∫Tt
⎥⎦
⎤⎢⎣
⎡−−∫
⎥⎦
⎤⎢⎣
⎡ ∫
=
∑
∫
=
−
−
w
ii
duur
i
T
da
duur
da
adtLdeE
tdLeEs
iT
1
)(
0 ,
)(
*,
}],0),(min{max[
)(
0
0
δ
© 2007 IBM Corporation6 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
⎦⎣ i 1
IBM TJ Watson Research Center
Modeling Default Times – Marginal Distributionsg g
Defa lt time for a single firm is modeled as theDefault time τ for a single firm is modeled as the first jump in a Cox process.
⎤⎡ ∫t
duu )(λ
⎥⎦
⎤⎢⎣
⎡ ∫−=≤⇒
⎥⎦
⎤⎢⎣
⎡ ∫=>
−
−
tduu
duu
eEtp
eEtp
0
0
)(
)(
1)(
)(
λ
λ
τ
τ
Default intensity or hazard rate of a given firm determines its default time
⎥⎦
⎢⎣
p )(
determines its default time.
© 2007 IBM Corporation7 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
IBM TJ Watson Research Center
Modeling Default Times – Joint Distributionsg
The primary driver of loss distributions is default coThe primary driver of loss distributions is default co-dependence – correlation sensitivity.–The higher the correlation, the more likely extreme loss events ( lti l d f lt ) b d th f i th d f(multiple defaults) become and therefore increases the spread of a senior tranche.
Need to model the join distribution of the default times (τi, …, j ( iτm) of the obligations in the portfolio
Gaussian copula is one of the first to be used for modeling the dependence structure in a credit portfoliothe dependence structure in a credit portfolio
[ ]))(()),...,((),...,( 111
111 NNNN tFtFttp −−
Σ ΦΦΦ=≤≤ ττ
© 2007 IBM Corporation8 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
IBM TJ Watson Research Center
Monte Carlo Simulation with Gaussian CopulapDraw a sample Z=(Z1,…,ZN) from an N-dimensional Gaussian distribution, with correlation matrix R
–Generate independent uniform random numbersConvert them into normal random numbers (W) by using e g Box Muller–Convert them into normal random numbers (W) by using e.g. Box-Muller transformation
–Perform Cholesky decomposition on the correlation matrix R=C.CT
–Generate correlated normal random numbers with X=CWConvert this sample to a correlated N-dimensional uniform vector U=(U1,…UN) = Φ(X)Turn each of these uniforms into a default time samples, by inversion: τi = Fi
-1(Ui)Sort the N-dimensional vector of default time in ascending order and select the default times that happen before maturity date.Use the random default times to generate the cash flow for the fixed leg and floating legg gDiscount these cash flow to get their present valuesRepeat the process for m times for the m-path Monte Carlo estimation
© 2007 IBM Corporation9 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
* For simplicity, calibration process is not included in this work.
IBM TJ Watson Research Center
Introducing Cell/B.E. v1.0C ll/B E i l t t i t 64b PCell/B.E. is an accelerator extension to 64b Power
– Built on a Power ecosystem– Used best know system practices for processor design
Sets a new performance standardSets a new performance standard – Exploits parallelism while achieving high frequency– Supercomputer attributes with extreme floating point
capabilities– Sustains high memory bandwidth with smart DMA First Generation Cell/B.E.– Sustains high memory bandwidth with smart DMA
controllersDesigned for natural human interaction
– Photo-realistic effectsP di t bl l ti
90 nm
241M transistors
235mm2
9 10 th d– Predictable real-time response– Virtualized resources for concurrent activities
Designed for flexibility– Wide variety of application domains
9 cores, 10 threads
>200 GFlops (SP)
>20 GFlops (DP)
Up to 25 GB/s memory B/WWide variety of application domains– Highly abstracted to highly exploitable programming
models– Reconfigurable I/O interfaces– Virtual trusted computing environment for security
Up to 75 GB/s I/O B/W
>300 GB/s EIB
Top frequency >4GHz (observed in lab)
© 2007 IBM Corporation10 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
Virtual trusted computing environment for securityCell/B.E. is the chip powering the Sony PS3
– (Shipped in volume the US in Nov ’06)
IBM TJ Watson Research Center
Heterogeneous multi-core system architecture
Power ProcessorSPE
SPUSPUSPUSPUSPUSPUSPUSPU
Cell/B.E. Features
– Power Processor Element for control tasks
– Synergistic Processor Elements for data-intensive processing
LS
SXUSPU
MFC
LS
SXUSPU
MFC
LS
SXUSPU
MFC
LS
SXUSPU
MFC
LS
SXUSPU
MFC
LS
SXUSPU
MFC
LS
SXUSPU
MFC
LS
SXUSPU
MFCintensive processingSynergistic Processor Element (SPE) consists of – Synergistic Processor
U it (SPU) 16B/cycle (2x)16B/cycle16B/cycle
EIB (up to 96B/cycle)
16B/cycle
CCCCC
Unit (SPU)– Synergistic Memory Flow
Control (MFC)• Data movement and
16B/cycle (2x)16B/cycle
BICMIC
PPE
PPUData movement and synchronization
• Interface to high-performance Element
FlexIOTMDual XDRTM
PXUL116B/cycle
L232B/cycle
© 2007 IBM Corporation11 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
Interconnect Bus 64-bit Power Architecture with VMX
IBM TJ Watson Research Center
Profiling results of the CDO pricing algorithmg p g g
Running time of various stages in CDO pricing
Ch l k D iti
Computational Complexity
of various stages:
Cholesky DecompositionCalculate PaymentsSum PaymentsStatistics
Generate Correlated
Generate Normals: O(Np)
Cholesky Decomposition: O(N3)
GenerateNormals
Random numbers
Generate Correlated
Random Numbers: O(N2p)
Generate Default Times: O(Np)
Generate Default Times
Sorting
( p)
Sort: O(pN logN)
Calculate Payments: O(Np)Using 100 firms and
© 2007 IBM Corporation12 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
100,000 paths
IBM TJ Watson Research Center
Random Numbers: Mersenne Twister
Astronomical period of 219937-1, suitable for Monte Carlo
Algorithm–
– series of shift operations on xk+n generates the output random number
2 different parallelization strategies– Optimize for a single SPE, use different (random) seeds.
– Fine-grain parallelism for generating a single stream.
© 2007 IBM Corporation13 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
IBM TJ Watson Research Center
Optimization for the SPEp
N = 624, M=397
V t t ti f l tiVector starting from location (i+1) or (i+M) may not be quadword aligned.
Computation of latter part of array requires updated dataarray requires updated data from the first M entries– Data dependence
© 2007 IBM Corporation14 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
IBM TJ Watson Research Center
Normalized Random Numbers: Polar Method
1. Generate Random Numbers a & b2 V 2a 1 V 2b 12. V1 ← 2a-1 V2 ← 2b-13. R ← V1
2 + V22
4. If R > 1, continue from STEP 1- R1 ← sqrt (-2 logR/R)- X ← V1R1
- Y ← V2R1Y ← V2R1
Optimization on SPEU t d b t & b– Use two random number vectors a & b
– Redo if condition fails for any pair of random numbers•Overheard due to skipping of perfectly normal random
© 2007 IBM Corporation15 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
•Overheard due to skipping of perfectly normal random numbers
IBM TJ Watson Research Center
Performance Comparison of RNG (MT) with other architectures
Time (in seconds) to generate 100 million random numbers in sequential
© 2007 IBM Corporation16 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
Time (in seconds) to generate 100 million random numbers in sequential and block pattern on various architectures.
* Source: http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/speed.html
IBM TJ Watson Research Center
Performance Comparison of RNG (MT) with other
P f i f R N G (M T i t )
architectures
P erfo rm ance com parison o f R N G (M ersene Tw is te r)on various a rch itec tu res
1.4
1 .6
B lockS equentia l
20 0
22 .1
(sec
onds
)
0 .8
1 .0
1 .2
10 6
20.0
12 .4
Tim
e
0 .2
0 .4
0 .610 .6
6 .6 6 .38 .3
9 .9
In te l_1 .4 In te l_3 .0 A M D _2.4 P P C _1.33 C e ll0 .0
© 2007 IBM Corporation17 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
IBM TJ Watson Research Center
Performance compared with other Cell/B.E. implementations
Performance comparison of our optimized RNG (Mersene Twister)as compared with other Cell/B.E. implementations
4
7.7
nnin
g Ti
me
(sec
onds
)
2
3
2.7
Another Cell RNG (MT) SDK RNG* Our RNG
Ru
0
1
Time (in seconds) to generate 100 millionnormalized random numbers
Performance comparison of our optimized RNG (with Normalization)as compared with other Cell/B.E. implementatoins
onds
)
14
16
18
20
32-bit64-bit
2.3
* Vectorized Random Number generation
on a single SPE.R
unni
ng T
ime
(sec
o
4
6
8
10
122.2
© 2007 IBM Corporation18 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
ecto ed a do u be ge e at oavailable with Cell SDK 2.1
Another Cell RNG w/N Our RNG w/N0
2
IBM TJ Watson Research Center
Correlation Matrix : Cholesky Decompositiony p
Cholesky decomposition on correlation matrix
– C -> LLT , where L is a NxN lower triangular matrix
M difi d i f th G Al ith– Modified version of the Gauss Algorithm
Initial optimized version for a single SPEInitial optimized version for a single SPE– Analyzing ways to further optimize and parallelize on the Cell.
© 2007 IBM Corporation19 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
IBM TJ Watson Research Center
Generating Correlated Random Numbersg
Compute N (number of firms) normalized random numbers.
– Vector V[0 .. N-1].
Calculate V’ = LV , where L is a lower triangular matrix.
Cell Optimization:– Branch mispredicts compromise performance for small N.
– 2 load instructions (6 cycles) for each madd (6 cycles), inefficient use of the even pipeline.
© 2007 IBM Corporation20 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
p p
– Initial performance results.
IBM TJ Watson Research Center
Generating Correlated Random Numbersg
Also working on utilizing the lower triangular property of the matrix L, to achieve ,better performance.
© 2007 IBM Corporation21 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
IBM TJ Watson Research Center
Conclusions
CDO pricing is computationally intensive instead of communications intensive.
We use Monte-Carlo simulation– Highly scalable among various SPEsg y g
Initial Performance results– Show substantial speedup for Mersenne Twister and Normalization– Show substantial speedup for Mersenne Twister and Normalization as compared to other architectures
– Initial results for cholesky decomposition and generating correlated random numbers.
– Cell is a good fit for financial workloads.
Double precision is essential for FSS workloads
© 2007 IBM Corporation22 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal
Double precision is essential for FSS workloads
IBM TJ Watson Research Center
Thank youThank you
Questions?
© 2007 IBM Corporation23 CDO Pricing on Cell/Lurng-Kuo Liu/Virat Agarwal