May 29, 2003 74.757 Final Presentation Sajib Barua1 Development of a Parallel Fast Fourier Transform...
-
date post
21-Dec-2015 -
Category
Documents
-
view
218 -
download
1
Transcript of May 29, 2003 74.757 Final Presentation Sajib Barua1 Development of a Parallel Fast Fourier Transform...
May 29, 2003 74.757 Final Presentation Sajib Barua
1
Development of a Parallel Fast Fourier Transform Algorithm for Derivative Pricing Using MPI
Sajib BaruaCS. Dept. University of Manitoba
Course Professor: Dr. Ruppa K. Thulasiram
May 29, 2003 74.757 Final Presentation Sajib Barua
2
Outline
Introduction Background and related work Problem Statement Solution Strategy and Implementation Experimental Platform Results Conclusions and Future Work
May 29, 2003 74.757 Final Presentation Sajib Barua
3
Introduction
Derivatives are knows as options Two basic types of option
– Call Option
– Put Option
Two types of option depending on the exercise time:– American Option
– European Option Determining the optimum exercise policy is a key issue in pricing American option to achieve maximum profit.
May 29, 2003 74.757 Final Presentation Sajib Barua
5
Methods used for option pricing
Binomial Lattice method Monte Carlo approach Finite-Difference method Fast Fourier transform Finite-element method
FFT is used recently to study multifactor model for option pricing problems and
FFT is highly suitable for parallel computing.
May 29, 2003 74.757 Final Presentation Sajib Barua
6
Parallel FFT Algorithm
Most Common parallel FFT Algorithm:– Cooley-Tukey Algorithm– Gentleman-Sande Algorithm
Two Schemes for FFT Algorithm– Recursive Scheme– Iterative Scheme
May 29, 2003 74.757 Final Presentation Sajib Barua
7
Problem Statement
There are two distinct features in this project: – mathematical treatment of the option pricing problem – and its computation
Design, development and implementation of FFT algorithm with improved data locality.
Use of this algorithm in option pricing problem with appropriate mapping
May 29, 2003 74.757 Final Presentation Sajib Barua
8
Challenges
Computation:– In parallel system two types of latencies are incurred
• Communication latency• Synchronization latency
– FFT is inherently a synchronous algorithm.– Communication latency is circumvented by providing good
data locality. Mathematical treatment:
– Applying appropriate risk-neutral probability for finishing in-the-money
– obtaining the Fourier Transform of the call price function with a known risk-neutral density function
– etc
May 29, 2003 74.757 Final Presentation Sajib Barua
9
Solution Strategy and Implementation
Parallel FFT AlgorithmFFT in option Pricing
A New Parallel FFT Algorithm
May 29, 2003 74.757 Final Presentation Sajib Barua
10
How FFT Equation can be parallized?
A sequence of X of length n– X = <X[0], X[1] …. X[n-1]>
The DFT of this sequence is – Y = <Y[0], Y[1] …. Y[n-1]>
May 29, 2003 74.757 Final Presentation Sajib Barua
11
Contd.
Overall sequential complexity is Θ(n log n) FFT calculation can be parallelized A parallel FFT calculation requires log N number of stages A parallel FFT algorithm with block data distribution of N data
points with P processors requires – (log N – log P) number of iterations for local calculation called local
algorithm– log P number of stages for communication among processors called
remote algorithm.
May 29, 2003 74.757 Final Presentation Sajib Barua
12
Fourier Analysis in Option Pricing
Call value is a function of the log of its strike price. This function is not square integrable. Its Fourier transform can not be
calculated. To get the analytically solvable Fourier transform, this function is
multiplied by an exponential to make it square integrable. The call price function is
ΨT (v) is the Fourier transform of the call price
May 29, 2003 74.757 Final Presentation Sajib Barua
13
Contd. If M = eαk/2Π and ω = ei then
The discrete form of this equation will be
This DFT equation can be parallelized. Call price can be calculated accurately and efficiently
from a good parallel FFT algorithm with an efficient data distribution.
May 29, 2003 74.757 Final Presentation Sajib Barua
14
A New Parallel FFT Algorithm
Improving Data Locality Approach
May 29, 2003 74.757 Final Presentation Sajib Barua
15
Cooley-Tukey Algorithm
x0 w80
w80
Iteration 1Iteration 0 Iteration 2
w
a
b
a+bw
a-bw
w
Butterfly Operation
x4
x2
x6
x1
x5
x3
x7
w80
w80
w80
w82
w80
w82
w80
w81
w82
w83
y0
y1
y2
y3
y4
y5
y6
y7
P0
P1
May 29, 2003 74.757 Final Presentation Sajib Barua
16
Parallel FFT Algorithm With Improved Data Locality
0
1
2
3
4
5
6
7
0
2
1
3
4
6
5
7
0
4
1
5
2
6
3
7
Iteration 0
Iteration 1
Iteration 2
P0
P1
Iteration No = i1st node index = j2nd node index = 2i + j
May 29, 2003 74.757 Final Presentation Sajib Barua
17
Parallel FFT Algorithm With 16 number of nodes
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0
2
1
3
4
6
5
7
8
10
9
11
12
14
13
15
0
4
1
5
2
6
3
7
8
12
9
13
10
14
11
15
0
8
1
9
2
10
3
11
4
12
5
13
6
14
7
15
p0
p1
p2
p3
Iteration 0 Iteration 1 Iteration 2 Iteration 3
May 29, 2003 74.757 Final Presentation Sajib Barua
19
Programming Language– Unix C– Message Passing Interface (MPI)
Single Processor Bewoulf Cluster (Distributed Architecture)
– 2 no of processors– 4 no of processors– 8 no of processors
Experimental Testbed or Environment
May 29, 2003 74.757 Final Presentation Sajib Barua
21
Performance result with varying processor size
0
2
4
6
8
10
12
14
No of Processor
Exe
cutio
n tim
e(in
sec
)
2^12
2^13
2^15
2^16
2^17
2^18
2^19
2^20
2^12 0.02535093 0.02060401 0.01717699 0.01517296
2^13 0.05539596 0.04329801 0.02915406 0.02441597
2^15 0.2773631 0.183461 0.1334109 0.10645
2^16 0.615428 0.392648 0.265748 0.2157681
2^17 1.332534 0.915802 0.5701129 0.4130991
2^18 2.87687 1.789844 1.19493 0.8592221
2^19 6.18839 3.80007 2.502013 1.786862
2^20 11.83851 7.009985 3.72315
1 2 4 8
May 29, 2003 74.757 Final Presentation Sajib Barua
22
Performance result with varying Data size
0
2
4
6
8
10
12
14
2^12
2^13
2^15
2^16
2^17
2^18
2^19
2^20
Data Size
Ex
ec
uti
on
tim
e (
in S
ec
)
1 Processor2 Processors4 Processors8 Processors
May 29, 2003 74.757 Final Presentation Sajib Barua
23
Comparison of Cooley-Tukey and Parallel FFT Algorithm
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
2^12 2^13 2^15 2^16 2^17 2^18 2^19 2^20
Data Size
Ex
ec
uti
on
Tim
e (
in S
ec
)
Parallel FFT
Cooley-Tukey
With 8 no of processors
May 29, 2003 74.757 Final Presentation Sajib Barua
25
1. A New Parallel FFT algorithm(with improved data locality)Reduce communication overheadPerformance is more than 15% over Cooley-Tukey
Algorithm with larger data sets.
2. Mathematical Contribution– Applying appropriate risk neutral density function– Specification of dampening co-efficient – Mathematical derivation of the dampened call price
function and its Fourier Transform
Contribution:
May 29, 2003 74.757 Final Presentation Sajib Barua
26
Rapid solution and quick information processing
Fast algorithm with high performance computing capability
Better performance and accuracy Market Forecasting FFT is a data- and communication-intensive
Problem. Implementation of FFT is a serious challenge in scientific-computing.
Contd.
May 29, 2003 74.757 Final Presentation Sajib Barua
27
Future Work
Better risk neutral density function For now the results are presented only in terms of the performance improvement of the FFT algorithm.
The results need to be presented in terms of the option prices Implementation of our algorithm on shared-memory architecture (OpenMP) More efficient node distribution technique and tuning parallel FFT algorithm.