Case Studies in Acceleration of
Heston’s Stochastic Volatility
Financial Engineering Model:
GPU,Cloud and FPGA Implementations
by
Christos Delivorias
Supervised by
Dr.Peter Richtárik
and Martin Takáč
Dissertation presented for the Degree of
MSc in Operational Research
The School of Mathematics
August 2012
I want to thank ﬁrst and foremost Mr.Erik Vynckier of Scottish Widows
Investment Partnership(SWIP) in Edinburgh,both for giving me the
opportunity to work on this project,as well as his support and guidance
throughout the project.He has been an invaluable mentor to me.
I also would like to thank my academic supervisor Dr.Peter Richtárik for his
help and also for believing in me and suggesting me for this project.I especially
want to thank Mr.Martin Takáč in his role as a cosupervisor.His experience
with aspects of GPU parallelisation have been invaluable,as has his
instantaneous availability to my questions.Much aspects of this thesis might
have been left uncompleted without his assistance and guidance.
I would also like to thank Messrs Robin Bruce and Steven Hutt at Maxeler
Technologies for their hospitality in London and their support with getting to
grips with Dataﬂow processing.On the same note,I’d like to acknowledge the
help of Messrs Tuomas Eerola and Rainer Wehkamp at Techila Technologies for
extending to myself and SWIP the use of their grid processing platform.
Acknowledgments also go out to Mr.John Holden for providing a trial license
for the Numerical Algorithms Group (NAG) libraries for Matlab,to use in this
project.Similarly,to Mr.John Ashley at NVIDIA Corp for his input regarding
the physical setup of the GPU server,to Mathworks for providing a trial license
for Matlab to run on the GPU server,and to Boston Limited for providing the
GPU test platform.
Finally,and most importantly for myself,I want to thank my girlfriend and
partner in life Eugenia Metzaki.My love,I would not have made it thus far
without your help,your support,and your faith in me.
Thank you.
Declaration
I declare that this thesis was composed by myself and that the work contained
therein is my own,except where explicitly stated otherwise in the text.
(Christos Delivorias)
Abstract
Here we present a comparative insight of the performance of the Heston stochastic
volatility model on diﬀerent acceleration platforms.This model was tested against
a MacBook’s CPU,a Techila grid server hosted on Microsoft’s Azure cloud,a GPU
node hosted by Boston Ltd,and an FPGA node hosted by Maxeler Technologies
Ltd.Temporal data was collected and compared against the CPU baseline to
provide quantiﬁed acceleration beneﬁts from all platforms.
Table of Contents
List of Tables v
List of Figures vi
Acronyms and Abbreviations viii
Chapter 1 Introduction 1
1.1 Motivation...............................1
1.2 Outline of the Thesis.........................2
Chapter 2 Option Pricing Theory 3
2.1 Arbitragefree and complete market.................3
2.2 Riskneutral valuation........................4
2.3 FeynmanKac theorem........................4
2.4 Black & Scholes (B&S) model....................5
Chapter 3 Underlying Price Volatility 6
3.1 Stochastic Volatility.........................6
3.2 Heston’s Model............................8
3.2.1 CoxIngersollRoss Process..................9
3.2.2 Heston Partial Diﬀerential Equation (PDE)........11
3.2.3 Heston’s closedform analytical solution...........13
3.2.4 Double Heston Model.....................14
3.3 Bates’ Model.............................15
3.3.1 Aﬃne Jump Diﬀusion Functions...............15
Chapter 4 Monte Carlo Simulation 17
4.1 Introduction..............................17
4.2 Simple Monte Carlo (MC) algorithm................18
4.3 Variance reduction..........................19
4.3.1 Control variate technique...................19
4.3.2 Antithetic variates technique.................21
ii
4.3.3 Quasirandom simulation...................22
4.4 Discretisation Schemes for Stochastic Diﬀerential Equations...23
4.4.1 EulerMaruyama scheme...................23
4.4.2 Milstein scheme........................24
4.4.3 KahlJäckel scheme......................25
4.4.4 BroadieKaya exact calculation scheme...........25
4.4.5 Quadratic Exponential (QE) scheme............26
4.5 Generating random realisations of variates.............27
4.5.1 General considerations....................27
4.5.2 Sampling methods......................29
4.5.3 Uniform random samples...................29
4.5.4 Normal random samples...................30
Chapter 5 Dataﬂow programming on Field Programmable Gate
Arrays (FPGAs) 35
5.1 Applications of FPGAs in computational ﬁnance..........35
5.2 FPGA versus Intel multicore....................36
5.3 Scope for FPGAs...........................37
5.4 Application to the Heston model..................38
Chapter 6 Implementation on Graphical Processor Units (GPUs) 40
6.1 Using GPUs to solve large parallel problems............40
6.2 MATLAB implementation of the Heston model for the General
Purpose computing on GPU (GPGPU)...............42
Chapter 7 Implementation on the Techila cloud 45
7.1 The Techila cloud platform......................45
7.2 Description of platform implementation...............47
Chapter 8 Eﬃciency and Accuracy 48
8.1 Performance of Monte Carlo Simulation...............48
8.2 Accuracy of Simulation........................50
8.2.1 Bias and Variance of Results.................50
8.2.2 Standard Error........................51
Chapter 9 Results and conclusions 53
9.1 Acceleration results..........................53
9.2 Research Outcomes..........................55
9.3 Future Work..............................55
iii
Appendix A Code i
iv
List of Tables
6.1 GPGPU speciﬁcations for the dual Kepler cards...........41
6.2 GPGPU speciﬁcations from NVIDIA’s website...........42
9.1 Acceleration results for all platforms................54
v
List of Figures
3.1 Prices of 3 indices from 1990 to 2012................7
3.2 GARCH volatility of 3 indices from 1990 to 2012.........8
3.3 A plotting of the CoxIngersollRoss process............10
4.1 Monte Carlo (MC) simulation procedure...............32
4.2 Sobol sequence for quasirandom variates..............33
4.3 platforms sequence for quasirandom variates............33
4.4 Faure sequence for quasirandom variates..............34
5.1 Central Processing Unit (CPU) Architecture............37
5.2 The FPGA architecture.......................38
5.3 Code use with the Maxeler Technologies..............39
6.1 Memory space allocation on a GPU.................41
6.2 Schematic of the multiple cores in the NVIDIA Fermi GPU....44
7.1 Parallel problem architecture.....................45
7.2 Embarrassingly parallel problem architecture............46
7.3 Techila Architecture.........................47
8.1 Matlab ﬁgures for the Monte Carlo simulation of the Heston model 49
8.2 Call option prices with multiple models...............51
8.3 Monte Carlo standard error.....................52
9.1 Comparisson of accelerations.....................54
vi
List of Algorithms
1 MC Simulation(N).........................18
2 Monte Carlo Simulation with control variate(t;N)..21
3 Monte Carlo Simulation with antithetic variates(N).22
4 Quadratic Exponential Variance Reduction.......28
vii
Acronyms and Abbreviations
ALU Arithmetic Logic Unit................................................37
AJD Aﬃne Jump Diﬀusion.................................................11
ATM At the Money........................................................50
BLAST Basic Local Alignment Search Tool................................46
B&S Black & Scholes.......................................................5
CIR CoxIngersollRoss......................................................9
CPU Central Processing Unit...............................................35
KS KolmogorovSmirnov....................................................29
CUDA Compute Uniﬁed Device Architecture...............................42
CI Conﬁdence Interval......................................................48
DoF Degrees of Freedom.....................................................6
GPU Graphical Processing Unit.............................................2
GPGPU General Purpose computing on GPU..............................40
HW HalfWidth............................................................18
HDL Hardware Description Language......................................35
HFT High Frequancy Trading...............................................1
i.i.d.independent and identically distributed................................16
LCG Linear Congruential Generator........................................29
viii
MC Monte Carlo...........................................................17
MCMC Markov Chain Monte Carlo........................................50
MT Mersenne Twister......................................................30
NAG Numerical Algorithms Group.........................................23
FPGA Field Programmable Gate Array.....................................2
GARCH Generalized AutoRegressive Conditional Heteroskedasticity........6
GBM Geometric Brownian Motion..........................................4
GPU Graphical Processor Unit..............................................2
OS Operating System.......................................................36
PCT Parallel Computing Toolbox..........................................53
PCIe Peripheral Component Interconnect express...........................36
PDE Partial Diﬀerential Equation...........................................2
PDF Probability Density Function..........................................29
PRNG PseudoRandom Number Generator................................28
QE Quadratic Exponential..................................................26
RV Random Variable........................................................5
SDE Stochastic Diﬀerential Equation........................................2
TRNG TrueRandom Number Generator...................................28
ix
Chapter 1
Introduction
The purpose of this thesis is to explore diﬀerent ways to accelerate a certain
ﬁnancial mode,speciﬁcally the Heston stochastic volatility mode.The platforms
to be examined are the cutting edge at the present time in parallelisation of
algorithms.For myself this was a completely novel and unchartered territory.
Coming from a background in Computer Science and taken a Risk management
approach with the MSc in Operational Research,this project was combining
aspects of ﬁnancial mathematics that challenged me to quickly come to grips
with.It has been a steep learning curve,but I’ve thoroughly enjoyed mastering
this new domain of knowledge.
1.1 Motivation
For a long time now,even before the 2008 ﬁnancial crisis,the derivatives sector
was in need of a fast and eﬃcient way to calculate the prices of options given
certain market data.The use of this is two fold;ﬁrst the ability to price the option
value on a given underlying is important in a fastpaced market environment.
The second use is in a relatively new sector of the market that deals with High
Frequancy Trading (HFT).This sector is relying on extremely fast computations
in order to make algorithmic decisions based on the current market status.This
thesis will focus on the pricing of options rather than the aspects of HFT.The
informed reader would to look into some of the negative aspects of this practice
as explained by Easley et al.[ELO].
This joint project between the University of Edinburgh and Scottish Widows
Investment Partership aims to explore the possibilities in accelerating ﬁnancial
engineering models,with real life applications in the markets.The goal of this
thesis was to test the same model on diﬀerent platforms and assess the beneﬁts
of accelerations that each platform provided.
1
1.2 Outline of the Thesis
Chapter 2 introduces the pricing theory behind the derivatives options and states
some of the shortcomings of the previous models.This chapter links the Stochastic
Diﬀerential Equations (SDEs) with the Partial Diﬀerential Equations (PDEs).
The introduction of stochastic volatility to combat the issues raised in Chapter 2
is addressed in Chapter 3.Chapter 4 explains the theory and the implementation
behind evaluating an SDE using a Monte Carlo simulation.Chapters 5,6,and 7
go into more details on the acceleration platforms of Field Programmable Gate
Arrays (FPGAs),Graphical Processor Units (GPUs),and the Techila Cloud,re
spectively.Chapter 8 details the eﬃciency and the accuracy of the implementation
of the Heston model in Matlab,and ﬁnally Chapter 9 presents the experimental
results and concludes the thesis.
2
Chapter 2
Option Pricing Theory
This chapter will touch upon the fundamentals of arbitrage theory in pricing the
fair value of an option.An option provides the bearer with the opportunity,but
not the obligation,to sell(put) or buy(call) the underlying at ﬁxed price(strike).
If the option is not exercised nothing happens,aside from the premium paid for
the option is lost.There are currently two ways to achieve this pricing.One is via
the option’s characteristic PDE and the other with the Monte Carlo simulation.
The PDE option will be presented in Section 2.4 and will be used to ana
lytically evaluate the price of the Heston model in Section 3.2.The riskneutral
approach in Section 2.2 will be used to numerically simulate the price of the
underlying with the Monte Carlo simulation,as presented in Chapter 4.
2.1 Arbitragefree and complete market
An arbitrage is the practice of identifying and taking advantage henceforth of
mispriced assets that are trading in multiple exchanges.In eﬀect this is taking
advantage of someone selling an asset for a lower price than another party is
willing to buy it for;and viceversa.By acting inbetween the price spread,the
arbitrageurs can make a proﬁt.The act of arbitrage itself changes the underlying’s
value and thus pushes it towards an equilibrium.An arbitragefree market is one
where such imbalances do not exist.This idea is crucial when trying to evaluate
the riskneutral evaluation of an option.
Given a portfolio with M underlying assets,excluding the riskfree asset,and
R the number of sources of randomness,it is possible to deﬁne a"metatheorem"
to determine if the portfolio represents a complete and arbitragefree market.As
stated in [Bj9,p.122],we can deﬁne the following rule of thumb.
Theorem 2.1.Let M denote the number of underlying traded assets in the model
excluding the risk free asset,and let R denote the number of random sources.
Generically we then have the following relations:
3
1.The model is arbitrage free if and only if M R.
2.The model is complete if and only if M R.
3.The model is complete and arbitrage free if and only if M = R.
Remark.The practical implication of a complete market is that options are head
geable and can therefore be priced.On the other hand an incomplete market is
one where derivative securities cannot be perfectly hedged,and the price of an
option cannot be derived by the prices of its underlying assets.
For the purposes of this thesis we will assume a frictionless market,with no
taxes or transaction costs and premiums.For simplicity of the modelling we will
also assume a nondividend paying underlying.We will proceed to examine the
relationship between two assets;the option and the underlying.
2.2 Riskneutral valuation
The idea of a risk neutral valuation states that a derivative’s price is calculated
by the expecation of the discounted price at maturity,under a risk neutral mea
sure.What this implies is that the the discounted option processes are indeed
Martingales under the risk neutral measure in a no arbitrage market.In this case
the value of the derivative is the expected value at maturity of the payoﬀ and the
expected return of the underlying asset equals the riskfree rate.
2.3 FeynmanKac theorem
The FeynmanKac theorem is fundamental to ﬁnancial modelling as it links the
PDE to the SDE domain.What follows is a brief presentation of the theorem.
The theorem is explained in detail in [Kle05].
Theorem 2.2.Let x
t
follow the SDE
dx
t
= (x
t
;t)dt +(x
t
;t)dW
Q
t
;(2.3.1)
where W
Q
t
is a Geometric Brownian Motion (GBM) under the probablity measure
Q.Now let f(x
t
;t) be a diﬀerentiable function of x
t
and t that follows the PDE,
@f
@t
+(x
t
;t)
@f
@x
+
1
2
(x
t
;t)
2
@
2
f
@x
2
r(x
t
;t)f(x
t
;t) = 0;
on the condition at the boundary of f(X
T
;T).Under the provision of the theorem,
f has the solution,
f(x
t
;t) = E
Q
h
e
R
T
t
r(x
u
;u)du
f(X
T
;T)jF
i
:(2.3.2)
4
2.4 Black & Scholes (B&S) model
The B&S model formulates the assumption that the price of an asset is dependent
on a GBM.
dS
t
= S
t
dt +S
t
dW
t
;(2.4.1)
where the drift and the volatility are assumed to be constant.This means
that the asset prices are lognormally distributed.This however is on the antipode
fromrecent empirical observations which dictate the volatility to be nonGaussian
and in fact stochastic.
The price C
T
for a vanilla call option according to B&S is
C
T
= e
rT
E
(S
0
e
(r
1
2
2
)T+
p
TN
(0;1)
K)
+
;(2.4.2)
where r the riskfree interest rate,T the time to maturity,K is the strike price,
S
0
the spot price at t = 0, is the volatility,and N
(0;1)
is a Gaussian Random
Variable (RV),in this example the payoﬀ f of a call option
f = (S
0
e
(r
1
2
2
)T+
p
TN
(0;1)
K)
+
:
5
Chapter 3
Underlying Price Volatility
For many years the B&S model was the de facto method for pricing derivative
securities.The purpose of the B&S model was to create a riskneutral asset that
could be perfectly hedged against market volatility.In their seminal paper Fis
cher Black and Myron Scholes [BS73] described the partial diﬀerential equation
that governs the price of a derivative over time.The key insight of this approach
was to hedge the security by buying or selling the underlying,at a certain ratio
which was calculated by the B&S equation.This kind of strategy was also
referred as dynamic delta hedging.
However recent criticism [HT09] has highlighted the weaknesses and limita
tions of the model.One of the most prominent remarks is the incapacity of the
Gaussian distribution to properly address the probability of events in the tails of
the distribution.Suggestions are being made to use the Student’s tdistribution
which is a fattailed distribution for low Degrees of Freedoms (DoFs).
3.1 Stochastic Volatility
As is suggested in [Kwo08] there is an apparent clustering of volatility with regards
to an assets value.The clustering is characterised by a meanreverting behaviour
for the volatility.The assumption made by B&S was that of a constant volatility.
However empirical evidence has shown that the volatility of an asset price is not
described very well by the Normal distribution,when in fact the volatility follows
a leptokurtic distribution
1
.Figure 3.1 shows the prices of 3 indices (FTSE100,
S&P500,and NIKKEI225) from 1990 until today.Figure 3.2 shows the volatility
of each index modelled with GARCH(1,1)
2
since 1990.It is apparent that the
1
Essentially the distribution has higher peaks around the mean and fatter tails,than that
of a Gaussian distribution.
2
The Generalized AutoRegressive Conditional Heteroskedasticity (GARCH) model aims to
more accurately describe the phenomenon of volatility clustering and related eﬀects such as
6
31−Dec−1989
05−Jul−1997
07−Jan−2005
12−Jul−2012
0
5000
10000
Time (date)
Adjusted Close price ($)
FTSE100
31−Dec−1989
05−Jul−1997
07−Jan−2005
12−Jul−2012
0
500
1000
1500
2000
Time (date)
Adjusted Close price ($)
SP500
03−Jan−1990
07−Jul−1997
08−Jan−2005
12−Jul−2012
0
20000
40000
0
Time (date)
Adjusted Close price ($)
NiKKEI225
Figure 3.1
The FTSE100,S&P500,and NIKKEI225 index values from 1990 to 2012.The
Matlab code for these plots can be found in Listing A.3.
volatility is not normally distributed.
Since volatility drives the asset price,it became apparent that a more realistic
model was required to represent volatility.Heston [Hes93] showed that the B&S
model could be extended,by taking into account a stochastically moving volatility.
kyrtosis.Volatility clustering is the eﬀect that large changes in prices tend to be followed by
large changes,of either sign,and vice versa.
7
31−Dec−1989
05−Jul−1997
07−Jan−2005
12−Jul−2012
−0.1
−0.05
0
0.05
0.1
03−Jan−1990
07−Jul−1997
08−Jan−2005
12−Jul−2012
−0.1
−0.05
0
0.05
0.1
31−Dec−1989
05−Jul−1997
07−Jan−2005
12−Jul−2012
−0.1
−0.05
0
0.05
0.1
FTSE100 daily returns
GARCH(1,1) conditional SD
NIKKEI225 daily returns
GARCH(1,1) conditional SD
SP500 daily returns
GARCH(1,1) conditional SD
Figure 3.2
The correspoding conditional variance (in red) calculated with GARCH(1,1),
overlayed on the daily returns of each index.The Matlab code for these plots can
be found in Listing A.3.
3.2 Heston’s Model
The Heston model extends the B&S model by taking into account a stochastic
volatility,that is meanreverting and is also correlated with the asset price.It
assumes that both the asset price and its variance are determined by a joint
stochastic process.
Deﬁnition 3.1.The Heston model that deﬁnes the price of the underlying S and
its stochastic volatility v is deﬁned as
dS
t
= S
t
dt +
p
v
t
S
t
dW
S
t
;
dv
t
= ( v
t
)dt +
p
v
t
dW
v
t
;
CovhdW
S
t
;dW
v
t
i = dt;
(3.2.1)
8
where the two GBMs(W
S
;W
v
) are correlated by a coeﬃcient , is the rate of
reversion of the variance to – the longterm variance –, is the volatility of
variance,and is the rate of return of the asset.
The Heston model is a superior choice to the B&S model since it provides a
model that has a dynamic stochastic volatility,as described in Equation (3.2.1).
This model has a semianalytic formula that can be exploited to derive an inte
gral solution which is described in §3.2.2.Additionally if the [Fel51] condition is
upheld,this process will only produce strictly positive variance with probability
1;this is described in Lemma 3.2.
Lemma 3.2.If the parameters of the Heston model obey the condition
2
2
;
then the stochastic process will produce volatility such that Pr (v
t
> 0) = 1,since
the upward drift is large enough to strongly reﬂect away from the origin.
3.2.1 CoxIngersollRoss Process
Cox et al.[CIR85] described a Markov process with continuous paths deﬁned by
the SDE
dv
t
= ( v
t
)dt +
p
v
t
dW
v
t
;(3.2.2)
where is the equilibrium supported by the fundamentals, is the rate at which
the shocks dissipate and the variance returns to , is the degree of volatility
around it caused by shocks,and dW
t
is a GBM.Figure 3.3 shows the plot of a
CoxIngersollRoss (CIR) process.
This process is meanreverting,ergodic
3
and can be thought of as an Ornstein
Uhlenbeck
4
process [see [Gil96] for more on implementation,and the Matlab code
in Listing A.2].
The probability density function of the variance at time s in the future,con
ditional on its value at the current time,t s,as described in [CIR85,p.391]
is
f (v
s
;s;v
t
;t) = ce
u
u
q
2
I
q
(2(u)
1
2
);
3
An ergodic process is one which its statistical properties,e.g.mean and variance,can be
deduced from a single,suﬃciently long realisation of the process.
4
Describes the velocity of a Brownian particle under friction.It’s stationary,Gaussian,
Markovian,and tends to drift towards its longterm mean.
9
0
50
100
150
200
250
−2
0
2
4
6
Euler Discritisation, sampling from N
(0,1)
for 1 year(s)
Time(business days)
S(t)
0
50
100
150
200
250
−2
0
2
4
6
Euler Discritisation, sampling from the terminal non−central χ
2
distribution for 1 year(s)
Time(business days)
S(t)
Figure 3.3
Two CIR processes for 5 periods,with diﬀerent distributions,where dt = 1,mean
reversion rate = 3:0,long term mean = 1,and a volatility of = 0:6.The
process starts at S(0)=5 reverting to a long term mean of 1 (the mean is marked
in the ﬁgure with a horizontal black dashed line).Note how the top process
breaches the lower barrier of 0,and becomes negative,while the second one is
highly reﬂective at 0.
where
c =
2
2
(1 e
(st)
;
u = cv
t
e
(st)
;
= cv
s
;
q =
2
2
1;
and I
q
() is the modiﬁed Bessel function of the ﬁrst kind of order q.This con
ditional distribution function is a noncentral
2
;e:g:v
t
2
(2q + 2;2u),with
2q +2 =
4
2
DoF and noncentrality factor 2u = 2cv
t
.The terminal distribution
10
as s!1;is distributed as a gamma distribution,i.e.f (
2
2
;
2
2
)
5
.The
CIR process itself is a special case of the basic aﬃne diﬀusion function which are
described in §3.3.1.
3.2.2 Heston PDE
The process followed by the variance,as shown in Equation (3.2.1),is a CIR pro
cess.These kinds of processes,as presented in §3.2.1,are a subset of Aﬃne Jump
Diﬀusion (AJD) processes,in that they present no jumps.The following deriva
tion of the Heston model’s PDE is a special case of PDE for general stochastic
volatility models as described in [Gat06].Given the Heston model as we deﬁned
it in §3.2 there are two random sources – the price and variance Wiener processes
– and only one traded asset,the underlying option (since the volatility is not a
traded asset
6
),then according to [Bj9,p.118] this model is arbitragefree but
incomplete.In order to make it complete we need to add another traded asset
with associated stochastic risk to it,so that the portfolio can be hedged.
Theorem 3.3.Let a portfolio (t) consisting an option V (S;v;t;T) with matu
rity T, units of the underlying stock S,and
1
units of another hedging option
U(S;v;t;T) with maturity T.The PDE is then described by the equation,
0 =
@V
@t
+
1
2
vS
2
@
2
V
@S
2
+vS
@
2
V
@S@v
+
1
2
2
v
@
2
V
@v
2
rV +rS
@V
@S
+[( v) (S;v;t)]
@U
@v
;
where all the variables are as were deﬁned in x3.2,and is the market price of
volatility risk.
Proof.Let (t) be a selfﬁnancing
7
hedged portfolio,consisting of an option on
an underlying asset S,the asset S itself,and another option U,
(t) = S +V (S;v;t;T) +
1
U(S;v;t;T);(3.2.3)
where t 2 [0;T],and v is the volatility as deﬁned in the Heston model.We can
drop the dependent parameters for reason of brevity and look at the change in
the value of the portfolio during a quantum of time dt is
5
A Gamma distribution (;),is deﬁned by two numerical parameters:,the shape pa
rameter –which dictates the overall shape of the distribution–,and the scale parameter,which
deﬁnes by how much the distribution is stretched/shrinked.
6
However recent instruments have emerged that treat Volatility as an asset class [see??]
7
i.e.there is no inwards/outwards cash ﬂow.
11
d = dS +dV +
1
dU:(3.2.4)
From Itô’s lemma we have that the changes of the price of the options V and U
are,
dV =
@V
@t
+
1
2
vS
2
@
2
V
@S
2
+Sv
@
2
V
@S@v
+
1
2
2
v
@
2
V
@v
2
dt+
@V
@S
dS+
@V
@v
dv;(3.2.5)
dU =
@U
@t
+
1
2
vS
2
@
2
U
@S
2
+Sv
@
2
U
@S@v
+
1
2
2
v
@
2
U
@v
2
dt +
@U
@S
dS+
@U
@v
dv:(3.2.6)
We replace Equations 3.2.5 and 3.2.6 into 3.2.4 to get,
d = dS +dv + dt;(3.2.7)
where,
=
@V
@S
++
1
@U
@S
;
=
@V
@v
+
1
@U
@v
;
=
@V
@t
+
1
2
vS
2
@
2
V
@S
2
+vS
@
2
V
@S@v
+
1
2
2
v
@
2
V
@v
2
+
1
@U
@t
+
1
2
vS
2
@
2
UV
@S
2
+Sv
@
2
U
@S@v
+
1
2
2
v
@
2
V
@v
2
;
and where and above are as deﬁned in Equation (3.2.1).
In order to provide a riskneutral valuation it is necessary to eliminate the the
stochastic components of Equation (3.2.7).Hence we set = = 0 and derive
the following hedging parameters
1
=
@V
@v
@U
@v
;
=
@V
@S
1
@U
@S
:
(3.2.9)
If there is no arbitrage to exist,then that the rate of return on the portfolio’s
value will be equal to the riskfree interest rate r  which we will assume is
deterministic for this thesis  leaving us with
d = dt = rdt = r(V S
1
U)dt:(3.2.10)
Equation (3.2.10) yields that,
12
= r;
and hence,
r(S +V +
1
U) =
@V
@t
+
1
2
vS
2
@
2
V
@S
2
+vS
@
2
V
@S@v
+
1
2
2
v
@
2
V
@v
2
+
1
@U
@t
+
1
2
vS
2
@
2
UV
@S
2
+Sv
@
2
U
@S@v
+
1
2
2
v
@
2
V
@v
2
;
which we can abstract by writing
A+
1
B = r(S +V +
1
U):
When we substitute ;
1
from Equations 3.2.9 and rearrange we get,
ArV +rS
@V
@S
@V
@v
=
B rU rS
@U
@S
@U
@v
:(3.2.11)
The LHS of Equation 3.2.11 is a function of just V,as is the RHS a function of
U alone.Hence there must be some function f(S;v;t) for which the equality of
Equation 3.2.11 holds.When we deﬁne this function as f(S;v;t) = ( v) +
(S;v;t),where is the market price of volatility risk,following the proposal by
[Hes93].Thus Equation 3.2.11 can be written as,
ArV +rS
@V
@S
@V
@v
= ( v) +(S;v;t):
By rearrangement and substitution of A we get,
0 =
@V
@t
+
1
2
vS
2
@
2
V
@S
2
+vS
@
2
V
@S@v
+
1
2
2
v
@
2
V
@v
2
rV +rS
@V
@S
+[( v) (S;v;t)]
@U
@v
:
3.2.3 Heston’s closedform analytical solution
The Heston model’s closed form price C – as described in [Hes93] and deﬁned
in Deﬁnition 3.1 – for a European vanilla call option,with strike price K,spot
price S,and time to maturity T,on a nondividend underlying satisﬁes Equation
(3.2.12).
C(S;v;t) = SP
1
Ke
r(tT)
P
2
;(3.2.12)
13
where the change in its price for a time d,conditional on the strike price K,is
P

(x;v;T;ln[K]) =
1
2
+
1
Z
1
0
R
e
iln[K]
f

(x;v;t;)
i
d;
x = ln[S
t
];
f

(x;v;t;) = e
C

(Tt;)+D

(Tt;)v+ix
;
C

(;) = ri +
2
(b

i +d

) 2 ln
1 g

e
d

1 g

;
D

(;) =
b

i +d

2
1 e
d

1 g

e
d

;
and
g

=
b

i +d

b

i d

;
d

=
q
(i b

)
2
2
(2u

i
2
);
where  = f1;2g, = T t, is the integration parameter,i =
p
1, is the
price of volatility risk
8
,
u
1
=
1
2
;u
2
=
1
2
;b
1
= + ;b
2
= +;
and the rest of the parameters are as deﬁned in Deﬁnition 3.1.The Octave/Mat
lab code to implement Equation (3.2.12) is shown in Listing A.1 in Appendix A.
3.2.4 Double Heston Model
Another approach to the Heston model is to add a second process of variance
[GP09],independent of the ﬁrst one,
dS
t
= S
t
dt +
2
X
{=1
S
t
p
v
{
t
dW
S
{
t
;(3.2.13a)
dv
{
t
=
{
(
{
v
{
t
)dt +
{
p
v
{
t
dW
v
{
t
;{ = f1;2g;(3.2.13b)
0
B
B
B
B
B
@
d
W
S
1
W
v
1
W
S
2
W
v
2
W
S
1
1
1
0 0
W
v
1
1
1 0 0
W
S
2
0 0 1
2
W
v
2
0 0
2
1
1
C
C
C
C
C
A
:(3.2.13c)
8
The choice of (S;v;t) can be assumed as (S;v;t) = Cov[dv;
dC
C
],where C(t) is the
consumpion rate and is the relativerisk aversion of an investor [see [Bre79]].
14
This approach provides a more ﬂexible model by increasing the DoF in the
stochastic process.[CH09] showed that the introduction of a second,uncorrelated,
variance process to the Heston model provided a more ﬂexible modelling for the
time variation in the smirk
9
,leading to a better ﬁt quality with comparable
computational time.Equation (3.2.13c) shows the correlation structure between
the GBMs.
3.3 Bates’ Model
Critical information regarding a company’s performance arrive at random points
in time and have an acute impact on the company’s stock price.Because the
information arrives at a randomtime,it can be treated and modelled as a random
variable.To account for this behaviour Merton [Mer76] introduced a model to
account for the discontinuous paths of asset prices.The behaviour of this model
is very similar to the AJD functions described in §3.3.1.
The Merton and Heston model were combined by Bates [Bat96] who merged
the stochastic volatility with the jump diﬀusion augmentation.The stock price
follows the SDE,
dS
t
= S
t
dt +
p
v
t
S
t
dW
S
t
+(e
+
1)Sdq
t
;
dv
t
= ( v
t
)dt +
p
v
t
dW
v
t
;
CovhdW
S
t
;dW
v
t
i = dt;
dq =
(
0 with probability 1 dt
1 with probability dt
;
where dq is a compounded Poisson process with a constant intensity ,and the
jump size is lognormally distributed with mean logjump and standard deviation
,where N
(0;1)
.The jump size distribution in not correlated with the Wiener
processes.
3.3.1 Aﬃne Jump Diﬀusion Functions
The basic AJD functions are of the following form,
dx(t) = (x;t)dt +(x;t)dW(t) +dZ(t;;);
(x;t) =
0
+
1
x(t);
2
(x;t) = H
0
+H
1
x(t);
9
A smirk is deﬁned as a skewed volatility smile,which in its turn is the longobserved pattern
at which atthemoney options thend to have lower implied volatilities than in/outofmoney
options.
15
where (x;t) and
2
(x;t) are aﬃne functions,dW is a GBM and dZ(t) is a
compound Poisson process with intensity and independent jumps,which is
independent of W.The logjump sizes Y
i
N
(;
2
)
are independent and identi
cally distributed (i.i.d.) random variables with mean and variance
2
which are
uncorrelated with both Z(t) and W(t).Jumps add mass to the returns distri
bution,and thus address the issue of non"fattail"that the B&S models suﬀers
from.By increasing we can add mass to both tail,while a negative/positive
add mass to the left/right tail respectively alone.
More recent explorations into jump diﬀusions can be seen in [Lip02],where
the logexponential jumps are investigated to price exotic options.
16
Chapter 4
Monte Carlo Simulation
As mentioned in Chapter 2,the numerical method to evaluate the value of a PDE
is a Monte Carlo (MC) simulation.In this chapter we will set the basics of this
method and deﬁne the inputs and the intermittent processes required for this
simulation to work.
4.1 Introduction
The reference to Monte Carlo,is due to the homonymcity’s aﬃliation with games
of luck.The premise of luck is utilised within the simulation in order to provide
a random sample from the overall probability space.If the random sample is as
truly random as possible,then the subset of the random samples is taken as an
estimate of the volume of the entire probability space.The law of large numbers
guarantees [RW] that this estimation will converge to the true likelihood as the
number of random draws increases.Given a certain number of random draws,
the likely magnitude of the error can be derived by the central limit theorem.
As stated in §2.3 the FeynmanKač theorem is the connective tissue between
the PDE form of the stochastic model,and its approximation by a Monte Carlo
simulation.By this theorem it is possible to approximate certain form PDEs,by
simulating random paths and deriving the expectation of them as the solution of
the original PDE.
As an example we can to return the B&S Equation (2.4.2),where the price
of the option depends on the expected value of the payoﬀ.In order to calculate
the expected value of f it is possible to run a MC simulation with N paths
1
,
in order to approximate the actual price of the call option C with the simulated
1
A path in this context is a discrete timeseries of prices for the option,one for each discrete
quantum of time.
17
price
^
C
N
,
^
C
N
=
e
rT
N
N
X
{=1
S
0
e
(r
1
2
2
)T+
p
TN
{
(0;1)
K
+
;(4.1.1)
where r is the riskfree interest rate,T is the time to maturity of the option,
is the volatility,K is the strike price at maturity date T,S
0
is the spot price at
t = 0,and N
(0;1)
are Gaussian RVs.By the"strong law of large numbers"we
have,
Pr
^
C
N
!C
= 1,as N!1:(4.1.2)
4.2 Simple MC algorithm
Suppose we wish to estimate the call option price C:= E[h(X)],then the algo
rithm to simulate the expected value is,
Algorithm 1 MC Simulation(N)
Require:Generate X
1
; ;X
N
Ensure:E[C] =
^
C
N
1:for i = 1 to N do
2:Y
i
:= h(X
i
)
3:S
i
S
i1
+Y
i
4:end for
5:return
^
C
N
S
N
N
In order to determine the eﬃciency of the simulation,we need to calculate the
conﬁdence interval for this simulation.The approximate 100(1 )% conﬁdence
interval for Algorithm 1 is deﬁned as,
^
C
N
z
1
2
^
N
p
N
;
^
C
N
+z
1
2
^
N
p
N
;
where ^
N
is the estimate of V ar(Y ) based on Y
1
; ;Y
N
.We can use the Half
Width (HW) of the conﬁdence interval to measure the quality of the estimator,
^
C
N
,
HW= z
1
2
r
V ar(Y )
N
;
(4.2.1)
The smaller the HWthe more accurate the simulation of the expected value will
be.There are two ways that this can be achieved.Either increase N or reduce
V ar(Y ).The ﬁrst option has some limitations upon computational complexity
of calculating Y
i
in a reasonable amount of time.We can however impact the
variance of the generated Y
i
values.There are multiple techniques to do the
latter and these are referred to as variance reduction.
18
4.3 Variance reduction
There are two major avenues to take in order to reduce variance in a MC simula
tion,one is to take advantage of speciﬁc features of the problem domain to adjust
or correct simulation outputs,the other by directly reducing the variability of
the simulation inputs.In this section we’ll introduce control variates,antithetic
variance,absorbing technique,full truncation,and the case of the quasirandom
simulation.
4.3.1 Control variate technique
In order to improve on the accuracy of the estimated variate,it is essential to
use as much information as possible.We can derive very important information
for the estimated variate,if we can deﬁne another variate,one of which it is
not diﬃcult to calculate it’s expected value,that is highly correlated with the
estimated one.We could then use this new variate as a control mechanism in
order to improve the accuracy of the simulation by minimising the errors between
the two variates.
Suppose that,as in Section 4.2,we wish to estimate the call option price
C:= E[Y ],where Y:= h(X) is for instance the discounted payoﬀ for a call
option for which we have no closedform evaluation.We do however know of
another variate Z,which we can easily generate,whose expected value E[Z] we
can also quickly calculate.Now suppose that for each replication of Y
i
we also
generate a value for Z
i
,and that the pair (Y
i
;Z
i
) is i.i.d.,then for any ﬁxed c we
can calculate,
Y
c
i
= Y
i
+c(Z
i
E[Z]);
from the i
th
path and thus calculate the sample mean,
^
Y
c
=
^
Y +c(
^
Z E[Z]) =
1
N
N
X
i=1
(Y
i
+c(Z
i
E[Z])):
This is an unbiased estimator of E[Y ]
2
that can act as a control variate estimator,
that is the observed residual Z E[Z] acts as a control when estimating E[Y ].
If we now compute the variance of
^
Y
c
we get,
V ar
h
^
Y
c
i
= V ar [Y
i
+c(Z
i
E[Z])]
=
2
Y
+2c
Z
Y
ZY
+c
2
2
Z
2
c
;
(4.3.1)
where
2
Z
= V ar [Z];
2
Y
= V ar [Y ],and
ZY
is the correlation factor between Z
and Y.In order to minimise the variance of the replication we choose a c
that
2
Since E[Y
c
i
] = E[Y
i
+c(Z
i
E[Z])] = E
h
^
Y
i
= E[Y ].
19
minimises Equation 4.3.1 and is expressed as,
c
=
Y
Z
ZY
=
Cov[Z;Y ]
V ar[Z]
:(4.3.2)
Substituting Equation 4.3.2 into 4.3.1 we get,
V ar
h
^
Y
c
i
= V ar[
^
Y ] 2
Cov[Z;Y ]
V ar[Z]
Cov[Z;Y ] +
Cov[Z;Y ]
2
V ar[Z]
2
V ar[Z]
= V ar[
^
Y ]
Cov[Z;Y ]
2
V ar[Z]
;
(4.3.3)
therefore,as long as Cov[Z;Y ] 6= 0 we can achieve a reduction in variance of
the estimated variate.In fact the higher the correlation,
ZY
,is the higher the
reduction to the variance will be.
All that remains now is to ﬁnd an appropriate control variate that is nonzero
correlated to the estimated variate.Since we are trying to ﬁnd the discounted
payoﬀ of a call option,we could use the Black & Scholes SDE as the Z control
variate of the above example.
Remark.If we were to examine the ratio of the controlled estimator to that of
the uncontrolled estimator we could derive that,
V ar
h
^
Y +c(
^
Z E[Z])
i
V ar[
^
Y ]
= 1
2
ZY
;
(4.3.4)
where what is implied is that the stronger the correlation between the estimated
Y and control Z variate,the more eﬀective the control variate is.N.B.that this
eﬀect is irrelevant of the sign of correlation since this is canceled in the square
form in Equation 4.3.4.
The question remains however as to how to estimate this correlation of the
two bivariates.This can be achieved by doing t training simulation runs in which
the correlation coeﬃcient is calculated,ﬁrst by calculating the covariance of the
two variates as,
\
Cov[Y;Z] =
P
t
=1
Y

^
Y
t
(Z

E[Z]))
t 1
:
And by selecting an appropriate control variable,we are able to calculate the
expected value as well as the sample variance of the replications.For the variance
we get,
\
V ar[Z] =
P
t
=1
(Z

E[Z])
2
t 1
;
20
and hence we can derive fromEquation 4.3.2 the optimal constant
^
c
to use for the
control variate simulation.Once we have this constant we can use it to perform
a simulation and reduce the variance with a control variate.
A simple algorithm to simulate the variate V with a control variate Z would
look like Algorithm 2.
Algorithm 2 Monte Carlo Simulation with control variate(t;N)
Require:t > 1;N > 0.
Ensure:E[Y ] =
^
V
N
.
1:for i = 1 to t do
2:generate (Y
i
;Z
i
) {training run to calculate the constant factor c*}
3:end for
4:^c
calculate c
5:for i = 1 to N do
6:generate (Y
i
;Z
i
) {actual simulation}
7:V
i
Y
i
+^c
(Z
i
E[Z])
8:end for
9:
^
V
N
P
N
i=1
V
i
N
10:
\
V ar[V ]
N
P
N
i=1
(V
i
^
V
N
)
2
N1
11:100(1 )%CI
^
V
N
z
1
2
\
V ar[V ]
N
p
N
;
^
V
N
+z
1
2
\
V ar[V ]
N
p
N
4.3.2 Antithetic variates technique
This method reduces variance by introducing a negative dependence between pairs
of replications.We will present the case where the replications are sampled from
a Uniform distribution,however this method can take various forms.As [Gla04,
p.205] mention,this method can be extended to various distributions via the
inverse transform method where F
1
(U) and F
1
(1 U) both have distribution
F,but are antithetic to each other because F
1
is monotonic.As an example it
is possible to use a pair or replications from the normal distribution,by pairing a
sequence Z
1
;Z
2
; of i.i.d.N
(0;1)
random variables with the antithetic sequence
Z
1
;Z
2
; of i.i.d.N
(0;1)
random variables.
Let us now extend the paradigm we presented in Chapter 4.3.1 for the price
of a call option.In this case we generate two pairs of antithetic replications from
a Uniform distribution and use them as the unbiased estimator,
Z
{
=
Y
i
+
~
Y
i
2
;
(4.3.5)
where Y
i
= h(U
i
) is the payoﬀ of the call option sampled on a Uniformdistribution
U
i
U
(0;1)
,and
~
Y
i
= h(U
i
) is the payoﬀ of the call option sampled on an antithetic
21
Uniform distribution 1 U
i
U
(0;1)
.
Since E[Y ] = E[
~
Y ] =
^
Y,then from Equation (4.3.5) we deduce that Z
i
is
an unbiased estimator of
^
Y,and because the U
0
i
s are i.i.d.we can use Z
i
to
construct conﬁdence intervals.Algorithm 3 explains how the MC simulation
would implement antithetic variates.
Algorithm 3 Monte Carlo Simulation with antithetic variates(N)
Require:N > 0.
Ensure:E[Y ] =
^
V
N
.
1:for i = 1 to N do
2:generate U
i
3:Y
i
h(U
i
)
4:
~
Y
i
h(1 U
i
)
5:Z
i
Y
i
+
~
Y
i
2
6:end for
7:
^
Z
N
P
N
i=1
Z
i
N
8:
\
V ar[Z]
N
P
N
i=1
(Z
i
^
Z
N
)
2
N1
9:100(1 )%CI
^
V
N
z
1
2
\
V ar[V ]
N
p
N
;
^
V
N
+z
1
2
\
V ar[V ]
N
p
N
4.3.3 Quasirandom simulation
One more procedure to reduce the variance of the simulation is to sample for
variates of lower variance.Such numbers can be sampled from the so called"low
discrepancy"sequences.A sequence’s discrepancy is a measure of its uniformity
and is deﬁned by following deﬁnition [see [Lev02]].
Deﬁnition 4.1.Given a set of points x
1
;x
2
; ;x
N
2 I
S
and a subset G'I
S
,
deﬁne the counting function S
N
(G) as the number of points x
i
2 G.For each
x = (x
1
;x
2
; ;x
S
) 2 I
S
,let G
x
be the rectangular sdimensional region,
G
x
= [0;x
1
) [0;x
2
) [0;x
S
);
with volume x
1
;x
2
; ;x
N
.Then the discrepancy of the points x
1
;x
2
; ;x
N
is
given by,
D
N
(x
1
;x
2
; ;x
N
) = sup
x2I
S
jS
N
(G
x
) N (x
1
x
2
; ;x
S
)j:
The discrepancy value of the distribution compares the sample points found
in the volume of a multidimensional space,against the points that should be in
that volume provided it was a uniform distribution.
22
There are a few sequences that are being used to generate quasirandom vari
ates.The Numerical Algorithms Group (NAG) libraries provide three sequence
generators.The Niedereiter [Nid92],the Sobol [Sob67],and the Faure [FAU81]
sequence are implemented in MATLAB with the functions g05yl and g05ym.
4.4 Discretisation Schemes for Stochastic Diﬀer
ential Equations
In §4.3 we described methods of reducing the variance of the Monte Carlo simula
tion and thus increasing the precision of the estimated value at the end.However,
there is one more factor of error that needs to be taken into account and addressed,
and that’s the simulation bias due to the discretisation of the SDE.One way to
think of this is with by shooting arrows at a bullseye target;high precision shots
form a tight cluster of arrows,but they could be completely outside of the circles
due to high bias.We will continue to present the Euler Scheme,which is the
simplest and most common form of discretisation,before we proceed to reﬁne
and extend it to alternative schemes.
4.4.1 EulerMaruyama scheme
Let us consider the case of the following SDE,
dX(t) = (X(t))dt +(X(t))dW(t):(4.4.1)
Also let
^
X be the discretised approximation of X.The Euler Maruymama [Mar55]
approximation,and temporal granulation 0 = t
0
< t
1
< < t
m
,and
^
X is,
^
X(t
i+1
) =
^
X(t
i
) +(
^
X(t
i
))t +(
^
X(t
i
))
p
tZ
i+1
;(4.4.2)
for i = 0; ;m1,Z
i
i.i.d.normal variates,and t = t
i+1
t
i
.
Since the discretisation is an approximation process,it is imperative to mea
sure howaccurate this approximation ultimately is.To do this we need to evaluate
the discrepancy between the SDE and its discrete approximation conditional on
the size of t.In essence we want to evaluate if the error
=
h
jjX(t)
^
X(t)jj
i
!0;
23
when t!0.There are two accepted metrics to measure this discrepancy;the
weak convergence that shows the error of the mean,and the strong convergence
which shows the mean of the error.
A typical weak convergence error has the form,
weak
t
:= sup
0t
i
T
E[f(X(t))] E
h
f(
^
X(t))
i
;
where f is a smoothing polynomial of some order k.We say that the discretised
approximation
^
X(t) converges weakly if
weak
t
!0 when t!0.The order of
the weak convergence is > 0 when
weak
t
Ct
;
for some scalar C and for all suﬃciently small t.
Conversely the discretised approximation
^
X(t) converges strongly if for the
strong error
strong
t
:= sup
0t
i
T
E[X(t)] E
h
^
X(t)
i
;
we have that
string
t
!0 when t!0.Similarly as above the order of the strong
convergence is > 0 when
strong
t
Ct
;
for some scalar C and for all suﬃciently small t.According to [Gla04,p.345],
the Euler scheme typically has a strong order of
1
2
,but often achieves a weak
order of 1.
4.4.2 Milstein scheme
This scheme was ﬁrst proposed by Milstein [Mil95],and is explained in detail
by Glasserman [Gla04,p.340344],Klöden and Platen[KPS94] for more general
processes,and in Kahl and Jäckel [KJ06] for stochastic volatility processes.The
scheme works for SDEs for which the drift and diﬀusion terms are not dependent
on time directly.Let us take the case of a stochastic process as deﬁned in Equation
(4.4.1).The Milstein discretisation scheme can then be expressed as
^
X(i +1) =
^
X(i) +(
^
X(i))t +(
^
X(i))
p
tZ
i+1
+
1
2
0
(
^
X(i))(
^
X(i))t(Z
2
i+1
1):
(4.4.3)
The discretisation Equation 4.4.3 is composed by a deterministic part which
is deﬁned by the term,a stochastic term that is deﬁned by the term,and an
Itô’s term.
24
4.4.3 KahlJäckel scheme
Kahl and Jäckel [KJ06,p.24] propose an implicit Milstein scheme for the variance
in combination with an alternative discretisation for the underlying’s price pro
cess.Speciﬁcally they refer to this stochastic volatility scheme as IJK and deﬁne
it as
^
V (t +t) =
^
V (t) +t +
q
^
V (t)Z
V
p
t +
1
4
2
t(Z
2
V
1)
1 +t
;
(4.4.4)
ln
^
X(t +t) = ln
^
X(t)
t
4
^
V (t +t) +
^
V (t)
+
q
^
V (t)Z
V
p
t
+
1
2
q
^
V (t +t) +
q
^
V (t)
Z
X
p
t Z
V
p
t
+
1
4
t(Z
2
V
1);
(4.4.5)
where is the equilibrium supported by the fundamentals, is the rate at which
the shocks dissipate and the variance returns to , is the degree of volatility
around it caused by shocks,and Z
X
;Z
V
are Normal variates.
By ﬁnding the minimum of the variance function and forcing its value to
be positive we can easily derive that the variance is guaranteed to be strictly
positive if 4 >
2
.In reality,as Andersen [And07] mentions,it is unrealistic to
uphold this constraint in realistic situations,hence this scheme will can and will
produce negative variance.Andersen proposes then a full truncation to 0 when
a value is negative.This means the Equation (4.4.4) will substitute
^
V ( )
+
=
max(
^
V ( );0) where there is
^
V ( ).
4.4.4 BroadieKaya exact calculation scheme
Broadie and Kaya [BK06] presented a discretisation process that is completely
biasfree.This exact scheme though has limitions
3
and suboptimal performance
even against the"simple"Euler scheme as Lord et al.[LKvD06] showed in their
numerical comparisons
4
.
To obtain the biasfree scheme begin with the consecutive application of Itô’s
Lemma ﬁrst to get the explicit form and then to pass on to a Cholesky decom
position (see [And07,p.7] for a detailed derivation).What is ﬁnally obtained
is,
3
Due to lack of speed and high complexity of implementation.
4
Most likely due to the reliance to the acceptancerejection sampling that induces a perfor
mance penalty.
25
V (t +t) = V (t) +
Z
t+t
t
( V (u))du
+
Z
t+t
t
p
V (u)dW
V
(u);
(4.4.6)
lnX(t +t) = lnX(t) +
(V (t +t) V (t) t)
+
1
2
Z
t+t
t
V (u)du
+
p
1
2
Z
t+t
t
p
V (u)dW(u):
(4.4.7)
The distribution of lnX(t+t) is clearly Normal,and after sampling V (t+t)
from a noncentral
2
,with degrees of freedom deﬁned from a Poisson sampling,
we can draw a sample
R
t+t
t
V (u)dujV (t +t),and calculate the next log value
of X.Since this last sampled distribution is conditional on the next value of the
variance,we can identify it as a Brownian Bridge.
4.4.5 Quadratic Exponential (QE) scheme
In 2005 Andersen [And07] proposed a new scheme to discretise the stochastic
volatility and the price of an underlying asset.This scheme takes advantage of
the fact that a noncentral
2
sampled variate can be approximated by a related
distribution,that’s momentmatched to the conditional ﬁrst and second moments
of the noncentral
2
distribution.
As Andersen points out,the cubic transformation of the Normal RV is a
more accurate representation of the distribution closer to 0,it introduces negative
values of variance.Thus the quadratic representation is adopted with a spacial
case for when we have low values of V (t).Therefore when V (t) is suﬃciently
large,we get,
^
V (t +t) = a(b +Z
V
)
2
;(4.4.8)
where Z
V
is an N
(0;1)
Gaussian RV,and a,b scalars that will be determined by
momentmatching.Nowfor the complementary lowvalues of V (t) the distribution
can –asymptotically– be approximated by,
P(
^
V (t +t) 2 [x;x +t]) (p(0) +(1 p)e
x
)dx;x 0;
(4.4.9)
where is the,strongly reﬂective at 0,Dirac deltafunction,and p and are
positive scalars to be calculated.The scalars a;b;p; depend on the parameters
of the Heston model and the time granulation t,and will be calculated by
momentmatching the exact distribution.
26
To sample from these distributions there are two distributions to take into
account:
Sample from the normal N
(0;1)
Gaussian RV and calculate
^
V (t +t) from
Equation (4.4.8).
To sample for the small values of V the inverse of Equation (4.4.9) will be
used.The inverse of the distribution function is,
1
(u) =
1
(u;p;) =
(
0 if 0 u p;
1
ln
1p
1u
if p u 1:
(4.4.10)
The value of V can then be sampled from
^
V (t +t) =
1
(U
V
;p;);
(4.4.11)
where U
V
is a uniform RV.
The rule on deciding which descritisation of V to use is depended on the non
centrality of the distribution,and can be triaged based on the value of .The
value of is,
:=
s
2
m
2
=
^
V (t)
2
e
t
(1 e
t
) +
2
2
(1 e
t
)
2
( +(
^
V (t) )e
t
)
2
;
(4.4.12)
where m;s
2
are the conditional mean and variance of the exact distribution we
are matching.What Andersen showed was that the quadratic scheme of Equation
4.4.8 can only be momentmatched for 2 and similarly the exponential scheme
of Equation 4.4.11 can only be momentmatched for 1.It emerges then
that there is an overlap interval for 2 [1;2] where the two schemes overlap.
Intuitively Andersen chooses the midpoint of this interval as the cutoﬀ point
between the schemes;thus the cutoﬀ
c
= 1:5.
Since we’ve deﬁned the discretisation process for the QE scheme,with Equa
tions 4.4.8 and 4.4.11,and the cutoﬀ discriminator,what is left is to calculate
the remaining parameters a;b;p; for each case.The algorithm for this process
is detailed in Algorithm 4.
This algorithm is implemented in MATLAB [see Listing A.4 for details],and
is used for numerical comparisons of acceleration.
4.5 Generating random realisations of variates
4.5.1 General considerations
A pseudorandom number sequence is one that tries to approximate a truly ran
dom process by using a process that relies on a speciﬁc initial condition,called a
27
Algorithm 4 Quadratic Exponential Variance Reduction
Require:The present value for the variance,
^
V (t)
Ensure:The value for the variance in the subsequent timestep,
^
V (t +t)
1:Compute m +(
^
V (t) )e
t
2:Compute s
2
^
V (t)
2
e
t
(1 e
t
) +
2
2
(1 e
t
)
2
3:Compute
m
2
s
2
4:if
c
then
5:Compute a
m
1+b
2
6:Compute b 2
1
1 +
p
2
1
p
2
1
1
7:Generate Normal random variate Z
V
8:return
^
V (t +t) a(b +Z
V
)
2
9:else
10:Compute p
1
+1
2 [0;1)
11:Compute
1p
m
12:Generate Uniform random variate U
V
13:if 0 U
V
p then
14:return
^
V (t +t) 0
15:else
16:return
^
V (t +t)
1
ln
1p
1U
V
17:end if
18:end if
seed,and a deterministic algorithm for generating continuous variates.Since the
process is deterministic the generated sequence cannot be truly random,however
it will not fail all tests of randomness.It is not possible to create a Pseudo
Random Number Generator (PRNG) that passes all empirical statistical testing
(see [Knu97] for more on this matter),however we can accept empirically that
good PRNGs will pass all simple tests,while probably fail more complex ones.
Another property of the PRNGs is that the generated sequences are periodic,
meaning that after a certain amount of variates generated,the same variate as the
very ﬁrst one will be generated again.However since the deterministic generation
relies on a speciﬁc mathematical function to generate the random variate,when
implemented on a CPU it is signiﬁcantly more eﬃcient in generating very large se
quences,in comparison to generating truly random numbers using TrueRandom
Number Generators (TRNGs)
5
.
There are two qualities that can qualify a sequence’s randomness;its uni
formity and its independence.The most common tests of uniformity are the
5
The simplest example of a truly random number would be a sequence of numbers recorded
by throwing a fair die for a large enough amount times.More optimised generators might use
natural phenomena or chaotic sources,that once observed can produce true randomness(e.g.
lava lamps[http://www.lavarnd.org],atmospheric white noise,CCD chip white noise,etc.)
28
KolmogorovSmirnov (KS) test,and the
2
test.The autocorrelation test would
test for independence,by identifying periodicity in the sequences.
There are quite a number of algorithms to generate pseudorandom numbers.
Some of the most commonly used are mentioned in the chapters below.
4.5.2 Sampling methods
4.5.2.1 Inversion method
The Inverse transform sampling relies on the following theorem,
Theorem 4.2.Let X be a random variate from a continuous distribution with
a strictly increasing cumulative F(x).Let U be a uniform(0;1) random variate,
then F
1
(U) F(x).
Recall that for y 2 [0;1];F(y) = P(U y) = y.Then,
P(F
1
(U) x) = P(U F(x)) = F(x) = P(X x):
Then the inversion process is
1.Generate U U(0;1).
2.Output X = F
1
(U).
4.5.2.2 Acceptancerejection method
For a random variate sequence X with a Probability Density Function (PDF)
f(x) and support (a;b),i.e.f(x) = 0;when x =2 [a;b].The function also has an
upper bound K for which f(x) K;8x.The acceptancerejection process is
1.Generate uniform random variates U
1
U
(a;b)
and U
2
U
(0;K)
.
2.If U
2
f(U
1
),then output X = U
1
,else goto step 1.
4.5.3 Uniform random samples
The continuous uniform distribution is part of a family of distributions,where all
bins(intervals) of the same interval,are equally probable.What follows is a pre
sentation of three procedures to sample from a Uniform distribution;the Linear
Congruential Generator (LCG),the Mersenne twister,and L’Ecuyer’s methods.
29
4.5.3.1 Linear Congruential Generator (LCG)
The LCG method is the oldest and well known methods to generate Uniform
random numbers.This is a recursive algorithm that takes advantage of some
properties of prime numbers.Like most PRNG it too relies on an initial value
to seed it.This becomes the starting value of the pseudorandom sequence.The
next value is deterministically derived from
Z
i+1
= (Z
i
+c) (mod m);
where m is the modulus, is the multiplier,and c is the increment.Since the
next variate in the sequence is modulated by m,it is not possible to get a number
higher than m.Thus we can derive uniform numbers U
i
2 [0;1] by U
i
=
Z
i
m
.
4.5.3.2 Mersenne twister
The Mersenne Twister (MT) is an exceptionally high quality,fast randomnumber
generator.It was developped by Makoto Matsumoto and Takuji Nishimura in
1996 and further extended in 2002;the current implementation is MT19937,with
32bit word length.This algorithmhas a period length of approximately 2
19;937
1
and has been shown to be uniformly distributed in 623 dimensions.(see [MN98]).
The function g05kg(genid,subid) from the NAG library was used to generate the
uniform from 0 t 1 random variates.When the genid=3 the MT algorithm is
used to generate them.
4.5.4 Normal random samples
The normal or gaussian distribution has a charactestic belllike same which gives
it the empirical name of the bell curve.It is considered the most prominent
distribution statistics,and arises in a multitude on natural phenomena.The
Gaussian function that descibes the bell curve is,
f(x;;
2
) =
1
p
2
e
1
2
(
x
)
2
;
where = 0 is the expected value of the distrbution,and = 1 is the variance
of it.
The most common and eﬃcient way to generate normal variates N
0;1
is the
BoxMuller transformation.The method relies on ﬁrst sampling two Uniform RV
and then performing a polar transformation with them.
Lemma 4.3.Let U
1
;U
2
to independent uniform random variates that are uni
formly distributed in the interval (0;1).We can then generate two Normal random
variates,
30
Z
0
=
p
2 lnU
1
cos(2U
2
);
Z
1
=
p
2 lnU
1
sin(2U
2
);
then Z
0
;Z
1
are independent random variates,distributed normally with = 1.
In the case of this implementation the NAG library was used to generate the
random variates.An example of how to generate them is given in Listing 4.1.
1 f unct i on [ x,i seedOut,i f a i l ] = GenerateNormalNAG(mu,var,n,gen)
n = i nt 64 (n);
3 i gen = i nt64 ( gen);
[ igenOut,i s eed ] = g05kc ( i gen );
5 [ i gen,i s eed ] = g05kb( i gen,i s ee d );
[ x,i seedOut,i f a i l ] = g05l a (mu,var,n,i gen,i s ee d );
Listing 4.1
Generate a 1 n vector of N
0;1
random variates.
31
Static MC
Uniform Generator
Generated r.v U
Transformation Rules
Simulated r.v.X
Simulation al
gorithm (loop)
Variance
Reduction
Sample paths
Law of large numbers
Monte Carlo
Estimation
Dynamic MC
Arbitragefree model
Riskneutral SDE
Discretization
Scheme
Approxim.
random walk
Series
Expansion
Approxim.
random path
Figure 4.1
Monte Carlo (MC) simulation procedure.
32
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
Sobol Quasi Random
Pseudo Random
Figure 4.2
The Sobol quasirandom uniform variates on the left,versus the pseudorandom
variates on the right.The NAG library was used for the quasirandom variates.
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Niedereitter Quasi Random
Pseudo Random
Figure 4.3
The Niedereitter quasirandom uniform variates on the left,versus the pseudo
random variates on the right.The NAG library was used for the quasirandom
variates.
33
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
Faure Quasi Random
Pseudo Random
Figure 4.4
The Faure quasirandom uniform variates on the left,versus the pseudorandom
variates on the right.The NAG library was used for the quasirandom variates.
34
Chapter 5
Dataﬂow programming on FPGAs
This approach to accelerate code,takes advantage of the fact that most of the
time the Central Processing Unit (CPU) is busy ﬁguring out the scheduling of the
instructions and the branch prediction of the program.The purpose of an FPGA
is to provide a customisable “ﬁeldprogrammable” chip that can be optimised to
perform calculation for a speciﬁc problem domain.This is achieved by allowing
the logic blocks on the chip to be rewirable.This way,even after a board has
been shipped,it can be rewired and repurposed.
This rewiring is achieved via a Hardware Description Language (HDL).The
language then oﬀers the ability to interconnect the logic blocks into diﬀerent
combinations and cater for complex combinatorial functions,and also manage
the onchip memory
1
.
5.1 Applications of FPGAs in computational ﬁ
nance
One company that has made breakthroughs in accelerating ﬁnancial models on
FPGAs is Maxeler Technologies.Their paradigm shift,from the Von Neumann
controlﬂow architecture to the Dataﬂow architecture,allows for much higher
computational specialisation and acceleration.The controlﬂow architecture can
be likened to a mechanics workshop,where one person does all stages of con
structing a product,e.g.a motorcycle.This work doesn’t have to be sequential
in any order.While building the motorcycle,the mechanic can also sidestep and
work on a part of a car before returning to the motorcycle according to a sched
ule.The antipode of this paradigm is a motorcycle production line,where each
station on the production is optimised to perform one action.The overall process
is as quick as the ﬂow.What the FPGAs provide is a way to create workers
1
This is mostly in the form of either ﬂipﬂops,or more structured blocks of memory.
35
–kernels as deﬁned by Maxeler– that are highly specialised and extremely quick
at conducting a specialised operation.The kernels are large synchronous dataﬂow
pipelines that implement the mathematics and the control of the problem.They
are asynchronously coupled to other kernels and I/O sources and sinks (DRAM,
Peripheral Component Interconnect express (PCIe),interchip links,etc.) by the
manager.
An additional beneﬁt of this technology is its low power footprint when com
pared against a standard CPU.Since the clock cycle on a CPU is much higher
than in an FPGA the electrical resistance on the transistors causes the release of
energy in the form of heat.This heat accumulates in a server room and needs
to be dissipated.In large server clusters,the cooling of the servers can amount
to a signiﬁcant cost.Typically the main power consumption for a cluster would
be half and half for the servers themselves,and for the cooling systems of those
servers.FPGA chips tend to oﬀer a signiﬁcant reduction in the electricity costs
of maintenance.
5.2 FPGA versus Intel multicore
Thus far,CPU developments have adhered to Moore’s law
2
.This prediction has
been followed up to this point,however limitations of the scale that transistors
can achieve coupled with the issue of power consumption increasing the more
transistors are ﬁtted in a chip,casts doubt into the relevance of Moore’s law.
However,what might actually happen instead is that the transistors will double
their numbers every 18 months,but mainly because the number of cores in each
chip would double.What this means is that Operating Systems (OSs) will be
able to take advantage on multiple cores within a CPUand via eﬃcient scheduling
maximise performance while minimising power consumption.
The beneﬁts of such of the Intel multicore approach is that the current pro
gramming paradigm can abide and most existing code could be easily –compared
to more exotic implementation on GPUs and FPGAs – ported to the manycore
architecture much quicker.
On the one hand the FPGA can leverage two advantages over the CPU ap
proach.First it has more silicon dedicated to calculations compared to the CPU.
And second it relies on the DataFlow architecture to do away with the taxing as
pects of instruction scheduling and branchpredictions.This way the calculations
pipeline is always full and a result is calculated every clock cycle [see Figure 5.2 for
2
Moore stated in 1965 that the transistor density of semiconductor chips would double
roughly every 18 months.
36
Figure 5.1
The process architecture on a CPU where the ALU is referred as the Function
Unit.Data has to be moved into the Funtion Unit form memory and then moved
back into memory for storage.(Photo used by permission of Maxeler Technolo
gies).
more details].On the other hand the CPU,as shown in Figure 5.1,needs to han
dle concurrent threads vying for their turn on the Arithmetic Logic Unit (ALU)
in order to progress their calculation status.
5.3 Scope for FPGAs
The FPGA lends itself more aptly to problems of a diﬀerence engine nature.For
instance it has been successfully used in the Seismic Acquisition Industry to per
formﬁnite diﬀerence modelling of the geophysical models,to performreverse time
migrations,and to do CRS stackings.Lately implementations in Credit Deriva
tives Pricing have been appearing as well [WSRM12].Basically any mathematical
process that can be decomposed to distinguishable selfsuﬃcient sequential cal
culation can achieve high acceleration on the FPGA architecture.
37
Figure 5.2
The FPGA architecture as implemented by Maxeler Technologies.The Max
Compiler constructs the DataFlow tree which deﬁnes the circuit architecture on
the FPGA chip.From then on data from memory gets piped into the diﬀerent
DataFlow cores until it exits the calculation pipe and is committed to memory.
(Photo used by permission of Maxeler Technologies).
5.4 Application to the Heston model
The implementation of Heston’s stochastic volatility model has two aspects to
it.First the code that is run on the host,and second the code that deﬁnes the
circuit architecture of the FPGA and performs the necessary calculations.
Since only repetitive calculations can beneﬁt from the DataFlow architecture
there are certain elements that need to run on the host and others on the FPGA
card.Maxeler Technologies use the nomenclature of a kernel and a manager.
The kernel comprises of a set of calculations that produce a distinct result,e.g.
a 3value moving average.The manager’s responsibility is to instantiate and ad
minister the life cycle and functions of each kernel that is assigned to it.For this
implementation the manager creates numerous pipes within a given MaxCard
3
to
3
Latest models of Maxeler’s FPGA cards provide an ever increasing number of resources
onboard the chip.
38
Figure 5.3
This ﬁgure illustrates how code interacts between the host CPU and the FPGA
kernels.(Photo used by permission of Maxeler Technologies)
handle diﬀerent operations.The more pipes that can be ﬁlled into the available
silicon the better the overall performance of the FPGA.The manager is responsi
ble to create and to populate the pipes with kernels to generate random variates
from the Gamma distribution,and also kernels that calculate the next values for
the variance and the price of the underlying.Once all the prices of the underlying
have been generated for every timestep,the results are aggregated back on the
host’s CPU.
39
Chapter 6
Implementation on GPUs
6.1 Using GPUs to solve large parallel problems
The GPU is a highly specialised parallel processor for accelerating graphical com
putations.The ﬁrst GPU card that consolidated the entire graphics processing
pipeline oﬀchip for 3D and 2D acceleration was the NVIDIA GeForce256,which
was released in 1999 by NVIDIA.The main function of the card was to render a
graphical image on the GPU chip to turn a model into an image.The strength of
this chip was its specialised computational capacity for ﬂoatingpoint arithmetic.
The card itself has an onboard memory whose diﬀerent access layer are used
internally by the diﬀerent threads running on the GPUblocks.It is the developers
duty to orchestrate the migration of state from the host CPU memory to the
worker GPU memory,create the work kernels in the blocks,aggregate the results
in the worker memory,and ﬁnally return the result to the host CPU memory (see
Figure 6.1).
The initial use for this acceleration card was for computer video games,whose
high visualisation demands spearheaded the graphic cards industry into ever im
proving GPU cards.The result of this was that their performance in computation
of parallel calculation started to outperform high end CPUs because of the spe
cialisation they applied.This was the time that clients with increased processing
needs started to take notice.Academic Institutions,and companies in the Oil &
Gas sector,as well as others,began to notice that code could be parallelised on a
GPU instead of relying on grids of CPUs.This need gave rise to the niche market
of General Purpose computing on GPUs (GPGPUs) for dealing with embarrass
ingly parallel problems in certain domains.Table 6.2 details the most current
GPGPU cards available on the market and their speciﬁcations.
40
Figure 6.1
Memory space architecture on a GPU.
Source “NVIDIA CUDA Programming Guide” version 1.1.
Features
Tesla K10
Number and Type of GPU
2 Kepler GK104s
Peak double precision ﬂoating point[FP] performance
190 Gigaﬂops
Peak single precision ﬂoating point[FP] performance
4577 Gigaﬂops
Memory bandwidth (ECC oﬀ)
320 GB/sec
Memory size (GDDR5)
8GB
CUDA cores
3072
Table 6.1
GPGPU speciﬁcations for the dual Kepler cards.
41
Features
Tesla M2090 Tesla M2075 Tesla M2070Q
Number and Type of GPU
1 Fermi GPU 1 Fermi GPU 1 Fermi GPU
Peak double precision FP
665 Gigaﬂops 515 Gigaﬂops 515 Gigaﬂops
Peak single precision FP
1331 Gigaﬂops 1030 Gigaﬂops 1030 Gigaﬂops
Memory bandwidth (ECC oﬀ)
177 GBytes/sec 150 GBytes/sec 150 GBytes/sec
Memory size (GDDR5)
6 GigaBytes 6 GigaBytes 6 GigaBytes
CUDA cores
512 448 448
Table 6.2
GPGPU speciﬁcations from NVIDIA’s website.
http://www.nvidia.com/object/teslaservers.html
There has been a ﬂurry of diﬀerent problems implemented with GPUs in diﬀer
ent disciplines.Problems in chemical engineering [ZVS11],iterative algorithms in
linear systems [ESV10],ﬁnance [RS09],and bioinformatics [VS11].These imple
mentations can be rapidly prototyped in MATLAB,using the Parallel Computing
Toolbox,and then deployed on heterogeneous platforms including GPUs.There
is also the option to develop highly specialised code in C/C++ and the Compute
Uniﬁed Device Architecture (CUDA) toolkit in order to increase the performance
of the bundled executables or to deploy API libraries with the modelled function
ality.
The development friction with the latter approach,arises when developing
CUDA code,which is cumbersome and rigid.Niche knowledge of the underlying
platform is essential for the development of the tools.However recently there
have been attempts to bridge the gap between existing code,fully tested and
understood,with the parallelisation oﬀered by GPUs.One of these approaches
is a JAVA to GPU compiler.The serialisation of the JAVA state (e.g.JAVA
objects),the kernel code generation and launch,and the deserialisation of state
back into CPU memory is all done automatically.The developer only writes the
model logic in JAVA and the compiler automates the rest of the processes (see
[PSFW12] for more information).
6.2 MATLABimplementation of the Heston model
for the GPGPU
Using the existing Heston model as shown in Listing A.4 it is possible to augment
and diﬀerentiate this model to be able to take advantage of Matlab’s Parallel
Computing Toolbox to deploy the existing model on multiple GPUs.To do this
42
there are several ways to take advantage of a single GPU card,multiple cards per
node,and ﬁnally multiple nodes in a grid.
The easiest formof access to the GPU is via the overrided functions within the
toolboxes acted on a speciﬁc class of the gpuArray.This method will eﬀectively
transfer data over the PCIe card onto the GPU memory.Methods of transforma
tion,e.g.addition,subtraction,multiplication,etc.,would then be applied on the
GPU and not on the CPU.The calculated data is then collected from the GPU
into the CPU memory via the gather method.This highlevel implementation
is useful when quick matrix calculation without signiﬁcant need for minute con
trol is required.However when the problemdimensionality increases,ﬁnegrained
control is required in order to increase the performance of the acceleration.
For this reason it is important to distinguish between independent calculations
in the model and take advantage of multiple GPU cards in order to accelerate
the calculations.The Monte Carlo simulation lends itself quite ﬁttingly because
the individual path calculations are independent of each other,hence can be
executed on multiple GPU cards.This process is achieved by executing a parfor
loop,whose constituent loops are distributed on multiple GPUs and even across
multiple nodes.
In order to access multiple GPU cards it is necessary to bind them to speciﬁc
Matlab workers.This is achieved via the spmd...end structure.The following
Listing 6.1 shows this in more detail.
N=1000;
2 A = gpuArray( magic (N) );
4 spmd
gpuDevice ( l abi ndex );
6 end
8 par f or i x =1:N % Di s t r i but e f or l oop to workers
% Do the GPU cal c
10 X = myGPUFunction( i x,A);
% Gather data
12 Xtotal ( i x,:) = gather (X);
end
Listing 6.1
’Example of multiple GPU assignment to distribute a Matlab function to multiple
workers.’
The actual implementation of the present Heston model for the GPU platform
can be viewed in Listing A.5 in Appendix A.Martin Takáč oﬀered signiﬁcant
implementation guidance and help on the use of the arrayfun function.
43
Figure 6.2
Schematic of the multiple cores in the NVIDIA Fermi GPU.
44
Chapter 7
Implementation on the Techila
cloud
7.1 The Techila cloud platform
The Techila company in Finland deals in the domain of distributed computational
problems.The latter are characterised by two kinds of problems.The ﬁrst are
traditional parallel problems where the individual distributed processes are run
on diﬀerent hosts,but require some information to be passed between them.
Problems like ﬂuid dynamics or ﬁnite element models are examples of parallel
problems.
Figure 7.1
Parallel problem.Each computational job accepts input data and uses this input
data to compute results.Communication between jobs is required in order for
the individual jobs to be completed.(Photo used by permission of Techila).
45
On the other hand are the embarrassingly parallel problems.These kinds
of problems do not require communication between the distributed calculations,
since each calculation is independent from all others.Monte Carlo simulations,
computer graphics rendering,genetic algorithms,brute force methods in cryptog
raphy,Basic Local Alignment Search Tool (BLAST) searches in bioinformatics,
and Machine Learning analytics [DEW
+
12],etc..
Figure 7.2
Embarrassingly parallel problem.Each process accepts input data and uses this
input data to compute results.Communication is limited to receiving input data
and transferring output data;no interprocess communication takes place.(Photo
used by permission of Techila).
Techila has produced a computational platform that deals with heterogeneous
distributed computing of embarrassingly parallel problems.The beneﬁt of such
an approach is the ability to utilise multiple hardware and operating systems
in order to leverage all available computing power for problem solving.The
computational resources could be a plethora of instances,fromlaptops,to unused
machines in close proximity,to local servers,and even ondemand instances from
cloud providers.In other words Techila ports and manages the code on a hybrid
set of computational resources.
Each computational resource is assigned a worker instance that performs the
independent computation.All the workers are administered by a central man
ager which interacts with the end user to collect the problem state.As shown
46
Figure 7.3
Techila system threetiered structure.The EndUser interacts with the Techila
Server,which distributes computations to the Workers.All network communi
cation in the Techila environment will be secured by using SSL (Secure Socket
Layer),meaning all transferred will be in encrypted form.(Photo used by permis
sion of Techila).
in Figure 7.3 the End User communicates solely with the server,whose job in
turn is to segment the problem in its distributed parts and transfer them to the
heterogeneous constituency of computational resources.
7.2 Description of platform implementation
For the present case of the Heston model the cloud implementation was achieved
Enter the password to open this PDF file:
File name:

File size:

Title:

Author:

Subject:

Keywords:

Creation Date:

Modification Date:

Creator:

PDF Producer:

PDF Version:

Page Count:

Preparing document for printing…
0%
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο