Matematicas FINANCIERAS CIFF dob.pdf

ITS30001 34 views 161 slides Jun 29, 2024
Slide 1
Slide 1 of 161
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85
Slide 86
86
Slide 87
87
Slide 88
88
Slide 89
89
Slide 90
90
Slide 91
91
Slide 92
92
Slide 93
93
Slide 94
94
Slide 95
95
Slide 96
96
Slide 97
97
Slide 98
98
Slide 99
99
Slide 100
100
Slide 101
101
Slide 102
102
Slide 103
103
Slide 104
104
Slide 105
105
Slide 106
106
Slide 107
107
Slide 108
108
Slide 109
109
Slide 110
110
Slide 111
111
Slide 112
112
Slide 113
113
Slide 114
114
Slide 115
115
Slide 116
116
Slide 117
117
Slide 118
118
Slide 119
119
Slide 120
120
Slide 121
121
Slide 122
122
Slide 123
123
Slide 124
124
Slide 125
125
Slide 126
126
Slide 127
127
Slide 128
128
Slide 129
129
Slide 130
130
Slide 131
131
Slide 132
132
Slide 133
133
Slide 134
134
Slide 135
135
Slide 136
136
Slide 137
137
Slide 138
138
Slide 139
139
Slide 140
140
Slide 141
141
Slide 142
142
Slide 143
143
Slide 144
144
Slide 145
145
Slide 146
146
Slide 147
147
Slide 148
148
Slide 149
149
Slide 150
150
Slide 151
151
Slide 152
152
Slide 153
153
Slide 154
154
Slide 155
155
Slide 156
156
Slide 157
157
Slide 158
158
Slide 159
159
Slide 160
160
Slide 161
161

About This Presentation


Slide Content

GLOBAL STANDARD IN FINANCIAL ENGINEERING
CERTIFICATE IN FINANCECQF
Certificate in Quantitative Finance
Subtext to go here
A4 PowerPoint cover landscape2.indd 1
21/10/2011 10:53
June 2012 MathsPrim
er
This is a revision course designed to act as a mathematics refresher. The
volume of work covered is signi…cantly large so the emphasis is on working
through the notes and problem sheets. The four topics covered are
Calculus
Linear Algebra
Di¤erential Equations
Probability & StatisticsPage 1 1 Introduction to Calculus
1.1 Basic Terminology
We begin by de…ning some mathematical shorthand and number systems
9there exists
8for all
)therefore
*because
!which gives
s.t such that
:such that
i¤ if and only if
equivalent
similar
2an element of
!xa uniquexPage 2 NaturalNumbersN=f0;1;2;3;::::: g
Integers(N)Z=f0;1;2;3; ::::: g
Rationals
p
q
:p;q2Z;Q=
n
1
2
;0:76;2:25;0:3333333::::
o
Irrationals
Q=
np
2;0:01001000100
001:::; ; e
o
RealsRall the above
ComplexC=
n
x+iy:i=
p
1
oPage 3

(a;b) =a< x < bopen interval
[a;b] =axbclosed interval
(a;b] =a < xbsemi- open/closed interval
[a;b) =ax < bsemi- open/closed interval
So typically we would writex2(a;b):
Examples
1< x <1 (1;1)
1< xb(1;b]
ax <1 [a;1)Page 4 1.2 Functions
This is a term we use very loosely, but what is a function? Clearly it is a type of
black box with some input and a corresponding output. As long as the correct
result comes out we usually are not too concerned with what happens ’inside’.
Afunctiondenotedf(x)of a single variable xis a rule that assigns each ele-
mentofasetX(writtenx2X)toexactlyoneelementyofasetY(y2Y):
A function is denoted by the formy=f(x)orx7!f(x):
We can also writef:X!Y;which is saying thatfis a mapping such that
all members of the input setXare mapped to elements of the output set Y:
So clearly there are a number of ways to describe the workings of a function.
For example, iff(x) =x
3
;thenf(2) =2
3
=8:Page 5 ­30
­20
­10
0
10
20
30
­4 ­3 ­2 ­1 0 1 2 3 4 We ofte
n writey=f(x)whereyis thedependent variable andxis the
independent variable.Page 6 The setXiscalled thedomainoffand the setYis called theimage(or
range), writtenDomfandImf;in turn. For a given value of xthere should
be at mostone
value ofy. S
o the role of a function is to operate on the
domain and map it across uniquely to the range.
So we have seen two notations for the same operation.
The …rsty=f(x)suggests a graphical representation whilst the second
f:X!Yestablishes the idea of a mapping.Page 7

There arethree types of mapping:
1. For eachx2X;9oney2Y:This is a one to one mapping (or11
function) e.g.y= 3x+1:
2. More than onex2X;gets mapped onto oney2Y:This is a many to
one mapping (or many1function) e.g.y= 2x
2
+1;becausex=2
yields oney:
3. For eachx2X;9more than oney2Y;e.g.y=
p
x:This is a
many
to one mapping. Clearly it is multivalued, and has two branches. We will
assumethatonlythepositivevalueisbeingconsideredforconsistencywith
the de…nition of a function. A one to many mapping is not
a func
tion.Page 8 The function maps the domain across to the range. What about a process
which does the reverse? Such an operation is due to the inverse functionwhich
maps the image of the original function to the domain. The function y=f(x)
has inversex=f
1
(y):Interchange ofxandyleads to consideration of
y=f
1
(x):
The inverse functionf
1
(x)is de…ned so that
f

f
1
(x)

=xandf
1
(f(x)) =x:
Thusx
2
and
p
xare
inverse functions and we say they are mutually inverse.
Note the inverse
p
xis multivalue
d unless we de…ne it such that only non-
negative values are considered.
Example 1:What is the inverse ofy= 2x
2
1:Page 9 i.e. wewanty
1
:One way this can be done is to write the function above as
x= 2y
2
1
and now rearrange to havey=::::so
y=
s
x+1
2
:
Hencey
1
(x) =
s
x+1
2
:Check:
y
y
1
(x) = 2
0
@
s
x+1
2
1 A
2
1 =x=y
1
y(
x)
Example 2:Considerf(x) = 1 =x;thereforef
1
(x) = 1 =x
Domf= (1;0)[(0;1)orRf0gPage 10 Returningto the earlier example
y= 2x
2
1
clearlyDomf=R(clearly) and for
y
1
(x) =
s
x+1
2
to exist
we require the term inside the square root sign to be non-negative, i.e.
x+1
2
0 =)x >
1;thereforeDomf=f[1;1)g:
Aneven functionis one which has the property
f(x) =f(x)
e.g.f(x) =x
2
:
f(x) =x
3
is an example of anodd functionbecause
f(x) =f(x):
Most functions are neither even nor odd but every function can be expressed
as the sum of an even and odd function.Page 11

1.2.1 Explicit/Implicit Representation
When we express a function asy=f(x);then we can obtainycorresponding
to a (known) value ofx:We sayyis anexplicitfunction. All known terms
are on the right hand side (rhs) and unknown on the left hand side (lhs). For
example
y= 2x
2
+4x16 = 0
Occasionally we may write a function in an implicitformf(x;y) = 0 ;al-
though in general there is no guarantee that for each xthere is a uniquey.
A trivial example isyx
2
= 0;which in its current form is implicit. Simple
rearranging givesy=x
2
which is explicit :
A more complex example is4y
4
2y
2
x
2
yx
2
+x
2
+3 = 0:
This can neither be expressed as y=f(x)orx=g(y):Page 12 So wesee all known and unknown variables are bundled together. An implicit
form which does not give rise to a function is
y
2
+x
2
16 = 0 :
This can be written as
y=
q
16x
2
:
and e.
g. forx= 0we can have eithery= 4ory=4;i.e. one to many.Page 13 1.2.2 Types of functionf(x)
Polynomialsare functions which involve powers of x;
y=f(x) =a0+a1x+a2x
2
+:::::
::+an1x
n1
+anx
n
:
The highest power is called the degreeof the polynomial - sof(x)is ann
th
degree polynomial. We can express this more compactly as
f(x) =
n
X
k=0
a
kx
k
where the coe¢ cients ofxare constants.
Polynomial equations are writtenf(x) = 0 ;so ann
th
degree polynomial
equation is
anx
n
+an1x
n1
+::::::+a2x
2
+a1x+a0= 0:Page 14 k= 1;2givesa linear and quadratic in turn. The most general form of
quadratic equation is
ax
2
+bx+c= 0:
To solve we can complete the square which gives

x+
b
2a

2

b
2
4a
2
+
c
a
= 0

x+
b
2a

2
=
b
2
4a
2

c
a
=
b
2
4ac
4a
2
x+
b
2a
=

p
b
2
4ac 2a
and …n
ally we get the well known formula for x
x=
b
p
b
2
4ac
2a
:
There are
three cases to consider:
(1)b
2
4ac >0!x16=x22R:2 distinct real rootsPage 15

(2)b
2
4ac= 0!x=x1=x2=
b
2a
2R:one t
wo fold root
(3)b
2
4ac <0!x16=x22CComplex conjugate pairPage 16 1.2.3 TheModulus Function
Sometimes we wish to obtain the absolute value of a number, i.e. positive part.
For example the absolute value of3:9is3:9:In maths there is a function
which gives us the absolute value of a variable xcalled themodulus function ,
writtenjxjand de…ned as
y=jxj=
(
x x >0
x x <0
;
although most de…nitions included equality in the positive quadrant.modulus function
0
0.5
1
1.5
2
2.5
3
3.5
­4 ­3 ­2 ­1 0 1 2 3 4 Page 17 This is anexample of apiecewise function.
The name is given because they are functions that comprise of ’pieces’, each
piece of the function de…nition depends on the value of x.
So, for the modulus, the …rst de…nition is used when xis non-negative and the
second ifxis negative.Page 18 1.3 Limits
Choose a pointx0and functionf(x):Suppose we are interested in this
function near the pointx=x0:The function need not be de…ned atx=x0:
We writef(x)!lasx!x0;"iff(x)gets closer and closer to lasx
gets close tox0". Mathematically we write this as
lim
x!x0
f(x)!l;
if9a numberlsuch that
Wheneverxis close tox0
f(x)is close tol:Page 19

The limit onlyexists if
f(x)!lasx!x

0
f(x)!lasx!x
+
0
Let us have a look at a few basic examples and corresponding "tricks" to
evaluate them
Example 1:
lim
x!0

x
2
+2x+3

!0+0+3!3;Page 20 Example 2:
li
m
x!1
x
2
+2x+2
3x
2
+4
= li
m
x!1
x
2
x
2
+
2x
x
2
+
2
x
2
3x
2 x
2
+
4
x
2
=
lim
x!
1
1+
2
x
+
2
x
2
3+
4
x
2
!
1
3
:
Example 3:
li
m
x!3
x
2
9
x3
= lim
x
!3
(x+3)( x3)
(x3)
= lim
x
!3
(x+3)!6Page 21 A functionf(x)iscontinuousatx0if
lim
x!x0
f(x) =f(x0):
That is, ’we can draw its graph without taking the pen o¤ the paper’.Page 22 1.3.1 Theexponential and log functions
Thelogarithm(or simplylog) was introduced to solve equations of the form
a
p
=N
and we saypislogofNto basea:That is we take logs of both sides ( log
a)
log
aa
p
= log
aN
which gives
p= log
aN:
By de…nitionlog
aa= 1(important).
We will often need the exponential function e
x
and the (natural) logarithm
log
exor(lnx):Page 23

Here
e= 2:718
281828::::
which is the approximation to
lim
n!1

1+
1
n

n
whennis very la
rge. Similarly the exponential function can be approximated
from
lim
n!1

1+
x
n

n
lnxande
x
are
mutual inverses:
log( e
x
) =e
logx
=x:Page 24 Also
1
e
x
=e
x
:
Here we
have used the property(x
a
)
b
=x
ab
;which allowed us to write
1
e
x= (e
x
)
1
=e

x
:
Their graphs look like this:Exponential Functions
0
1
2
3
4
5
6
7
8
­2.5 ­2 ­1.5 ­1 ­0.5 0 0.5 1 1.5 2 2.5
x
exp (x)  an d   (exp(­x) lo g x  a n d  ln x
­2.5
­2
­1.5
­1
­0.5
0
0.5
1
1.5
2
0 1 2 3 4 5
x Page 25 Note thate
x
is always strictly positive. It tends to zero as xbecomes very
large and negative, and to in…nity as xbecomes large and positive. To get
an idea of how quicklye
x
grows, note the approximatione
5
t150:
Later we will also seee
x
2
=2
;which is particularly useful in probability :This
function decays particularly rapidly as jxjincreases.
Note:
e
x
e
y
=e
x+y
; e
0
= 1
(recallx
a
:x
b
=x
a+b
) and
log(xy) = logx+logy;log(1 =x) =logx;log1 = 0 :Page 26 log

x
y
!
= logxlo
gy:
Dom(e
x
)=R;Im(e
x
)= (0 ;1)
Dom(lnx) = (0 ;1);Im(lnx) =R
Example:
lim
x!1
e
x
!0; lim
x!1
e
x
! 1; lim
x!0
e
x
!e
0
= 1:Page 27

1.3.2 Trigonometric/Circular Functionssinx and cosx
­1.5
­1
­0.5
0
0.5
1
1.5
­8 ­6 ­4 ­2 0 2 4 6 8 sinxis a
noddfunction, i.e.sin(x) =sinx:
It isperiodicwith period2:sin(x+2) = sinx. This means that after
every360

it repeats itself.
sinx= 0()x=n8n2ZPage 28 Dom(sinx)=RandIm(sinx)=[1;1]
cosxis anevenfunction, i.e.cos( x) = cosx:
It isperiodicwith period2:cos( x+2) = cosx.
cosx= 0()x=(2n+1)

2
8n2Z
Dom(cosx
)=RandIm(cosx) =[ 1;1]
tanx=
sinx
cosx
This is an
odd function:tan( x) = tanx
Periodic:tan( x+) = tanxPage 29 Dom=fx:
cosx6= 0g=
n
x:x6=(2n+1)

2
;n2Z
o
=R
n
(2n+1)

2
;n2Z
o
Trigonometri
c Identities:
cos
2
x+sin
2
x= 1; sin(xy) = sinxcosycosxsiny
cos( xy) = cosxcosysinxsiny; tan( x+y) =
tanx+tany
1tanxtany
Exercise:V
erify the followingsin

x+

2

= cosx
; cos


2
x

= sinx:
The
reciprocal trigonometric functions are de…ned by
secx=
1
cosx
; cscx=
1
sinx
; cotx=
1
tanxPage 30 Moreexamples on limiting:
lim
x!0
sinx!0; lim
x!0
sinx
x
!1; li
m
x!0
jxj !0
What aboutlim
x!0
jxj
x
?
lim
x!
0
+
jxj
x
= 1
lim
x
!0

jxj
x
=1
therefo
re
jxj
x
does
not tend to a limit asx!0:Page 31

Hyperbolic Functions
sinhx=
1
2

e
x
e
x

Odd f
unction:sinh( x) =sinhx
Dom(sinhx)=R; Im(sinhx) =R Page 32 coshx=
1
2

e
x
+e
x

Even fu
nction:cosh(x) = coshx
Dom(coshx)=R;Im(coshx) = [1 ;1) Page 33 tanhx=
sinhx
coshx
Dom(tanhx
)=R; Im(tanhx) = ( 1;1)
Identities:
cosh
2
xsi
nh
2
x= 1
sinh(x+y) = sinhxcoshy+coshxsinhy
cosh(x+y) = coshxcoshy+sinhxsinhyPage 34 Inverse Hyperbolic Functions
y= sinh
1
x!x= sinhy=
expyexp(y)
2
;
2x= ex
pyexp(y)
multiply both sides byexpyto obtain2xe
y
=e
2y
1which can be written
as
(e
y
)
2
2x(e
y
)1 = 0 :
This gives us a quadratic in e
y
therefore
e
y
=
2x
p
4x
2
+4 2
=x
q
x
2
+1
Now
p
x
2
+1> x=)x
p
x
2
+1<0and w
e know thate
y
>0therefore
we havee
y
=x+
p
x
2
+1:Hence takin
g logs of both sides gives us
sinh
1
x= ln




x+
q
x
2
+1



Page 35

Dom

sinh
1
x

=R; Im

sinh
1
x

=R Similarlyy=
cosh
1
x!x= coshy=
expy+exp( y)
2
;
2x= ex
py+exp( y)and again multiply both sides by expyto obtain
(e
y
)
2
2x(e
y
)+1 = 0 :
and
e
y
=x+
q
x
2
1Page 36 We take the positive root (not both) to ensure this is a function.
cosh
1
x= ln




x+
q
x
2
1




Dom

cosh

1
x

=[1;1); Im

cosh
1
x

= [0 ;1)
We …
nish o¤ by obtaining an expression for tanh
1
x:Puty= tanh
1
x!
x= tanhy=
expyexp(y)
expy+ex
p(y)
;
xexpy+xexp(y) = expyexp(y)Page 37 and as before multiply through bye
y
xexp2y+x= exp2y1
exp2y(1x) = 1+x!exp2y=
1+x
1x
taking logs
gives
2y= ln




1+x
1x




=)tanh
1
x=
1
2
ln




1+x
1x



Dom

tanh
1
x

=(
1;1); Im

tanh
1
x

=R Page 38 1.4 Di¤erentiation
A basic question asked is how fast does a function f(x)change withx? The
derivativeoff(x);written
df
dx
:Leibniz notation
o
r
f
0
(x) :Lagrange notation,
is de…ned for eachxas
f
0
(x) = lim
x!0
f(x+x)f(x)
x
assu
ming the limit exists (it may not) and is unique.Page 39

The term on the right hand side
f(x+x)f(x)
x
is c
alledNewton quotient .
Di¤erentiability implies continuity but converse does not always hold.
There is another notation for a derivative due to Newton, if a function varies
with time, i.e.y=y(t)then a dot is used

y
We can also de…ne operator notation due to Euler. Write
D
d
dx
:
ThenDoperates
on a function to produce its derivative, i.e. Df
df
dx
:Page 40 The earlier form of the derivative given is also called a forward derivative.
Other possible de…nitions of the derivative are
f
0
(x) = lim
x!0
1
x
(f(x
)f(xx))backward
f
0
(x) = lim
x!0
1
2x
(f(x+
x)f(xx))centred
Example:Di¤erentiatingx
3
from …rst principles:
f(x) =x
3
f(x+x) = ( x+x)
3
=x
3
+x
3
+3xx(x+x)
f(x+x)f(x)
x
=
x
3
+
3xx(x+x)
x
=x
2
+
3x
2
+3xx
!3x
2
asx!0;Page 41 d
dx
x
n
=nx
n1
;
d
dx
e
x
=e
x
;
d
dx
e
ax
=ae
ax
;
d
dx
logx=
1
x
;
d dx
cosx=sinx;
d
dx
sinx= c
osx;
d
dx
tanx= sec
2
x
and
so on. Take these as de…ned (standard results).
Examples:
f(x) =x
5
!f
0
(x) = 5 x
4
g(x) =e
3x
!g
0
(x) = 3 e
3x
= 3g(x)Page 42 Linearity:Ifandare constants andy=f(x)+g(x)then
dy
dx
=
d
dx
(f(x)+g(x
)) =f
0
(x)+g
0
(x):
Thus ify= 3x
2
6e
2x
then
dy=dx= 6x+12e
2x
:Page 43

1.4.1 Product Rule
Ify=f(x)g(x)then
dy
dx
=f
0
(x)g(x)+f(x
)g
0
(x):
Thus ify=x
3
e
3x
then
dy=dx= 3x
2
e
3x
+x
3

3e
3x

= 3x
2
(1+x)e
3x
:Page 44 1.4.2 Function of a Function Rule
Di¤erentiation is often a matter of breaking a complicated problem up into
simpler components. The function of a function rule is one of the main ways
of doing this.
Ify=f(g(x))then
dy
dx
=f
0
(g(x))g
0
(x):
Thus ify=e
4
x
2
then
dy=dx=e
4x
2
4:2x= 8xe
4x
2
:Page 45 So di¤erentiate the whole function, then multiply by the derivative of the
"inside"(g(x)):
Another way to think of this is in terms of the chain rule.
Writey=f(g(x))as
y=f(u); u=g(x):
Then
dy
dx
=
d
dx
f(u) =
du
dx
d du
f(u) =g
0
(x
)f
0
(u)
=g
0
(x)f
0
(g(x)):
Symbolically, we write this asPage 46 dy dx
=
du
dx
dy du
provideduis
a function ofxalone.
Thus fory=e
4x
2
;writeu= 4x
2
; y=e
u
:Then
dy
dx
=
du
dx
dy du
= 8xe
4x
2
:
F
urther examples:
y= sinx
3
y= sinu;whereu=x
3
y
0
= cosu:3x
2
!y
0
= 3x
2
cosx
3
y= tan
2
x:this is how we write(tanx)
2
so put
y=u
2
whereu= tanx
y
0
= 2u:sec
2
x!y
0
= 2tanxsec
2
xPage 47

y= lnsinx:Putu= sinx!y= lnu
dy
du
=
1
u
;
du
dx
= cosx
hen
cey
0
= cotx:
Exercise:Di¤erentiatey= logtan
2
xto show
dy
dx
= 2secxcs
cxPage 48 1.4.3 Quotient Rule
Ify=
f(x)
g(x)
then
dy
dx
=
g(x)f
0
(x)f(x)g
0
(x)
(g(x))
2
:
Thus ify=e
3
x
=x
2
;
dy
dx
=
x
2
3e
3x
2xe
3x
x
4
=
3x2
x
3
e
3x
:
This is a
combination of the product rule and the function of a function (or
chain) rule. It is very simple to derive:Page 49 Startingwithy=
f(x)
g(x)
and writing
asy=f(x)(g(x))
1
we apply the
product rule
dy
dx
=
df
dx
(g(x))
1
+f(x)
d
dx
(g(x))
1
Now u
se the chain rule on(g(x))
1
;i.e. writeu=g(x)so
d
dx
(g(x))
1
=
du
dx
d du
u
1
=g
0
(x)

u
2

=
g
0
(x)
g(x)
2
:
Then
dy
dx
=
1
g(x)
df
dx
f(x)
g
0
(x)
g(x)
2
=
f
0
(x)
g(x)

f(x)g
0
(x)
g(x)
2
:Page 50 To simplifywe note that the common denominator isg(x)
2
hence
dy
dx
=
g(x)f
0
(x)f(x)g
0
(x)
g(x)
2
:
Examples:
d
dx
(xe
x
)=x
d
dx
(e
x
)+e
x
d
dx
(x)
=xe
x
+e
x
=e
x
(x+1)
;
d
dx
(e
x
=x) =
x(e
x
)
0
e
x
(x
)
0
(x)
2
=
xe
x
e
x
x
2
=
e
x
x
2
(x1);
d
dx

e
x
2

=
d
dx
(e
u
)whereu=x
2
)du=2xdx
= (
2x)e
x
2
:Page 51

1.4.4 Implicit Di¤erentiation
Consider the function
y=a
x
whereais a constant. If we take natural logof both sides
lny=xlna
and now di¤erentiate both sides by applying the chain rule to the left hand
side
1
y
dy
dx
= lna
dy
dx
=ylna
and re
placeybya
x
to give
dy
dx
=a
x
lna:Page 52 This is anexample ofimplicit di¤erentiation.
We could have obtained the same solution by initially writing a
x
as a combi-
nation of alogandexp
y= exp(lna
x
)= exp( xlna)
y
0
=
d
dx

e
xlna

=e
xlna
d
dx
(xlna)
=a
x
lna:
Consider th
e earlier implicit function given by
4y
4
2y
2
x
2
yx
2
+x
2
+3 = 0:
The resulting derivative will also be an implicit function. Di¤erentiating gives
16y
3
y
0
2

2yy
0
x
2
+2y
2
x



y
0
x
2
+2xy

=2x

16y
3
2yx
2
x
2

y
0
=2x+4y
2
x+2xy
y
0
=
2x+4y
2
x+2xy
16y
3
2yx
2
x
2Page 53 1.4.5 Higher Derivatives
These are de…ned recursively;
f
00
(x) =
d
2
f
dx
2
=
d
dx

df
dx

f
000
(x) =
d
3
f
dx
3
=
d
dx

d
2
f
dx
2
!
and so on
. For example:
f(x) = 4 x
3
!f
0
(x) = 12 x
2
!f
00
(x) = 24 x
f
000
(x) = 24!f
(iv)
(x) = 0 :
so for anyn
th
degree polynomial
f(x) =anx
n
+an1x
n1
+:::::::+a1x+a0
we havef
(n+1)
(x) = 0 :Page 54 Consider another two examples
f(x) =e
x
f
0
(x) =e
x
!f
00
(x) =e
x
.. .
f
(n)
(x) =e
x
=f(x):
g(x) = logx!g
0
(x) = 1 =x
g
00
(x) =1=x
2
!g
000
(x) = 2 =x
3
:
Warning
Not all functions are di¤erentiable everywhere. For example, 1=xhas the
derivative1=x
2
but only forx6= 0:
Easy way is to "look for a hole", e.g. f(x) =
1
x2
does
not exist atx= 2:
x= 2is called asingularityfor this function. We sayf(x)issingularat the
pointx= 2:Page 55

1.4.6 Leibniz Rule
This is the …rst of two rules due to Leibniz. Here it is used to obtain the n
th
derivative of a producty=uv, by starting with the product rule.
dy
dx
=u
dv
dx
+v
du
dx
uDv+vD
u
then
y
00
=uD
2
v+2DuDv+vD
2
u
y
000
=uD
3
v+3DuD
2
v+3D
2
uDv+vD
3
u
and so on. This suggests (can be proved by induction)
D
n
(uv) =uD
n
v+

n
1

DuD
n1
v+

n
2

D
2
uD
n2
v+:::+

n
r

D
r
uD
nr
v+:::+vD
n
u
where

n
r

=
n!
r!(nr)!
:Page 56 Example:Find then
th
derivative ofy=x
3
e
ax
:
Putu=x
3
andv=e
ax
andD
n
(uv)(uv)
n
;so
(uv)
n
=uvn+

n
1

u1vn1+

n
2

u2vn2+

n
3

u3vn3+:::::::
u=x
3
;u1= 3x
2
;u2= 6x;u3= 6;u4= 0
v=e
ax
;v1=ae
ax
;v2=a
2
e
ax
;:::::::: ;vn=a
n
e
ax
thereforeD
n

x
3
e
ax

=
x
3
a
n
e
ax
+

n
1

3x
2
a
n1
e
ax
+

n
2

6xa
n2
e
ax
+

n
3

6a
n3
e
ax
=e
ax

x
3
a
n
+n3x
2
a
n1
+n(n1)a
n2
3x+n(n1)(n2)a
n3
Page 57 1.4.7 Further Limits
This will be an application of di¤erentiation. Consider the limiting case
lim
x!a
f(x)
g(x)

0
0
or
1
1
This is called
anindeterminate form.ThenL’Hospitals rulestates
lim
x!a
f(x)
g(x)
= lim
x
!a
f
0
(x)
g
0
(x)
=:::::::= l
im
x!a
f
(r)
(x)
g
(r)
(x)
forrsuch th
at we have the indeterminate form0=0:If forr+1we have
lim
x!a
f
(r+1)
(x)
g
(r+1)
(x)
!A
whereAis
not of the form0=0then
lim
x!a
f(x)
g(x)
lim
x!
a
f
(r+1)
(x)
g
(r+1)
(x
)
:Page 58 Note:Very important to verify quotient has this indeterminate form before
using L’Hospitals rule. Else we end up with an incorrect solution.
Examples:
1.
lim
x!0
cosx+2x1
3x

0
0
So di¤
erentiate both numerator and denominator!
lim
x!0
d
dx
(cosx+2x1)
d
dx
(3x)
= lim
x
!0
sinx+2
3
6=
0
0
!
2
3
2.lim
x!0
e
x
+e

x
2
1cos2 x
;qu
otient has form0=0:By L’Hospital’s rule we have
lim
x!0
e
x
e
x
2sin
2x
;which has indeterminate form0=0again for 2nd time, soPage 59

we app
ly L’Hospital’s rule again
lim
x!0
e
x
+e
x
4cos
2x
=
1
2
:
3.lim
x!1
x
2
lnx

1
1
)use L’Hosp
ital , solim
x!1
2x
1=x
! 1
4.lim
x
!1
e
3x
lnx

1
1
)lim
x!
1
3xe
3x
! 1
5.lim
x!1
x
2
e
3x
0:1;so we convert to form1=1by writinglim
x!1
x
2
e
3x
;
and now
use L’Hospital (di¤erentiate twice), which gives lim
x!1
2
9e
3x
!0Page 60 6.lim
x!0
sinx
x
lim
x!
0
cosx1
What is example6:saying?
Whenxis very close to0thensinxx:That issinxcan be approximated
with the functionxfor small values.Page 61 1.5 Taylor Series
Many functions are so complicated that it is not easy to see what they look
like. If we only want to know what a function looks like locally, we can
approximate it by simpler functions: polynomials. The crudest approximation
is by a constant: iff(x)is continuous atx0;
f(x)tf(x0)
forxnearx0:
Before we consider this in a more formal manner we start by looking at a simple
motivating example:
Considerf(x) =e
x
:Page 62 Supposewe wish to approximate this function for very small values of x(i.e.
x!0). We know atx= 0;
df
dx
= 1:So th
is is the gradient atx= 0:We
can …nd the equation of the line that passes through a point (x0;y0)using
yy0=m(xx0):
Herem=
df
dx
= 1; x0=
0; y0= 1;soy= 1 +x;is a polynomial. What
information have we ascertained from this?
Ifx!0then the point(x;1+x)on the tangent is close to the point
(x;e
x
)on the graphf(x)and hencePage 63

e
x
1+x­5
0
5
10
15
20
25
­4 ­3 ­2 ­1 0 1 2 3 4 Page 64 Supposenow that we are not that close to0:We look for a second degree
polynomial (i.e. quadratic)
g(x) =ax
2
+bx+c!g
0
= 2ax+b!g
00
= 2a
If we want this parabolag(x)to have
(i)sameyintercept asf:
g(0) =f(0) =)c= 1
(ii)same tangent asf
g
0
(0) =f
0
(0) =)b= 1
(iii)same curvature asf
g
00
(0) =f
00
(0) =)2a= 1Page 65 This gives
e
x
g(x
) =
1
2
x
2
+x+10
5
10
15
20
25
­4 ­3 ­2 ­1 0 1 2 3 4 Page 66 Moving further away we would look at a third order polynomial h(x)which
gives
e
x
h(x) =
1
3!
x
3
+
1
2!
x
2
+x+1­5
0
5
10
15
20
25
­4 ­3 ­2 ­1 0 1 2 3 4
and so on
.Page 67

Better is to approximate by the tangent at x0:This makes the approximation
andits derivative agree with the function:
f(x)tf(x0)+(xx0)f
0
(x0):
Better still is by the best …t parabola (quadratic), which makes the …rst two
derivatives agree:
f(x)tf(x0)+(xx0)f
0
(x0)+
1
2
(xx0)
2
f
00
(x0):
This pro
cess can be continued inde…nitely as long as fcan be di¤erentiated
often enough.
Then
th
term is
1
n!
f
(n)
(x0)(xx0)
n
;Page 68 wheref
(n)
means then
th
derivative offandn! =n:(n1):::2:1is
the factorial.
x0= 0is the special case, called Maclaurin Series.
Examples:
Expanding about the originx0= 0;
e
x
= 1+x+
x
2
2!
+
x
3
3!
+:::+
x
n
n!
Near0;the
logarithm looks like
log(1+x) =x
x
2
2
+
x
3
3

x
4
4
+:::+(1)
n
x
n
+1
(n+1)!Page 69 How canwe obtain this? Putf(x) = log(1+x);thenf(0) = 0
f
0
(x) =
1
1+x
f
0
(0) = 1
f
00
(
x) =
1
(1+x)
2
f
00
(0) =1
f
000
(
x) =
2
(1+x)
3
f
000
(0) = 2
f
(4)
(x
) =
6
(1+x)
4
f
(4)
(0) =6
Thu
s
f(x) =
1
X
n=0
f
(n)
(0)
n!
x
n
= 0+
1
1!
x+
(1)
2!
x
2
+
1
3!
:2x
3
+
(6)
4!
x
4
+:::::
=x
x
2
2
+
x
3
3

x
4
4
+:::Page 70 Taylor’s theorem, in general, is this : If f(x)and its …rstnderivatives exist
(and are continuous) on some interval containing the point x0then
f(x) =f(x0)+
1
1!
f
0
(x0)(xx0)
+
1
2!
f
00
(x0)(xx0)
2
+:::
+
1
(n1)!
f
(n1)
(x0)(xx0)
n
1
+Rn(x)
whereRn(x) = (1 =n!)f
(n)
()(xx0)
n
; is some (usually unknown)
number betweenx0andxandf
(n)
is then
th
derivative off.
We can expand about any pointx=a;and shift this point to the origin, i.e.
xx00and we express in powers of(xx0)
n
:Page 71

So forf(x) = sinxaboutx==4we will have
f(x) =
1
X
n=0
f
(n)


4

n!
(x=4)
n
wheref
(
n)


4

is then
th
deriva
tive ofsinxatx0==4:
As another example suppose we wish to expand log(1+x)aboutx0= 2;i.e.
x2 = 0then
f(x) =
1
X
n=0
1
n!
f
(n)
(2)( x2)
n
wheref
(
n)
(2)is then
th
derivative oflog(1+x)evaluated at the point
x= 2:
Note thatlog(1+x)does not exist forx=1:Page 72 1.5.1 TheBinomial Expansion
TheBinomial Theoremis the Taylor expansion of(1+x)
n
wherenis a
positive integer. It reads:
(1+x)
n
= 1+nx+
n(n1)
2!
x
2
+
n(n1)(n2)
3!
x
3
+::: :
We
can extend this to expressions of the form
(1+ax)
n
= 1+n(ax)+
n(n1)
2!
(ax)
2
+
n(n1)(n2)
3!
(ax)
3
+::: :
(p+ax)
n
=
"
p

1
+
a
p
x
!#
n
=p
n
"
1+n

a
p
x
!
+::::::::
#Page 73 The binomial coe¢ cients are found in Pascal’s triangle:
1 (n=0) (1+x)
0
1 1 ( n=1) (1+x)
1
1 2 1(n=2) (1+x)
2
1 3 3 1 ( n=3) (1+x)
3
1 4 6 4 1 (n=4) (1+x)
4
1 5 10 10 5 1(n=5) (1+x)
5
and so on ...Page 74 As an example consider:
(1+x)
3
n= 3)1 3 3 1)(1+x)
3
= 1+3x+3x
2
+x
3
(1+x)
5
n= 5!(1+x)
5
= 1+5x+10x
2
+10x
3
+5x
4
+x
5
:
Ifnis not an integer the theorem still holds but the coe¢ cients are no longer
integers. For example,
(1+x)
1
= 1x+x
2
x
3
+::: :
and
(1+x)
1=2
= 1+
1
2
x+

1
2


1
2

x
2 2!
::: :Page 75

(a+b)
k
=a
k
h
1+
b
a
i
k
=
a
k

1+kb
a
1
+
k(k1)
2!
b
2
a
2
+
k(k1)(k2)
3!
b
3
a
3
+::

=a
k
+kba
k
1
+
k(k1)
2
b
2
a
k2
+
k(k1)(k2)
3!
b
3
a
k3
+::
Example:We
looked atlim
x!0
sinx
x
!1(by L
’Hospital). We can also do this
using Taylor series:
lim
x!0
sinx
x
lim
x!
0
xx
3
=3!+x
5
=5!+::::
x
lim
x!
0

1x
2
=3!+x
4
=5!+::::

!1:Page 76 1.6 Integration
1.6.
1 The Inde…nite Integral
The inde…nite integral off(x);
Z
f(x)dx;
is any functionF(x)whose derivative equalsf(x). Thus if
F(x) =
Z
f(x)dxthen
dF
dx
(x) =f(x
):
Since the derivative of a constant, C;is zero(dC=dx= 0);the inde…nite
integral off(x)is only determined up to an arbitrary constant.Page 77 If
dF
dx
=f(x)then
d
dx
(F(x)+C) =
dF
dx
(x)+
dC
dx
=
dF
dx
(x) =f(x
):
Thuswemustalwaysincludeanarbitraryconstantofintegrationinaninde…nite
integral.
Simple examples are
Z
x
n
dx=
1
n+1
x
n+1
+C(
n6=1);
Z
dx
x
= log
(x)+C;
Z
e
ax
dx=
1
a
e
ax
+C(a6= 0);
Z
cosaxdx=
1
a
sinax+C;
Z
si
naxdx=
1
a
cosax+CPage 78 Linearity
Integration is linear:
Z
(f(x)+g(x))dx=
Z
f(x)dx+
Z
g(x)dx
for constantsAandB:Thus, for example
Z

Ax
2
+Bx
3

dx=A
Z
x
2
dx+B
Z
x
3
dx
=
A
3
x
3
+
B
4
x
4
+C;
Z
(3e
x
+
2=x)dx= 3
Z
e
x
dx+2
Z
dx
x
= 3e
x
+2
log( x)+C;
and so forth.Page 79

1.6.2 TheDe…nite Integral
Thede…nite integral,
Z
b
a
f(x)dx;
is the area under the graph off(x);betweenx=aandx=b;with
positive values off(x)giving positive area and negative values of f(x)
contributing negative area. It can be computed if the inde…nite integral is
known. For example
Z
3
1
x
3
dx=

1
4
x
4

3
1
=
1
4

3
4
1
4

= 20
;
Z
1
1
e
x
dx=[e
x
]
1
1
=e1=e:
Note that the de…nite integral is also linear in the sense that
Z
b
a
(Af(x)+Bg(x))dx=A
Z
b
a
f(x)dx+B
Z
b
a
g(x)dx:Page 80 Note also thata de…nite integral
Z
b
a
f(x)dx
does not depend on the variable of integration, xin the above, it only depends
on the functionfand the limits of integration ( aandbin this case); the
area under a curve does not depend on what we choose to call the horizontal
axis.
So
Z
b
a
f(x)dx=
Z
b
a
f(y)dy=
Z
b
a
f(z)dz:
We should never confuse the variable of integration with the limits of integra-
tion; a de…nite integral of the form
Z
x
a
f(x)dx;
use dummy variable.Page 81 Ifa < b< cthen
Z
c
a
f(x)dx=
Z
b
a
f(x)dx+
Z
c
b
f(x)dx:
Also
Z
a
c
f(x)dx=
Z
c
a
f(x)dx:Page 82 1.6.3 Integration by Substitution
This involves the change of variable and used to evaluate integrals of the form
Z
g(f(x))f
0
(x)dx;
and can be evaluated by writingz=f(x)so thatdz=dx=f
0
(x)or
dz=f
0
(x)dx:Then the integral becomes
Z
g(z)dz:
Examples:
Z
x
1+x
2
dx:z= 1
+x
2
!dz= 2xdx
Z
x
1+x
2
dx=
1
2
log(
z)+C=
1
2
log

1+x
2

+C
=
log
q
1+x
2

+CPage 83

R
xe
x
2
dx:z=x
2
!dz=2
xdx
Z
xe
x
2
dx=
1
2
Z
e
z
dz
=
1
2
e
z
+C=
1
2
e
x
2
+C
Z
1
x
log(
x)dx=
Z
z dz=
1
2
z
2
+C
=
1
2
(log(
x))
2
+C
withz= log( x)sodz=dx=xand
Z
e
x+e
x
dx=
Z
e
x
e
e
x
dx=
Z
e
z
dz
=e
z
+C=e
e
x
+C
withz=e
x
sodz=e
x
dx:Page 84 Themethodcanbeusedforde…niteintegralstoo. Inthiscaseitisusuallymore
convenient to change the limits of integration at the same time as changing
the variable; this is not strictly necessary, but it can save a lot of time.
For example, consider
Z
2
1
e
x
2
2xdx:
Writez=x
2
;sodz= 2xdx:Now consider the limits of integration; when
x= 2; z=x
2
= 4and whenx= 1; z=x
2
= 1:Thus
Z
x=2
x=1
e
x
2
2xdx=
Z
z=4
z=1
e
z
dz
= [e
z
]
z=4
z=1
=e
4
e
1
:Page 85 Further examples: consider
Z
x=2
x=1
2xdx
1+x
2
:
In this
case we could writez= 1 +x
2
;sodz= 2xdxandx= 1
corresponds toz= 2,x= 2corresponds toz= 5;and
Z
x=2
x=1
2x
1+x
2
dx=
Z
z=5
z=2
dz
z
= [ln(
z)]
z=5
z=2
= log(5)ln(2)
= ln(5=2)
We can solve the same problem without change of limit, i.e.
n
ln


1+x
2



o
x=2
x=1
!ln5ln2 = ln5 =2:Page 86 Or consider
Z
x=e
x=1
2
log( x)
x
dx
in which cas
e we should choosez= log(x)sodz=dx=xandx= 1
givesz= 0; x=egivesz= 1and so
Z
x=e
x=1
2
log( x)
x
dx=
Z
z=1
z=0
2zdz=
h
z
2
i
z=1
z=0
= 1:Page 87

When wemake a substitution likez=f(x)we are implicitly assuming that
dz=dx=f
0
(x)is neither in…nite nor zero. It is important to remember this
implicit assumption.
Consider the integral
Z
1
1
x
2
dx=
1
3
h
x
3
i
x=1
x=1
=
1
3
(1(1)) =
2
3
:
Now p
utz=x
2
sodz= 2xdxordz= 2
p
z dxand whenx=
1;
z=x
2
= 1and whenx= 1; z=x
2
= 1;so
Z
x=1
x=1
x
2
dx=
1
2
Z
z=1
z=1
dz
p
z
= 0
as th
e area under the curve1=
p
zbet
weenz= 1andz= 1is obviously
zero.Page 88 It is clear thatx
2
>0except atx= 0and therefore that
Z
1
1
x
2
dx=
2
3
must be
the correct answer. The substitution z=x
2
gave
Z
x=1
x=1
x
2
dx=
1
2
Z
z=1
z=1
dz
p
z
= 0
which i
s obviously wrong. So why did the substitution fail?
It failed becausef
0
(x) =dz=dx= 2xchanged signs betweenx=1
andx= 1:In particular,dz=dx= 0atx= 0;the functionz=x
2
is
not invertible for1x1:
Moral: when making a substitution make sure that dz=dx6= 0:Page 89 1.6.4 Integration by Parts
This is based on the product rule. In usual notation, if y=u(x)v(x)then
dy
dx
=
du
dx
v+u
dv
dx
so that
du
dx
v=
dy
dx
u
dv
dx
and he
nce integrating
Z
du
dx
vdx=
Z
dy
dx
dx
Z
u
dv
dx
dx=y(x)
Z
u
dv
dx
dx+C
or
Z
du
dx
vdx=u(x
)v(x)
Z
u(x)
dv
dx
dx+C
i.e.
Z
u
0
vdx=uv
Z
uv
0
dx+CPage 90 This is useful, for instance, if v(x)is a polynomial andu(x)is an exponential.
How can we use this formula? Consider the example
Z
xe
x
dx
Put
v=x u
0
=e
x
v
0
= 1u=e
x
hence
Z
xe
x
dx=uv
Z
u
dv
dx
dx
=xe
x

Z
e
x
:1dx=e
x
(x1)+C
The f
ormula we are using is the same as
Z
vdu=uv
Z
udv+CPage 91

Now using the same example
R
xe
x
dx
v=x du=e
x
dx
dv=dx u=e
x
and
Z
vdu=uv
Z
udv=xe
x

Z
e
x
dx
=e
x
(x1)+C
Another example
Z
x
2
|{z}
v(x)
e
2x
|{z}
u
0
dx=
1
2
x
2
e
2x
|
{z
}
uv

Z
xe
2x
|
{z
}
uv
0
dx+C
and using
integration by parts again
Z
xe
2x
dx=
1
2
xe
2x

1
2
Z
e
2x
dx=
1
4
(2x1)e
2x
+D
so
Z
x
2
e
2x
dx=
1
4

2x
2
2x+1

e
2x
+E
:Page 92 1.6.5 Reduction Formula
Consider the de…nite integral problem
Z
1
0
e
t
t
n
dt=In
putv=t
n
andu
0
=e
t
!v
0
=nt
n1
andu=e
t

h
e
t
t
n
i
1
0
+n
Z
1
0
e
t
t
n1
dt
=
h
e
t
t
n
i
1
0
+nIn1
In=nIn1
=n(n1)In2=:::::::=n!I0
whereI0=
Z
1
0
e
t
dt= 1
)In=n!;n2Z
+
Inis called theGamma Function.Page 93 1.6.6 Other Results
Z
f
0
(x)
f(x)
dx= lnj
f(x)j+C
e.g.
Z
3
1+3
x
dx= lnj1+3xj+C
Z
1
2+7
x
dx=
1
7
Z
7
2+7
x
dx=
1
7
lnj2+7
xj+C
This allows us to state a standard result
Z
1
a+bx
dx=
1
b
lnja+bxj+C
How can
we re-do the earlier example
Z
x
1+x
2
dx;
which w
as initially treated by substitution?Page 94 Partial FractionsConsider a fraction where both numerator and denomina-
tor are polynomial functions, i.e.
h(x) =
f(x)
g(x)

NP
n=0
anx
n
MP
n=0
bnx
n
where degf(x
)<degg(x), i.e.N < M:Thenh(x)is called apartial
fraction.Suppose
c
(x+a)(x+b)

A
(x+a)
+
B
(x+b)
then writing
c=A(
x+b)+B(x+a)
and solving forAandBallows us to obtain partial fractions.Page 95

The simplest way to achieve this is by setting x=bto obtain the value of
B;then puttingx=ayieldsA:
Example:
1
(x2)(x+
3)
:Now write
1
(x2)(x+3
)

A
x2
+
B
x+3
which b
ecomes
1 =A(x+3)+B(x2)
Settingx=3!B=1=5;x= 2!A= 1=5:So
1
(x2)(x+
3)

1
5(x2)

1
5(x+
3)
:Page 96 There is another quicker and simpler method to obtain partial fractions, called
the " cover-up" rule. As an example consider
x
(x2)(x+3
)

A
x2
+
B
x+3
:
Firstly,
look at the term
A
x2
:The denomin
ator vanishes forx= 2;so
take the expression on the LHS and "cover-up"(x2):Now evaluate the
remaining expression, i.e.
x
(x+3)
fo
rx= 2;which gives2=5:SoA= 2=5:
Now repeat this, by noting that
B
x+3
doe
s not exist atx=3:So cover
up(x+3)on the LHS and evaluate
x
(x2)
forx=3
;which gives
B= 3=5:Page 97 Any rational expression
f(x)
g(x)
(with degree off(
x)<degree ofg(x)) such
as above can be written
f(x)
g(x)
F1+F2+::::::::+F
k
where ea
chFihas form
A
(px+q)
m
or
Cx+D

ax
2
+bx+c

n
where
A
(px+q)
m
is written as
A1
(px+q)
+
A2
(px+q)
2
+::::::+
A
(px+q)
mPage 98 and
Cx+D

ax
2
+bx+c

nbe
comes
C1x+D1
ax
2
+bx+c
+::::::+
Cnx+Dn

ax
2
+bx+c

nPage 99

Examples:
3x2
(4x3)(2 x+
5)
3

A
4x3
+
B
2x+5
+
C
(2x+5)
2
+
D
(2x+5)
3
4
x
2
+13x9
x(x+3)
(x1)

A
x
+
B
x+3
+
C
(x1)
3x
3
18x
2
+29
x4
(x+1)
(x2)
3

A
x+1
+
B
x2
+
C
(x2)
2
+
D
(x2)
3
5x
2
x+2

x
2
+2x+4

2
(
x1)

Ax+B
x
2
+2x+4
+
C
x+D

x
2
+2x+4

2
+
E
x1
x
2
x21

x
2
+4

2
(2x1)

Ax+B
x
2
+4
+
Cx+D

x
2
+4

2
+
E
2x1Page 100 1.7 Complex Numbers
A complex numberzis de…ned byz=x+iywherex; y2Rand
i=
p
1:It follows
thati
2
=1:
We call thexaxis the real line and the yaxis the imaginary line.
zmay also be expressed in polar co-ordinate form as
z=r(cos+isin)
whereris always positive andcounter-clockwise fromOx:
Sox=rcos; y=rsinPage 101 x
y
r
q
z  = x +i y modulu
sofzdenotedjzjis de…nedjzj=r=
+
q
x
2
+y
2
;argument=
arctan
y
x
The set of
all complex numbers is denotedC;and for any complex numberz
we writez2C:We can think ofRC:
We de…ne thecomplex conjugateofzby
_
zwhere
_
z=xiy:
zis the re‡
ection ofzin the real line. So for example if z= 12i;then
z= 1+
2i:Page 102 1.7.1 Arithmetic
Given any two complex numbersz1=a+ib; z2=c+idthe following
de…nitions hold:
Addition & Subtractionz1z2= (ac)+i(bd)
Multiplicationz1z2=(acbd)+i(ad+bc)
Division
z1
z2
=
a+ib
c+id
=
(ac+bd)+i(b
cad)
c
2
+d
2
=
(ac+bd)
c
2
+d
2
+i
(bcad)
c
2
+d
2
here w
e have simply multiplied by
cid
cid
and note
that(c+id)(cid) =
c
2
+d
2Page 103

Examples z1= 1+
2i; z2= 3i
z1+z2= (1+3)+i(21) = 4+i;z1z2= (13)i(2(1)) =
2+3i
z1z2= (1 :32:1)+i(1:1+2:3) = 5+5 i
z1
z2
=
1+2
i
3i
:
3+i
3+i
=
1
10
+i
7
10Page 104 1.7.2 Complex Conjugate Identities
1.
_
z

=z
2.
(z1+z2) =
z1+
_
z2
3.
(z1z2) =
_
z1
_
z2
4.z+
_
z= 2x=
2Rez)Rez=
z+
_
z
2
5.z
_
z= 2iy= 2iImz)Imz=
z
_
z
2i
6.z:
_
z=(x+iy)
(xiy) =jzj
2Page 105 7.j
zj
2
=
z
(
z) =
zz=jzj
2
)
j
zj=jzj
8.
z1
z2
=
z1
z2
:
z2 z2
=
z1
z2
j
z2j
2
9.jz1z2j
2
=jz1j
2
jz2j
2Page 106 1.7.3 Polar Form
We return to the polar form representation of complex numbers. We now
introduce a new notation. Ifz2C;then
z=r(cos+isin) =re
i
:
Hence
e
i
= cos+isin;
which is a special relationship called Euler’s Identity.Knowingsinis an odd
function givese
i
= cosisin:Referring to the earlier polar coordinate
…gure, we have:
jzj=r;argz=
If
z1=r1e
i1andz2=r2e
i2Page 107

then
z1z2=r1r2e
i(1+2)
) jz1z2j=r1r2=j
z1jjz2j
arg( z1z2) =1+2= arg( z1)+arg( z2):
Ifz26= 0then
z1
z2
=
r1e
i1
r2e
i2
=
r1
r2
e
i(12)
and he
nce





z1
z2



=
jz1j
jz2j
=
r1
r2
arg

z1
z2
!
=12= arg(
z1)arg( z2)Page 108 Euler’sFormulaLetbe any a1ngle, then
exp(i) = cos+isin:
We can prove this by considering the Taylor series for exp(x);sinx;cosx
e
x
= 1+x+
x
2
2!
+
x
3
3!
+::::::::::::: +
x
n
n!
(a)
sinx=x
x
3
3!
+
x
5
5!
::::::::::::: +(
1)
n
x
2n+1
(2n+1)!
(b
)
cosx= 1
x
2
2!
+
x
4
4!
::::::::::::: +(
1)
n
x
2n
(2n)!
(c)Page 109 Replacingxby the purely imaginary quantityiin(a), we obtain
e
i
= 1+i+
(i)
2
2!
+
(i)
3
3!
+::::::::::::: +
(i)
n
n!
=

1

2
2!
+

4
4!


6
6!
+::::::::::::
!
+
i



3
3!
+

5
5!
:::::::::
!
= cos+isi
n
Note: When=thenexpi=1and==2givesexp(i=2) =i:Page 110 We can apply Euler’s formula to integral problems. Consider the problem
Z
e
x
sinxdx
which was simpli…ed using the integration by parts method. We know Ree
i
=
cos;so the above becomes
Z
e
x
Ime
ix
dx=
Z
Ime
(i+1)x
dx= Im
1
1+i
e
(i+1)x
=e
x
Im
1
1+i

e
ix

=e
x
Im
1i
(1+i)(1i)

e
ix

=
1
2
e
x
Im(1i)

e
ix

=
1
2
e
x
Im

e
ix
ie
ix

=
1
2
e
x
Im(cosx+isi
nxicosx+sinx)
=
1
2
e
x
(sinxcosx)
Exercise
: Apply this method to solving
Z
e
x
cosxdx.Page 111

1.8 Functions of Several Variables: Multivariate Calculus
A function can depend on more than one variable. For example, the value of
an option depends on the underlying asset price S(for ’spot’or ’share’) and
timet:We can write its value asV(S;t):
The value also depends on other parameters such as the exercise price E;
interest raterand so on. Although we could writeV(S;t;E;r;::: );it is
usually clearer to leave these other variables out.
Depending on the application, the independent variables may be xandtfor
space and time, or two space variables xandy;orSandtfor price and
time, and so on.Page 112 Consider afunctionz=f(x;y);which can be thought of as a surface in
x; y; zspace. We can think ofxandyas positions on a two dimensional
grid (or as spacial variables) and zas the height of a surface above the (x;y)
grid.
How do we di¤erentiate a functionf(x;y)oftwovariables? What if there
are more independent variables?
Thepartial derivativeoff(x;y)with respect toxis written
@f
@x
(note@and
notd). It is thexderivative offwithyheld …xed:
@f
@x
= li
m
x!0
f(x+x;y)f(x;y)
x
:Page 113 The otherpartial derivative,@f=@y;is de…ned similarly but nowxis held
…xed:
@f
@y
= l
im
y!0
f(x;y+y)f(x;y)
y
:
@f
@x
and
@f
@y
ar
e sometimes written asfxandfy:
Examples
If
f(x;y) =x+y
2
+xe
y
2
then
@f
@x
=fx= 1
+0+1e
y
2Page 114 @f @y
=fy= 0
+2y+x(2y)e
y
2
:
The convention is, treat the other variable like a constant.Page 115

HigherDerivatives
Like ordinary derivatives, these are de…ned recursively:
@
2
f
@x
2
=fxx=
@
@x

@f
@x

;
@
2
f
@y
2
=fyy=
@
@y

@f
@y
!
:
and
@
2
f
@x@y
=fxy=
@
@y

@f
@x

;
@
2
f
@y@
x
=fyx=
@
@x

@f
@y
!
:Page 116 Iffis well-behaved, the ’mixed’partial derivatives are equal:
fxy=fyx:
i.e. the second order derivatives exist and are continuous.
Example:
Withf(x;y) =x+y
2
+xe
y
2
as above,
fx= 1+e
y
2
so
fxx= 0;fxy=2ye
y
2Page 117 Also
fy= 2y2
xye
y
2
so
fyx=2ye
y
2
;fyy= 22xe
y
2
+4xy
2
e
y
2
Note thatfxy=fyx:Page 118 1.8.1 TheChain RuleI
Suppose thatx=x(s)andy=y(s)andF(s) =f(x(s);y(s)):
Then
dF
ds
(s) =
@f
@x
(x(
s);y(s))
dx
ds
(s)+
@f
@y
(x(
s);y(s))
dy
ds
(s)
Thus i
ff(x:y) =x
2
+y
2
andx(s) = cos( s); y(s) = sin( s)we …nd
thatF(s) =f(x(s);y(s))has derivative
dF
ds
=sin(s
)2cos( s)+cos( s)2sin( s) = 0
which is what it should be, since F(s) = cos
2
(s)+sin
2
(s) = 1 ;
i.e. a constant.Page 119

Example:Calculate
dz
dt
att==2where
z=
exp

xy
2

x=tcost; y=tsint:
Chain rule gives
dz
dt
=
@z
@x
dx dt
+
@z
@y
dy dt
=y
2
exp

xy
2

(tsint+c
ost)+
2xyexp

xy
2

(sint+tcost):
Att==2x= 0; y==2)
dz
dt




t==2
=

3
8
:Page 120 1.8.2 TheChain Rule II
Supposethatx=x(u;v); y=y(u;v)andthatF(u;v) =f(x(u;v); y(u;v)):
Then
@F
@u
=
@x
@u
@f @x
+
@y
@u
@f @y
and
@F
@v
=
@x
@v
@f @x
+
@y
@v
@f @y
:
This is
sometimes written as
@
@u
=
@x
@u
@ @x
+
@y
@u
@ @y
;
@
@v
=
@x
@v
@ @x
+
@y
@v
@ @y
:
so is
essentially a di¤erential operator.Page 121 Example:
T=x
3
xy+y
3
wherex=rcos;
y=rsin
@T
@r
=
@T
@x
@x @r
+
@T
@y
@y @r
= c
os

3x
2
y

+sin

3y
2
x

= cos

3r
2
cos
2
rsin

+
sin

3r
2
sin
2
rcos

= 3r
2

cos
3
+sin
3


2rcossin
= 3r
2

cos
3
+sin
3


rsin2:Page 122 @T @
=
@T
@x
@x @
+
@T
@y
@y @
=rsi
n

3x
2
y

+rcos

3y
2
x

=rsin

3r
2
cos
2
rsin

+
rcos

3r
2
sin
2
rcos

= 3r
3
cossin(sincos)+
r
2

sin
2
cos
2


:
=r
2
(sincos)(3rcossin+sin+cos)Page 123

1.8.3 Taylor for two Variables
Assuming that a functionf(x;t)is di¤erentiable enough, nearx=x0;
t=t0;
f(x;t) =f(x0;t0)+(xx0)fx(x0;t0)+
(tt0)ft(x0;t0)
+
1
2
2
6
4
(xx0)
2
fxx(x0;t0)
+2(
xx0)(tt0)fxt(x0;t0)
+(tt0)
2
ftt(x0;t0)
3
7
5+::::
That is,
f(x;t) =constant+linear+quadratic
+::::
The error in truncating this series after the second order terms tends to zero
faster than the included terms. This result is particularly important for Itô’s
lemma in Stochastic Calculus.Page 124 Supposea functionf=f(x;y)and bothx;ychange by a small amount, so
x!x+xandy!y+y;then we can examine the change infusing
a two dimensional form of Taylor
f(x+x;y+y) =f(x;y)+fxx+fyy+
1
2
fxxx
2
+
1
2
fyyy
2
+
fxy
xy+O

x
2
;y
2

:
By takingf(x;y)to the lhs, writing
df=f(x+x;y+y)f(x;y)
and considering only linear terms, i.e.
df=
@f
@x
x+
@
f
@y
y
w
e obtain a formula for thedi¤erentialortotal changeinf:Page 125 2 Introduction to Linear Algebra
2.1 Properties of Vectors
We consider realndimensional vectors belonging to the setR
n
. An
ntuple
v
= (v1;
v2; :::::::::; v n)2R
n
is a vector of dimensionn:The elementsvi(i= 1;::::;n )are called
components ofv
:
Any pairu; v
2R
n
are
equal i¤the corresponding componentsui’s andvi’s
are equalPage 126 Examples:
u
1= (1
;0); u
2=

1; e;
p
3;6

; u
3= (3
;4); u
4= (;ln
3;2;1)
1.u
1; u
32R
2
andu
2; u
42R
4
2.(x+y; xz
;2z1) = (3 ;2;5):For equality to hold correspond-
ing components are equal, so
x+y= 3
xz=2
2z1 = 5
9
>
=
>
;
)x= 1;y= 2;z= 3Page 127

2.1.1 Vector Arithmetic
Letu
; v
2R
n
:Thenvector
additionis de…ned as
u
+v
= (u1+v1; u2+v2;
:::::::::::; u n+vn)
Ifk2Ris any scalar then
ku
= (ku1;
ku2; :::::::::::; ku n)
Note:vector addition only holds if the dimensions of each are identical.
Examples:
u
=(3;1;2;0); v
=(5;5;1;2); w
=(0;5;3;1)
1.u
+v
= (3
+5;15;2+1;0+2) = (8 ;4;1;2)Page 128 2.2w
= (2
:0;2:(5);2:3;2:1) = (0 ;10;6;2)
3.u
+v
2w
=(8;4;1;2)(0;10;6;2) = (8
;6;7;0)
0
=(0;0; :::::;0)is thezero
vector .
Vectors can also be multiplied together using the dot product. Ifu
; v
2R
n
then the
dot product denoted byu
:v
is
u
:v
=u1v1+u2v2+:::::::::::: +unvn2R
which
is clearly a scalar quantity. The operation is commutative , i.e.
u
:v
=v
:u
If a p
air of vectors have a scalar product which is zero, they are said to be
orthogonal.
Geometrically this means that the two vectors are perpendicular to each other.Page 129 2.1.2 Concept of Length inR
n
Recall in 2-Du
=(x1;y1)x
y
x1
y1
q
u
The len
gth ormagnitudeofu
;writtenjujis given b
y Pythagoras
juj=
q
(x1)
2
+(y1)
2Page 130 and theanglethe vector makes with the horizontal is
= arctan
y1
x1
:
Any vecto
ru
can b
e expressed as
u
=jujbu
wherebu
is theunit
vectorbecausejbu
j= 1:
Given
any two vectorsu
; v
2R
2
;we c
an calculate the distance between them
jv
u
j=j(v1; v2)(u1;
u2)j
=
q
(v1u1)
2
+(v2u2)
2Page 131

x
y
u
v
uv In 3D (
orR
3
) a vectorv
=(x1; y1; z1)has
length/magnitude
jvj=
q
(x1)
2
+(y1)
2
+(z1)
2
:
To e
xtend this toR
n
;is similar.
Considerv
= (v1; v2;
:::::::::; v n)2R
n
:The length ofv
is called
thenormPage 132 and denotedkv
k;where
kv
k=
q
(v1)
2
+(v2)
2
+::::::::+
(vn)
2
Ifu
; v
2R
n
then th
e distance betweenu
andv
is can
be obtained in a similar
fashion
kv
u
k=
q
(v1u1)
2
+(v2u2)
2
+::::::::+
(vnun)
2
Wementionedearlierthattwovectorsu
andv
intw
odimensionareorthogonal
ifu
:v
= 0:
The idea
comes from the de…nition
u
:v
=juj:jvjcos:Page 133 Re-arranginggivestheanglebetweenthetwovectors. Notewhen==2,u
:v
=
0:
Ifu
; v
2R
n
we wri
te
u
:v
=jju
jj:jjvjjcos
Examples:Consider the following vectors
u
= (2 ;
1;0;3); v
= (1
;1;1;3);
w
=(1;3;2;2)
ku
k=
q
(2)
2
+(1)
2
+
(0)
2
+(3)
2
=
p
14Page 134 Distancebetweenv
&w
=kw
v
k=
q
(11)
2
+(3(
1))
2
+(2(1))
2
+(23)
2
= 3
p
2
The an
gle betweenu
&w
can b
e obtained from
cos=
u
:v
jju
jj jj vjj
:
Hence
cos=
(2;1;0;3):(1;1;1;3)
2
p
3
p
14
=
s
3
14
!
= cos
1


q
3
14
Page 135

2.2 Matrices
Amatrixis a rectangular arrayA=

ai j

fori= 1;:::;m;j= 1;:::;n
written
A=
0
B
B
B
B
B
B
B
B
@
a11a12:: :: :: a1n
a21a22:: :: :: a2n
::
::
::: :: :: :::
am1am2:: :: :: amn
1
C
C
C
C
C
C
C
C
A
and is an(mn)matrix, i.e.mrows andncolumns.
Ifm=nthe matrix is calledsquare . The productmngives the number of
elements in the matrix.Page 136 2.2.1 MatrixArithmetic
LetA; B2
m
R
n
A+B=
0
B
B
B
B
B
B
B
B
@
a11a12:::: a1n
a21a22:::: a2n
: : : :
: : : :
::: :::::
am1am2:::: amn
1
C
C
C
C
C
C
C
C
A
+
0
B
B
B
B
B
B
B
B
@
b11b12:: :: b1n
b21b22:: :: b2n
: : : : :
: : : : :
::: :: :::
bm1bm2:: :: bmn
1
C
C
C
C
C
C
C
C
A
and the corresponding elements are added to give
0
B
B
B
B
B
B
B
B
@
a11+b11a12+b12:::: a1n+b1n
a21+b21a22++b22:::: a2n+b2n
:: : :
:: : :
::: :::::
am1+bm1am2+bm2:::: amn+bmn
1
C
C
C
C
C
C
C
C
A
=B+APage 137 Matrices can onlyadded if they are of the same form.
Examples:
A=

11 2
0 3 4
!
; B=

4 03
12 3
!
;
C=
0
B
@
23 1
51 2
1 0 3
1
C
A; D=
0
B
@
1 0 0
0 1 0
0 0 1
1
C
A
A+B=

511
1 1 7
!
;C+D=
0
B
@
33 1
5 0 2
1 0 4
1
C
A
We cannot perform any other combination of addition asAandBare
(23)andCandDare(33):Page 138 2.2.2 MatrixMultiplication
To multiply two square matricesAandB;so thatC=AB;the elements of
Care found from the recipe
Cij=
N
X
k=1
A
ikB
kj:
That is, thei
th
row ofAis dotted with thej
th
column ofB:For example,

a b
c d
!
e f
g h
!
=

ae+bg af+bh
ce+dg cf+dh
!
:
Note that in generalAB6=BA:The general rule for multiplication is
ApnBnm!CpmPage 139

Example:

2
1 0
2 0 2
!
0
B
@
1 2
0 3
1 2
1
C
A
=

2:1+1:0+0:1 2:2+1:3+0:2
2:1+0:0+2:1 2:2+0:3+2:2
!
=

2 7
4 8
!Page 140 2.2.3 Transpose
Thetransposeof a matrix with entries Aijis the matrix with entries Aji;the
entries are ’re‡ected’across the leading diagonal, i.e. rows become columns.
The transpose ofAis writtenA
T
:IfA=A
T
thenAissymmetric . For
example, of the matrices
A=

1 2
3 4
!
;B=

1 3
2 4
!
;C=ix

1 2
2 1
!
;
we haveB=A
T
andC=C
T
:Note that for any matrixAandB
(i)(A+B)
T
=A
T
+B
T
(ii)

A
T

T
=APage 141 (iii)(kA)
T
=kA
T
; kis a scalar
(iv)(AB)
T
=B
T
A
T
Example:
A=
0
B
@
2 1
1 2
2 2
1
C
A!A
T
=

2 1 2
1 2 2
!
Askew-symmetricmatrix has the propertyaij=ajiwithaii= 0:For
example
0
B
@
0 34
3 0 1
41 0
1
C
APage 142 2.2.4 MatrixRepresentation of Linear Equations
We begin by considering a two-by-two set of equations for the unknowns x
andy:
ax+by=p
cx+dy=q
The solution is easily found. To get x;multiply the …rst equation byd;the
second byb;and subtract to eliminatey:
(adbc)x=dpbq:
Then …ndy:
(adbc)y=aqcp:
This works and gives a unique solution as long asadbc6= 0:
Ifadbc= 0;the situation is more complicated: there may be no solution
at all, or there may be many.Page 143

Examples:
Here is a system with a unique solution:
xy= 0
x+y= 2
The solution isx=y= 1:
Now try
xy= 0
2x2y= 2
Obviously there is no solution: from the …rst equation x=y;and putting this
into the second gives0 = 2 :Hereadbc= 1( 2)(1)2 = 0 :
Also note what is being said:
x=y
x= 1+y
)
Impossible.Page 144 Lastly try
xy= 1
2
x2y= 2:
The second equation is twice the …rst so gives no new information. Any x
andysatisfying the …rst equation satisfy the second. This system has many
solutions.
Note: If we have one equation for two unknowns the system is undetermined
and has many solutions. If we havethreeequations for two unknowns, it is
over-determined and in general has no solutions at all.
Then the general(22)system is written

a b
c d
!
x
y
!
=

p
q
!Page 145 or
Ax
=p
:
The eq
uations can be solved if the matrix Aisinvertible.This is the same
as saying that itsdeterminant





a b
c d





=adbc
is not zero.
These concepts generalise to systems of Nequations inNunknowns. Now
the matrixAisNNand the vectorsxandphaveNentries.Page 146 Here aretwo special forms forA:One is thennidentity matrix,
I =
0
B
B
B
B
B
B
@
1 0 0:::0
0 1 0:::
0 0 1:::
...
.. .
...0
0:::0 1
1
C
C
C
C
C
C
A
:
The other is thetridiagonal form.This is common in …nite di¤erence
numerical schemes.
A=
0
B
B
B
B
B
B
B
B
@
0 0

...
...
...
.. .
0
...
...
...
...
. . .
. . .
...
...
...
...0
. . .
...
...
...
0 0
1
C
C
C
C
C
C
C
C
A
There is amain diagonal , one above and below called thesuper diagonaland
sub-diagonalin turn.Page 147

To conclude:System of Linear Equations
In con sist ent ConsistentConsistent
NoSolution
UniqueSolution ManySolutions
nE>
nE=
 variablefree
nE< whereE=number
of equations andn=unknowns.
The theory and numerical analysis of linear systems accounts for quite a large
branch of mathematics.Page 148 2.3 Using Matrix Notation For Solving Linear Systems
The usual notation for systems of linear equations is that of matrices and
vectors. Consider the system
ax+by+cz=p()
dx+ey+fz=q
gx+hy+iz=r
for the unknown variablesx; y; z. We gather the unknownsx; yandzand
the givenp; qandrinto vectors:
0
B
@
x
y
z
1
C
A;
0
B
@
p
q
r
1
C
A
and put the coe¢ cients into a matrix
A=
0
B
@
a b c
d e f
g h i
1
C
A:Page 149 Ais called thecoe¢ cient matrixof the linear system()and the special
matrix formed by
0
B
@
a b c
d e f
g h i







p
q
r
1
C
A
is called theaugmented matrix.Page 150 Now considera generallinearsystem consistingof nequationsinnunknowns
which can be written in augmented form as
0
B
B
B
B
B
B
B
B
@
a11a12:: :: :: a1n
a21a22:: :: :: a2n
::
::
::: :: :: :::
an1an2 ann














b1
b2
:
:
:
bn
1
C
C
C
C
C
C
C
C
A
:
We can perform a series of row operations on this matrix and reduce it to a
simpli…ed matrix of the form
0
B
B
B
B
B
B
B
B
@
a11a12:: :: :: a1n
0a22:: :: :: a2n
0 0:
0 0 0 :
::: :: :: :::
0 0 0ann














b1
b2
:
:
:
bn
1
C
C
C
C
C
C
C
C
A
:
Such a matrix is said to be of echelon formif the number of zeros precedingPage 151

the …rstnon-zero entry of each row increases row by row.
A matrixAis said to berow equivalentto a matrixB;writtenAB
ifBcan be obtained fromAfrom a …nite sequence of operations called
elementary row operationsof the form:
[ER1]: Interchange thei
th
andj
th
rows:Ri$Rj
[ER2]:Replace thei
th
row by itself multiplied by a non-zero constant k:
Ri!kRi
[ER3]:Replacethei
th
rowbyitselfplusktimesthej
th
row:Ri!Ri+kRj
These have no a¤ect on the solution of the of the linear system which gives
the augmented matrix.Page 152 Examples:
Solve the following linear systems
1.
2x+y2z= 10
3x+2y+2z= 1
5x+4y+3z= 4
9
>
=
>
;
Ax
=b
with
A=
0
B
@
2 12
3
2 2
5 4 3
1
C
Aandb
=
0
B
@
10
1
4
1
C
A
The au
gmented matrix for this system is
0
B
@
2 12
3 2 2
5 4 3







10
1
4
1
C
A
R2!2R23R1

R3!2R35R1
0
B
@
2 12
0 1 10
0 3 16







10
28
42
1
C
APage 153 R3!R33R2

R1!R1R2
0
B
@
2 012
0
1 10
0 014







38
28
42
1
C
A
14z= 42!z=3
y+10z=28!y=28+30 = 2
x6z= 19!x= 1918 = 1
Therefore solution is unique with
x
=
0
B
@
1
2
3
1
C
APage 154 2.
x+2y3z=
6
2xy+4z= 2
4x+3y2z= 14
9
>
=
>
;
0
B
@
1 23
21 4
4 32







6
2
14
1
C
A
R2!R22R1

R3!R34R1
0
B
@
1 23
05 10
05 10







6
10
10
1
C
A
R3!R3R2

R2!0:5R2
0
B
@
1 23
0 12
0 0 0







6
2
0
1
C
A
Number of equations is less than number of unknowns.
y2z= 2soz=ais a free variable )y= 2(1+a)
x+2y3z= 6!x= 62y+3z= 2aPage 155

)x= 2a;y=
2(1+a);z=a
Therefore there are many solutions
x
=
0
B
@
2a
2(1+a)
a
1
C
APage 156 x+2y3z=
1
3xy+2z= 7
5x+3y4z= 2
9
>
=
>
;
0
B
@
1 23
31 2
5 34







1
7
2
1
C
A
R2!R23R1

R3!R35R1
0
B
@
1 23
07 11
07 11







1
10
7
1
C
A
R3!R3R2

0
B
@
1 23
07 11
0 0 0







1
10
3
1
C
A
The last line reads0 =3:Also middle iteration shows that the second
and third equations are inconsistent.
Hence no solution exists.Page 157 2.4 Matrix Inverse
Theinverseof a matrixA, writtenA
1
;satis…es
AA
1
=A
1
A=I:
It may not always exist, but if it does, the solution of the system
Ax=p
is
x=A
1
p:
The inverse of the matrix for the special case of a 22matrix

a b
c d
!
=
1
adbc

db
c a
!
pr
ovided thatadbc6= 0:Page 158 The inverse of anynnmatrixAis de…ned as
A
1
=
1
jAj
adjA
where adjA=
h
(
1)
i+j


M
ij



i
T
is the adjoint, i.e. we form the matrix of A’s
cofactors and transpose it.
M
ijisthesquaresub-matrixobtainedby"coveringthei
th
rowandj
th
column",
and its determinant is called the Minorof the elementa
ij. The termA
ij=
(1)
i+j


M
ij


is then called thecofactorofa
ij:
Consider the following example with
A=
0
B
@
1 1 0
1 2 1
0 1 3
1
C
A
So the determinant is given byjAj=Page 159

(1)
1+1
A11jM11j+(1)
1+2
A12jM12j+(1)
1+3
A13jM13j
= 1





2 1
1 3





1





1 1
0 3





+0





1 2
0 1





= (2311)(1310)+0 = 53
= 2
Here we have expanded about the 1
st
row - we can do this about any row. If
we expand about the 2
nd
row - we should still get jAj= 2:
We now calculate the adjoint:
(1)
1+1
M11= +





2 1
1 3





(1)
1+2
M12=





1 1
0 3





(1)
1+3
M13= +





1 2
0 1





(1)
2+1
M21=





1 0
1 3





(1)
2+2
M22= +





1 0
0 3





(1)
2+3
M23=





1 1
0 1





(1)
3+1
M31= +





1 0
2 1





(1)
3+2
M32=





1 0
1 1





(1)
3+3
M33= +





1 1
1 2




Page 160 adjA=
0
B
@
53 1
3
31
11 1
1
C
A
T
We can now write the inverse ofA(which is symmetric)
A
1
=
1
2
0
B
@
53 1
3
31
11 1
1
C
A
Elementary row operations (as mentioned above) can be used to simplify a
determinant, as increased numbers of zero entries present, requires less calcu-
lation. There are two important points, however. Suppose the value of the
determinant isjAj;then:
[ER1]:Ri$Rj) jAj ! j Aj
[ER2]:Ri!kRi) jAj !kjAjPage 161 2.5 Orthogonal Matrices
A matrixPisorthogonalif
PP
T
=P
T
P=I:
This means that the rows and columns ofPare orthogonal and have unit
length. It also means that
P
1
=P
T
:
In two dimensions, orthogonal matrices have the form

cossin
sincos
!
or

cossin
sincos
!
for some angleand they correspond to rotations or re‡ections.Page 162 So rows and columns being orthogonal means rowirowj= 0;i.e. they are
perpendicular to each other.
(cos;sin)(sin;cos) =
cossin+sincos= 0
(cos;sin)(sin;cos) =
cossinsincos= 0
v
= (cos
;sin)
T
! jv
j= cos
2
+
(sin)
2
= 1
Finally, ifP=

cossin
sincos
!
then
P
1
=
1
cos
2


sin
2


|
{z
}
=1

cossin
sincos
!
=P
T
:Page 163

2.6 Eigenvalues and Eigenvectors
IfAis a square matrix, the problem is to …nd values of (eigenvalue ) for
which
Av
=v
has a non
-trivial vector solution v
(eigenvector). W
e can write the above as
(AI)v
=0:
AnNNmatrix
has exactlyNeigenvalues, not all necessarily real or
distinct; they are the roots of the characteristic equation
det(AI) = 0 :
Each solution has a corresponding eigenvector v
:det(A
I)is thechar-
acteristic polynomial.Page 164 The eigenvectors can be regarded as special directions for the matrix A:In
complete generality this is a vast topic. Many Boundary-Value Problems can
be reduced to eigenvalue problems.
We will just look at real symmetric matrices for which A=A
T
. For these
matrices
The eigenvalues are real;
The eigenvectors corresponding to distinct eigenvalues are orthogonal;
The matrix can bediagonalised:that is, there is an orthogonal matrix
Psuch that
A=PDP
T
orP
T
AP=DPage 165 whereDisdiagonal , that i
s only the entries on the leading diagonal are
non-zero, and these are equal to the eigenvalues of A:
D=
0
B
B
B
B
B
B
@
10 0 0 0
0
...
...
...0
0
...
...
...0
0
...
...
...0
0 0 0 0n
1
C
C
C
C
C
C
A
Example:
A=
0
B
@
3 3 3
31 1
3 11
1
C
A
then so that the eigenvalues, i.e. the roots of this equation, are 1=3;
2=2and3= 6:Page 166 Eigenvectors are now obtained from
0
B
@
3i3 3
31i1
3 11i
1
C
Av
i=
0
B
@
0
0
0
1
C
Ai= 1;2
;3
1=3 :
0
B
@
6 3 3
3 2 1
3 1 2
1
C
A
0
B
@
x
y
z
1
C
A=
0
B
@
0
0
0
1
C
A
Upon row reduction we have
0
B
@
2 1 1
0 11
0 0 0







0
0
0
1
C
A!y=z;so putz=a
and2x=yz!x=)v
1=
0
B
@
1
1
1
1
C
A
SimilarlyPage 167

2=2 :v
2=
0
B
@
0
1
1
1
C
A; 3= 6
:v
3=
0
B
@
2
1
1
1
C
A
If we
take=== 1the corresponding eigenvectors are
v
1=
0
B
@
1
1
1
1
C
A;v
2=
0
B
@
0
1
1
1
C
A;v
3=
0
B
@
2
1
1
1
C
A
Now n
ormalise these, i.e.jv
j= 1:Us
ebv
=v
=jv
jfor no
rmalised eigen-
vectors
bv
1=
1
p
3
0
B
@
1
1
1
1
C
A;bv
2=
1
p
2
0
B
@
0
1
1
1
C
A;bv
3=
1
p
6
0
B
@
2
1
1
1
C
A
Hence
P=
0
B
B
B
@
1
p
3
0
2
p
6

1
p
3
1
p
2
1
p
6

1
p
3

1
p
2
1
p
6
1
C
C
C
A
!P
T
=
0
B
B
B
@
1
p
3

1
p
3

1
p
3
0
1
p
2

1
p
2
2
p
6
1
p
6
1
p
6
1
C
C
C
APage 168 so that
P
T
AP=
0
B
@
3
0 0
02 0
0 0 6
1
C
A
=D:Page 169 2.6.1 Criteria for invertibility
A system of linear equations is uniquely solvable if and only if the matrix A
is invertible. This in turn is true if any of the following is:
1. If and only if the determinant is non-zero;
2. If and only if all the eigenvalues are non-zero;
3. If (but not only if) it is strictly diagonally dominant.
In practise it takes far too long to work out the determinant. The second
criterion is often useful though, and there are quite quick methods for working
out the eigenvalues. The third method is explained on the next page.Page 170 A matrixAwith entriesAijis strictly diagonally dominant if
jAiij>
X
j6=i


Aij


:
That is, the diagonal element in each row is bigger in modulus that the sum
of the moduli of the o¤-diagonal elements in that row. Consider the following
examples:
0
B
@
2 0 1
1 4 2
1 3 6
1
C
Ais s.d.d. and so invertible;
0
B
@
1 0 2
2 5 1
3 2 13
1
C
Ais not s.d.d. but still invertible;

1 1
1 1
!
is neither s.d.d. nor invertible.Page 171

3 Di¤erentialEquations
3.1 Introduction
2 Types of Di¤erential Equation (D.E)
(i) Ordinary Di¤erential Equation (O.D.E)
Equation involving (ordinary) derivatives
x; y;
dy
dx
;
d
2
y
dx
2
; ::::::::;
d
n
y
dx
n
(some …xedn
)
yis some unknown function ofxtogether with its derivatives, i.e.Page 172 F

x; y;
y
0
; y
00
; ::::::; y
(n)

= 0(1)
Notey
4
6=y
(4)
Also ify=y(t), wheretis time, then we often write

y=
dy
dt
,

y=
d
2
y
dt
2
, ......
,

y=
d
4
y
dt
4Page 173 (ii) Partial Di¤erential Equation (PDE)
Involve partial derivatives, i.e. unknown function dependent on two or more
variables,
e.g.
@u
@t
+
@
2
u
@x@y
+
@
u
@z
u= 0
So
here we solving for the unknown function u(x;y;z;t ):
More complicated to solve - better for modelling real-life situations, e.g. …-
nance, engineering & science.
In quant …nance there is no concept of spatial variables, unlike other branches
of mathematics.Page 174 Orderof thehighest derivative is the order of the DE An od
e is ofdegree
rif
d
n
y
dx
n
(wher
enis the order of the derivative) appears
with powerr

rZ
+

the de…nition ofnandris distinct. Assume that any ode has the
property that each
d
`
y
dx
`
appea
rs in the form

d
`
y
dx
`
!
r
!

d
n
y
dx
n
!
r
orde
rnand degreer.Page 175

Examples:
DEord
er degree
(1)y
0
= 3y1 1
(2)

y
0

3
+4siny=x
3
1 3
(3)

y
(4)

2
+x
2

y
(2)

5
+

y
0

6
+y= 04 2
(4)y
00
=
p
y
0
+y+x2 2
(5)y
00
+x

y
0

3
xy= 02
1
Note - example (4) can be written as

y
00

2
=y
0
+y+xPage 176 We willconsider ODE’s of degree one, and of the form
an(x)
d
n
y
dx
n+an1(x)
d
n1
y
dx
n1
+::::+a1(x)
dy
dx
+a0(x)y=g(x)

n
X
i=0
ai(x)y
(i)
(x) =g(x
)(more pedantic)
Note:y
(0)
(x)- zeroth derivative, i.e. y(x):
This is a Linear
ODE of o
rdern, i.e.r= 18(for all) terms. Linear also
becauseai(x)not a function ofy
(i)
(x)- else equation is Non-linear.Page 177 Examples:
DENature of
DE
(1)2xy
00
+x
2
y
0
(x+1)y=x
2
Linear
(2)yy
00
+xy
0
+y= 2a2=y)Non-Linear
(3)y
00
+
p
y
0
+y=x
2
Non-Linear*

y
0
1
2
(4)
d
4
y
dx
4
+y
4
= 0Non-Linea
r -y
4
Our aim is to solve our ODE either explicitly
or b
y …nding the most general
y(x)satisfying it or implicitly
by …
nding the functionyimplicitly in terms of
x, via the most general functiongs.tg(x;y) = 0 .Page 178 Supposethatyis given in terms ofxandnarbitrary constants of integration
c1,c2, .......,cn.
Soeg(x; c1; c2;:::::::;cn) = 0. Di¤erentiatingeg,ntimes to get(n+1)
equations involving
c1; c2;:::::::;cn; x; y; y
0
; y
00
; ::::::; y
(n)
.
Eliminatingc1; c2;:::::::;cnwe get an ODE
e
f

x; y; y
0
; y
00
; ::::::; y
(n)

= 0Page 179

Examples:
(1)y=x
3
+ce
3
x
(so 1 constantc)
)
dy
dx
= 3x
2
3ce

3x
, so eliminatecby taking3y+y
0
= 3x
3
+3x
2
;i.e.
3x
2
(x+1)+3y+y
0
= 0
(2)y=c1e
x
+c2e
2x
(2 constant’s so di¤erentiate twice)
y
0
=c1e
x
+2c2e
2x
)y
00
=c1e
x
+4c2e
2x
Now
y+y
0
= 3c2e
2x
(a)
y
0
+y
00
= 6c2e
2x
(b)
)
and2(a)=(b))2

y+y
0

=y+y
00
!
y
00
2y
0
y= 0.Page 180 Converselyitcanbeshown(undersuitableconditions)thatthegeneralsolution
of an n
th
order ode will involvenarbitrary constants. If we specify values (i.e.
boundary values) of
y; y
0
; :::::::::::; y
(n)
for values ofx, then the constants involved may be determined.Page 181 A solutiony=y(x)of(1)is a function that produces zero upon substitution
into the lhs of(1).
Example:
y
00
3y
0
+2y= 0is a 2
nd
order equation andy=e
x
is a solution.
y=y
0
=y
00
=e
x
- substituting in equation gives e
x
3e
x
+ 2e
x
= 0. So
we can verify that a function is the solution of a DE simply by substitution.
Exercise:
(1) Isy(x) =c1sin2x+c2cos2 x(c1,c2arbitrary constants) a solution of
y
00
+4y= 0
(2) Determine whethery=x
2
1is a solution of

dy
dx
4
+y
2
=1Page 182 3.1.1 Initial & Boundary Value Problems
A DE together with conditions, an unknown function y(x)and its derivatives,
all given at the same value of independent variable xis called anInitial Value
Problem(IVP).
e.g.y
00
+2y
0
=e
x
;y() = 1 ,y
0
() = 2is an IVP because both conditions
are given at the same valuex=.
ABoundary Value Problem(BVP) is a DE together with conditions given
at di¤erent values ofx, i.e.y
00
+2y
0
=e
x
;y(0) = 1 ,y(1) = 1 .
Here conditions are de…ned at di¤erent values x= 0andx= 1.
A solution to an IVP or BVP is a function y(x)that both solves the DE and
satis…es all given initial or boundary conditions.Page 183

Exercise:Determine whether any of the following functions
(a)y1= sin2 x(b)y2=x(c)y3=
1
2
sin2xis
a solution of the IVP
y
00
+4y= 0;y(0) = 0 ; y
0
(0) = 1Page 184 3.2 First OrderOrdinary Di¤erential Equations
Standard form for a …rst order DE (in the unknown function y(x)) is
y
0
=f(x;y)(2)
so given a 1
st
order ode
F

x;y;y
0

= 0
can often be rearranged in the form (2), e.g.
xy
0
+2xyy= 0)y
0
=
y2x
xPage 185 3.2.1 OneVariable Missing
This is the simplest case
ymissing:
y
0
=f(x)solution isy=
Z
f(x)dx
xmissing:
y
0
=f(y)solution isx=
Z
1
f(y)
dy
Example:
y
0
= c
os
2
y,y=

4
whenx= 2
)x=
Z
1
cos
2
y
dy=
Z
sec
2
y dy)x=
tany+c,
cis a constant of integration.Page 186 This is the general solution. To obtain a particular solution use
y(2) =

4
!2 = tan

4
+c)c= 1
so rea
rranging gives
y= arctan( x1)Page 187

3.2.2 Variable Separable
y
0
=g(x)h(y)(3)
Sof(x;y) =g(x)h(y)wheregandhare functions ofxonly andyonly
in turn. So
dy
dx
=g(x)h(y)!
Z
dy
h(y)
=
Z
g(x)dx+c
carbitr
ary constant.
Two examples follow on the next page:Page 188 dy dx
=
x
2
+2
y
Z
y dy=
Z

x
2
+2

dx!
y
2
2
=
x
3
3
+2x+c
dy
dx
=ylnxsubject
toy= 1atx=e(y(e) = 1)
Z
dy
y
=
Z
lnx dxRecall
:
Z
lnx dx=x(lnx1)
lny=x(lnx1)+c!y=Aexp(xlnxx)
Aarb. constant
now puttingx=e,y= 1givesA= 1. So solution becomes
y= exp(lnx
x
)exp(x)!y=
x
x
e
x
)y=

x
e

xPage 189 3.2.3 Linear Equations
These are equations of the form
y
0
+P(x)y=Q(x)(4)
which are similar to(3), but the presence ofQ(x)renders this no longer
separable. We look for a functionR(x), called anIntegrating Factor(I.F)
so that
R(x)y
0
+R(x)P(x)y=
d
dx
(R(x)y)
So up
on multiplying the lhs of(4), it becomes a derivative of R(x)y, i.e.
R y
0
+RPy=Ry
0
+R
0
y
from(4):Page 190 This givesRPy=R
0
y)R(x)P(x) =
dR
dx
, which i
s a DE forRwhich is
separable, hence
Z
dR
R
=
Z
Pdx+c!lnR=
Z
Pdx+c
SoR(
x) =Kexp(
R
P dx), hence there exists a functionR(x)with the
required property. Multiply(4)through byR(x)
R(x)
h
y
0
+P(x)y
i
|
{z
}
=
d
dx
(R(x)y)
=R(x)Q(x)
d
dx
(Ry) =R(
x)Q(x)!R(x)y=
Z
R(x)Q(x)dx+B
Barb. constant.
We also know the form ofR(x)!
yKexp
Z
P dx

=
Z
Kexp
Z
P dx

Q(x)dx+B:Page 191

Divide through byKto give
yexp
Z
P dx

=
Z
exp
Z
P dx

Q(x)dx+constant.
So we can takeK= 1in the expression forR(x).
To solvey
0
+P(x)y=Q(x)calculate
R(x) = exp
(
R
P dx)
, which i
s the
I.F.Page 192 Examples:
1. Solvey
0

1
x
y=x
2
In this c
ase c.f(4)givesP(x)
1
x
&Q(x)x
2
, the
refore
I.FR(x) = exp

R

1
x
dx

= ex
p(lnx) =
1
x
. Multipl
y DE by
1
x
!
1
x

y
0

1
x
y

=x)
d
dx

y
x

=x!
Z
d

x
1
y

=
Z
xdx+c
)
y
x
=
x
2
2
+c)GS isy=
x
3
2
+cxPage 193 2. Obtain the general solution of (1+ye
x
)
dx
dy
=e
x
dy
dx
= (1
+ye
x
)e
x
=e
x
+y)
dy
dx
y=e
x
Which is
a linear equation, withP=1;Q=e
x
I.FR(y) = exp
Z
dx

=e
x
so multiplying DE by I.F
e
x

y
0
y

=e
2x
!
d
dx

ye
x

=e

2x
)
Z
d

ye
x

=
Z
e
2x
dx
ye
x
=
1
2
e
2x
+c)y=ce
x

1
2
e
x
is the G
S.Page 194 3.3 SecondOrder ODE’s
Typical second order ODE (degree 1) is
y
00
=f

x;y;y
0

solution involves two arbitrary constants.
3.3.1 Simplest Cases
Ay
0
,ymissing, so
y
00
=f(x)
Integ
rate wrtx(twice):y=
R
(
R
f(x)dx)dx
Example:y
00
= 4xPage 195

GSy=
ZZ
4x dx

dx=
Z
h
2x
2
+C
i
dx=
2
x
3
3
+Cx+D
Bymissi
ng, so
y
00
=f

y
0
; x

PutP=y
0
!y
00
=
dP
dx
=f(P;x
), i.e.P
0
=f(P;x)- …rst order ode
Solve once!P(x)
Solve again!y(x)
Example: Solvex
d
2
y
dx
2
+2
dy
dx
=x
3
Note:
A
is a sp
ecial case of
BPage 196 Cy
0
andxmissing,so
y
00
=f(y)
Putp=y
0
, th
en
d
2
y
dx
2
=
dp
dx
=
dp
dy
dy dx
=p
dp
dy
=f(y)
So solve
1st order ode
p
dp
dy
=f(y)
which is
separable, so
Z
p dp=
Z
f(y)dy!Page 197 12
p
2
=
Z
f(y)dy+const.
Example:Solvey
3
y
00
= 4
)y
00
=
4
y
3
. Putp=y
0
!
d
2
y
dx
2
=p
dp
dy
=
4
y
3
)
R
p dp=
Z
4
y
3
dy)p
2
=
4
y
2
+D)p=

q
Dy
2
4
y
, so f
rom our
de…nition ofp,
dy
dx
=

q
Dy
2
4
y
)
Z
dx=
Z
y
q
Dy
2
4
dyPage 198 Integrate rhs by substitution (i.e. u=Dy
2
4) to give
x=

q
Dy
2
4
D
+E!
h
D(xE)
2
i
=Dy
2
4
)GS isD
y
2
D
2
(xE)
2
= 4
Dxmissing:y
00
=f

y
0
; y

PutP=y
0
, so
d
2
y
dx
2
=P
dP
dy
=f(P;y)-
1
st
order ODEPage 199

3.3.2 Linear ODE’s of Order at least2
General n
th
order linear ode is of form:
an(x)y
(n)
+an1(x)y
(n1)
+:::::::+a1(x)y
0
+a0(x)y=g(x)
Use symbolic notation:
D
d
dx
;D
r

d
r
dx
r
soD
r
y
d
r
y
dx
r
)arD
r
ar(x)
d
r
dx
r
so
arD
r
y=ar(x)
d
r
y
dx
r
Now i
ntroduce
L=anD
n
+an1D
n1
+an2D
n2
+:::::::::::: +a1D+a0
so we can write a linear ode in the form
L y=gPage 200 LLinearDi¤erential Operator of ordern
and its d
e…nition will be used
throughout.
Ifg(x) = 08x, thenL y= 0is said to beHOMOGENEOUS.
L y= 0is said to be the homogeneous part of L y=g:
Lis a linear operator because as is trivially veri…ed:
(1)L(y1+y2) =L(y1)+L(y2)
(2)L(cy) =cL(y)c2R
GS ofLy=gis given by
y=yc+ypPage 201 whereycComplimentary Function &ypParticular Integral (or Particular
Solution)
ycis solution ofLy= 0
ypis solution ofLy=g
)
)GSy=yc+yp
Look at homogeneous caseLy= 0. Puts=all solutions ofLy= 0. Then
sforms a vector space of dimensionn. Functionsy1(x); :::::::::::; y n(x)
are LINEARLY DEPENDENT if91; :::::::::; n2R(not all zero) s.t
1y1(x)+2y2(x) +::::::::::: +nyn(x) = 0
Otherwiseyi’s(i= 1; :::::; n)are said to be LINEARLY INDEPENDENT
(Lin. Indep.))whenever
1y1(x)+2y2(x) +::::::::::: +nyn(x) = 08x
then1=2=:::::::::=n= 0:Page 202 FACT: (1)Ln
th
orde
r linear operator, then 9nLin. Indep. solutionsy1; :::::; yn
ofLy= 0s.t GS ofLy= 0is given by
y=1y1+2y2+::::::::::: +nyni2R
1in
.
(2) AnynLin. Indep. solutions ofLy= 0have this property.
To solveLy= 0we need only …nd by "hook or by crook"nLin. Indep.
solutions.Page 203

3.3.3 Linear ODE’s with Constant Coe¢ cients
Consider Homogeneous case:Ly= 0.
All basic features appear for the case n= 2, so we analyse this.
L y=a
d
2
y
dx
2
+b
dy
dx
+cy= 0a; b
; c2R
Try a solution of the formy= exp( x)
L

e
x

=

aD
2
+bD+c

e
x
hencea
2
+b+c= 0and sois a root of the quadratic equation
a
2
+b+c= 0AUXILLIARY EQUATION (A.E)Page 204 There arethree cases to consider:
(1)b
2
4ac >0
So16=22R, so GS is
y=c1exp(1x)
+c2exp(2x)
c1,c2arb.
const.
(2)b
2
4ac= 0
So=1=2=
b
2aPage 205 Clearlye
x
isa solution ofL y= 0- but theory tells us there exist two
solutions for a 2
nd
order ode. So now tryy=xexp(x)
L

xe
x

=

aD
2
+bD+c

xe
x

=

a
2
+b+c

|
{z
}
=0

xe
x

+(2
a+b)
|
{z
}
=0

e
x

= 0
This gives
a 2
nd
solution)GS isy=c1exp(x)+c2xexp(x), hence
y= (c1+c2x)
exp( x)
(3)b
2
4ac <0
So16=22C- Complex
conjugate pair=piqwhere
p=
b
2a
,q=
1
2a
r

b
2
4ac


(6= 0)Page 206 Hence
y=c1exp(p+iq)x+c2exp(piq)x
=c1e
px
e
iq
+c2e
px
e

iq
=e
px

c1e
iqx
+c2e
iqx

Eulers identity givesexp(i) = cosisin
Simplifying (using Euler) then gives the GS
y(x) =e
px
(Acosqx+Bsinqx)
Examples:
(1)y
00
3y
0
4y= 0
Puty=e
x
to obtain A.E
A.E:
2
34 = 0!(4)(+1) = 0)= 4&1- 2
distinctRroots
GSy(x) =Ae
4x
+Be
xPage 207

(2)y
00
8y
0
+16y= 0
A.E
2
8+16 = 0!(4)
2
= 0)= 4,4(2 fold root)
’go up one’, i.e. instead of y=e
x
, takey=xe
x
GSy(x) = ( C+Dx)e
4x
(3)y
00
3y
0
+4y= 0
A.E:
2
3+4 = 0!=
3
p
916 2
=
3i
p
7
2
piq

p=
3
2
,q=
p
7
2
!
y=e
3
2
x

acos
p
7
2
x+bsin
p
7
2
x
!Page 208 3.4 Generaln
th
Order Equation
Consider
Ly=any
(n)
+an1y
(n1)
+::::::::::+a1y
0
+a0y= 0
then
LanD
n
+an1D
n1
+an2D
n2
+:::::::+a1D+a0
soLy= 0and the A.E becomes
an
n
+an1
n1
+::::::::::::: +a1+a0= 0Page 209 Case 1(Basic)
ndistinct roots1; :::::::::; nthene
1x
,e
2x
, ........, e
nx
arenLin. Indep.
solutions giving a GS
y=
1e
1x
+
2e
2x
+::::::::+
ne
nx

iarb.
Case 2
Ifis a realrfold root of the A.E thene
x
,xe
x
,x
2
e
x
,.........., x
r1
e
x
arerLin. Indep. solutions ofLy= 0, i.e.
y=e
x

1+2x+3x
2
::::::::+rx
r1

iarb.Page 210 Case 3
If=p+iqis
ar- fold root of the A.E then so is piq
e
px
cosqx,xe
px
cosqx, .........., x
r1
e
px
cosqx
e
px
sinqx,xe
px
sinqx, ............, x
r1
e
px
sinqx
)
!2rLin. Indep. solutions ofL y= 0
GSy=e
px

c1+c2x+c3x
2
+::::::::::::

cosqx+
e
px

C1+C2x+C3x
2
+::::::::::::

sinqxPage 211

Examples: Findthe GS of each ODE
(1)y
(4)
5y
00
+6y= 0
A.E:
4
5
2
+6 = 0!


2
2


2
3

= 0
So=
p
2,=
p
3- four
distinct roots
)GSy=Ae
p
2x
+Be

p
2x
+Ce
p
3x
+De

p
3x
(Case 1)
(2)
d
6
y
dx
6
5
d
4
y
dx
4
= 0
A.E:
6
5
4
=
0roots:0;0;0;0;
p
5
GSy=Ae
p
5x
+Be

p
5x
+

C+Dx+Ex
2
+F
x
3

(*exp(0) = 1)Page 212 (3)
d
4
y
dx
4
+2
d
2
y
dx
2
+y= 0
A.E:
4
+2
2
+
1 =


2
+1

2
= 0=iis a 2 fold root.
Example of Case (3)
y=Acosx+Bxcosx+Csinx+DxsinxPage 213 3.5 Non-Homogeneous Case - Method of Undetermined
Coe¢ cients
GSy=C.F+P.I
C.F comes from the roots of the A.E
There are three methods for …nding P.I
(a)"Guesswork" - which we are interested in
(b)Annihilator
(c)D-operator MethodPage 214 (a) Guesswork Method
If the rhs of the odeg(x)is of a certain type, we can guess the form of P.I.
We then try it out and determine the numerical coe¢ cients.
The method will work wheng(x)has the following forms
i. Polynomial inx g(x) =p0+p1x+p2x
2
+::::::::::+pmx
m
.
ii. An exponentialg(x) =Ce
kx
(Providedkis not
a ro
ot of A.E).
iii. Trigonometric terms,g(x)has the formsinax,cosax(Providediais
not
a ro
ot of A.E).
iv.g(x)is a combination of i. , ii. , iii. provided g(x)does not contain part
of the C.F (in which case use other methods).Page 215

Examples:
(1)y
00
+3y
0
+2y=
3e
5x
The homogeneous part is the same as in (1), so yc=Ae
x
+Be
2x
. For
the non-homog. part we note thatg(x)has the forme
kx
, so tryyp=Ce
5x
,
andk= 5is not a solution of the A.E.
Substitutingypinto the DE gives
C

5
2
+15+2

e
5x
= 3e
5x
!C=
1
14
)y=Ae
x
+Be
2
x
+
1
14
e
5xPage 216 (2)y
00
+3y
0
+2y=x
2
GSy=C.
F+P.I=yc+yp
C.F: A.E gives

2
+3+2 = 0)=1;2)yc=ae
x
+be
2x
P.I Nowg(x) =x
2
,
so tryyp=p0+p1x+p2x
2
!y
0
p=p1+2p2x!y
00
p= 2p2
Now substitute these in to the DE, ie
2p2+3(p1+2p2x)+2

p0+p1x+p2x
2

=x
2
and equate coe¢ cients of
x
n
O

x
2

: 2p2= 1)p2=
1
2Page 217 O(x) : 6p2+2p1= 0)p1=
3
2
O

x
0

: 2p2+
3p1+2p0= 0)p0=
7
4
)GSy=ae
x
+b
e
2x
+
7
4

3
2
x+
1
2
x
2Page 218 (3)y
00
5y
0
6y= cos3x
A.E:
2
6 = 0)=1,6)yc=e
x
+e
6x
Guided by the rhs, i.e.g(x)is a trigonometric term, we can tryyp=
Acos3 x+Bsin3x;and calculate the coe¢ cientsAandB:
How about a more sublime approach? Putyp= ReKe
i3x
for the unknown
coe¢ cientK:
!y
0
p= 3ReiKe
i3x
!y
00
p=9ReKe
i3x
and substitute into the DE,
droppingRe
(915i6)Ke
i3x
=e
i3x
15(1+i)K= 1
15K=
1
1+i
!K=
1
2
(1i)Page 219

HenceK=
1
30
(1i)to give
yp=
1
30
Re(1i)
(cos3 x+isin3x)
=
1
30
(cos3
x+isin3xicos3 x+sin3 x)
so general solution becomes
y=e
x
+e
6x

1
30
(cos3 x+
sin3 x)Page 220 3.5.1 Failure Case
Consider the DEy
00
5y
0
+ 6y=e
2x
, which has a CF given byy(x) =
e
2x
+e
3x
. To …nd a PI, if we tryyp=Ae
2x
, we have upon substitution
Ae
2x
[410+6] =e
2x
so whenk(= 2)is also a solution of the C.F , then the trial solution yp=
Ae
kx
fails, so we must seek the existence of an alternative solution.Page 221 Ly=y
00
+ay
0
+b=e
kx
- trial function is normally yp=Ce
kx
:
Ifkis a root of the A.E thenL

Ce
kx

= 0so this substitution does not
work. In this case, we try yp=Cxe
kx
- so ’go one up’.
This works providedkis not a repeated root of the A.E, if so try yp=Cx
2
e
kx
,
and so forth ....Page 222 3.6 Linear ODE’s with Variable Coe¢ cients - Euler Equa-
tion
In the previous sections we have looked at various second order DE’s with
constant coe¢ cients. We now introduce a 2
nd
order equation in which the
coe¢ cients are variable in x. An equation of the form
L y=ax
2
d
2
y
dx
2
+x
dy
dx
+cy=g(x)
iscalled
aCauchy-Eulerequation. Notetherelationshipbetweenthecoe¢ cient
andcorrespondingderivativeterm, iean(x) =ax
n
and
d
n
y
dx
n
, i.e
. bothpower
and order of derivative are n.Page 223

The equation is still linear. To solve the homogeneous part, we look for a
solution of the form
y=x

Soy
0
=x
1
!y
00
=(1)x
2
, which upon substitution yields the
quadratic, A.E.
a
2
+b+c= 0
[whereb= (a)]which can be solved in the usual way - there are 3 cases
to consider, depending upon the nature of b
2
4ac.Page 224 Case 1:
b
2
4ac >0!1,22R- 2
real distinct roots
GSy=Ax
1+Bx
2
Case 2:
b
2
4ac= 0!=1=22R- 1
real (double fold) root
GSy=x

(A+Blnx)
Case 3:
b
2
4ac <0!=i2C- pair
of complex conjugate
roots
GSy=x

(Acos( lnx)+Bsin(lnx))Page 225 Example1
Solvex
2
y
00
2xy
0
4y= 0
Puty=x

)y
0
=x

1
)y
00
=(1)x
2
and substitute
in DE to obtain (upon simpli…cation) the A.E.
2
34 = 0!
(4)(+1) = 0
)= 4&1: 2 distinctRroots. So GS is
y(x) =Ax
4
+Bx
1
Example
2
Solvex
2
y
00
7xy
0
+16
y= 0
So assumey=x

A.E
2
8+16 = 0)= 4,4(2 fold root)
’go up one’, i.e. instead of y=x

, takey=x

lnxto give
y(x) =x
4
(A+Blnx)Page 226 Example3
Solvex
2
y
00
3xy
0
+13
y= 0
Assume existence of solution of the form y=x

A.E becomes
2
4+13 = 0!=
4
p
1652 2
=
46i
2
1= 2+
3i,2= 23ii(= 2,= 3)
y=x
2
(Acos(3lnx)+Bsin(3lnx))Page 227

3.6.1 Reduction to constant coe¢ cient
TheEulerequationconsideredabovecanbereducedtotheconstantcoe¢ cient
problemdiscussedearlierbyuseofasuitabletransform. Toillustratethissimple
technique we use a speci…c example.
Solvex
2
y
00
xy
0
+y= lnx
Use the substitutionx=e
t
i.e.t= lnx. We now rewrite the the equation
in terms of the variable t, so require new expressions for the derivatives (chain
rule):
dy
dx
=
dy
dt
dt
dx
=
1
x
dy
dtPage 228 d
2
y dx
2
=
d
dx

dy
dx

=
d
dx

1
x
dy
dt

=
1
x
d
dx
dy dt

1
x
2
dy dt
=
1
x
dt
dx
ddt
dy
dt

1
x
2
dy dt
=
1
x
2
d
2
y dt
2

1
x
2
dy dt
)the Euler equ
ation becomes
x
2

1
x
2
d
2
y dt
2

1
x
2
dy dt
!
x

1
x
dy
dt

+y=t!
y
00
(t)2y
0
(t)+y=t
The s
olution of the homogeneous part , ie C.F. is yc=e
t
(A+Bt):
The particular integral (P.I.) is obtained by using yp=p0+p1tto give
yp= 2+tPage 229 The GSof this equation becomes
y(t) =e
t
(A+Bt)+2+t
which is a function oft. The original problem wasy=y(x), so we use our
transformationt= lnxto get the GS
y=x(A+Blnx)+2+lnx.Page 230 3.7 Partial Di¤erential Equations
The formation (and solution) of PDE’s forms the basis of a large number
of mathematical models used to study physical situations arising in science,
engineering and medicine.
More recently their use has extended to the modelling of problems in …nance
and economics.
We now look at the second type of DE, i.e. PDE’s. These have partial
derivatives instead of ordinary derivatives.
One of the underlying equations in …nance, the Black-Scholes equation for the
price of an optionV(S;t)is an example of a linear PDE
@V
@t
+
1
2

2
S
2
@
2
V
@S
2
+(rD)S
@
V
@S
rV=
0Page 231

providing; D; rare not functions ofVor any of its derivatives.
If we letu=u(x;y);then the general form of a linear 2nd order PDE is
A
@
2
u
@x
2
+B
@
2
u
@x@y
+C
@
2
u
@y
2
+D
@u
@x
+E
@u
@y
+Fu=G
where
the coe¢ cientsA; ::::; Gare functions ofx&y:
When
G(x;y) =
(
0(1) is homogeneous
non-zero(1) is non-homogeneous
hyperbolicB
2
4AC >0
parabolicB
2
4AC= 0
ellipticB
2
4AC <0Page 232 In thecontext of mathematical …nance we are only interested in the 2nd type,
i.e. parabolic.
There are several methods for obtaining solutions of PDE’s.
We look at a simple (but useful) technique:Page 233 3.7.1 Method of Separation of Variables
Without loss of generality, we solve the one-dimensional heat equation
@u
@t
=c
2
@
2
u
@x
2
(*)
for
the unknown functionu(x;t):
In this method we assume existence of a solution which is a product of a
function ofx(only) and a function ofy(only). So the form is
u(x;t) =X(x)T(t):
We substitute this in (*), so
@u
@t
=
@
@t
(X
T) =XT
0
@
2
u
@x
2
=
@
@x

@
@x
(X
T)

=
@
@x

X
0
T

=X
00
TPage 234 Therefore (*) becomes
X T
0
=c
2
X
00
T
dividing through byc
2
X Tgives
T
0
c
2
T
=
X
00
X
:
The RHS is
independent oftand LHS is independent ofx:
So each equation must be a constant. The convention is to write this constant
as
2
or
2
:
There are possible cases:
Case 1
:
2
>0
T
0
c
2
T
=
X
00
X
=
2
leadin
g to
T
0

2
c
2
T= 0
X
00

2
X= 0
)Page 235

which have solutions, in turn
T(t) =kexp

c
2

2
t

X(x) =Acosh(x)+Bsinh( x)
)
So solution is
u(x;t) =X T=kexp

c
2

2
t

fAcosh(x)+Bsinh( x)g
Thereforeu= exp

c
2

2
t

fcosh(x)+sinh( x)g
(=Ak;=Bk)Page 236 Case 2
:
2
<0
T
0
c
2
T
=
X
00
X
=
2
which
gives
T
0
+
2
c
2
T= 0
X
00
+
2
X= 0
)
resulting in the solutions
T=
kexp

c
2

2
t

X=
Acos( x
)+
Bsin(x
)
)
respectively.
Hence
u(x;t) = exp

c
2

2
t

fcos( x)+sin(x)g
where

=
k
A;=
k
B

:Page 237 Case 3
:
2
= 0
T
0
= 0
X
00
=
0
)
!
T(t) =
e
A
X=
e
Bx+
e
C
)
which gives the simple solution
u(x;y) =
b
Ax+
b
C
where

b
A=
e
A
e
B;
b
C=
e
B
e
C

:Page 238

GLOBAL STANDARD IN FINANCIAL ENGINEERING
CERTIFICATE IN
FINAN CE
CQF
Certificate in Quantitative Finance
Subtext t here
A4 PowerPoint cover Portrait2.indd 1 21/10/2011 10:53 1 PROBABILITY
1 Probability
1.1 Preliminaries
Anexperimentis a repeatable process that gives
rise to a number of outcomes.
Aneventis a collection (or set) of one or more out-
comes.
Ansample spaceis the set of all possible outcomes
of an experiment, often denoted .
Example
In an experiment a dice is rolled and the number ap-
pearing on top is recorded.
Thus
=f1;2;3;4;5;6g
IfE1;E2;E3are the events even, odd and prime occur-
ring, then
E1=f2;4;6g
E2=f1;3;5g
E3=f2;3;5g
2

1.1 Preliminaries 1 PROBABILITY
1.1.1 Probability Scale
Probability of an EventEoccurring i.e.P(E) is less
than or equal to 1 and greater than or equal to 0.
0P(E)1
1.1.2 Probability of an Event
The probability of an event occurring is dened as:
P(E) =
The number of ways the event can occur
Total number of outcomes
Example
A fair dice is tossed. The event A is dened asthe
number obtained is a multiple of 3. Determine P(A)
=f1; 2;3;4;5;6g
A=f3;6g
)P(A) =
2
6
1.1.3 The Complimentary Event E
0
An eventEoccurs or it does not. IfEis the event then
E
0
is the complimentary event, i.e. notEwhere
P(E
0
) = 1P(E)
3 1.2 Probability Diagrams 1 PROBABILITY
1.2 Probability Diagrams
It is useful to represent problems diagrammatically. Three
useful diagrams are:
Sample space or two way table
Tree diagram
Venn diagram
Example
Two dice are thrown and their numbers added to-
gether. What is the probability of achieving a total of
8?
P(8) =
5
36
Example
A bag contains 4 red, 5 yellow and 11 blue balls. A
ball is pulled out at random, its colour noted and then
4

1.2 Probability Diagrams 1 PROBABILITY
replaced. What is the probability of picking a red and a
blue ball in any order.
P(Red and Blue) or P(Blue and Red) =

4
20

11
20

+

11
20

4
20

=
11
50
Venn Diagram
A Venn diagram is a way of representing data sets or
events. Consider two events A and B. A Venn diagramto represent these events could be:
A[B"A or B"
5 1.2 Probability Diagrams 1 PROBABILITY
A\B"A and B"
Addition Rule:
P(A[B) =P(A) +P(B)P(A\B)
or
P(A\B) =P(A) +P(B)P(A[B)
Example
In a class of 30 students, 7 are in the choir, 5 are in
the school band and 2 students are in the choir and the
school band. A student is chosen at random from the
class. Find:
a) The probability the student is not in the band
b) The probability the student is not in the choir nor in
the band
6

1.2 Probability Diagrams 1 PROBABILITY
P(not in band) =
5 + 20
30
=
25
30
=
5
6
P(not in either) =
20
30
=
2
3
Example
A vet surveys 100 of her clients, she nds that:
(i) 25 own dogs
(ii) 53 own cats
(iii) 40 own tropical sh
(iv) 15 own dogs and cats
(v) 10 own cats and tropical sh
7 1.2 Probability Diagrams 1 PROBABILITY
(vi) 11 own dogs and tropical sh
(vii) 7 own dogs, cats and tropical sh
If she picks a client at random, Find:
a) P(Owns dogs only)
b) P(Does not own tropical sh)
c) P(Does not own dogs, cats or tropical sh)
P(Dogs only) =
6
100
P(Does not own tropical sh) =
6 + 8 + 35 + 11
100
=
60
100
P(Does not own dogs, cats or tropical sh) =
11
100
8

1.3 Conditional Probability 1 PROBABILITY
1.3 Conditional Probability
The probability of an event B may be dierent if you
know that a dependent event A has already occurred.
Example
Consider a school which has 100 students in its sixth
form. 50 students study mathematics, 29 study biology
and 13 study both subjects. You walk into a biology class
and select a student at random. What is the probability
that this student also studies mathematics?
P(study maths given they study biology) =P(MjB) =
13
29
In general, we have:
9 1.3 Conditional Probability 1 PROBABILITY
P(AjB) =
P(A\B)
P(B)
or,Multiplication Rule:
P(A\B) =P(AjB)P(B)
Example
You are dealt exactly two playing cards from a well
shued standard 52 card deck. What is the probability
that both your cards are Kings ?
Tree Diagram!
P(K\K) =
4
52

3
51
=
1
221
=0:5%
or
P(K\K) =P(2nd is Kingjrst is king)P (rst is king) =
3
51

4
52
We know,
P(A\B) =P(B\A)
10

1.3 Conditional Probability 1 PROBABILITY
so
P(A\B) =P(AjB)P(B)
P(B\A) =P(BjA)P(A)
i.e.
P(AjB)P(B) =P(BjA)P(A)
or
Bayes' Theorem:
P(BjA) =
P(AjB)P(B)
P(A)
Example
You have 10 coins in a bag. 9 are fair and 1 is double
headed. If you pull out a coin from the bag and do not
examine it. Find:
1. Probability of getting 5 heads in a row
2. Probability that if you get 5 heads the you picked
the double headed coin
11 1.3 Conditional Probability 1 PROBABILITY
P(5heads) =P(5headsjN)P(N) +P(5headsjH)P(H)
=

1
32

9
10

+

1
1
10

=
41
320
13%
P(Hj5heads) =
P(5headsjH)P(H)
P(5heads)
=
1
1
10
41
320
=
320
410
78%
12

1.4 Mutually exclusive and Independent events 1 PROBABILITY
1.4 Mutually exclusive and Independent
events
When events can not happen at the same time, i.e. no
outcomes in common, they are calledmutually exclu-
sive. If this is the case, then
P(A\B) = 0
and the addition rule becomes
P(A[B) =P(A) +P(B)
Example
Two dice are rolled, event A is 'the sum of the out-
comes on both dice is 5' and event B is 'the outcome on
each dice is the same'
When one event has no eect on another event, the
two events are said to beindependent, i.e.
P(AjB) =P(A)
and the multiplication rule becomes
P(A\B) =P(A)P(B)
13 1.5 Two famous problems 1 PROBABILITY
Example
A red dice and a blue dice are rolled, if event A is 'the
outcome on the red dice is 3' and event B 'is the outcome
on the blue dice is 3' then events A and B are said to be
independent.
1.5 Two famous problems
Birthday Problem- What is the probability that
at least 2 people share the same birthday
Monty Hall Game Show - Would you swap ?
14

1.6 Random Variables 1 PROBABILITY
1.6 Random Variables
1.6.1 Notation
Random VariablesX;Y;Z
Observed Variablesx;y;z
1.6.2 Denition
Outcomes of experiments are not always numbers, e.g.
two heads appearing; picking an ace from a deck of cards.
We need some way of assigning real numbers to each ran-
dom event. Random variables assign numbers to events.
Thus a random variable (RV)Xis a function which
maps from the sample space to the number line.
Example
letX= the number facing up when a fair dice is rolled,
or letXrepresent the outcome of a coin toss, where
X=

1 if heads
0 if tails
1.6.3 Types of Random variable
1. Discrete - Countable outcomes, e.g. roll of a dice,
rain or no rain
2. Continuous - Innite number of outcomes, e.g. exact
amount of rain in mm
15 1.7 Probability Distributions 1 PROBABILITY
1.7 Probability Distributions
Depending on whether you are dealing with a discrete
or continuous random variable will determine how you
dene your probability distribution.
1.7.1 Discrete distributions
When dealing with a discrete random variable we de-
ne the probability distribution using a probaility mass
fucntion or simply a probability function.
Example
The RVXis dened as' the sum of scores shown by
two fair six sided dice'. Find the probability distribution
ofX
A sample space diagram for the experiment is:
The distribution can be tabulated as:
x 23456789101112
P(X=x)
1
36
2
36
3
36
4
36
5
36
6
36
5
36
4
36
3
36
2
36
1
36
16

1.7 Probability Distributions 1 PROBABILITY
or can be represented on a graph as
1.7.2 Continuous Distributions
As continuous random variables can take any value, i.e an
innite number of values, we must dene our probability
distribution dierently.
For a continuous RV the probability of getting a spe-
cic value is zero, i.e
P(X=x) = 0
and so just as we go from bar charts to histograms when
representing discrete and continuous data, we must use a
probability density function (PDF) when describing the
probability distribution of a continuous RV.
17 1.7 Probability Distributions 1 PROBABILITY
P(a < X < b) =
Z
b
a
f(x)dx
Properties of a PDF:
f(x)0 since probabilities are always positive

R
+1
1
f(x)dx= 1
P(a < X < b) =
R
b
a
f(x)dx
Example
The random variableXhas the probability density
function:
f(x) =
8
<
:
k 1< x <2
k(x1) 2x4
0 otherwise
18

1.7 Probability Distributions 1 PROBABILITY
a) Find k and Sketch the probability distribution
b) FindP(X1:5)
a)
Z
+1
1
f(x)dx = 1
1 =
Z
2
1
kdx+
Z
4
2
k(x1)dx
1 = [kx]
2
1
+

kx
2
2
kx

4
2
1 = 2kk+ [(8k 4k)(2k2k)]
1 = 5k
)k=
1
5
19 1.8 Cumulative Distribution Function 1 PROBABILITY
b)
P(X1:5) =
Z
1:5
1
1
5
dx
=
h
x
5
i
1:5
1
=
1
10
1.8 Cumulative Distribution Function
The CDF is an alternative function for summarising a
probability distribution. It provides a formula forP(X
x), i.e.
F(x) =P(Xx)
1.8.1 Discrete Random variables
Example
Consider the probability distribution
x 123456
P(X=x)
1
2
1
4
1
8
1
16
1
32
1
32
F(X) =P(Xx)
Find:
a)F(2) and
b)F(4:5)
20

1.8 Cumulative Distribution Function 1 PROBABILITY
a)
F(2) =P(X2) =P(X= 1) +P(X= 2)
=
1
2
+
1
4
=
3
4
b)
F(4:5) =P(X4:5) =P(X4)
=
1
16
+
1
8
+
1
4
+
1
2
=
15
16
1.8.2 Continuous Random Variable
For continuous random variables
F(X) =P(Xx) =
Z
x
1
f(x)dx
or
f(x) =
d
dx
F(x)
Example
A PDF is dened as
f(x) =

3
11
(4x
2
) 0x1
0 otherwise
Find the CDF
21 1.8 Cumulative Distribution Function 1 PROBABILITY
Consider:
From1to 0:F(x) = 0
From 1 to1: F(x) = 1
From 0 to 1 :
22

1.8 Cumulative Distribution Function 1 PROBABILITY
F(x) =
Z
x
0
3
11
(4x
2
)dx
=
3
11

4x
x
3
3

x
0
=
3
11

4x
x
3
3

i.e.
F(x) =
8
>
<
>
:
0 x <0
3
11
h
4x
x
3
3
i
0x1
1 x >1
Example
A CDF is dened as:
F(x) =
8
<
:
0 x <1
1
12

x
2
+ 2x3

1x3
1 x >3
a) FindP(1:5x2:5)
b) Findf(x)
a)
P(1:5x2:5) =F(2:5)F(1:5)
=
1
12
(2:5
2
+ 2(2:5) 3)
1
12
(1:5
2
+ 2(1:5) 3)
= 0:5
23 1.9 Expectation and Variance 1 PROBABILITY
b)
f(x) =
d
dx
F(x)
f(x) =

1
6
(x+ 1) 1x3
0 otherwise
1.9 Expectation and Variance
The expectation or expected value of a random variable
X is the mean(measure of center), i.e.
E(X) =
The variance of a random variables X is a measure of
dispersion and is labeled
2
, i.e.
V ar(X) =
2
1.9.1 Discrete Random variables
For a discrete random variable
E(X) =
X
allx
xP(X=x)
Example
Consider the probability distribution
x 1234
P(X=x)
1
2
1
4
1
8
1
8
then
24

1.9 Expectation and Variance 1 PROBABILITY
E(X) = (1
1
2
) + (2
1
4
) + (3
1
8
) + (4
1
8
)
=
15
8
25 1.9 Expectation and Variance 1 PROBABILITY
Aside
What is Variance?
Variance =
P
(x)
2
n
=
P
x
2
n

2
Standard deviation =
rP
(x)
2
n
=
rP
x
2
n

2
26

1.9 Expectation and Variance 1 PROBABILITY
For a discrete random variable
V ar(X) =E(X
2
)[E(X)]
2
Now, for previous example
E(X
2
) = 1
2

1
2
+ 2
2

1
4
+ 3
2
18 + 4
2

1
8
E(X) =
15
18
)V ar(X) =
71
64
= 1:10937:::
Standard Deviation = 1:05(3s.f)
1.9.2 Continuous Random Variables
For a continuous random variable
E(X) =
Z
allx
xf(x)dx
and
V ar(X) =E(X
2
)[E(X)]
2
=
Z
allx
x
2
f(x)dx
Z
allx
xf(x)dx

2
Example
if
f(x) =

3
32
(4xx
2
) 0x4
0 otherwise
27 1.9 Expectation and Variance 1 PROBABILITY
FindE(X) andV ar(X)
E(X) =
Z
4
0
x:
3
32
(4xx
2
)dx
=
3
32
Z
4
0
4xx
2
dx
=
3
32

4x
3
3

x
4
4

4
0
=
3
32

4(4)
3
3

4
4
4

(0)

= 2
V ar(X) =E(X
2
)[E(X)]
2
=
Z
4
0
x
2
:
3
32
(4xx
2
)dx2
2
=
3
32

4x
4
4

x
5
5

40
4
=
3
32

4
4

4
5
5

4
=
4
5
28

1.10 Expectation Algebra 1 PROBABILITY
1.10 Expectation Algebra
SupposeXandYare random variables and a,b and c
are constants. Then:
E(X+a) =E(X) +a
E(aX) =aE(X)
E(X+Y) =E(X) +E(Y)
V ar(X+a) =V ar(X)
V ar(aX) =a
2
V ar(X)
V ar(b) = 0
IfXandYare independent, then
E(XY) =E(X)E(Y)
V ar(X+Y) =V ar(X) +V ar(Y)
29 1.11 Moments 1 PROBABILITY
1.11 Moments
The rst moment isE(X) =
Then
th
moment isE(X
n
) =
R
allx
x
n
f(x)dx
We are often interested in the moments about the
mean, i.e. central moments.
The 2
nd
central moment about the mean is called the
varianceE[(X)
2
] =
2
The 3
rd
central moment isE[(X)
3
]
So we can compare with other distributions, we scale
with
3
and deneSkewness.
Skewness =
E[(X)
3
]

3
This is a measure of asymmetry of a distribution. A
distribution which is symmetric has skew of 0. Negative
values of the skewness indicate data that are skewed to
the left, where positive values of skewness indicate data
skewed to the right.
30

1.11 Moments 1 PROBABILITY
The 4
th
normalised central moment is calledKurtosis
and is dened as
Kurtosis =
E[(X)
4
]

4
A normal random variable has Kurtosis of 3 irrespec-
tive of its mean and standard deviation. Often when
comparing a distribution to the normal distribution, the
measure ofexcess Kurtosisis used, i.e. Kurtosis of
distribution3.
Intiution to help understand Kurtosis
Consider the following data and the eect on the Kur-
tosis of a continuous distribution.
xi< :
The contribution to the Kurtosis from all data points
within 1 standard deviation from the mean is low since
(xi)
4

4
<1
e.g consider
x1=+
1
2

then
(x1)
4

4
=

1
2

4

4

4
=

1
2

4
=
1
16
xi> :
31 1.11 Moments 1 PROBABILITY
The contribution to the Kurtosis from data points
greater than 1 standard deviation from the mean will
be greater the further they are from the mean.
(xi)
4

4
>1
e.g consider
x1=+ 3
then
(x1)
4

4
=
(3)
4

4
= 81
This shows that a data point 3 standard deviations
from the mean would have a much greater eect on the
Kurtosis than data close to the mean value. Therefore,
if the distribution has more data in the tails, i.e. fat tails
then it will have a larger Kurtosis.
Thus Kurtosis is often seen as a measure of how 'fat'
the tails of a distribution are.
If a random variable has Kurtosis greater than 3 is
is calledLeptokurtic, if is has Kurtosis less than 3 it is
calledplatykurtic
Leptokurtic is associated with PDF's that are simul-
taneously peaked and have fat tails.
32

1.11 Moments 1 PROBABILITY
33 1.12 Covariance 1 PROBABILITY
1.12 Covariance
The covariance is useful in studying the statistical de-
pendence between two random variables. IfXandY
are random variables, then theor covariance is dened
as:
Cov(X;Y) =E[(XE(X))(YE(Y))]
=E(XY)E(X)E(Y)
Intuition
Imagine we have a single sample of X and Y, so that:
X= 1,E(X) = 0
Y= 3,E(Y) = 4
Now
XE(X) = 1
and
YE(Y) =1
i.e.
Cov(X;Y) =1
So in this sample whenXwas above its expected value
and Y was below its expected value we get a negative
number.
Now if we do this for everyXandYand average
this product, we should nd the Covariance is negative.
What about if:
34

1.12 Covariance 1 PROBABILITY
X= 4,E(X) = 0
Y= 7,E(Y) = 4
Now
XE(X) = 4
and
YE(Y) = 3
i.e.
Cov(X;Y) = 12
i.e positive
We can now dene an important dimensionless quan-
tity (used in nance) called the correlation coecient
and denotedXY(X;Y) where
XY=
Cov(X;Y)
XY
;1XY1
IfXY=1 =)perfect negative correlation
IfXY= 1 =)perfect positive correlation
IfXY= 0 =)uncorrelated
35 1.13 Important Distributions 1 PROBABILITY
1.13 Important Distributions
1.13.1 Binomial Distribution
The Binomial distribution is a discrete distribution and
can be used if the following are true.
A xed number of trials,n
Trials are independent
Probability of success is a constantp
We sayXB(n;p) and
P(X=x) =

n
x

p
x
(1p)
nx
where
n
x

=
n!
x!(nx)!
Example
IfXB(10;0:23), nd
a)P(X= 3)
b)P(X <4)
a)
P(X= 3) =

10
3

(0:23)
3
(10:23)
7
= 0:2343
36

1.13 Important Distributions 1 PROBABILITY
b)
P(X <4) =P(X3)
=P(X= 0) +P(X= 1) +P(X= 2) +P(X= 3)
=

10
0

(0:23)
0
(0:77)
10
+

10
1

(0:23)
1
(0:77)
9
+

10
2

(0:23)
2
(0:77)
8
+

10
3

(0:23)
3
(0:77)
7
= 0:821(3 d.p)
Example
Paul rolls a standard fair cubical die 8 times. What is
the probability that he gets 2 sixes.
Let X be the random variable equal to the number of
6's obtained, i.eXB(8;
1
6
)
P(X= 2) =

8
2

1
6

2
1
6

6
= 0:2604(4 d.p)
It can be shown that for a binomial distribution where
XB(n;p)
E(X) =np
and
V ar(X) =np(1p)
37 1.13 Important Distributions 1 PROBABILITY
1.13.2 Poisson Distribution
The Poisson distribution is a discrete distribution where
the random variable X represents the number of events
that occur 'at random' in any interval. If X is to have a
Poisson distribution then events must occur
Singly, i.e. no chance of two events occurring at the
same time
Independently of each other
Probability of an event occurring at all points in time
is the same
We sayXPo().
The Poisson distribution has probability function:
P(X=r) =
e


r
r!
r= 0;1;2:::
It can be shown that:
E(X) =
V ar(X) =
Example
Between 6pm and 7pm, directory enquiries receives
calls at the rate of 2 per minute. Find the probability
that:
(i) 4 calls arrive in a randomly chosen minute
(ii) 6 calls arrive in a randomly chosen two minute pe-
riod
38

1.13 Important Distributions 1 PROBABILITY
(i) LetXbe the number of call in 1 minute, so
= 2; i:e: E(X) = 2
and
XPo(2) =
e
2
2
r
r!
P(X= 4) =
e
2
2
4
4!
= 0:090(3 d.p)
(ii) Let Y be the number of calls in 2 minutes, so
= 4; i:e: E(Y) = 4
and
P(Y= 6) =
e
4
4
6
6!
= 0:104(3 d.p)
39 1.13 Important Distributions 1 PROBABILITY
1.13.3 Normal Distribution
The Normal distribution is a continuous distribution.
This is themostimportant distribution. If X is a ran-
dom variable that follows the normal distribution we say:
XN(;
2
)
where
E(X) =
V ar(X) =
2
and the PDF is described as
PDF =f(x) =
1

p
2
e
(x)
2
2
2
i.e.
P(Xx) =
Z
x
1
1

p
2
e
(s)
2
2
2ds
The Normal distribution is symmetric and area under
the graph equals 1, i.e.
Z
+1
1
1

p
2
e
(x)
2
2
2dx= 1
40

1.13 Important Distributions 1 PROBABILITY
To nd the probabilities we must integrate underf(x),
this is not easy to do and requires numerical methods.
In order to avoid this numerical calculation we dene
a standard normal distribution, for which values have
already been documented.
The Standard Normal distribution is just a transfor-
mation of the Normal distribution.
1.13.4 Standard Normal distribution
We dene a standard normal random variable byZ,
whereZN(0;1), i.e.
E(Z) = 0
V ar(Z) = 1
thus the PDF is
(z) =
1
p
2
e
z
2
2
and
(z) =
Z
z
1
1
p
2
e
s
2
2ds
41 1.13 Important Distributions 1 PROBABILITY
To transform a Normal distribution into a Standard
Normal distribution, we use:
Z=
X

Example
GivenXN(12;16) nd:
a)P(X <14)
b)P(X >11)
c)P(13< X <15)
a)
Z=
X

=
1412
4
= 0:5
Therefore we want
P(Z0:5) = (0:5)
= 0:6915
(from tables)
b)
42

1.13 Important Distributions 1 PROBABILITY
Z=
1112
4
=0:25
Therefore we want
P(Z >0:25)
but this is not in the tables. From symmetry this is the
same as
P(Z <0:25)
i.e.
(0:25)
thus
P(Z >0:25) = (0 :25)
= 0:5987
c)
43 1.13 Important Distributions 1 PROBABILITY
Z1=
1312
4
= 0:25
Z2=
1512
4
= 0:75
Therefore
P(0:25< Z <0:75) = (0:75)(0:25)
= 0:7734 0:5987
= 0:1747
1.13.5 Common regions
The percentages of the Normal Distribution lying within
the given number of standard deviations either side of
the mean are approximately:
One Standard Deviation:
Two Standard Deviations:
44

1.13 Important Distributions 1 PROBABILITY
Three Standard Deviations:
45 1.14 Central Limit Theorem 1 PROBABILITY
1.14 Central Limit Theorem
The Central Limit Theorem states:
SupposeX1;X2;::::::;Xnarenindependentrandom
variables, each having thesamedistribution. Then asn
increases, the distributions of
X1+X2+::::::+Xn
and of
X1+X2+::::::+Xn
n
come increasingly to resemble normal distributions.
Why is this important ?
The importance lies in the fact:
(i) The common distribution of X is not stated - it can
be any distribution
(ii) The resemblance to a normal distribution holds for
remarkably smalln
(iii) Total and means are quantities of interest
If X is a random variable with meanand standard
devaitionfom an unknown distribution, the central
limit theorem states that the distribution of the sample
means isNormal.
But what are it's mean and variance ?
Let us consider the sample mean as another random
variable, which we will denote

X. We know that

X=
X1+X2+::::::Xn
n
=
1
n
X1+
1
n
X2+::::::+
1
n
Xn
46

1.14 Central Limit Theorem 1 PROBABILITY
We wantE(

X) andV ar(

X)
E(

X) =E

1
n
X1+
1
n
X2+::::::+
1
n
Xn

=
1
n
E(X1) +
1
n
E(X2) +::::::+
1
n
E(Xn)
=
1
n
+
1
n
+::::::+
1
n

=n

1
n


=
i.e. the expectation of the sample mean is the popu-
lation mean !
V ar(

X) =V ar

1
n
X1+
1
n
X2+::::::+
1
n
Xn

=V ar

1
n
X1

+V ar

1
n
X2

+::::::+V ar

1
n
Xn

=

1
n

2
V ar(X1) +

1
n

2
V ar(X2) +:::::+

1
n

2
V ar(Xn)
=

1
n

2

2
+

1
n

2

2
+:::::+

1
n

2

2
=n

1
n

2

2
=

2
n
Thus CLT tells us that where n is a suciently large
47 1.14 Central Limit Theorem 1 PROBABILITY
number of samples.

XN(;

2
n
)
Standardising, we get the equivalent result that

X

p
n
N(0;1)
This analysis could be repeated for the sumSn=X1+
X2+:::::::+Xnand we would nd that
Snn

p
n
N(0;1)
Example
Consider a 6 sided fair dice. We know thatE(X) = 3:5
andV ar(X) =
35
12
.
Let us now consider an experiment. The experiment
consists of rolling the dicentimes and calculating the
average for the experiment. We will run 500 such exper-
iments and record the results in a Histogram.
n=1
In each experiment the dice is rolled once only, this
experiment is then repeated 500 times. The graph below
shows the resulting frequency chart.
48

1.14 Central Limit Theorem 1 PROBABILITY
This clearly resembles a uniform distribution (as ex-
pected).
Let us now increase the number of rolls, but continue
to carry out 500 experiments each time and see what
happens to the distribution of

X
n=5
49 1.14 Central Limit Theorem 1 PROBABILITY
n=10
n=30
We can see that even for small sample sizes (number
of dice rolls), our resulting distribution begins to look
more like a Normal distribution. we can also note that
asnincreases our distribution begins to narrow, i.e. the
variance becomes smaller

2
n
, but the mean remains the
same.
50

2 STATISTICS
2 Statistics
2.1 Sampling
So far we have been dealing with populations, however
sometimes the population is too large to be able to anal-
yse and we need to use a sample in order to estimate the
population parameters, i.e. mean and variance.
Consider a population ofNdata points and a sample
taken from this population ofndata points.
We know that the mean and variance of a population
are given by:
population mean; =
P
N
i=1
xi
N
and
population variance;
2
=
P
N
i=1
(xix)
2
N
51 2.1 Sampling 2 STATISTICS
But how can we use the sample to estimate our pop-
ulation parameters?
First we dene an unbiased estimator. An unbiased
estimator is when the expected value of the estimator is
exactly equal to the corresponding population parame-
ter, i.e.
if xis the sample mean then the unbiased estimator is
E(x) =
where the sample mean is given by:
x=
P
N
i=1
xi
n
IfS
2
is the sample variance, then the unbiased esti-
mator is
E(S
2
) =
2
where the sample variance is given by:
S
2
=
P
ni=1
(xix)
2
n1
2.1.1 Proof
From the CLT, we know:
E(

X) =
and
V ar(

X) =

2
n
Also
V ar(

X) =E(

X
2
)[E(

X)]
2
52

2.1 Sampling 2 STATISTICS
i.e.

2
n
=E(

X
2
)
2
or
E(

X
2
) =

2
n
+
2
For a single piece of datan= 1, so
E(

X
2
i) =
2
+
2
Now
E
hX
(Xi

X)
2
i
=E
hX
X
2
in

X
2
i
=
X
E(X
2
i)nE(

X)
2
=n
2
+n
2
n


2
n
+
2

=n
2
+n
2

2
n
2
= (n 1)
2
)
2
=
E
P
(Xi

X)
2

n1
53 2.2 Maximum Likelihood Estimation 2 STATISTICS
2.2 Maximum Likelihood Estimation
The Maximum Likelihood Estimation (MLE) is a sta-
tistical method used for tting data to a model (Data
analysis).
We are asking the question:
"Given the set of data, what model parameters is most
likely to give this data?"
MLE is well dened for the standard distributions,
however in complex problems, the MLE may be unsuit-
able or even fail to exist.
Note:When using the MLE model we must rst as-
sume a distribution, i.e. a parametric model, after which
we can try to determine the model parameters.
2.2.1 Motivating example
Consider data from a Binomial distribution with random
variableXand parametersn= 10 andp=p0. The
parameterp0is xed and unknown to us. That is:
f(x;p0) =P(X=x) =

10
x

P
x
0(1p0)
10x
Now suppose we observe some dataX= 3.
Our goal is to estimate the actual parameter valuep0
based on the data.
54

2.2 Maximum Likelihood Estimation 2 STATISTICS
Thought Experiments:
let us assumep0= 0:5, so probability of generating
the data we saw is
f(3;0:5) = P(X= 3)
=

10
3

(0:5)
3
(0:5)
7
0:117
Not very high !
How aboutp0= 0:4, again
f(3;0:4) = P(X= 3)
=

10
3

(0:4)
3
(0:6)
7
0:215
better......
So in general letp0=pand we want to maximise
f(3;p), i.e.
f(3;p) =P(X= 3) =

10
3

P
3
(1p)
7
Let us dene a new function called the likelihood func-
tion`(p;3) such that`(p;3) =f(3;p). Now we want to
maximise this function.
Maximising this function is the same as maximising
the log of this function (we will explain why we do this
55 2.2 Maximum Likelihood Estimation 2 STATISTICS
later!), so let
L(p;3) = log`(p;3)
therefore,
L(p;3) = 3logp+ 7log(1p) + log

10
3

To maximise we need to nd
dL
dp
= 0
dL
dp
= 0
3
p

7
1p
= 0
3(1p)7p= 0
p=
3
10
Thus the value ofp0that maximisesL(p;3) isp=
3
10
.
This is called theMaximum Likelihood estimateof
p0.
2.2.2 In General
If we have n pieces ofiiddatax1;x2;x3;::::xnwith prob-
ability density (or mass) functionf(x1;x2;x3;::::xn;),
whereare the unknown parameter(s). Then the Max-
imum likelihood function is dened as
`(;x1;x2;x3;::::xn) =f(x1;x2;x3;::::xn;)
and the log-likelihood function can be dened as
56

2.2 Maximum Likelihood Estimation 2 STATISTICS
L(;x1;x2;x3;::::xn) = log`(;x1;x2;x3;::::xn)
Where the maximum likelihood estimate of the param-
eter(s)0can be obtained by maximisingL(;x1;x2;x3;::::xn)
2.2.3 Normal Distribution
Consider a random variableXsuch thatXN(;
2
).
Letx1;x2;x3;::::xnbe a random sample of iid observa-
tions. To nd the maximum likelihood estimators of
and
2
we need to maximise the log-likelihood function.
f(x1;x2;x3;::::xn;;) =f(x1;;):f(x2;;):::::::f(xn;;)
`(;;x1;x2;x3;::::xn) =f(x1;;):f(x2;;):::::::f(xn;;)
)L(;;x1;x2;x3;::::xn) = log`(;;x1;x2;x3;::::xn)
= logf(x1;;) + logf(x2;;) +:::::+ logf(xn;;)
=
n
X
i=1
logf(xi;;)
For the Normal distribution
f(x;;) =
1

p
2
e

(x)
2
2
2
57 2.2 Maximum Likelihood Estimation 2 STATISTICS
so
L(;;x1;x2;x3;::::xn) = log
"
n
X
i=1
1

p
2
e

(x
i
)
2
2
2
#
=
n
2
log(2 )nlog()
1
2
2
n
X
i=1
(xi)
2
To maximise we dierentiate partially with respect to
andset the derivatives to zero and solve. If we were
to do this, we would get:
=
1
n
n
X
i=1
xi
and

2
=
1
n
n
X
i=1
(xi)
2
58

2.3 Regression and Correlation 2 STATISTICS
2.3 Regression and Correlation
2.3.1 Linear regression
We are often interested in looking at the relationship be-
tween two variables (bivariate data). If we can model
this relationship then we can use our model to make pre-
dictions.
A sensible rst step would be to plot the data on a
scatter diagram, i.e. pairs of values (xi;yi)
Now we can try to t a straight line through the data.
We would like to t the straight line so as to minimisethe sum of the squared distances of the points from theline. The dierent between the data value and the ttedline is called the residual or error and the technique of
often referred to as the method of least squares.
59 2.3 Regression and Correlation 2 STATISTICS
If the equation of the line is given by
y=bx+a
then the error iny, i..e the residual of thei
th
data point
(xi;yi) would be
ri=yiy
=yi(bxi+a)
We want to minimise
P
n=1
n=1
r
2
i
, i.e.
S:R=
n=1
X
n=1
r
2
i=
n=1
X
n=1
[yi(bxi+a)]
2
We want to nd thebandathat minimise
P
n=1
n=1
r
2
i
.
S:R=
X
y
2
i2yi(bxi+a) + (bx i+a)
2

=
X
y
2
i2byixi2ayi+b
2
x
2
i+ 2baxi+a
2

or
=n

y
2
2bnxy2any+b
2
n

x
2
+ 2banx+na
2
60

2.3 Regression and Correlation 2 STATISTICS
To minimise, we want
(i)
@(S:R)
@b
= 0
(ii)
@(S:R)
@a
= 0
(i)
@(S:R)
@b
=2nxy+ 2bn

x
2
+ 2anx= 0
(ii)
@(S:R)
@a
=2ny+ 2bnx+ 2an= 0
These are linear simultaneous equations in b and a
and can be solved to get
b=
Sxy
Sxx
where
Sxx=
X
(xix)
2
=
X
(x
2
i)
P
(xi)
2
n
and
Sxy=
X
(xix)(yiy) =
X
xiyi
(
P
xi)(
P
yi)
n
a= ybx
Example
x510152025303540
y9890816661473934
X
xi= 180
X
yi= 516
X
x
2
i= 5100
X
y
2
i= 37228
X
xiyi= 9585
61 2.3 Regression and Correlation 2 STATISTICS
Sxy= 9585
180516
8
=2025
Sxx= 5100
180
2
8
= 1050
)b=
2025
1050
=1:929
x=
180
8
= 22:5 y=
516
8
= 64:5
)a= 64:5 (1:929 22:5) = 107:9
i.e.
y=1:929x + 107:9
62

2.3 Regression and Correlation 2 STATISTICS
2.3.2 Correlation
A measure of how two variables are dependent is their
correlation. When viewing scatter graphs we can often
determine if their is any correlation by sight, e.g.
63 2.3 Regression and Correlation 2 STATISTICS
It is often advantageous to try to quantify the corre-
lation between between two variables, this can be done
in a number of ways, two such methods are described.
2.3.3 Pearson Product-Moment Corre-
lation Coecient
A measure often used within statistics to quantify this
is thePearson product-moment correlation coe-
cient. This correlation coecient is a measure of linear
dependence between two variables, giving a value be-
tween +1 and1.
PMCCr=
Sxy
p
SxxSyy
ExampleConsider the previous example, i.e.
x510152025303540
y9890816661473934
We calculated,
64

2.3 Regression and Correlation 2 STATISTICS
Sxy=2025 and Sxx= 1050
also,
Syy=
X
(yiy)
2
=
X
(y
2
i)
P
(yi)
2
n
i.e
Syy= 37228
516
2
8
= 3946
therefore,
r=
2025
p
10503946
=0:995
This shows a strong negative correlation and if we were
to plot this using a scatter diagram, we can see this vi-
sually.
2.3.4 Spearman's Rank Correlation Co-
ecient
Another method of measuring the relationship between
two variables is to use theSpearman's rank corre-
65 2.3 Regression and Correlation 2 STATISTICS
lation coeeint. Instead of dealing with the values
of the variables as in the product moment correlation
coecient, we assign a number (rank) to each variable.
We then calculate a correlation coecient based on the
ranks. The calculated value is called the Spearmans
Rank Correlation Coecient,rs, and is an approxima-
tion to the PMCC.
rs= 1
6
P
d
2
i
n(n
2
1)
where d is the dierence in ranks and n is the number of
pairs.
Example
Consider two judges who score a dancing championship
and are tasked with ranking the competitors in order.
The following table shows the ranking that the judges
gave the competitors.
CompetitorABCDEFGH
JudgeX 31675482
JudgeY 21584376
calculatingd
2
, we get
difference d10111114
difference
2
d
2
101111116
)
X
d
2
i= 22 andn= 8
rs= 1
622
8(8
2
1)
= 0:738
66

2.4 Time Series 2 STATISTICS
i.e. strong positive correlation
2.4 Time Series
A time series is a sequence of data points, measured typi-
cally at successive times spaced at uniform time intervals.
Examples of time series are the daily closing value of the
Dow Jones index or the annual ow volume of the Nile
River at Aswan.
Time series analysis comprises methods for analyzing
time series data in order to extract meaningful statistics
and other characteristics of the data.
Two methods for modeling time series data are (i)
Moving average models (MA) and (ii) Autoregressive
models.
2.4.1 Moving Average
The moving average model is a common approach to
modeling univariate data. Moving averages smooth the
67 2.4 Time Series 2 STATISTICS
price data to form a trend following indicator. They do
not predict price direction, but rather dene the current
direction with a lag.
Moving averages lag because they are based on past
prices. Despite this lag, moving averages help smooth
price action and lter out the noise. The two most pop-
ular types of moving averages are the Simple Moving
Average (SMA) and the Exponential Moving Average
(EMA).
Simple moving average
A simple moving average is formed by computing the
average over a specic number of periods.
Consider a 5-day simple moving average for closing
prices of a stock. This is the ve day sum of closing
prices divided by ve. As its name implies, a moving
average is an average that moves. Old data is dropped
as new data comes available. This causes the average
to move along the time scale. Below is an example of a
5-day moving average evolving over three days.
The rst day of the moving average simply covers the
68

2.4 Time Series 2 STATISTICS
last ve days. The second day of the moving average
drops the rst data point (11) and adds the new data
point (16). The third day of the moving average contin-
ues by dropping the rst data point (12) and adding the
new data point (17). In the example above, prices grad-
ually increase from 11 to 17 over a total of seven days.
Notice that the moving average also rises from 13 to 15
over a three day calculation period. Also notice that
each moving average value is just below the last price.
For example, the moving average for day one equals 13
and the last price is 15. Prices the prior four days were
lower and this causes the moving average to lag.
Exponential moving average
Exponential moving averages reduce the lag by apply-
ing more weight to recent prices. The weighting applied
to the most recent price depends on the number of pe-
riods in the moving average. There are three steps to
calculating an exponential moving average. First, calcu-
late the simple moving average. An exponential moving
average (EMA) has to start somewhere so a simple mov-
ing average is used as the previous period's EMA in the
rst calculation. Second, calculate the weighting multi-
plier. Third, calculate the exponential moving average.
The formula below is for a 10-day E.
Ei+1= 2
(n+1)
(Pi+1Ei) +Ei
69 2.4 Time Series 2 STATISTICS
A 10-period exponential moving average applies an
18.18% weighting to the most recent price. A 10-period
EMA can also be called an 18.18% EMA.
A 20-period EMA applies a 9.52% weighing to the
most recent price
2
20+1
=:0952. Notice that the weight-
ing for the shorter time period is more than the weightingfor the longer time period. In fact, the weighting dropsby half every time the moving average period doubles.
70

2.4 Time Series 2 STATISTICS
2.4.2 Autoregressive models
Autoregressive models are models that describe random
processes (denote here aset) that can be described by
a weighted sum of its previous values and a white noise
error.
An AR(1) process is a rst-order one process, meaning
that only the immediately previous value has a direct
eect on the current value
et=ret1+ut
whereris a constant that has absolute value less than
one, andutis a white noise process drawn from a distri-
71 2.4 Time Series 2 STATISTICS
bution with mean zero and nite variance, often a normal
distribution.
An AR(2) would have the form
et=r1et1+r2et2+ut
and so on. In theory a process might be represented
by anAR(1).
72

Mathematical Preliminaries
Introduction to Probability
Preliminaries
Randomness lies at the heart of …nance and whether terms uncertainty or risk are used, they refer to the
random nature of the …nancial markets. Probability theory provides the necessary structure to model the
uncertainty that is central to …nance. We begin by de…ning some basic mathematical tools.
The setof all possible outcomes of some given experiment is called thesample space. A particular
outcome!2is called asample point.
Anevent is a set of outcomes, i.e. .
To a set of basic outcomes!iwe assign real numbers called probabilities, writtenP(!i) =pi:Then for
any eventE;
P(E) =
X
!i2E
pi
Example 1
Experiment: A dice is rolled and the number appearing on top is observed. The sample space consists
of the 6 possible numbers:
=f1;2;3;4;5;6g
If the number 4 appears then!= 4is a sample point, clearly42.
Let 1, 2, 3=events that an even, odd, prime number occurs respectively.
So
1=f2;4;6g; 2=f1;3;5g; 3=f2;3;5g
1[ 3=f2;3;4;5;6g event that an even or
prime number occurs.
2\ 3=f3;5g event that odd and
prime number occurs.

c
3=f1;4;6g event that prime number does not occur (complement of event).
Example 2
Toss a coin twice and observe the sequence of heads (H) and tails (T) that appears. Sample space
=fHH, TT, HT, THg
Let 1be event that at least one head appears, and 2be event that both tosses are the same:
1=fHH, HT, THg, 2=fHH, TTg
1\ 2=fHHg
Events are subsets of, but not all subsets ofare events.
1 The basic properties of probabilities are
1.0pi1
2.P() =
X
i
pi= 1(the sum of the probabilities is always 1).
Random Variables
Outcomes of experiments are not always numbers, e.g. 2 heads appearing; picking an ace from a deck
of cards. We need some way of assigning real numbers to each random event. Random variables assign
numbers to events.
Thus arandom variable(RV)Xis a function which maps from the sample spaceto the set of real
numbers
X:!2!R;
i.e. it associates a numberX(!)with each outcome!:
Consider the example of tossing a coin and suppose we are paid £ 1 for each head and we lose £ 1 each time
a tail appears. We know thatP(H) =P(T) =
1
2
:So now we can assign the following outcomes
P(1) =
1
2
P(1) =
1
2
Mathematically, if our random variable isX;then
X=

+1if H
1if T
or using the notation aboveX:!2 fH,Tg ! f1;1g:
The probability that the RV takes on each possible value is called theprobability distribution.
IfXis a RV then
P(X=a) =P(f!2 :X(!) =ag)
is the probability thataoccurs (orXmaps ontoa).
P(aXb) =probability thatXlies in the interval[a; b] =
P(f!2 :aX(!)bg)
X:
Domain
! R
Range (…nite)
X() =fx1; ::::; xng=fxig
1in
P[xi] =P[X=xi] =f(xi)8i:
So the earlier coin tossing example gives
P(X= 1) =
1
2
;P(X=1) =
1
2
2

f(xi)is the probability distribution ofX:
This is called adiscrete probability distribution.
xi
x1 x2:::::::::::: xn
f(xi)
f(x1)f(x2):::::::::::: f(xn)
There are two properties of the distributionf(xi)
(i)f(xi)08i2[1; n]
(ii)
nP
i=1
f(xi) = 1;i.e. sum of all probabilities is one.
Mean/Expectation
Themeanmeasures the centre (average) of the distribution
=E[X] =
nP
i=1
xif(xi)
=x1f(x1) +x2f(x2) +:::::+xnf(xn)
which is equal to the weighted average of all possible values ofXtogether with associated probabilities.
This is also called the…rst moment.
Example:
xi
2 3 8
f(xi)
1
4
1
2
1
4
=E[X] =
3P
i=1
xif(xi) = 2

1
4

+ 3

1
2

+ 8

1
4

= 4
Variance/Standard Deviation
This measures the spread (dispersion) ofXabout the mean.
VarianceV[X] =
E

(X)
2

=E

X
2


2
=
nP
i=1
x
2
if(xi)
2
=
2
E

(X)
2

is also called thesecond moment about the mean.
From the previous example we have= 4;therefore
V[X] =

2
2

1
4

+ 3
2

1
2

+ 8
2

1
4

16
= 5:5 =
2
!= 2:34
Rules for Manipulating Expectations
SupposeX; Yare random variables and; ; 2Rare constant scalar quantities. Then
3 E[X] =E[X]
E[X+Y] =E[X] +E[Y];(linearity)
V[X+] =
2
V[X]
E[XY] =E[X]E[Y];
V[X+Y] =V[X] +V[Y]
The last two are providedX; Yare independent.
4

Continuous Random Variables
As the number of discrete events becomes very large, individual probabilitiesf(xi)!0:Now look at the
continuous case.
Instead off(xi)we now havep(x)which is a continuous distribution called asprobability density function,
PDF.
P(aXb) =
Z
b
a
p(x)dx
Thecumulative distribution functionF(x)of a RVXis
F(x) =P(Xx) =
Z
x
1
p(x)dx
F(x)is related to the PDF by
p(x) =
dF
dx
(fundamental theorem of calculus) providedF(x)is di¤erentiable. However unlikeF(x); p(x)may
have singularities (and may be unbounded).
Special Expectations:
Given any PDFp(x)ofX:
Mean=E[X] =
Z
R
xp(x)dx:
Variance
2
=V[X] =E

(X)
2

=
Z
R
x
2
p(x)dx
2
(2
nd
moment about the mean).
Then
th
moment about zero is de…ned as

n=E[X
n
]
=
Z
R
x
n
p(x)dx:
In general, for any functionh
E[h(X)] =
Z
R
h(x)p(x)dx:
whereXis a RV following the distribution given byp(x):
Moments about the mean are given by
E[(X)
n
] ;n= 2;3; :::
The special casen= 2gives the variance
2
:
5 Skewness and Kurtosis
Having looked at the variance as being the second moment about the mean, we now discuss two further
moments centred about;that provide further important information about the probability distribution.
Skewnessis a measure of the asymmetry of a distribution (i.e. lack of symmetry) about its mean. A
distribution that is identical to the left and right about a centre point is symmetric.
The third central moment, i.e. third moment about the mean scaled with
3
. This scaling allows us to
compare with other distributions.
E

(X)
3


3
is called theskewand is a measure of the skewness (a non-symmetric distribution is calledskewed).
Any distribution which is symmetric about the mean has a skew of zero.
Negative values for the skewness indicate data that are skewed left and positive values for the skewness
indicate data that are skewed right.
By skewed left, we mean that the left tail is long relative to the right tail. Similarly, skewed right means
that the right tail is long relative to the left tail.
The fourth centred moment scaled by the square of the variance, called thekurtosisis de…ned
E

(X)
4


4
:
This is a measure of how much of the distribution is out in the tails at large negative and positive values
ofX:
The4
th
central moment is called Kurtosis and is de…ned as
Kurtosis=
E

(X)
4


4
normal random variable has Kurtosis of 3 irrespective of its mean and standard deviation. Often when
comparing a distribution to the normal distribution, the measure of excess Kurtosis is used, i.e. Kurtosis
of distribution3.
If a random variable has Kurtosis greater than 3 is called Leptokurtic, if is has Kurtosis less than 3 it
is called platykurtic Leptokurtic is associated with PDF’s that are simultaneously peaked and have fat tails.
6

Normal Distribution
Thenormal(orGaussian) distributionN(;
2
)with mean and standard deviationand
2
in turn
is de…ned in terms of its density function
p(x) =
1

p
2
exp


(x)
2
2
2
!
:
For the special case= 0and= 1it is called thestandard normaldistributionN(0;1):
This is also veri…ed by making the substitution
=
x

inp(x)which gives
() =
1
p
2
exp


1
2

2

and clearly has zero mean and unit variance:
E

X


=
1

E[X] = 0;
V

X


=V

X





NowV[X+] =
2
V[X](standard result), hence
1

2
V[X] =
1

2
:
2
= 1
Its cumulative distribution function is
F(x) =
1
p
2
Z
x
1
e

1
2

2
d=P(1 Xx):
The skewness ofN(0;1)is zero and its kurtosis is3:
7 Correlation
The covariance is useful in studying the statistical dependence between two random variables. IfX; Yare
RV’s, then their covariance is de…ned as:
Cov (X; Y) =E
2
6
4
0
@XE(X)
|
{z}
=
x
1
A
0
B
@YE(Y)
|{z}
=
y
1
C
A
3
7
5
=E[XY]
x
y
which we denote asXY:Note:
Cov (X; X) =E

(X
x)
2

=
2
:
X; Yarecorrelatedif
E

(X
x)

Y
y

6= 0:
We can then de…ne an important dimensionless quantity (used in …nance) called thecorrelation coe¢ cient
and denoted as
XY(X; Y)where

XY=
Cov (X; Y)
xy
:
The correlation can be thought of as a normalised covariance, asj
XYj 1;for which the following
conditions are properties:
i.(X; Y) =(Y; X)
ii.(X;X) =1
iii.11

XY=1)perfect negative correlation

XY= 1)perfect correlation

XY= 0)X; Yuncorrelated
Why is the correlation coe¢ cient bounded by1?Justi…cation of this requires a result called theCauchy-
Schwartz inequality. This is a theorem which most students encounter for the …rst time in linear algebra
(although we have not discussed this). Let’s start o¤ with the version for random variables (RVs)Xand
Y, then the Cauchy-Schwartz inequality is
[E[XY]]
2
E

X
2

E

Y
2

:
We know that the covariance ofX; Yis
XY=E[(X
X) (Y
Y)]
If we put
V[X] =
2
X=E

(X
X)
2

V[Y] =
2
Y=E

(Y
Y)
2

:
8

From Cauchy-Schwartz we have
(E[(X
X) (Y
Y)])
2
E

(X
X)
2

E

(Y
Y)
2

or we can write

2
XY
2
X
2
Y
Divide through by
2
X

2
Y

2
XY

2
X

2
Y
1
and we know that the left hand side above is
2
XY
, hence

2
XY=

2
XY

2
X

2
Y
1
and since
XYis a real number, this impliesj
XYj 1which is the same as
1
XY+1:
Central Limit Theorem
This concept is fundamental to the whole subject of …nance.
LetXibe any independent identically distributed (i.i.d) random variable with meanand variance
2
;
i.e.XD(;
2
);whereDis some distribution:If we put
Sn=
nP
i=1
Xi
Then
(Snn)

p
n
has a distribution that approaches the standard normal distribution asn! 1:
The distribution of the sum of a large number of independent identically distributed variables will be
approximately normal, regardless of the underlying distribution. That is the beauty of this result.
Conditions:
The Normal distribution is the limiting behaviour if you add many random numbers from any basic-building
block distribution provided the following is satis…ed:
1.Mean of distribution must be …nite and constant
2.Standard deviation of distribution must be …nite and constant
This is a measure of how much of the distribution is out in the tails at large negative and positive values
ofX:
9 Moment Generating Function
Themoment generating functionofX;denotedMX()is given by
MX() =E

e
X

=
Z
R
e
x
p(x)dx
provided the expectation exists. We can expand as a power series to obtain
MX() =
1
X
n=0

n
E(X
n
)
n!
so then
th
moment is the coe¢ cient of
n
=n!;or then
th
derivative evaluated at zero.
How do we arrive at this result?
We use the Taylor series expansion for the exponential function:
Z
R
e
x
p(x)dx=
Z
R

1 +x+
(x)
2
2!
+
(x)
3
3!
+::::::
!
p(x)dx
=
Z
R
p(x)dx
|
{z}
1
+
Z
R
xp(x)dx
|{z}
E(X)
+

2
2!
Z
R
x
2
p(x)dx
|
{z}
E(X
2
)
+

3
3!
Z
R
x
3
p(x)dx
|
{z}
E(X
3
)
+::::
= 1 +E(X) +

2
2!
E

X
2

+

3
3!
E

X
3

+::::
=
1
X
n=0

n
E(X
n
)
n!
:
10

Calculating Moments
Thek
th
momentmkof the random variableXcan now be obtained by di¤erentiating, i.e.
mk=M
(k)
X
() ;k= 0;1;2; :::
M
(k)
X
() =
d
k
d
k
MX()




=0
So what is this result saying? ConsiderMX() =
1
X
n=0

n
E(X
n
)
n!
MX() = 1 +E[X] +

2
2!
E

X
2

+

3
3!
E

X
3

+::::+

n
n!
E[X
n
]
As an example suppose we wish to obtain the second moment; di¤erentiate twice with respect to
d
d
MX() =E[X] +E

X
2

+

2
2
E

X
3

+::::+

n1
(n1)!
E[X
n
]
and for the second time
d
2
d
2
MX() =E

X
2

+E

X
3

+::::+

n2
(n2)!
E[X
n
]:
Setting= 0;gives
d
2
d
2
MX(0) =E

X
2

which captures the second momentE[X
2
]. Remember we will already have an expression forMX():
A useful result in …nance is the MGF for the normal distribution. IfXN(;
2
), then we can construct
a standard normalN(0;1)by setting=
X

=)X=+:
The MGF is
MX() =E

e
x

=E

e
(+)

=e

E

e


So the MGF ofXis therefore equal to the MGF ofbut withreplaced by:This is much nicer than
trying to calculate the MGF ofXN(;
2
):
E

e


=
1
p
2
Z
1
1
e
x
e
x
2
=2
dx=
1
p
2
Z
1
1
e
xx
2
=2
dx
=
1
p
2
Z
1
1
e

1
2(x
2
2x+
2

2
)
dx=
1
p
2
Z
1
1
e

1
2
(x)
2
+
1
2

2
dx
=e
1
2

21
p
2
Z
1
1
e

1
2
(x)
2
dx
Now do a change of variable - putu=x
E

e


=e
1
2

21
p
2
Z
1
1
e

1
2
u
2
du
=e
1
2

2
11 Thus
MX() =e

E

e


=e
+
1
2

2

2
To get the simpler formula for a standard normal distribution put= 0; = 1to getMX() =e
1
2

2
:
We can now obtain the …rst four moments for a standard normal
m1=
d
d
e
1
2

2




=0
=e
1
2

2



=0
= 0
m2=
d
2
d
2
e
1
2

2




=0
=


2
+ 1

e
1
2

2



=0
= 1
m3=
d
3
d
3
e
1
2

2




=0
=


3
+ 3

e
1
2

2



=0
= 0
m4=
d
4
d
4
e
1
2

2




=0
=


4
+ 6
2
+ 3

e
1
2

2



=0
= 3
The latter two are particularly useful in calculating the skew and kurtosis.
IfXandYare independent random variables then
MX+Y() =E

e
(x+y)

=E

e
x
e
y

=E

e
x

E

e
y

=MX()MY():
12

Calculus Refresher
Taylor for two Variables
Assuming that a functionf(x; t)is di¤erentiable enough, nearx=x0; t=t0;
f(x; t) =f(x0; t0) + (xx0)fx(x0; t0) +
(tt0)ft(x0; t0)
+
1
2
2
4
(xx0)
2
fxx(x0; t0)
+2 (xx0) (tt0)fxt(x0; t0)
+ (tt0)
2
ftt(x0; t0)
3
5+::::
That is,
f(x; t) =constant+linear+quadratic
+::::
The error in truncating this series after the second order terms tends to zero faster than the included
terms. This result is particularly important for Itô’s lemma in Stochastic Calculus.
Suppose a functionf=f(x; y)and bothx; ychange by a small amount, sox!x+xandy!y+y;
then we can examine the change infusing a two dimensional form of Taylor
f(x+x; y+y) =f(x; y) +fxx+fyy+
1
2
fxxx
2
+
1
2
fyyy
2
+
fxyxy+O

x
2
; y
2

:
By takingf(x; y)to the lhs, writing
df=f(x+x; y+y)f(x; y)
and considering only linear terms, i.e.
df=
@f
@x
x+
@f
@y
y
we obtain a formula for thedi¤erentialortotal changeinf:
13 Integration
There are two ways to show the following important result
Z
R
e
x
2
=
p
:
The …rst can be thought of as the ’poor man’s’derivation.
TheCDFfor the Normal Distribution is
N(x) =
1
p
2
Z
x
1
e
s
2
=2
ds
Ifx! 1then we know (by the fact that the area under a PDF has to sum to unity) that
1
p
2
Z
1
1
e
s
2
=2
ds= 1:
Make the substitutionx=s=
p
2to givedx=ds=
p
2;hence the integral becomes
p
2
Z
1
1
e
x
2
dx=
p
2
and hence we obtain Z
1
1
e
x
2
dx=
p

From this we also note that
R
1
0
e
x
2
dx=
p
2
becausee
x
2
is an even function.
The second requires double integration. PutI=
Z
R
e
x
2
dxso that
I
2
=
Z
R
e
x
2
dx
Z
R
e
y
2
dy
=
Z
R
Z
R
e
(x
2
+y
2
)
dxdy
The region of integration is a square centered at the origin of in…nite dimension
x2(1;1)
y2(1;1)
i.e. the complete 2D plane. Introduce plane polars
x=rcos
y=rsin

dxdy!rdrd
The region of integration is now a circle centred at the origin of in…nite radius
0r <1
02
so the problem becomes
I
2
=
Z
2
0
Z
1
0
e
r
2
rdrd
=
1
2
Z
2
0
d=
Hence
I=
Z
1
1
e
x
2
dx=
p
:
14

Review of Di¤erential Equations
Cauchy Euler Equation
An equation of the form
Ly=ax
2
d
2
y
dx
2
+x
dy
dx
+cy=g(x)
is called a Cauchy-Euler equation.
To solve the homogeneous part, we look for a solution of the form
y=x

Soy
0
=x
1
!y
00
=(1)x
2
, which upon substitution yields the quadratic, A.E.
a
2
+b+c= 0;
whereb= (a)which can be solved in the usual way - there are 3 cases to consider, depending upon
the nature ofb
2
4ac.
Case 1: b
2
4ac >0!1,22R- 2 real distinct roots
GSy=Ax
1
+Bx
2
Case 2: b
2
4ac= 0!=1=22R- 1 real (double fold) root
GSy=x

(A+Blnx)
Case 3: b
2
4ac <0!=i2C- pair of complex conjugate roots
GSy=x

(Acos (lnx) +Bsin (lnx))
Example
Consider the followingEuler
type problem
1
2

2
S
2
d
2
V
dS
2
+rS
dV
dS
rV= 0;
V(0) = 0; V(S

) =S

E
where the constantsE; S

; ; r >0. We are given that the roots of A.Emare real withm<0< m+:
Look for a solution of the form General Solution is
V(S) =AS
m+
+BS
m
:
V(0) = 0 =)B= 0else we have division by zero
V(S) =AS
m+
15 To …ndAuse the second conditionV(S

) =S

E
V(S

) =A(S

)
m+
=S

E!A=
S

E
(S

)
m+
hence
V(S) =
S

E
(S

)
m+
(S)
m+
= (S

E)

S
S


m+
:
Similarity Methods
f(x; y)ishomogeneous of degreet0iff(x; y) =
t
f(x; y):
1.f(x; y) =
p
(x
2
+y
2
)
f(x; y) =
q

(x)
2
+ (y)
2

=
p
[(x
2
+y
2
)] =f(x; y)
g(x; y) =
x+y
xy
then
g(x; y) =
x+y
xy
=
0

x+y
xy

=
0
g(x; y)
2.h(x; y) =x
2
+y
3
h(x; y) = (x)
2
+ (y)
3
=
2
x
2
+
3
y
3
6=
t

x
2
+y
3

for anyt. Sohis not homogeneous.
Consider the function
F(x; y) =
x
2
x
2
+y
2
If for any >0we write
x
0
=x; y
0
=y
then
dy
0
dx
0
=
dy
dx
;
x
2
x
2
+y
2=
x
0
2
x
0
2
+y
0
2:
We see that the equation isinvariantunder the change of variables. It also makes sense to look for a
solution which is also invariant under the transformation. One choice is to write
v=
y
x
=
y
0
x
0
so write
y=vx:
De…nitionThe di¤erential equation
dy
dx
=f(x; y)is said to behomogeneouswhenf(x; y)is homogeneous
of degreetfor somet:
Method of Solution
Puty=vxwherevis some (as yet) unknown function. Hence we have
dy
dx
=
d
dx
(vx) =x
dv
dx
+v
dx
dx
=v
0
x+v
16

Hence
f(x; y) =f(x; vx)
Nowfis homogeneous of degreetso
f(x; vx) =x
t
f(1; v)
The di¤erential equation now becomes
v
0
x+v=x
t
f(1; v)
which is not always solvable - the method may not work. But whent= 0(homogeneous of degree zero)
thenx
t
= 1:Hence
v
0
x+v=f(1; v)
or
x
dv
dx
=f(1; v)v
which is separable, i.e.
Z
dv
f(1; v)v
=
Z
dx
x
+c
and the method is guaranteed to work.
Example
dy
dx
=
yx
y+x
First we check:
yx
y+x
=
0

yx
y+x

which is homogeneous of degree zero. So puty=vx
v
0
x+v=f(x; yx) =
vxx
vx+x
=
v1
v+ 1
=f(1; v)
therefore
v
0
x=
v1
v+ 1
v
=
(1 +v
2
)
v+ 1
and the D.E is now separable
Z
v+ 1
v
2
+ 1
dv=
Z
1
x
dx
Z
v
v
2
+ 1
dv+
Z
1
v
2
+ 1
dv=
Z
1
x
dx
1
2
ln

1 +v
2

+ arctanv=lnx+c
1
2
lnx
2

1 +v
2

+ arctanv=c
Now we turn to the original problem, so putv=
y
x
1
2
lnx
2

1 +
y
2
x
2

+ arctan

y
x

=c
which simpli…es to
1
2
ln

x
2
+y
2

+ arctan

y
x

=c:
17 The Error Function
We begin by solving the following initial value problem (IVP)
dy
dx
2xy= 2; y(0) = 1:
which is clearly a linear equation. The integrating factor isR(x) = exp (x
2
)which multiplying through
gives
e
x
2

dy
dx
2xy

= 2e
x
2
d
dx

e
x
2
y

= 2e
x
2
Z
x
0
d

e
t
2
y

= 2
Z
x
0
e
t
2
dt
Concentrate on the lhs and noting the ICy(0) = 1
e
t
2
y



x
0
=e
x
2
y(x)y(0) =e
x
2
y(x)1
hence
e
x
2
y(x)1 = 2
Z
x
0
e
t
2
dt
y(x) =e
x
2

1 + 2
Z
x
0
e
t
2
dt

We cannot simplify the integral on the rhs any further if we wish this to remain as a closed form solution.
However we note the following non-elementary integrals
erf (x) =
2
p

Z
x
0
e
s
2
ds;
erfc(x) =
2
p

Z
1
x
e
s
2
ds:
This is theerror functionandcomplimentary error function, in turn.
The solution to the IVP can now be written
y(x) =e
x
2
1 +
p
erf (x)

So, for example
Z
x1
x0
e
x
2
dx=
Z
x1
0
e
x
2
dx
Z
x0
0
e
x
2
dx
=
p
2
(erf (x1)erf (x0)):
Working:We are usingerf (x) =
2
p

R
x
0
e
s
2
dswhich rearranges to give
Z
x
0
e
s
2
ds=
p
2
erf (x)
18

then
Z
x1
x0

Z
0
x0
+
Z
x1
0
=
Z
x0
0
+
Z
x1
0
=
Z
x1
0
e
x
2
dx
Z
x0
0
e
x
2
dx
=
p
2
(erf (x1)erf (x0))
19 The Dirac delta function
Thedeltafunction denoted(x) ;is a very useful ’object’in applied maths and more recently in quant
…nance. It is the mathematical representation of a point source e.g. force, payment. Although labelled
a function, it is more of a distribution orgeneralised function. Consider the following de…nition for a
piecewise function
f(x) =

1

;
0;
x2



2
;

2

otherwise
Now put the delta function equal to the above for the following limiting value
(x) = lim
!0
f(x)
What is happening here? Asdecreases we note the ’hat’ narrows whilst becoming taller eventually
becoming a spike. Due to the de…nition, the area under the curve (i.e. rectangle) is …xed at1, i.e.
1

;
which is independent of the value of:So mathematically we can write in integral terms
Z
1
1
f(x)dx=
Z


2
1
f(x)dx+
Z
2


2
f(x)dx+
Z
1

2
f(x)dx
=
1

= 1for all:
Looking at what happens in the limit!0;the spike like (singular) behaviour at the origin gives the
following de…nition
(x) =

1x= 0
0x6= 0
with the property
Z
1
1
(x)dx= 1:
There are many ways to de…ne(x):Consider the Gaussian/Normal distribution with pdf
G(x) =
1

p
2
exp


x
2
2
2

:
The function takes its highest value atx= 0;asjxj ! 1there is exponential decay away from the origin.
If we stay at the origin, then asdecreases,G(x)exhibits the earlier spike (as it shoots up to in…nity),
so
lim
!0
G(x) =(x):
20

The normalising constant
1

p
2
ensures that the area under the curve will always be unity.
The graph below showsG(x)for values= 2:0(royal blue);1:0(red);0:5(green);0:25(purple);0:125(turquoise);
the Gaussian curve becomes slimmer and more peaked asdecreases.
G(x)is plotted for= 0:01
21 Now generalise this de…nition by centring the functionf(x)at any pointx
0
:So
(xx
0
) = lim
!0
f(xx
0
)
Z
1
1
(xx
0
)dx= 1:
The …gure will be as before, except that now centered atx
0
and not at the origin as before. So we see two
de…nitions of(x):Another is the Cauchy distribution
L(x) =
1


x
2
+
2
So here
(x) = lim
!0
1


x
2
+
2
Now suppose we have a smooth functiong(x)and consider the following integral problem
Z
1
1
g(x)(xx0)dx=g(x0)
This sifting property of the delta function is a very important one.
Heaviside Function
TheHeaviside function, denoted byH(), is a discontinuous function whose value is zero for negative
parameters and one for positive arguments
H(x) =

1x >0
0x <0
Some de…nitions have
H(x) =
8
<
:
1x >0
1
2
x= 0
0x <0
22

and
H(x) =

1x >0
0x0
It is an example of the general class of step functions.
23 ProbabilityDistributions
At the heart of modern …nance theory lies the uncertain movement of …nancial quantities. For modelling
purposes we are concerned with the evolution of random events through time.
Adi¤usion processis one that is continuous in space, while arandom walkis a process that is discrete.
The random path followed by the process is called a realization. Hence when referring to the path traced
out by a …nancial variable will be termed as an asset price realization.
The mathematics can be achieved by the concept of a transition density function and is the connection
between probability theory and di¤erential equations.
Trinomial Random Walk
A trinomial random walk models the dynamics of a random variable, with valueyat timet: is a
probability.yis the size of the move iny.
The Transition Probability Density Function
The transition pdf is denoted by
p(y; t;y
0
; t
0
)
We can gain information such as the centre of the distribution, where the random variable might be in the
long run, etc. by studying its probabilistic properties. So the density of particles di¤using from(y; t)to
(y
0
; t
0
):
Think of(y; t)as current (or backward) variables and(y
0
; t
0
)as future ones.
The more basic assistance it gives is with
P(a < y
0
< batt
0
jyatt) =
Z
b
a
p(y; t;y
0
; t
0
)dy
0
i.e. the probability that the random variabley
0
lies in the intervalaandb;at a future timet
0
;given it
started out at timetwith valuey:
24

p(y; t;y
0
; t
0
)satis…es two equations:
Forward equationinvolving derivatives with respect to the future state(y
0
; t
0
):Here(y; t)is a starting
point and is ’…xed’.
Backward equationinvolving derivatives with respect to the current state(y; t):Here(y
0
; t
0
)is a future
point and is ’…xed’. The backward equation tells us the probability that we were at(y; t)given that we
are now at(y
0
; t
0
);which is …xed.
The mathematics:Start out at a point(y; t):We want to answer the question, what is the probability
density function of the positiony
0
of the di¤usion at a later timet
0
?
This is known as thetransition density functionwrittenp(y; t;y
0
; t
0
)and represents the density of
particles di¤using from(y; t)to(y
0
; t
0
):How can we …ndp?
Forward Equation
Starting with a trinomial random walk which is discrete we can obtain a continuous time process to obtain
a partial di¤erential equation for the transition probability density function (i.e. a time dependent PDF).
So the random variable can either rise or fall with equal probability <
1
2
and remain at the same location
with probability12:
Suppose we are at(y
0
; t
0
);how did we get there?
At the previous step time step we must have been at one of(y
0
+y; t
0
t)or(y
0
y; t
0
t)or
(y
0
; t
0
t):
So
p(y; t; y
0
; t
0
) =p(y; t; y
0
+y; t
0
t) + (12)p(y; t; y
0
; t
0
t)
+p(y; t; y
0
y; t
0
t)
25 Taylor series expansion gives (omit the dependence on(y; t)in your working as they will not change)
p(y
0
+y; t
0
t) =p(y
0
; t
0
)
@p
@t
0
t+
@p
@y
0
y+
1
2
@
2
p
@y
0
2y
2
+:::
p(y
0
; t
0
t) =p(y
0
; t
0
)
@p
@t
0
t+:::
p(y
0
y; t
0
t) =p(y
0
; t
0
)
@p
@t
0
t
@p
@y
0
y+
1
2
@
2
p
@y
0
2y
2
+:::
Substituting into the above
p(y
0
; t
0
) =

p(y
0
; t
0
)
@p
@t
0
t+
@p
@y
0
y+
1
2
@
2
p
@y
0
2y
2

+
(12)

p(y
0
; t
0
)
@p
@t
0
t+

+

p(y
0
; t
0
)
@p
@t
0
t
@p
@y
0
y+
1
2
@
2
p
@y
0
2y
2

0 =
@p
@t
0
t+
@
2
p
@y
0
2y
2
@p
@t
0
=
y
2
t
@
2
p
@y
0
2
Now take limits. This only makes sense if
y
2
t
isO(1);i.e.y
2
O(t)and lettingy; t!0gives
the equation
@p
@t
0
=c
2
@
2
p
@y
0
2;
wherec
2
=
y
2
t
:This is called theforward Kolmogorov equation.Also called Fokker Planck equation.
It shows how the probability density of future states evolves, starting from(y; t):
26

The Backward Equation
The backward equation is particularly important in the context of …nance, but also a source of much
confusion. Illustrate with the ’real life’example that Wilmott uses.
Wilmott uses aTrinomialRandom Walk
So 3 possible states at the next time step. Here <1=2:
27 At 7pm you are at the o¢ ce - this is the point(y; t)
At 8pm you will be at one of three places:
xThe Pub - the point(y+y; t+t) ;
xStill at the o¢ ce - the point(y; t+t) ;
xMadame Jojo’s - the point(yy; t+t)
We are interested in the probability of being tucked up in bed at midnight(y
0
; t
0
);given that we were at
the o¢ ce at 7pm:
Looking at the earlier …gure, we can only get to bed at midnight via either
the pub
the o¢ ce
Madame Jojo’s
at 8pm.
What happens after 8pm doesn’t matter - we don’t care, you may not even remember! We are only
concerned with being in bed at midnight.
The earlier …gure shows many di¤erent paths, only the ones ending up in ’our’bed are of interest to us.
In words: The probability of going from the o¢ ce at 7pm to bed at midnight is
the probability of going to the pub from the o¢ ce and then to bed at midnight plus
the probability of staying in the o¢ ce and then going to bed at midnight plus
the probability of going to Madame Jojo’s from the o¢ ce and then to bed at midnight
The above can be expressed mathematically as
p(y; t;y
0
; t
0
) =p(y+y; t+t;y
0
; t
0
) + (12)p(y; t+t;y
0
; t
0
) +
p(yy; t+t;y
0
; t
0
):
Performing a Taylor expansion gives droppingy
0
; t
0
p(y; t) =

p+
@p
@t
t+
@p
@y
y+
1
2
@
2
p
@y
2
y
2
+::

+ (12)

p
@p
@t
t+::



p+
@p
@t
t
@p
@y
y+
1
2
@
2
p
@y
2
y
2
+::

:
Most of the terms cancel and leave
0 =t
@p
@t
+y
2
@
2
p
@y
2
+:::
28

which becomes
0 =
@p
@t
+
y
2
t
@
2
p
@y
2
+:::
and letting
y
2
t
=c
2
wherecis non-zero and …nite ast; y!0;we have
@p
@t
+c
2
@
2
p
@y
2
= 0
Solving the Forward Equation
The equation is
@p
@t
0
=c
2
@
2
p
@y
02
for the unknown functionp=p(y
0
; t
0
):The idea is to obtain a solution in terms of Gaussian curves. Let’s
drop the primed notation.
We assume a solution of the following form exists:
p(y; t) =t
a
f

y
t
b

wherea; bare constants to be determined. So put
=
y
t
b
=yt
b
;
which is a dimensionless variable. We have the following derivatives
@
@y
=t
b
;
@
@t
=byt
b1
we can now say
p(y; t) =t
a
f()
therefore
@p
@y
=
@p
@
@
@y
=t
a
f
0
():t
b
=t
ab
f
0
()
@
2
p
@y
2
=
@
@y

@p
@y

=
@
@y

t
ab
f
0
()

=
@
@y
@
@

t
ab
f
0
()

=t
ab
1
t
b
@
@
f
0
() =t
a2b
f
00
()
@p
@t
=t
a
@
@t
f() +at
a1
f()
we can use the chain rule to write
@
@t
f() =
@f
@
:
@
@t
=byt
b1
f
0
()
so we have
@p
@t
=at
a1
f()byt
ab1
f
0
()
29 and then substituting these expressions in to the pde gives
at
a1
f()byt
ab1
f
0
() =c
2
t
a2b
f
00
:
We know fromthat
y=t
b

hence the equation above becomes
at
a1
f()bt
a1
f
0
() =c
2
t
a2b
f
00
:
For the similarity solution to exist we require the equation to be independent oft;i.e.a1 =a2b=)
b= 1=2;therefore
af
1
2
f
0
=c
2
f
00
thus we have so far
p=t
a
f

y
p
t

which gives us a whole family of solutions dependent upon the choice ofa:
We know thatprepresents a pdf, hence
Z
R
p(y; t)dy= 1 =
Z
R
t
a
f

y
p
t

dy
change of variablesu=y=
p
t!du=dy=
p
tso the integral becomes
t
a+1=2
Z
1
1
f(u)du= 1
which we need to normalize independent of timet:This is only possible ifa=1=2:
So the D.E becomes

1
2
(f+f
0
) =c
2
f
00
:
We have an exact derivative on the lhs, i.e.
d
d
(f) =f+f
0
, hence

1
2
d
d
(f) =c
2
f
00
and we can integrate once to get

1
2
(f) =c
2
f
0
+K:
We obtainKfrom the following information about a probability density, as! 1
f()!0
f
0
()!0
henceK= 0in order to get the correct solution, i.e.

1
2
(f) =c
2
f
0
which can be solved as a simple …rst order variable separable equation:
f() =Aexp


1
4c
2
2

:
30

Ais a normalizing constant, so write
A
Z
R
exp


1
4c
2
2

d= 1:
Now substitutex==2c;so2cdx=d
2cA
Z
R
exp

x
2

dx
|
{z}
=
p
= 1;
which givesA= 1=2c
p
:Returning to
p(y; t) =t
1=2
f()
becomes
p(y
0
; t
0
) =
1
2c
p
t
0
exp


y
0
2
4t
0
c
2
!
:
This is a pdf for a variableythat is normally distributed with mean zero and standard deviationc
p
2t;
which we ascertained by the following comparison:

1
2
y
0
2
2t
0
c
2
:
1
2
(x)
2

2
i.e.0and
2
2t
0
c
2
:
This solution is also called theSource SolutionorFundamental Solution.
If the random variabley
0
has valueyat timetthen we can generalize to
p(y; t;y
0
; t
0
) =
1
2c
p
(t
0
t)
exp


(y
0
y)
2
4c
2
(t
0
t)
!
31 Att
0
=tthis is now a Dirac delta function(y
0
y):This particle is known to start from(y; t)and
di¤uses out to(y
0
; t
0
)with meanyand variance(t
0
t)
Recall this behaviour of decay away from one pointy, unbounded growth at that point and constant area
means thatp(y; t;y
0
; t
0
)has turned in to aDirac delta function(y
0
y)ast
0
!t.
32

Using a Binomial random walk
The earlier results can also be obtained using a symmetric random walk. Consider the following (two step)
binomial random walk. So the random variable can either rise or fall with equal probability.
yis the random variable andtis a time step.yis the size of the move iny:
P[y] =P[y] = 1=2:
Suppose we are at(y
0
; t
0
);how did we get there? At the previous step time step we must have been at one
of(y
0
+y; t
0
t)or(y
0
y; t
0
t):
So
p(y
0
; t
0
) =
1
2
p(y
0
+y; t
0
t) +
1
2
p(y
0
y; t
0
t)
Taylor series expansion gives
p(y
0
+y; t
0
t) =p(y
0
; t
0
)
@p
@t
0
t+
@p
@y
0
y+
1
2
@
2
p
@y
0
2y
2
+:::
p(y
0
y; t
0
t) =p(y
0
; t
0
)
@p
@t
0
t
@p
@y
0
y+
1
2
@
2
p
@y
0
2y
2
+:::
Substituting into the above
p(y
0
; t
0
) =
1
2

p(y
0
; t
0
)
@p
@t
0
t+
@p
@y
0
y+
1
2
@
2
p
@y
0
2y
2

+
1
2

p(y
0
; t
0
)
@p
@t
0
t
@p
@y
0
y+
1
2
@
2
p
@y
0
2y
2

33 0 =
@p
@t
0
t+
1
2
@
2
p
@y
0
2y
2
@p
@t
0
=
1
2
y
2
t
@
2
p
@y
0
2
Now take limits. This only makes sense if
y
2
t
isO(1);i.e.y
2
O(t)and lettingy; t!0gives the
equation
@p
@t
0
=
1
2
@
2
p
@y
0
2
This is called theforward Kolmogorov equation.Also called Fokker Planck equation.
It shows how the probability density of future states evolves, starting from(y; t):
A particular solution of this is
p(y; t;y
0
; t
0
) =
1
p
2(t
0
t)
exp


(y
0
y)
2
2 (t
0
t)
!
Att
0
=tthis is equal to(y
0
y). The particle is known to start from(y; t)and its density is normal
with meanyand variancet
0
t:
34

Thebackward equationtells us the probability that we are at(y; t)given that we are at(y
0
; t
0
)in the
future:So(y
0
; t
0
)are now …xed and(y; t)are variables. So the probability of being at(y; t)given we are
aty
0
att
0
is linked to the probabilities of being at(y+y; t+t)and(yy; t+t):
p(y; t;y
0
; t
0
) =
1
2
p(y+y; t+t;y
0
; t
0
) +
1
2
p(yy; t+t;y
0
; t
0
)
Since(y
0
; t
0
)do not change, drop these for the time being and use a TSE on the right hand side
p(y; t) =
1
2

p(y; t) +
@p
@t
t+
@p
@y
y+
1
2
@
2
p
@y
2y
2
+:::

+
1
2

p(y; t) +
@p
@t
t
@p
@y
y+
1
2
@
2
p
@y
2y
2
+:::

which simpli…es to
0 =
@p
@t
+
1
2
y
2
t
@
2
p
@y
2:
Putting
y
2
t
=O(1)and taking limit gives thebackward equation

@p
@t
=
1
2
c
2
@
2
p
@y
2:
or commonly written as
@p
@t
+
1
2
@
2
p
@y
2= 0
35 Further Solutions of the heat equation
We know the one dimensional heat/di¤usion equation
@u
@t
=
@
2
u
@x
2
can be solved by seeking a solution of the formu(x; t) =t



x
t


:The corresponding solution derived
using the similarity reduction technique is thefundamental solution
u(x; t) =
1
2
p
t
exp


x
2
4t

:
Some books refer to this as asource solution.
Let’s consider the following integral
lim
t!0
Z
1
1
u(y; t)f(y)dy
which can be simpli…ed by the substitution
s=
y
2
p
t
=)2
p
tds=dy
to give
lim
t!0
1
2
p
t
Z
1
1
exp

s
2

f

2
p
ts

2
p
tds:
In the limiting process we get
f(0)
1
p

Z
1
1
exp

s
2

ds=f(0)
1
p

p

=f(0):
Hence
lim
t!0
Z
1
1
u(y; t)f(y)dy=f(0):
A slight extension of the above shows that
lim
t!0
Z
1
1
u(xy; t)f(y)dy=f(x);
where
u(xy; t) =
1
2
p
t
exp


(xy)
2
4t
!
:
Let’s derive the result above. As earlier we begin by writings=
xy
2
p
t
=)y=x2
p
tsand hence
dy=2
p
tds:Under this transformation the limits are
y=1 !s=1
y=1 !s=1
1
2
p
t
Z
1
1
exp

s
2

f

x2
p
ts

2
p
tds

ds
36

lim
t!0
1
p

Z
1
1
exp

s
2

f

x2
p
ts

ds
=f(x)
1
p

Z
1
1
exp

s
2

ds
=f(x)
1
p

p

and
lim
t!0
Z
1
1
u(xy; t)f(y)dy=f(x):
Since the heat equation is a constant coe¢ cient PDE, ifu(x; t)satis…es it, thenu(xy; t)is also a solution
for anyy:
Recall what it means for an equation to be linear:
Since the heat equation is linear,
1. ifu(xy; t)is a solution, so is a multiplef(y)u(xy; t)
2. we can add up solutions. Sincef(y)u(xy; t)is a solution for anyy;so too is the integral
Z
1
1
u(xy; t)f(y)dy:
Recall, adding can be done in terms of an integral. So we we can summarize by specifying the
following initial value problem
@u
@t
=
@
2
u
@x
2
u(x;0) =f(x)
which has a solution
u(x; t) =
1
2
p
t
Z
1
1
exp


(xy)
2
4t
!
f(y)dy:
This satis…es the initial condition att= 0because we have shown that at that point the value of
this integral isf(x):Puttingt <0gives a non-existent solution, i.e. the integrand will blow up.
Example 1Consider the IVP
@u
@t
=
@
2
u
@x
2
u(x;0) =

0ifx >0
1ifx <0
We can write down the solution as
u(x; t) =
1
2
p
t
Z
1
1
exp


(xy)
2
4t
!
u(y;0)
|
{z}
=f(y)
dy
=
1
2
p
t
Z
0
1
exp


(xy)
2
4t
!
:1dy
37 put
s=
yx
p
2t
Z
0
1
becomes
Zx
p
2t
1
1
2
p
t
Zx
2
p
t
1
exp

s
2
=2
p
2tds
=
1
p
2
Zx
p
2t
1
exp

s
2
=2

ds
=N

x
p
2t

So we have expressed the solution in terms of the CDF.
This can also be solved by using the substitution
bs=
(yx)
2
p
t
! dy= 2
p
tdbs
Z
0
1
becomes
Zx
2
p
t
1

1
2
p
t
Zx
2
p
t
1
exp

bs
2

2
p
tdbs
=
1
2
:
2
p

Z
1
x2
p
t
exp

bs
2

dbs
=
1
2
erfc

x
2
p
t

so now we have a solution in terms of the complimentary error function.
38

Mathematical Preliminaries
Introduction to Probability - Moment Generating Function
Themoment generating functionofX;denotedMX()is given by
MX() =E

e
x

=
Z
R
e
x
p(x)dx
provided the expectation exists. We can expand as a power series to obtain
MX() =
1
X
n=0

n
E(X
n
)
n!
so then
th
moment is the coe¢ cient of
n
=n!;or then
th
derivative evaluated at zero.
How do we arrive at this result?
We use the Taylor series expansion for the exponential function:
Z
R
e
x
p(x)dx=
Z
R

1 +x+
(x)
2
2!
+
(x)
3
3!
+::::::
!
p(x)dx
=
Z
R
p(x)dx
|
{z}
1
+
Z
R
xp(x)dx
|{z}
E(X)
+

2
2!
Z
R
x
2
p(x)dx
|
{z}
E(X
2
)
+

3
3!
Z
R
x
3
p(x)dx
|
{z}
E(X
3
)
+::::
= 1 +E(X) +

2
2!
E

X
2

+

3
3!
E

X
3

+::::
=
1
X
n=0

n
E(X
n
)
n!
:
1 Calculating Moments
Thek
th
momentmkof the random variableXcan now be obtained by di¤erentiating, i.e.
mk=M
(k)
X
() ;k= 0;1;2; :::
M
(k)
X
() =
d
k
d
k
MX()




=0
So what is this result saying? ConsiderMX() =
1
X
n=0

n
E(X
n
)
n!
MX() = 1 +E[X] +

2
2!
E

X
2

+

3
3!
E

X
3

+::::+

n
n!
E[X
n
]
As an example suppose we wish to obtain the second moment; di¤erentiate twice with respect to
d
d
MX() =E[X] +E

X
2

+

2
2
E

X
3

+::::+

n1
(n1)!
E[X
n
]
and for the second time
d
2
d
2
MX() =E

X
2

+E

X
3

+::::+

n2
(n2)!
E[X
n
]:
Setting= 0;gives
d
2
d
2
MX(0) =E

X
2

which captures the second momentE[X
2
]. Remember we will already have an expression forMX():
A useful result in …nance is the MGF for the normal distribution. IfXN(;
2
), then we can construct
a standard normalN(0;1)by setting=
X

=)X=+:
The MGF is
MX() =E

e
x

=E

e
(+)

=e

E

e


So the MGF ofXis therefore equal to the MGF ofbut withreplaced by:This is much nicer than
trying to calculate the MGF ofXN(;
2
):
E

e


=
1
p
2
Z
1
1
e
x
e
x
2
=2
dx=
1
p
2
Z
1
1
e
xx
2
=2
dx
=
1
p
2
Z
1
1
e

1
2(x
2
2x+
2

2
)
dx=
1
p
2
Z
1
1
e

1
2
(x)
2
+
1
2

2
dx
=e
1
2

21
p
2
Z
1
1
e

1
2
(x)
2
dx
Now do a change of variable - putu=x
E

e


=e
1
2

21
p
2
Z
1
1
e

1
2
u
2
du
=e
1
2

2
2

Thus
MX() =e

E

e


=e
+
1
2

2

2
To get the simpler formula for a standard normal distribution put= 0; = 1to getMX() =e
1
2

2
:
We can now obtain the …rst four moments for a standard normal
m1=
d
d
e
1
2

2




=0
=e
1
2

2



=0
= 0
m2=
d
2
d
2
e
1
2

2




=0
=


2
+ 1

e
1
2

2



=0
= 1
m3=
d
3
d
3
e
1
2

2




=0
=


3
+ 3

e
1
2

2



=0
= 0
m4=
d
4
d
4
e
1
2

2




=0
=


4
+ 6
2
+ 3

e
1
2

2



=0
= 3
The latter two are particularly useful in calculating the skew and kurtosis.
IfXandYare independent random variables then
MX+Y() =E

e
(x+y)

=E

e
x
e
y

=E

e
x

E

e
y

=MX()MY():
3 Review of Di¤erential Equations
Cauchy Euler Equation
An equation of the form
Ly=ax
2
d
2
y
dx
2
+x
dy
dx
+cy=g(x)
is called a Cauchy-Euler equation.
To solve the homogeneous part, we look for a solution of the form
y=x

Soy
0
=x
1
!y
00
=(1)x
2
, which upon substitution yields the quadratic, A.E.
a
2
+b+c= 0;
whereb= (a)which can be solved in the usual way - there are 3 cases to consider, depending upon
the nature ofb
2
4ac.
Case 1: b
2
4ac >0!1,22R- 2 real distinct roots
GSy=Ax
1
+Bx
2
Case 2: b
2
4ac= 0!=1=22R- 1 real (double fold) root
GSy=x

(A+Blnx)
Case 3: b
2
4ac <0!=i2C- pair of complex conjugate roots
GSy=x

(Acos (lnx) +Bsin (lnx))
4

Di¤usion Process
Gis called a di¤usion process if
dG(t) =A(G; t)dt+B(G; t)dW(t) (1)
This is also an example of a Stochastic Di¤erential Equation (SDE) for the processGand consists of two
components:
1.A(G;t)dtis deterministic –coe¢ cient ofdtis known as thedriftof the process.
2.B(G; t)dWis random – coe¢ cient ofdWis known as thedi¤usionorvolatilityof the process.
We sayGevolves according to (or follows) this process.
For example
dG(t) = (G(t) +G(t1))dt+dW(t)
is not a di¤usion (although it is a SDE)
A0andB1reverts the process back to Brownian motion
Called time-homogeneous ifAandBare not dependent ont:
dG
2
=B
2
dt:
We say(1)is a SDE for the processGor aRandom WalkfordG:
The di¤usion(1)can be written in integral form as
G(t) =G(0) +
Z
t
0
A(G; )d+
Z
t
0
B(G; )dW()
Remark: A di¤usionGis aMarkovprocess if - once the present stateG(t) =gis given, the past
fG(); < tgis irrelevant to the future dynamics.
We have seen that Brownian motion can take on negative values so its direct use for modelling stock prices
is unsuitable. Instead a non-negative variation of Brownian motion called geometric Brownian motion
(GBM) is used
If for example we have a di¤usionG(t)
dG=Gdt+GdW (2)
then the drift isA(G; t) =Gand di¤usion isB(G; t) =G:
The process(2)is also called Geometric Brownian Motion (GBM).
Brownian motionW(t)is used as a basis for a wide variety of models. Consider a pricing process
fS(t) :t2R+g: we can model its instantaneous changedSby a SDE
dS=a(S; t)dt+b(S; t)dW (3)
By choosing di¤erent coe¢ cientsaandbwe can have various properties for the di¤usion process.
A very popular …nance model for generating asset prices is the GBM model given by(2). The instantaneous
return on a stockS(t)is a constant coe¢ cient SDE
dS
S
=dt+dW (4)
whereandare the return’s drift and volatility, respectively.
5 An Extension of Itô’s Lemma (2D)
Now suppose we have a functionV=V(S; t)whereSis a process which evolves according to(4):If
S!S+dS; t!t+dtthen a natural question to ask is "what is the jump inV?"To answer this we
return to Taylor, which gives
V(S+dS; t+dt)
=V(S; t) +
@V
@t
dt+
@V
@S
dS+
1
2
@
2
V
@S
2
dS
2
+O

dS
3
; dt
2

SoSfollows
dS=Sdt+SdW
Remember that
E(dW) = 0; dW
2
=dt
we only work toO(dt)- anything smaller we ignore and we also know that
dS
2
=
2
S
2
dt
So the changedVwhenV(S; t)!V(S+dS; t+dt)is given by
dV=
@V
@t
dt+
@V
@S
[Sdt+SdW] +
1
2

2
S
2
@
2
V
@S
2
dt
Re-arranging to have the standard form of a SDEdG=a(G; t)dt+b(G; t)dWgives
dV=

@V
@t
+S
@V
@S
+
1
2

2
S
2
@
2
V
@S
2

dt+S
@V
@S
dW. (5)
This is Itô’s Formula in two dimensions.
Naturally ifV=V(S)then(5)simpli…es to the shorter version
dV=

S
dV
dS
+
1
2

2
S
2
d
2
V
dS
2

dt+S
dV
dS
dW. (6)
Examples:In the following casesSevolves according to GBM.
GivenV=t
2
S
3
obtain the SDE forV;i.e.dV:So we calculate the following terms
@V
@t
= 2tS
3
;
@V
@S
= 3t
2
S
2
!
@
2
V
@S
2
= 6t
2
S:
We now substitute these into(5)to obtain
dV=

2tS
3
+ 3t
2
S
3
+ 3
2
S
3
t
2

dt+ 3t
2
S
3
dW.
Now consider the exampleV= exp (tS)
Again, function of 2 variables. So
@V
@t
=Sexp (tS) =SV
@V
@S
=texp (tS) =tV
@
2
V
@S
2
=t
2
V
6

Substitute into(5)to get
dV=V

S+tS+
1
2

2
S
2
t
2

dt+ (StV)dW:
Not usually possible to write the SDE in terms ofVbut if you can do so - do not struggle to …nd a
relation if it does not exist. Always works for exponentials.
One more example: That isS(t)evolves according to GBM andV=V(S) =S
n
:So use
dV=

S
dV
dS
+
1
2

2
S
2
d
2
V
dS
2

dt+

S
dV
dS

dW.
V
0
(S) =nS
n1
!V
00
(S) =n(n1)S
n2
Therefore Itô gives usdV=

SnS
n1
+
1
2

2
S
2
n(n1)S
n2

dt+

SnS
n1

dW
dV=

nS
n
+
1
2

2
n(n1)S
n

dt+ [nS
n
]dW
Now we knowV(S) =S
n
;which allows us to write
dV=V

n+
1
2

2
n(n1)

dt+ [n]V dW
with drift=V

n+
1
2

2
n(n1)

and di¤usion=nV:
7 Important Cases - Equities and Interest Rates
If we now considerSwhich follows a lognormal random walk, i.e.V= log(S)then substituting into(6)
gives
d((logS)) =


1
2

2

dt+dW
Integrating both sides over a given time horizon ( betweent0andT)
Z
T
t0
d((logS)) =
Z
T
t0


1
2

2

dt+
Z
T
t0
dW(T > t0)
we obtain
log
S(T)
S(t0)
=


1
2

2

(Tt0) +(W(T)W(t0))
Assuming att0= 0,W(0) = 0andS(0) =S0the exact solution becomes
ST=S0exp


1
2

2

T+
p
T

. (7)
(7)is of particular interest when considering the pricing of a simple European option due to its non path
dependence. Stock prices cannot become negative, so we allowS, a non-dividend paying stock to evolve
according to the lognormal process given above - and acts as the starting point for the Black-Scholes
framework.
Howeveris replaced by the risk-free interest raterin(7)and the introduction of the risk-neutral measure
- in particular the Monte Carlo method for option pricing.
8

Interest rates exhibit a variety of dynamics that are distinct from stock prices, requiring the development
of speci…c models to include behaviour such as return to equilibrium, boundedness and positivity. Here we
consider another important example of a SDE, put forward by Vasicek in 1977. This model has a mean
reverting Ornstein-Uhlenbeck process for the short rate and is used for generating interest rates, given by
drt= (rt)dt+dWt. (8)
So drift= (rt)and volatility=.
refers to the reversion rate and


(=
r) denotes the mean rate, and we can rewrite this random walk(7)
fordras
drt=(rt
r)dt+dWt.
By settingt=rt
r,tis a solution of
dt=tdt+dWt;0=; (9)
hence it follows thattis an Ornstein-Uhlenbeck process and an analytic solution for this equation exists.
(9)can be written asdt+tdt=dWt:
Multiply both sides by an integrating factore
t
e
t
(dt+t)dt=e
t
dWt
d

e
t
t

=e
t
dWt
Integrating over[0; t]gives
Z
t
0
d(e
s
s) =
Z
t
0
e
s
dWs
e
s
sj
t
0
=
Z
t
0
e
s
dWs!e
t
t0=
Z
t
0
e
s
dWs
t=e
t
+
Z
t
0
e
(st)
dWs: (10)
By using integration by parts, i.e.
Z
v du=uv
Z
u dvwe can simplify (10).
u=Ws
v=e
(st)
!dv=e
(st)
ds
Therefore Z
t
0
e
(st)
dWs=Wt
Z
t
0
e
(st)
Wsds
and we can write (10) as
t=e
t
+

Wt
Z
t
0
e
(st)
Wsds

allowing numerical treatment for the integral term.
9 Higher Dimensional Itô
Consider the case whereNshares follow the usual Geometric Brownian Motions, i.e.
dSi=
iSidt+iSidWi;
for1iN:The share price changes are correlated with correlation coe¢ cient
ij:By starting with a
Taylor series expansion
V(t+t; S1+S1; S2+S2; :::::; SN+SN) =
V(t; S1; S2; :::::; SN) +
@V
@t
+
NP
i=1
@V
@Si
dSi+
1
2
NP
i=1
NP
j=i
@
2
V
@Si@Sj
+::::
which becomes, usingdWidWj=
ijdt
dV=

@V
@t
+
NP
i=1

iSi
@V
@Si
+
1
2
NP
i=1
NP
j=i
ij
ijSiSj
@
2
V
@Si@Sj
!
dt+
NP
i=1
iSi
@V
@Si
dWi:
We can integrate both sides over0andtto give
V(t; S1; S2; :::::; SN) =V(0; S1; S2; :::::; SN) +
Z
t
0

@V
@
+
NP
i=1

iSi
@V
@Si
+
1
2
NP
i=1
NP
j=i
ij
ijSiSj
@
2
V
@Si@Sj
!
d
+
Z
t
0
NP
i=1
iSi
@V
@Si
dWi:
Discrete Time Random Walks
When simulating a random walk we write the SDE given by (6) in discrete form
S=Si+1Si=rSit+Si
p
t
which becomes
Si+1=Si

1 +rt+
p
t

: (11)
This gives us a time-stepping scheme for generating an asset price realization if we knowS0, i.e.S(t)at
t= 0: N(0;1)is a random variable with a standard Normal distribution.
Alternatively we can use discrete form of the analytical expression(7)
Si+1=Siexp

r
1
2

2

t+
p
t

:
10

So we now start generating random numbers. In C++ we produce uniformly distributed random variables
and then use the Box Muller transformation (Polar Marsaglia method) to convert them to Gaussians.
This can also be generated on an Excel spreadsheet using the in-built random generator function RAND().
A crude (but useful) approximation forcan be obtained from
12
X
i=1
RAND()6
where RAND()U[0;1]:
A more accurate (but slower)can be computed using NORMSINV(RAND()):
11 Dynamics of Vasicek Model
The Vasicek model
drt=(
rrt)dt+dWt
is an example of aMean Reverting Process- an important property of interest rates.refers to the
reversion rate(also called the speed of reversion) and
rdenotes themean rate.
acts like a "spring". Mean reversion means that a process which increases has a negative trend (pulls
it down to a mean level
r), and whenrtdecreases on averagepulls it back up tor:
In discrete time we can approximate this by writing (as earlier)
ri+1=ri+(
rri)t+
p
t0
0.2
0.4
0.6
0.8
1
1.2
0 0.5 1 1.5 2 2.5 3 3.5
To gain an understanding of the properties of this model, look atdrin the absence of randomness
dr=(r
r)dt
Z
dr
(r
r)
=
Z
dt
r(t) =
r+kexp (kt)
Socontrols the rate of exponential decay.
One of the disadvantages of the Vasicek model is that interest rates can become negative. The Cox Ingersoll
Ross (CIR) model is similar to the above SDE but is scaled with the interest rate:
drt=(
rrt)dt+
p
rtdWt:
Ifrtever gets close to zero, the amount of randomness decreases, i.e. di¤usion!0;therefore the drift
dominates, in particular the mean rate.
12

Producing Standardized Normal Random Variables
Consider the RAND() function in Excel that produces a uniformly distributed random number over0and
1;writtenUnif[0;1]:We can show that for a large numberN,
lim
N!1
r
12N

NP
1
Unif[0;1]
N
2

N(0;1):
IntroduceUito denote a uniformly distributed random variable over[0;1]and sum up. Recall that
E[Ui] =
1
2
V[Ui] =
1
12
The mean is then
E

NP
i=1
Ui

=N=2
so subtract o¤N=2;so we examine the variance of

NP
1
Ui
N
2

V

NP
1
Ui
N
2

=
NP
1
V[Ui]
=N=12
As the variance is not1, write
V



NP
1
Ui
N
2

for some2R:Hence
2N
12
= 1which gives=
p
12=Nwhich normalises the variance. Then we achieve
the result r
12N

NP
1
Ui
N
2

:
Rewrite as
NP
1
UiN
1
2

q
1
12
p
N
:
and forN! 1by the Central Limit Theorem we getN(0;1)
13 Generating Correlated Normal Variables
Consider two uncorrelated standard Normal variables"1and"2from which we wish to form a correlated
pair
1;&
2(N(0;1)), such thatE[
1
2] =:The following scheme can be used
1.E["1] =E["2] = 0 ;E["
2
1] =E["
2
2] = 1andE["1"2] = 0 (*"1; "2are uncorrelated):
2. Set
1="1and
2="1+"2(i.e. a linear combination).
3. Now
E[
1
2] ==E["1("1+"2)]
E["1("1+"2)] =
E

"
2
1

+E["1"2] =!=
E


2
2

= 1 =E

("1+"2)
2

=E


2
"
2
1+
2
"
2
2+ 2"1"2

=
2
E

"
2
1

+
2
E

"
2
2

+ 2E["1"2] = 1

2
+
2
= 1!=
p
1
2
4. This gives
1="1and
2="1+
p
1
2

"2which are correlated standardized Normal variables.
14

Transition Probability Density Functions for Stochastic Di¤erential Equa-
tions
To match the mean and standard deviation of the trinomial model with the continuous-time random walk
we choose the following de…nitions for the probabilities

+
(y; t) =
1
2
t
y
2

B
2
(y; t) +A(y; t)y

;


(y; t) =
1
2
t
y
2

B
2
(y; t)A(y; t)y

We …rst note that the expected value is

+
(y) +

(y) +

1
+



(0)
=


+



y
We already know that the mean and variance of the continuous time random walk given by
dy=A(y; t)dt+b(y; t)dW
is, in turn,
E[dy] =Adt
V[dy] =B
2
dt:
So to match the mean requires


+



y=At
The variance of the trinomial model isE[u
2
]E
2
[u]and hence becomes
(y)
2


+
+





+



2
(y)
2
= (y)
2


+
+




+



2

:
We now match the variances to get
(y)
2


+
+




+



2

=B
2
t
First equation gives

+
=

+A
t
y
which upon substituting into the second equation gives
(y)
2



++





+


2

=B
2
t
where=A
t
y
:This simpli…es to
2

+
2
=B
2t
(y)
2
which rearranges to give


=
1
2

B
2t
(y)
2+
2


=
1
2

B
2t
(y)
2+

A
t
y

2
A
t
y

=
1
2
t
(y)
2

B
2
+A
2
tAy

15 tis small compared withyand so


=
1
2
t
(y)
2

B
2
Ay

:
Then

+
=

+A
t
y
=
1
2
t
(y)
2

B
2
+Ay

:
Note


+
+


(y)
2
=B
2
t
16

Derivation of the Fokker-Planck/Forward Kolmogorov Equation
Recall thaty
0
; t
0
are futures states.
We havep(y; t;y
0
; t
0
) =


(y
0
+y; t
0
t)p(y; t;y
0
+y; t
0
t)
+

1

(y
0
; t
0
t)
+
(y
0
; t
0
t)

p(y; t;y
0
; t
0
t)
+
+
(y
0
y; t
0
t)p(y; t;y
0
y; t
0
t)
Expand each of the terms in Taylor series about the pointy
0
; t
0
to …nd
p(y; t;y
0
+y; t
0
t) =p(y; t;y
0
; t
0
) +y
@p
@y
0
+
1
2
y
2
@
2
p
@y
02
t
@p
@t
0
+:::::;
p(y; t;y
0
; t
0
t) =p(y; t;y
0
; t
0
)t
@p
@t
0
+:::;
p(y; t;y
0
y; t
0
t) =p(y; t;y
0
; t
0
)y
@p
@y
0
+
1
2
y
2
@
2
p
@y
02
t
@p
@t
0
+:::::;

+
(y
0
y; t
0
t) =
+
(y
0
; t
0
)y
@
+
@y
0
+
1
2
y
2
@
2

+
@y
02
t
@
+
@t
0
+::::::;

+
(y
0
; t
0
t) =
+
(y
0
; t
0
)t
@
+
@t
0
+::::::;


(y
0
+y; t
0
t) =

(y
0
; t
0
) +y
@

@y
0
+
1
2
y
2
@
2


@y
02
t
@

@t
0
+::::::;


(y
0
; t
0
t) =

(y
0
; t
0
)t
@

@t
0
+::::::;
Substituting in our equation forp(y; t;y
0
; t
0
), ignoring terms smaller thant, noting thatyO

p
t

;
gives
@p
@t
0
=
@
@y
0

1
y


+



p

+
1
2
@
2
@y
02


+



p

:
Noting the earlier results
A=
(y)
2
t

1
y


+




;
B
2
=
(y)
2
t


+
+


gives theforward equation
@p
@t
0
=
1
2
@
2
@y
02

B
2
(y
0
; t
0
)p


@
@y
0
(A(y
0
; t
0
)p)
The initial condition used is
p(y; t;y
0
; t
0
) =(y
0
y)
17 As an example consider the important case of the distribution of stock prices. Given the random walk for
equities, i.e. Geometric Brownian Motion
dS
S
=dt+dW:
SoA(S
0
; t
0
) =S
0
andB(S
0
; t
0
) =S
0
:Hence the forward becomes
@p
@t
0
=
1
2
@
2
@S
02


2
S
0
2
p


@
@S
0
(S
0
p):
This can be solved with a starting condition ofS
0
=Satt
0
=tto give the transition pdf
p(S; t;S
0
; T) =
1
S
0
p
2(t
0
t)
e
(log(S=S
0
)+(
1
2

2
)(t
0
t))
2
/2
2
(t
0
t)
:
More on this and solution technique later, but note that a transformation reduces this to the one dimen-
sional heat equation and thesimilarity reduction methodwhich follows is used.
The Steady-State Distribution
As the name suggests ’steady state’ refers to time independent. Random walks for interest rates and
volatility can be modelled with stochastic di¤erential equations which have steady-state distributions. So
in the long run, i.e. ast
0
! 1the distributionp(y; t;y
0
; t
0
)settles down and becomes independent of the
starting stateyandt:The partial derivatives in the forward equation now become ordinary ones and the
unsteady term
@p
@t
0vanishes.
The resulting forward equation for the steady-state distributionp1(y
0
)is governed by the ordinary di¤er-
ential equation
1
2
d
2
dy
02

B
2
p1


d
dy
0
(Ap1) = 0:
Example:The Vasicek model for the spot raterevolves according to the stochastic di¤erential equation
dr=(
rr)dt+dW
Write down the Fokker-Planck equation for the transition probability density function for the interest rate
rin this model.
Now using the steady-state version for the forward equation, solve this to …nd thesteady state
probability
distributionp1(r
0
);given by
p1=
1

r

exp




2(r
0

r)
2

:
Solution:
For the SDEdr=(
rr)dt+dWwhere drift=(rr)and di¤usion isthe Fokker Planck equation
becomes
@p
@t
0
=
1
2

2
@
2
p
@r
0
2
@
@r
0
((
rr
0
)p)
wherep=p(r
0
; t
0
)is the transition PDF and the variables refer to future states. In the steady state
case, there is no time dependency, hence the Fokker Planck PDE becomes an ODE with
1
2

2
d
2
p1
dr
2

d
dr
((
rr)p1) = 0
18

p1=p1(r):The prime notation and subscript have been dropped simply for convenience at this stage.
To solve the steady-state equation:
Integrate wrtr
1
2

2
dp
dr
((
rr)p) =k
wherekis a constant of integration and can be calculated from the conditions, that asr! 1
(
dp
dr
!0
p!0
)k= 0
which gives
1
2

2
dp
dr
=((
rr)p);
a …rst order variable separable equation. So
1
2

2
Z
dp
p
=
Z
((
rr))dr!
1
2

2
lnp=

rr
r
2
2

+C,Cis arbitrary.
Rearranging and taking exponentials of both sides to give
p= exp

2

2

rr
r
2
2

+D

=Eexp


2

2

r
2
2

rr

Complete the square to get
p=Eexp




2

(r
r)
2
r
2

p1=Aexp




2(r
0

r)
2

:
There is another way of performing the integration on the rhs. If we go back to
R
(r
r)drand write
as

Z
1
2
d
dr
(r
r)
2
dr=

2
(r
r)
2
to give
1
2

2
lnp=

2
(r
r)
2
+C:
Now we know asp1is a PDF
Z
1
1
p1dr
0
= 1!
A
Z
1
1
exp



2(r
0

r)
2

dr
0
= 1
A few (related) ways to calculateA. Now use the error function, i.e.
Z
1
1
e
x
2
dx=
p

So put
x=
q

2(r
0

r)!dx=
q

2dr
0
19 which transforms the integral above
A
p

Z
1
1
e
x
2
dx= 1!A
r

= 1
therefore
A=
1

r

:
This allows us to …nally write the steady-state transition PDF as
p1=
1

r

exp




2(r
0

r)
2

:
Thebackward equationis obtained in a similar way to the forward
p(y; t;y
0
; t
0
) =

+
(y; t)p(y+y; t+t;y
0
; t
0
)
+

1

(y; t)
+
(y; t)

p(y; t+t;y
0
; t
0
)
+

(y; t)p(yy; t+t;y
0
; t
0
)
and expand using Taylor. The resulting PDE is
@p
@t
+
1
2
B
2
(y; t)
@
2
p
@y
2
+A(y; t)
@p
@y
= 0:
20

ReviewofModule1
The Binomial Model
The model has made option pricing accessible to MBA students and …nance practitioners preparing for
the CFA
R
:It is a very useful tool for conveying the ideas of delta hedging and no arbitrage, in addition
to the subtle concept of risk neutrality and option pricing. Here the model is considered in a slightly more
mathematical way.
The basic assumptions in option pricing theory consist of two forms, key:
Short selling allowed
No arbitrage opportunities
and relaxable
Frictionless markets
Perfect liquidity
Known volatility and interest rates
No dividends on the underlying
The key assumptions underlying the binomial model are:
an asset value changes only at discrete time intervals
an asset’s worth can change to one of only two possible new values at each time step.
The one period model - Replication
Another way of looking at the Binomial model is in terms of replication: we can replicate the option
using only cash (or bonds) and the asset. That is, mathematically, simply a rearrangement of the earlier
equations. It is, nevertheless a very important interpretation.
In one time step:
1. The asset moves fromS0=stoS1=suorS1=sd:
2. An optionXpays o¤xuif the asset price issuandxdif the price issd:
3. There is a bond market in which a pound invested today is continuously compounded at a constant
(risk-free) raterand becomese
r
;one time-step later.
Now consider a portfolio of bonds andassets which at timet= 0;will have an initial value of
V0=S0+
Now with this money we can buy or sell bonds or stocks in order to obtain a new portfolio at time-step1:
Can we construct a hedging strategy which will guarantee to pay o¤ the option, whatever happens to the
asset price?
1 The Hedging Strategy
We arrange the portfolio so that its value is exactly that of the required option pay-out at the terminal
time regardless of whether the stock moves up or down.
This is because having two unknowns; ;the amount of stock and bond, and we wish to match the two
possible terminal values,xu; xd;the option payo¤s. Thus we need to have
xu=su+ e
r
;
xd=sd+ e
r
:
Solving for; we have
=
xuxd
susd
=e
r
xdsuxusd
susd
This is ahedging strategy.
At time step1;the value of the portfolio is
V1=

xu
xd
ifS1=su
ifS1=sd

This is the option payo¤. Thus, givenV0=S0+ we can construct the above portfolio which has the
same payo¤ as the option. Hence the price for the option must beV0:Any other price would allow arbitrage
as you could play this hedging strategy, either buying or selling the option, and make a guaranteed pro…t.
Thus the fair, arbitrage-free price for the option is given by
V0= (S0+ )
=
xuxd
susd
s+e
r
xdsuxusd
susd
=e
r

e
r
ssd
susd
xu+
sue
r
s
susd
xd

:
De…ne
q=
e
r
ssd
susd
;
then we conclude that
V0=e
r
(qxu+ (1q)xd)
where
0q1:
We can think ofqas a probability induced by insistence on no-arbitrage, i.e. the so-calledrisk-neutral
probability.It has nothing to do with the real probabilities ofsuandsdoccurring; these arepand1p;
in turn.
The option price can be viewed as the discounted expected value of the option pay-o¤ with respect to the
probabilitiesq;
V0=e
r
(qxu+ (1q)xd)
=Eq

e
r
X

:
2

The fact that the risk neutral/fair value (orqvalue) of a call is less than the expected value of the call
(under the real probabilityp), is not a puzzle.
Pricing a call using the real probability,p, you will probably make a pro…t, but you might also might make
a loss. Pricing an option using the risk-neutral probability,q, you will certainly make neither a pro…t nor
a loss.
Assume an asset which has valueSand during a time steptcan either rise touSor fall tovSwith
0< v <1< u:So as earlier probabilities of a rise and fall in turn arepand1p:
uS
%
S
&
vS
|
{z}
t
V
+
uS
%
VS
&
V

vS
|
{z}
t
Also setuv= 1so that after an up and down move, the asset returns toS:Hence arecombiningtree.
To implement the Binomial model we need a model for asset price evolution to predict future possible spot
prices. So use
S=St+S
p
t;
i.e. discrete version of GBM:The 3 constantsu; v; pare chosen to give the binomial model the same drift
and di¤usion as the SDE. For the correct drift, choose
pu+ (1p)v=e
t
(a)
and for the correct standard deviation set
pu
2
+ (1p)v
2
=e
(2+
2
)t
(b)
u(a) +v(a)gives
(u+v)e
t
=pu
2
+uvpuv+pvu+v
2
pv
2
:
Rearrange to get
(u+v)e
t
=pu
2
+ + (1p)v
2
+uv
and we know from(b)thatpu
2
+ (1p)v
2
=e
(2+
2
)t
anduv= 1:Hence we have
(u+v)e
t
=e
(2+
2
)t
+ 1)
(u+v) =e
t
+e
(+
2
)t
:
Now recall that the quadratic equationax
2
+bx+c= 0with rootsandhas
+=
b
a
;=
c
a
:
3 We have
(u+v) =e
t
+e
(+
2
)t

b
a
uv= 1
c
a
henceuandvsatisfy
(xu) (xv) = 0
to give the quadratic
x
2
(u+v)x+uv= 0)
x=
(u+v)
q
(u+v)
2
4uv
2
so withu >1
u=
1
2

e
t
+e
(+
2
)t

+
1
2
q
(e
t
+e
(+
2
)t
)
2
4
In this model, the hedging argument gives
V
+
uS=V

vS
which leads to =
V
+
V

(uv)S
:Because all other terms are known chooseto eliminate risk.
We know tomorrow’s option value therefore price today is tomorrow’s value discounted for interest rates
VS=
1
1 +rt

V
+
uS

so(1 +rt) (VS) =V
+
uSand replace using the de…nition ofabove
(1 +rt)V=V
+

v+ 1 +rt
(uv)

+V


u1rt
(uv)

where the risk-neutral probabilities are
q=
v+ 1 +rt
(uv)
1q=
u1rt
(uv)
:
So(1 +rt)V=V
+
q+V

(1q):
Finally we have
V=
V
+
V

(uv)
+
uV

vV
+
(1 +rt) (uv)
q=
e
rt
v
(uv)
4

The Continuous Time Limit
Performing a Taylor expansion aroundt= 0we have
u
1
2

(1t+ ) +

1 +

+
2

t+

+
1
2

e
2t
+ 2e

2
t
+e
2(+
2
)t
4
1
2
=

1 +
1
2

2
t+

+
1
2

12t+ 2 + 2
2
t+ 1 + 2t+ 2
2
t4 +

=

1 +
1
2

2
t+

+
1
2

4
2
t+
1
2
Ignoring the terms of ordert
3
2and higher we get the result
=

1 +t
1
2+
1
2

2
t

+
Sinceuv= 1this implies thatv=u
1
. Using the expansion foruobtained earlier we have
v=

1 +t
1
2+
1
2

2
t+

1
=

1 +t
1
2

1 +
1
2
t
1
2



1
=

1t
1
2

1 +
1
2
t
1
2

+

t
1
2

1 +
1
2
t
1
2

2
+
!!
= 1t
1
2
1
2

2
t+
2
t+
= 1t
1
2+
1
2

2
t
So we have
u1 +
p
t+
1
2

2
t
v1
p
t+
1
2

2
t
So to summarise we can write
u=e

p
t
v=e

p
t
q=
e
rt
v
(uv)
and use these to build the asset price tree usinguandv;and then value the option backwards fromT
using
e
rt
V(S; t) =qV(uS; t+t) + (1q)V(vS; t+t)
and at each stage the headge ratiois obtained using
=
V
+
V

(uv)S
=
V(uS; t+t)V(vS; t+t)
(uv)S
5 Note that
=
V
+
V

(uv)S

2
p
tS
@V@S
2
p
tS
=
@V
@S
Now expand
V
+
=V(uS; t+t)V+t
@V
@t
+
p
tS
@V
@S
+
1
2

2
tS
2
@
2
V
@S
2
;
V

=V(vS; t+t)V+t
@V
@t

p
tS
@V
@S
+
1
2

2
tS
2
@
2
V
@S
2
:
Then
V=
V
+
V

(uv)
+
uV

vV
+
(1 +rt) (uv)
=
2
p
tS2
p
t
@V
@S
+

1 +
p
t

V



1
p
t

V
+
(1 +rt) 2
p
t
Rearranging to give
(1 +rt) 2
p
tV= 2
p
tS(1 +rt)
@V
@S
+

V

V
+

+
p
t

V

+V
+

;
and so
(1 +rt) 2
p
tV= 2
p
tS(1 +rt)
@V
@S
2
p
tS
@V
@S
+
2
p
t

V+
12

2
tS
2
@
2
V
@S
2
+t
@V
@t

;
(1 +rt)V=S(1 +rt)
@V
@S
S
@V
@S
+

V+
1
2

2
tS
2
@
2
V
@S
2
+t
@V
@t

;
divide through bytand allowt!0
rV=rS
@V
@S
+
1
2

2
S
2
@
2
V
@S
2
+
@V
@t
and hence the Black-Scholes Equation.
6

Probability
Probability theory provides the necessary structure to model the uncertainty that is central to …nance and
is the chief reason for its powerful in‡uence in mathematical …nance. Any formal discussion of random
variables requires de…ning the triple(;F;P) ;as it forms the foundation of theprobabilistic universe. This
three-tuple is called aprobability spaceand comprises of
1. the sample space
2. the …ltrationF
3. the probability measureP
Basic set theoretic notions have special interpretations in probability theory. Here are some
Thecomplementinof the eventA;writtenA
c
is interpreted as "notA" and occurs i¤Adoes
not occur.
TheunionA[Bof two eventsAandBis the event "at least one ofAorBoccurs".
TheintersectionA\Bof two eventsAandBis the event "bothAandBoccur". EventsAand
Bare said to bemutually exclusiveif they are disjoint,A\B=;;and so both cannot occur
together.
TheinclusionrelationABmeans the "occurrence ofAimplies the occurrence ofB".
ExampleThe daily closing price of a risky asset, e.g. share price on the FTSE100. Over the course of a
year (252 business days)
=fS1; S2; S3; : : : ; S252g
We could de…ne an event e.g. =fSi:Si110g
Outcomes of experiments are not always numbers, e.g. 2 heads appearing; picking an ace from a deck of
cards, or the coin ‡ipping example above. We need some way of assigning real numbers to each random
event. Random variables assign numbers to events. Thus arandom variable(RV)Xis a function which
maps from the sample spaceto the set of real numbers
X:!2!R;
i.e. it associates a numberX(!)with each outcome!:A more robust de…nition will follow.
Consider the example of tossing a coin and suppose we are paid £ 1 for each head and we lose £ 1 each time
a tail appears. We know thatP(H) =P(T) =
1
2
:So now we can assign the following outcomes
P(1) =
1
2
;P(1) =
1
2
7 Mathematically, if our random variable isX;then
X=

+1if H
1if T
or using the notation aboveX:!2 fH,Tg ! f1;1g:
Returning to the coin tossing game we see the sample spacehas two events:!1=Head;!2=Tail.
So now
=f!1; !2g
And the P&L from this game is a RVXde…ned by
X(!1) = +1
X(!2) =1
=f!1; !2g )2

=f;;f1g;f+1g;f1;+1gg.
In a multi-period market, information about the market is revealed in stages. Thenperiod Binomial
model demonstrates the way this information becomes available.
Some events may be completely determined by the end of the …rst trading period, others by the end of the
second or third, and others will only be available at the termination of all trading. These events can be
classi…ed in the following way; consider timetT;de…ne
Ft=fall events determined in the …rstttrading periodsg
The binomial stock price model is a discrete time stochastic model of a stock price process in which a
…ctitious coin is tossed and the stock price dynamics depend on the outcome of the coin tosses e.g. a Head
means the stock rises by one unit, a tail means the stock falls by that same amount. Start by introducing
some new probabilistic terminology and concepts.
SupposeT:=f0;1;2; :::; ngrepresents a discrete time set.
The sample space = n;the set of all outcomes ofncoin tosses; each sample point!2is of length
n;written as!=!1!2::::!n;where eachf!t:t2Tgis eitherU(due to a Head) orD(due to a Tail),
representing the outcome of thet
th
coin toss. So e.g. three coin tosses would give a sample path!=!1!2!3
of length3:
We are interested in astochastic processdue to the dynamic nature of asset prices.
Suppose before the markets open we guess the possible outcomes of the stock price, this will give us our
sample space. The sample path will tell us what just happened. Consider a stock price which over the
next time step can go upUor go downD.
8

1=fU; Dg:
2
1
outcomes
!=!1length 1
Then a two time period model looks like
So the sample space at the end of two time periods is
2=fUU; UD; DU; DDg:
2
2
outcomes
!=!1!2length 2
For this experiment asample pathor trajectory would be one realisation e.g.DUor
DD:Generally in
probability theory, the sample space is of greater interest. As the number of time periods becomes larger
and larger it becomes increasingly di¢ cult to track all of the possible outcomes and corresponding sample
space generated through time, i.e.1;2;3; :::;t;t+1;::::
The …ltration, F, is an indication of how an increasing family of events builds up over time as more results
become available, it is much more than just a family of events. The …ltration, F is a set formed of all
possible combinations of eventsA;their unions and complements. So for example if we want to know
what events can occur we are also interested in what cannot happen. The …ltration F is an object in
Measure Theory called aalgebra (also called a…eld).algebras can be interpreted as records of
information. Measure theory was brought to probability by Kolmogorov.
Now let F be the non-empty set of all subsets of;then F

2


is aalgebra (also called a…eld),
that is, a collection of subsets ofwith the properties:
9 1.; 2F
2. IfAF thenA
c
F (closed under complements)
3. If the sequenceAiF8i2N=)
S
1
i=1
AiF (closed under countable unions)
The second property also implies thatF. In addition
T
1
i=1
AiF. The pair(;F)is called a
measurable space.
Key Fact:For0t1t2:::T;
Ft1 Ft2::: FT F
Since we consider that information gets constantly recorded and accumulates up until the end of the
experimentTwithout ever getting lost or forgotten, it is only logical that with the passage of time the
…ltration increases.
In general it is very di¢ cult to describe explicitly the …ltration. In the case of (say) the binomial model,
this can be done. Consider a 3-period binomial model. At the end of each time period new information
becomes available, allowing us to predict the stock price trajectory.
ExampleConsider a 3-period binomial model. At the end of each period, new information becomes
available to help us predict the actual stock trajectory. So taken= 3; = 3;given by the …nite set
3=fUUU; UUD; UDU; UDD; DUU; DUD; DDU; DDD g;
the set of all possible outcomes of three coin tosses. At timet= 0, before the start of trading we only have
thetrivial …ltration
F0=f;;g
since we do not have any information regarding the trajectory of the stock. The trivialalgebraF0
contains no information: knowing whether the outcome!of the two tosses is in;; (it is not) and whether
it is in(it is) tells you nothing about!, in accordance with the idea that at time zero one knows nothing
10

about the eventual outcome!of the three coin tosses. All one can say is that! =2 ;and!2and so
F0=f;;g.
Now de…ne the following two subsets of :
AU=fUUU; UUD; UDU; UDDg; AD=fDUU; DUD; DDU; DDDg:
We seeAUis the subset of outcomes where a Head appears on the …rst throw,ADis the subset of outcomes
where a Tail lands on the …rst throw. After the …rst trading periodt= 1,(11am) we know whether the
initial move was an up move or down move. Hence
F1=f;;; AU; ADg
De…ne also
AU U=fUUU; UUDg; AU D=fUDU; UDDg;
ADU=fDUU; DUDg; ADD=fDDU; DDDg
corresponding to the events that the …rst two coin tosses result inHH; HT; TH; TTrespectively. This
is the information we have at the end of the 2nd trading periodt= 2,(1 pm). This means at the end of
the second trading period we have accumulated increasing information. Hence
F2=f;;; AU; AD; AU U; AU D; ADU; ADD+all unions of theseg;
which can be written as follows
F2=f;;; AU; AD; AU U; AU D; ADU; ADD
AU U[ADU; AU U[ADD; AU D[ADU; AU D[ADD
A
c
U U; A
c
U D; A
c
DU; A
c
DDg
We see
F0 F1 F2
ThenF2is aalgebra which contains the "information of the …rst two tosses" of "the information up to
time2". This is because, if you know the outcome of the …rst two tosses, you can say whether the outcome
!2of all three tosses satis…es!2Aor! =2Afor eachA2 F2:
Similarly,F3 F;the set of all subsets of;contains full information about the outcome of all three
tosses. The sequence of increasingalgebrasF=fF0;F1;F2;F3gis a …ltration.
Adapted ProcessA stochastic processStis said to beadaptedto the …ltrationFt(orFtmeasurable or
Ftadapted) if the value ofSat timetis known given the information setFt:
We place aprobability measurePonf;Fg:Pis a special type of "function", called a measure which
assigns probabilities to subsets (i.e. the outcomes); the theory also comes from Measure Theory. Whereas
cumulative density functions (CDF) are de…ned on intervals such asR;probability measures are de…ned
on general sets, giving greater power, generalisation and ‡exibility. A probability measurePis a function
mappingP:F ![0;1]with the properties
(i)P() = 1;
(ii) ifA1; A2; ::::is a sequence of disjoint sets inF;thenP(
S
1
k=1
Ak) =
1
X
k=1
P(Ak):
11 ExampleRecall the usual coin toss game with the earlier de…ned results. As the outcomes are equiprobable
the probability measure de…ned asP(!1) =
1
2
=P(!2):
The interpretation is that for a setA2F there is a probability in[0;1]that the outcome of a random
experiment will lie in the setA:We think ofP(A)as this probability. TheA2F is called anevent. For
A2F we can de…ne
P(A) :=
X
!2A
P(!); (*)
asAhas …nitely many elements. Letting the probability of H on each coin toss bep2(0;1);so that the
probability of T isq= 1p. For each!= (!1; !2; : : : !n)2we de…ne
P(!) :=p
Number of H in!
q
Number of T in!
:
Then for eachA2F we de…neP(A)according to():
In the …nite coin toss space, for eacht2Tlet Ftbe thealgebra generated by the …rsttcoin tosses.
This is aalgebra which encapsulates the information one has if one observes the outcome of the …rst
tcoin tosses (but not the full outcome!of allncoin tosses). Then Ftis composed of all the setsAsuch
that Ftis indeed aalgebra, and such that if the outcome of the …rsttcoin tosses is known, then we
can say whether!2Aor! =2A;for eachA2Ft:The increasing sequence ofalgebras(Ft)
t2T
is an
example of a…ltration.We use this notation when working in continuous time.
When moving to continuous time we will write(Ft)
t2[0;T]
:
If we are concerned with developing a more measure theory based rigorous approach then working structures
such asalgebras becomes more important - so we do not need to worry too much about this in our
…nancial mathematics setting.
We can compute the probability of any event. For instance,
P(AU) =P(H on …rst toss) =PfUUU; UUD; UDU; UDDg
=p
3
+ 2p
2
q+pq
2
=p;
and similarlyP(AT) =q:This agrees with the mathematics and our intuition.
Explanation of probability measure:If the number of basic events is very large we may prefer to
think of a continuous probability distribution. As the number of discrete events tends to in…nity, the
probability of any individual event usually tends to zero. In terms of random variables, the probability
that the random variableXtakes a given value tends to zero.
So, the individual probabilitiespiare no longer useful. Instead we have a probability density functionp(x)
with the property that
Pr (xXx+dx) =p(x)dx
for any in…nitesimal interval of lengthdx(think of this as a limiting process starting with a small interval whose length tends to zero):
It is also called a density because it is the probability of …ndingXon an interval of lengthdxdivided by
the length of the interval. Recall that the following are analogous
Z
1
1
p(x)dx= 1
X
i
pi= 1:
12

The (cumulative) distribution function of a random variable is de…ned by
P(x) = Pr (Xx):
It is an increasing function ofxwithP(1) = 0andP(1) = 1;note that0P(x)1:It is related
to the density function by
p(x) =
dP(x)
dx
provided thatP(x)is di¤erentiable. UnlikeP(x); p(x)may be unbounded or have singularities such as
delta functions.
Pis the probability measure, a special type of "function", called a measure, assigning probabilities to
subsets (i.e. the outcomes); the mathematics emanates from Measure Theory. Probability measures are
similar to cumulative density functions (CDF); the chief di¤erence is that where PDFs are de…ned on
intervals (e.g.R), probability measures are de…ned on general sets. We are now concerned with mapping
subsets on to[0;1]:The following de…nition of the expectation has been used
E[h(X)] =
Z
R
h(x)p(x)dx
=
Z
R
h(x)dP(x)
We now write as a Lebesgue integral with respect to the measureP
E
P
[h(X(!))] =
Z

h(!)P(d!):
So integration is now done over the sample space (and not intervals).
IffWt:t2[0; T]gis a Brownian motion or any general stochastic processfSn:n= 0; :::; Ng. The proba-
bility space(;F;P)is the set of all paths (continuous functions) andPis the probability of each path.
ExampleRecall the usual coin toss game with the earlier de…ned results. As the outcomes are equiprobable
the probability measure de…ned asP(!1) =
1
2
=P(!2):
There is a very powerful relation between expectations and probabilities. In our formula for the expectation,
choosef(X)to be theindicator function1x2Afor a subsetAde…ned
1x2A=

1
0
ifx2A
ifx =2A
i.e. when we are inA;the indicator function returns 1.
The expectation of the indicator function of an event is the probability associated with this event:
E[1X2A] =
Z

1x2AdP
=
Z
A
dP+
Z
nA
dP
=
Z
A
dP
=P(A)
which is simply the probability that the outcomeX2A:
13 Conditional Expectations
What makes a conditional expectation di¤erent (from an unconditional one) is information (just as in
the case of conditional probability). In our probability space,(;F;P)information is represented by the
…ltrationF;hence a conditional expectation with respect to the (usual information) …ltration seems a
natural choice.
Y=E[Xj F]
is the expected value of the random variable conditional upon the …ltration setF:In general
In generalYwill be a random variable
Ywill be adapted to the …ltrationF:
Conditional expectations have the following useful properties: IfX; Yare integrable random variables and
a; bare constants then
1.Linearity:
E[aX+bYj F] =aE[Xj F] +bE[Yj F]
2.Tower Property (i.e. Iterated Expectations): ifG F
E[E[Xj F]j G] =E[Xj G]:
This property states that if taking iterated expectations with respect to several levels of information,
we may as well take a single expectation subject to the smallest set of available information. The
special case is
E[E[Xj F]] =E[X]:
3.Taking out what is known:XisFadapted, then the value ofXis know once we knowF.
Therefore
E[Xj F] =X:
and hence by extension ifXisFmeasurable, but notYthen
E[XYj F] =XE[Yj F]:
4.Independence:Xis independent ofF;then knowingFis of no use in predictingX
E[Xj F] =E[X]:
5.Positivity: IfX0thenE[Xj F]0:
6.Jensen’s inequality: Letfbe a convex function, then
f(E[Xj F])fE[f(X)j F]
14

Solving the Di¤usion Equation
The Heat/Di¤usion equation
Consider the equation
@u
@t
=c
2
@
2
u
@x
2
for the unknown functionu=u(x; t); c
2
is a positive constant:The idea is to obtain a solution in terms
of Gaussian curves. Sou(x; t)represents a probability density.
We assume a solution of the following form exists:
u(x; t) =t
1=2
f

x
t
1=2

and the non-dimensional variable
=
x
t
1=2
which allows us to obtain the following derivatives
@
@x
=t
1=2
;
@
@t
=
1
2
xt
3=2
we can now say
u(x; t) =t
1=2
f()
therefore
@u
@x
=
@u
@
@
@x
=t
1=2
f
0
()
1
t
1=2
=t
1
f
0
()
@
2
u
@x
2
=
@
@x

@u
@x

=
@
@x

t
1
f
0
()

=t
3=2
f
00
()
@u
@t
=t
1=2
@
@t
f()
1
2
t
3=2
f()
=t
1=2


1
2
xt
3=2

f
0
()
1
2
t
3=2
f()
=
1
2
t
3=2
f
0
()
1
2
t
3=2
f()
and then substituting
@u
@t
=
1
2
t
3=2
(f
0
() +f())
@
2
u
@x
2
=t
3=2
f
00
()
gives

1
2
t
3=2
(f
0
() +f()) =c
2
t
3=2
f
00
()
simplifying to the ODE

1
2
(f+f
0
) =c
2
f
00
:
We have an exact derivative on the left hand side, i.e.
d
d
(f) =f+f
0
, hence

1
2
d
d
(f) =c
2
f
00
15 and we can integrate once to get

1
2
(f) =c
2
f
0
+K:
We setK= 0in order to get the correct solution, i.e.

1
2
(f) =c
2
f
0
which can be solved as a simple …rst order variable separable equation:
f() =Aexp


1
4c
2
2

Ais a normalizing constant, so write
A
Z
R
exp


1
4c
2
2

d= 1:
Now substitutes==2c;so2cds=d
2cA
Z
R
exp

s
2

ds
|
{z}
=
p
= 1;
which givesA=
1
2c
p

:Returning to
u(x; t) =t
1=2
f()
becomes
u(x; t) =
1
2c
p
t
exp


x
2
4tc
2
!
:
Hence the random variablexis Normally distributed with mean zero and standard deviationc
p
2t:
16

AppliedStochasticCalculus
Stochastic Process
The evolution of …nancial assets is random and depends on time. They are examples ofstochastic processes
which are random variables indexed (parameterized) with time.
If the movement of an asset is discrete it is called arandom walk. A continuous movement is called a
di¤usion process. We will consider the asset price dynamics to exhibit continuous behaviour and each
random path traced out is called arealization.
We need a de…nition and set of properties for the randomness observed in an asset price realization, which
will beBrownian Motion.
This is named after the Scottish Botanist who in 1827, while examining grains of pollen of the plant Clarkia
pulchella suspended in water under a microscope, observed minute particles, ejected from the pollen grains,
executing a continuous …dgety motion. In 1900 Louis Bachelier was the …rst person to model the share price
movement using Brownian motion as part of his PhD. Five years later Einstein used Brownian motion to
study di¤usions. In 1920 Norbert Wiener, a mathematician at MIT provided a mathematical construction
of Brownian motion together with numerous results about the properties of Brownian motion - in fact he
was the …rst to show that Brownian motion exists and is a well de…ned entity! Hence Wiener process is
also used as a name for this.
Construction of Brownian Motion and properties
We construct Brownian motion using a simple symmetric random walk. De…ne a random variable
Zi=

1
1
ifH
ifT
Let
Xn=
n
X
i=1
Zi
which de…nes the marker’s position after then
th
toss of the game. This is conditional upon the marker
starting at positionX= 0;so at each time-step moves one unit either to the left or right with equal
probability. Hence the distribution is binomial with
mean=
1
2
(+1) +
1
2
(1) = 0
variance=
1
2
(+1)
2
+
1
2
(1)
2
= 1
This can be approximated to a Normal distribution due to the Central Limit Theorem.
Is there a continuous time limit to this discrete random walk? Let’s introduce time dependency. Take a
time period for our walk, say[0; t]and performNsteps. So we partition[0; t]intoNtime intervals, so
each step takes
t=t=N:
Speed up this random walk so letN! 1:The problem with the original step sizes of1gives the variance
that is in…nite. We rescale the space step, keeping in mind the central limit theorem. Let
Y=NZ
17 for someNto be found, and let

X
N
n;n= 0; :::; N

such thatX
N
0= 0;be the path/trajectory of the
random walk with steps of sizeN:
Thus we now have
E

X
N
n

= 0;8n
and
V

X
N
n

=E
h

X
N
n

2
i
=NE

Y
2

=N
2
NE

Z
2

=N
2
N

1
2
+
1
2

=

t
t


2
N:
Obviously we must have
2
N
=t=O(1):Choosing
2
N
=t= 1gives
E

X
2

=V[X] =t:
AsN! 1;the symmetric random walk
n
X
N
[tN]
;t2[0;1)
o
converges to astandard Brownian motion
fWt;t2[0;1)g:SoWtN(0; dt):
Witht=ntwe have
dWt
dt
= lim
t!0
Wt+tWt
t
! 1:
Quadratic Variation
Consider a functionf(t)on the interval[0; T]:Discretising by writingt=idtanddt=T=Nwe can de…ne
the variationV
n
offforn= 1;2; ::as
V
n
[f] = lim
N!1
N1
X
i=0
jfti+1fti
j
n
:
Of interest is the quadratic variation
Q[f] = lim
N!1
N1
X
i=0
(fti+1fti
)
2
:
Iff(t)has more than a …nite number of jumps or a singularity thenQ[f] =1:
For a Brownian motion on[0; T]we have
Q[Wt] = lim
N!1
N1
X
i=0
(Wti+1Wti
)
2
= lim
N!1
N1
X
i=0
(W((i+ 1)dt)W(idt))
2
= lim
N!1
N1
X
i=0
dt= lim
N!1
N1
X
i=0
T
N
=T:
18

Suppose thatf(t)is a di¤erentiable function on[0; T]:Then to leading order, we have
fti+1fti
=f((i+ 1)dt)f(idt)f
0
(ti)dt
so,
Q[f]lim
N!1
N1
X
i=0
(f
0
(ti)dt)
2
lim
N!1
dt
N1
X
i=0
(f
0
(ti))
2
dt


lim
N!1
T
N
Z
T
0
(f
0
(ti))
2
dt
= 0:
The quadratic variation off(t)is zero. This argument remains valid even iff
0
(t)has a …nite number of
jump discontinuities. Thus a Brownian motion,Wt;has at worst a …nite number of discontinuities, but
an in…nite number of discontinuities in its derivative,W
0
t:It is continuous but not di¤erentiable, almost
everywhere. For us the important result is
dW
2
t=dt
or more importantly we can write (up to mean square limit)
E

dW
2
t

=dt:
Properties of a Wiener Process/Brownian motion
A stochastic processfWt:t2R+gis de…ned to be Brownian motion (or a Wiener process ) if
Brownian motion starts at zero, i.e.W0= 0(with probability one), i.e.P(W0= 0) = 1:
Continuity - paths ofWtare continuous (no jumps) with probability 1. Di¤erentiable nowhere.
Brownian motion has independent Gaussian increments, with zero mean and variance equal to the
temporal extension of the increment. That is for eacht >0ands >0,WtWsis normal with mean
0and variancejtsj;
i.e.
WtWsN(0;jtsj):
Coin tosses are Binomial, but due to a large number and the Central Limit Theorem we have a
distribution that is normal.WtWshas a pdf given by
p(x) =
1
p
2jtsj
exp


x
2
2jtsj

More speci…callyWt+sWtis independent ofWt:This means if
0t0t1t2:::::
dW1=W1W0is independent ofdW2=W2W1
dW3=W3W2is independent ofdW4=W4W3
and so on
19 Also calledstandard Brownian motionif the above properties hold.More importantly is the result
(in stochastic di¤erential equations)
dW=Wt+dtWtN(0; dt)
Brownian motion hasstationary increments. A stochastic process(Xt)
t0
is said to bestationaryif
Xthas the same distribution asXt+hfor anyh >0. This can be checked by de…ning the increment
processI= (It)
t0
by
It:=Wt+hWt:
ThenItN(0; h);andIt+h=Wt+2hWt+hN(0; h)have the same distribution. This is
equivalent to saying that the process(Wt+hWt)
h0
has the same distribution8t:
If we want to be a little more pedantic then we can write some of the properties above as
WtN
P
(0; t)
i.e.Wtis normally distributed under the probability measureP:
Thecovariance functionfor a Brownian motion at di¤erent times. Let can be calculated as follows.
Ift > s;
E[WtWs] =E

(WtWs)Ws+W
2
s

=E[WtWs]
|
{z}
N(0;jtsj)
E[Ws] +E

W
2
s

= (0)0 +E

W
2
s

=s
The …rst term on the second line follows from independence of increments. Similarly, ifs > t; then
E[WtWs] =tand it follows that
E[WtWs] = minft; sg:
Brownian motion is aMartingale. Martingales are very important in …nance.
Think back to the way the betting game has been constructed. Martingales are essentially stochastic
processes that are meant to capture the concept of a fair game in the setting of a gambling environment
and thus there exists a rich history in the modelling of gambling games. Although this is a key example
area for us, they nevertheless are present in numerous application areas of stochastic processes.
Before discussing the Martingale property of Brownian motion formally, some general background infor-
mation.
A stochastic processfXn: 0n <1gis called aPmartingalewith respect to the information…ltra-
tionFn;and probability distributionP;if the following two properties are satis…ed
P1E
P
n[jXnj]<1 8n0
P2E
P
n[Xn+mj Fn] =Xn;8n; m0
20

The …rst property is simply a technical integrability condition (…ne print), i.e. the expected value of the
absolute value ofXnmust be …nite for alln:Such a …niteness condition appears whenever integrals de…ned
overRare used (think back to the properties of the Fourier Transform for example).
The second property is the one of key importance. This is another expectation result and states that the
expected value ofXn+mgivenFnis equal toXnfor all non-negativenandm:
The symbolFndenotes the information set called a …ltration and is the ‡ow of information associated
with a stochastic process. This is simply the information we have in our model at timen:It is recognising
that at timenwe have already observed all the informationFn= (X0; X1; ::::; Xn):
So the expected value at any time in the future is equal to its current value - the information held at this
point it is the best forecast. Hence the importance of Martingales in modelling fair games. This property
is modelling a fair game, our future payo¤ is equal to the current wealth.
It is also common to usetto depict time
E
P
t[MTj Ft] =Mt;t < T
Taking expectations of both sides gives
Et[MT] =Et[Mt] ;t < T
so martingales have constant mean.
Now replacing the equality inP2with an inequality, two further important results are obtained. A process
Mtwhich has
E
P
t[MTj Ft]Mt
is called asubmartingaleand if it has
E
P
t[MTj Ft]Mt
is called asupermartingale.
Using the earlier betting game as an example (where probability of a win or a loss was
1
2
)
submartingale - gambler wins money on averageP(H)>
1
2
supermartingale- gambler loses money on averageP(H)<
1
2
The above de…nitions tell us that every martingale is also a submartingale and a supermartingale. The
converse is also true.
For a Brownian motion, again wheret < T
E
P
t[WT] =E
P
t[WTWt+Wt]
=E
P
t[WTWt]
|
{z}
N(0;jTtj)
+E
P
t[Wt]
The next step is important - and requires a little subtlety
The …rst term is zero. We are taking expectations at timethenceWtis known, i.e.E
P
t[Wt] =Wt:So
E
P
t[WT] =Wt:
21 Another important property of Brownian motion is that of aMarkov process. That is if you observe the
path of the B.M from0totand want to estimateWTwhereT > tthen the only relevant information
for predicting future dynamics is the value ofWt:That is, the past history is fully re‡ected in the present
value. So the conditional distribution ofWtgiven up tot < Tdepends only on what we know att(latest
information).
Markov is also called memoryless as it is a stochastic process in which the distribution of future states
depends only on the present state and not on how it arrived there. "It doesn’t matter how you arrived at
your destination".
Let us look at an example. Consider the earlier
random walkSngiven by
Sn=
n
X
i=1
Xi
which de…ned the winnings aftern‡ips of the coin. TheXi’s are IID with mean:now de…ne
Mn=Snn:
We will demonstrate thatMnis a Martingale.
Start by writing
En[Mn+mj Fn] =En[Sn+m(n+m)]:
So this is an expectation conditional on information at timen:Now work on the right hand side.
=En
"
n+m
X
i=1
Xi(n+m)
#
=En
"
n
X
i=1
Xi+
n+m
X
i=n+1
Xi
#
(n+m)
=
n
X
i=1
Xi+En
"
n+m
X
i=n+1
Xi
#
(n+m)
=
n
X
i=1
Xi+mEn[Xi](n+m)=
n
X
i=1
Xi+m(n+m)
=
n
X
i=1
Xin=Snn
En[Mn+m] =Mn:
22

Functions of a stochastic variable and Stochastic Di¤erential Equations
In continuous time models, changes are (in…nitesimally) small. Calculus is used to analyse small changes,
hence an extension of ’ordinary’deterministic calculus to variables governed by a di¤usion process.
Start by recalling a Taylor series expansion, i.e. Taylor’s theorem: letf(x)be a su¢ ciently di¤erentiable
function ofx, for smallx;
f(x+x) =f(x) +f
0
(x)x+
1
2
f
00
(x)x
2
+O

x
3

:
So we are approximating using the tangent or quadratic. The in…nitesimal version is
df=f
0
(x)x
where we have de…ned
df=f(x+x) =f(x)
wherex1:Hencex
2
x;and
df
df
dx
x+::::
How does this work for functions of a stochastic variable?
Suppose thatx=W(t)is Brownian motion, sof=f(W)
df
df
dW
dW+
1
2
d
2
f
dW
2
(dW)
2
+::::

df
dW
dW+
1
2
d
2
f
dW
2
dt+::::
This is the most basic version of Itô’s lemma; for a function of a Wiener process (or Brownian motion)
W(t)orWt;given by
df=
df
dW
dW+
1
2
d
2
f
dW
2
dt:
Now consider a simple examplef=W
2
then
d

W
2

= 2WdW+
1
2
(2)dt
df= 2WdW+dt;
which is a consequence of Brownian motion and stochastic calculus. In normal calculus the+dtterm would
not be present.
More generally, supposeF=F(t; W);is a function of time and Brownian motion, then Taylor’s theorem
is
dF(t; W) =
@F
@t
dt+
@F
@W
dW+
1
2
@
2
F
@W
2
(dW)
2
+O(dW)
3
where we knowdW
2
=dt;so Itô’s lemma becomes
dF(t; W) =

@F
@t
+
1
2
@
2
F
@W
2

dt+
@F
@W
dW:
Two important examples of Itô’s lemma are
23 f(W(t)) = logW(t)for which Itô gives
dlogW(t) =
dW
W

dt
2W
2
g(W(t)) =e
W(t)
for which Itô implies
de
W(t)
=e
W(t)
dW+
1
2
e
W(t)
dt
If we writeS=e
W(t)
then this becomes
dS=SdW+
1
2
Sdt
or
dS
S
=
1
2
dt+dW
Geometric Brownian motion
In the Black-Scholes model for option prices, we denote the (risky) underlying (equity) asset price byS(t)
orSt. Typical to also suppress thetand simply write the stock price asS:We model the instantaneous
return during timedt;
dS
S
=
dS(t)
S(t)
=
S(t+dt)S(t)
S(t)
;
as a Normally distributed random variable,
dS
S
=dt+dW
wheredtis the expected return overdtand
2
dtis the variance of returns (about the expected return).
We can think ofas being a measure of the exponential growth of the expected asset price in time and
is a measure of size of the random ‡uctuations about that exponential trend or a measure of the risk.
If we have
dS
S
=dt+dW
24

or more conveniently
dS=Sdt+SdW
then asdW
2
=dt;
dS
2
= (Sdt+SdW)
2
=
2
S
2
dW
2
+ 2S
2
dtdW+
2
S
2
dt
2
dS
2
=
2
S
2
dt+:::
In the limitdt!0;
dS
2
=
2
S
2
dt
This leads to Itô lemma for Geometric Brownian motion (GBM).
IfV=V(t; S);is a function ofSandt, then Taylor’s theorem states
dV=
@V
@t
dt+
@V
@S
dS+
1
2
@
2
V
@S
2
dS
2
so ifSfollows GBM,
dS
S
=dt+dW
thendS
2
=
2
S
2
dtand we obtain
Itô lemma for Geometric Brownian motion;
dV=

@V
@t
+S
@V
@S
+
1
2

2
S
2
@
2
V
@S
2

dt+S
@V
@S
dW
where the partial derivatives are evaluated atSandt:
IfV=V(S)then we obtain the shortened version of Itô;
dV=

S
dV
dS
+
1
2

2
S
2
d
2
V
dS
2

dt+S
dV
dS
dW
Following on from the earlier example,S(t) =e
W(t)
;for which
dS=
1
2
Sdt+SdW
we …nd that we can solve the SDE
dS
S
=dt+dW:
If we putS(t) =Ae
at+bW(t)
then from the earlier form of Itô’s lemma we have
dS=

aS+
1
2
b
2
S

dt+bSdW
or
dS
S
=

a+
1
2
b
2

dt+bdW
comparing with
dS
S
=dt+dW
gives
b=;a=
1
2

2
25 Another way to arrive at the same result is to use Itô for GBM. Usingf(S) = logS(t)with
df=

S
@f
@S
+
1
2

2
S
2
@
2
f
@S
2

dt+S
@f
@S
dW
gives
d(logS) =

S
@
@S
(logS) +
1
2

2
S
2
@
2
@S
2
(logS)

dt+S
@
@S
(logS)dW
=


1
2

2

dt+dW
and hence
Z
t
0
d(logS()) =
Z
t
0


1
2

2

d+
Z
t
0
dW
log
S(t)
S(0)
=


1
2

2

t+W(t)
Taking exponentials and rearranging gives the earlier result. We have also usedW(0) = 0:
Itô multiplication table:
dtdWdtdt
2
= 0dtdW= 0dWdWdt= 0dW
2
=dt
Exercise:Consider theItô integralof the form
Z
T
0
f(t; W(t))dW(t) = lim
N!1
N1
X
i=0
f(ti; Wi) (Wi+1Wi):
The interval[0; T]is divided intoNpartitions with end points
t0= 0< t1< t2< :::::: < tN1< tN=T;
where the length of an intervalti+1titends to zero asN! 1:
We know from Itô’s lemma that
4
Z
T
0
W
3
(t)dW(t) =W
4
(T)W
4
(0)6
Z
T
0
W
2
(t)dt
Show from the de…nition of the Itô integral that the result can also be found by initially writing the integral
4
Z
T
0
W
3
dX= lim
N!1
4
N1
X
i=0
W
3
i(Wi+1Wi)
Hint: use4b
3
(ab) =a
4
b
4
4b(ab)
3
6b
2
(ab)
2
(ab)
4
.
26

Di¤usion Process
Gis called a di¤usion process if
dG(t) =A(G; t)dt+B(G; t)dW(t) (1)
This is also an example of a Stochastic Di¤erential Equation (SDE) for the processGand consists of two
components:
1.A(G;t)dtis deterministic –coe¢ cient ofdtis known as thedriftof the process.
2.B(G; t)dWis random – coe¢ cient ofdWis known as thedi¤usionorvolatilityof the process.
We sayGevolves according to (or follows) this process.
For example
dG(t) = (G(t) +G(t1))dt+dW(t)
is not a di¤usion (although it is a SDE)
A0andB1reverts the process back to Brownian motion
Called time-homogeneous ifAandBare not dependent ont:
dG
2
=B
2
dt:
We say(1)is a SDE for the processGor aRandom WalkfordG:
The di¤usion(1)can be written in integral form as
G(t) =G(0) +
Z
t
0
A(G; )d+
Z
t
0
B(G; )dW()
Remark: A di¤usionGis aMarkovprocess if - once the present stateG(t) =gis given, the past
fG(); < tgis irrelevant to the future dynamics.
We have seen that Brownian motion can take on negative values so its direct use for modelling stock prices
is unsuitable. Instead a non-negative variation of Brownian motion called geometric Brownian motion
(GBM) is used
If for example we have a di¤usionG(t)
dG=Gdt+GdW (2)
then the drift isA(G; t) =Gand di¤usion isB(G; t) =G:
The process(2)is also called Geometric Brownian Motion (GBM).
Brownian motionW(t)is used as a basis for a wide variety of models. Consider a pricing process
fS(t) :t2R+g: we can model its instantaneous changedSby a SDE
dS=a(S; t)dt+b(S; t)dW (3)
By choosing di¤erent coe¢ cientsaandbwe can have various properties for the di¤usion process.
A very popular …nance model for generating asset prices is the GBM model given by(2). The instantaneous
return on a stockS(t)is a constant coe¢ cient SDE
dS
S
=dt+dW (4)
whereandare the return’s drift and volatility, respectively.
27 An Extension of Itô’s Lemma (2D)
Now suppose we have a functionV=V(S; t)whereSis a process which evolves according to(4):If
S!S+dS; t!t+dtthen a natural question to ask is "what is the jump inV?"To answer this we
return to Taylor, which gives
V(S+dS; t+dt)
=V(S; t) +
@V
@t
dt+
@V
@S
dS+
1
2
@
2
V
@S
2
dS
2
+O

dS
3
; dt
2

SoSfollows
dS=Sdt+SdW
Remember that
E(dW) = 0; dW
2
=dt
we only work toO(dt)- anything smaller we ignore and we also know that
dS
2
=
2
S
2
dt
So the changedVwhenV(S; t)!V(S+dS; t+dt)is given by
dV=
@V
@t
dt+
@V
@S
[Sdt+SdW] +
1
2

2
S
2
@
2
V
@S
2
dt
Re-arranging to have the standard form of a SDEdG=a(G; t)dt+b(G; t)dWgives
dV=

@V
@t
+S
@V
@S
+
1
2

2
S
2
@
2
V
@S
2

dt+S
@V
@S
dW. (5)
This is Itô’s Formula in two dimensions.
Naturally ifV=V(S)then(5)simpli…es to the shorter version
dV=

S
dV
dS
+
1
2

2
S
2
d
2
V
dS
2

dt+S
dV
dS
dW. (6)
Further Examples
In the following casesSevolves according to GBM.
GivenV=t
2
S
3
obtain the SDE forV;i.e.dV:So we calculate the following terms
@V
@t
= 2tS
3
;
@V
@S
= 3t
2
S
2
!
@
2
V
@S
2
= 6t
2
S:
We now substitute these into(5)to obtain
dV=

2tS
3
+ 3t
2
S
3
+ 3
2
S
3
t
2

dt+ 3t
2
S
3
dW.
Now consider the exampleV= exp (tS)
28

Again, function of 2 variables. So
@V
@t
=Sexp (tS) =SV
@V
@S
=texp (tS) =tV
@
2
V
@S
2
=t
2
V
Substitute into(5)to get
dV=V

S+tS+
1
2

2
S
2
t
2

dt+ (StV)dW:
Not usually possible to write the SDE in terms ofVbut if you can do so - do not struggle to …nd a
relation if it does not exist. Always works for exponentials.
One more example: That isS(t)evolves according to GBM andV=V(S) =S
n
:So use
dV=

S
dV
dS
+
1
2

2
S
2
d
2
V
dS
2

dt+

S
dV
dS

dW.
V
0
(S) =nS
n1
!V
00
(S) =n(n1)S
n2
Therefore Itô gives usdV=

SnS
n1
+
1
2

2
S
2
n(n1)S
n2

dt+

SnS
n1

dW
dV=

nS
n
+
1
2

2
n(n1)S
n

dt+ [nS
n
]dW
Now we knowV(S) =S
n
;which allows us to write
dV=V

n+
1
2

2
n(n1)

dt+ [n]V dW
with drift=V

n+
1
2

2
n(n1)

and di¤usion=nV:
29 Important Cases - Equities and Interest Rates
If we now considerSwhich follows a lognormal random walk, i.e.V= log(S)then substituting into(6)
gives
d((logS)) =


1
2

2

dt+dW
Integrating both sides over a given time horizon ( betweent0andT)
Z
T
t0
d((logS)) =
Z
T
t0


1
2

2

dt+
Z
T
t0
dW(T > t0)
we obtain
log
S(T)
S(t0)
=


1
2

2

(Tt0) +(W(T)W(t0))
Assuming att0= 0,W(0) = 0andS(0) =S0the exact solution becomes
ST=S0exp


1
2

2

T+
p
T

. (7)
(7)is of particular interest when considering the pricing of a simple European option due to its non path
dependence. Stock prices cannot become negative, so we allowS, a non-dividend paying stock to evolve
according to the lognormal process given above - and acts as the starting point for the Black-Scholes
framework.
Howeveris replaced by the risk-free interest raterin(7)and the introduction of the risk-neutral measure
- in particular the Monte Carlo method for option pricing.
Interest rates exhibit a variety of dynamics that are distinct from stock prices, requiring the development
of speci…c models to include behaviour such as return to equilibrium, boundedness and positivity. Here we
consider another important example of a SDE, put forward by Vasicek in 1977. This model has a mean
reverting Ornstein-Uhlenbeck process for the short rate and is used for generating interest rates, given by
drt= (rt)dt+dWt. (8)
So drift is(rt)and volatility given by.
refers to thespeed of reversionor simply thespeed.


(=
r) denotes the mean rate, and we can rewrite
this random walk(7)fordrtas
drt=(rt
r)dt+dWt.
The dimensions ofare1=time, hence1=has the dimensions of time (years). For example a rate that has
speed= 3takes one third of a year to revert back to the mean, i.e. 4 months.= 52means1== 1=52
years i.e. 1 week to mean revert (hence very rapid).
By settingXt=rt
r,Xtis a solution of
dXt=Xtdt+dWt;X0=; (9)
hence it follows thatXtis an Ornstein-Uhlenbeck process and an analytic solution for this equation exists.
(9)can be written asdXt+Xtdt=dWt:
30

Multiply both sides by an integrating factore
t
e
t
(dXt+Xt)dt=e
t
dWt
d

e
t
Xt

=e
t
dWt
Integrating over[0; t]gives
Z
t
0
d(e
s
Xs) =
Z
t
0
e
s
dWs
e
s
Xsj
t
0
=
Z
t
0
e
s
dWs!e
t
XtX0=
Z
t
0
e
s
dWs
Xt=e
t
+
Z
t
0
e
(st)
dWs: (10)
By using integration by parts, i.e.
Z
v du=uv
Z
u dvwe can simplify (10).
u=Ws
v=e
(st)
!dv=e
(st)
ds
Therefore Z
t
0
e
(st)
dWs=Wt
Z
t
0
e
(st)
Wsds
and we can write (10) as
Xt=e
t
+

Wt
Z
t
0
e
(st)
Wsds

allowing numerical treatment for the integral term.
Leaving the result in the form of(10)allows the calculation of the mean, variance and other moments.
Start with the expected valueE[Xt] =
E[Xt] =E

e
t
+
Z
t
0
e
(st)
dWs

=E

e
t

+E
Z
t
0
e
(st)
dWs

=e
t
+
Z
t
0
e
(st)
E[dWs]
Recall that Brownian motion is a Martingale; the Itô integral is a Martingale, hence
E[Xt] =e
t
To calculate the variance we haveV[Xt] =E[X
2
t]E
2
[Xt]
=E
"

e
t
+
Z
t
0
e
(st)
dWs
2
#

2
e
2t
=E


2
e
2t

+
2
E
"
Z
t
0
e
(st)
dWs
2
#
+ 2e
t
E
Z
t
0
e
(st)
dWs

|
{z}
Itô integral

2
e
2t
=
2
E
"
Z
t
0
e
(st)
dWs
2
#
31 Now use Itô’s Isometry
E
"
Z
t
0
YsdWs
2
#
=E
Z
t
0
Y
2
sds

;
So
V[Xt] =
2
E
Z
t
0
e
2(st)
ds

=

2
2
e
2(st)


t
0
=

2
2

1e
2t

Returning to the integral in(10)
Z
t
0
e
(st)
dWs
let’s use the stochatsic integral formula to verify the result. Recall
Z
t
0
@f
@W
dW=f(t; Wt)f(0; W0)
Z
t
0

@f
@s
+
1
2
@
2
f
@W
2

ds
so
@f
@W
e
(st)
=)f=e
(st)
Ws;
@f
@s
=e
(st)
Ws;
@
2
f
@W
2= 0
Z
t
0
e
(st)
dWs=Wt0
Z
t
0

e
(st)
Ws+
1
2
0

ds
=Wt
Z
t
0
e
(st)
Wsds:
We have used an integrating factor to obtain a solution of the Ornstein Uhlenbeck process. Let’s look at
d(e
t
Ut)by using Itô. Consider a functionV(t; Ut)wheredUt=Utdt+dWt;then
dV=

@V
@t
U
@V
@U
+
1
2

2
@
2
V
@U
2

dt+
@V
@U
dW
d

e
t
U

=

@
@t

e
t
U

U
@
@U

e
t
U

+
1
2

2
@
2
@U
2

e
t
U


dt+

@
@U

e
t
U

dW
=

e
t
UUe
t

dt+e
t
dW
=e
t
dW
32

Example:The Ornstein-Uhlenbeck process satis…es the spot rate SDE given by
dXt=(Xt)dt+dWt; X0=x;
where; andare constants. Solve this SDE by settingYt=e
t
Xtand using Itô’s lemma to show that
Xt=+ (x)e
t
+
Z
t
0
e
(ts)
dWs:
First write Itô forYtgivendXt=A(Xt; t)dt+B(Xt; t)dWt
dYt=

@Yt
@t
+A(Xt; t)
@Yt
@Xt
+
1
2
B
2
(Xt; t)
@
2
Yt
@X
2
t

dt+B(Xt; t)
@Yt
@Xt
dWt
=

@Yt
@t
+(Xt)
@Yt
@Xt
+
1
2

2
@
2
Yt
@X
2
t

dt+
@Yt
@Xt
dWt
@Yt
@t
=e
t
Xt;
@Yt
@Xt
=e
t
;
@
2
Yt
@X
2
t
= 0:
d

e
t
Xt

=

e
t
Xt+(Xt)e
t

dt+e
t
dWt
=e
t
dt+e
t
dWt
Z
t
0
d(e
s
Xs) =
Z
t
0
e
s
ds+
Z
t
0
e
s
dWs
e
t
Xtx=e
t
+
Z
t
0
e
s
dWs
Xt=xe
t
+e
t
+e
t
Z
t
0
e
s
dWs
Xt=+ (x)e
t
+
Z
t
0
e
(ts)
dWs:
Consider
drt=(rt)dt+dWt;
and show by suitable integration fors < t
rt=rse
(ts)
+

1e
(ts)

+
Z
t
s
e
(tu)
dWu:
The lower limit gives us an initial condition at times < t:Expandd(e
t
rt)
d

e
t
rt

=

e
t
rtdt+e
t
drt

=e
t
(dt+dWt)
Now integrate both sides over[s; t]to give for eachs < t
Z
t
s
d(e
u
ru) =
Z
t
s
e
u
du+
Z
t
s
e
u
dWu
e
t
rte
s
rs=e
t
e
s
+
Z
t
s
e
u
dWu
33 rearranging and dividing through bye
t
rt=e
(ts)
rs+e
(ts)
+e
t
Z
t
s
e
s
dWu
rt=e
(ts)
rs+

1e
(ts)

+
Z
t
s
e
(tu)
dWu
so thatrtconditional uponrsis normally distributed with mean and variance given by
E[rtjrs] =e
(ts)
rs+

1e
(ts)

V[rtjrs] =

2
2

1e
2(ts)

We note that ast! 1;the mean and variance become in turn
E[rtjrs] =
V[rtjrs] =

2
2
Example:GivenU= logY;whereYsatis…es the di¤usion process
dY=
1
2Y
dt+dW
Y(0) =Y0
use Itô’s lemma to …nd the SDE satis…ed byU:
SinceU=U(Y)withdY=a(Y; t)dt+b(Y; t)dW;we can write
dU=

a(Y; t)
dU
dY
+
1
2
b
2
(Y; t)
d
2
U
dY
2

dt+b(Y; t)
dU
dY
dW
NowU= log(Y)so
dU
dY
=
1;
Y
d
2
U
dY
2=
1
Y
2and substituting in
dU=

1
2Y

1
Y

+
1
2
(1)
2


1
Y
2

dt+
1
Y
dW
dU=e
U
dW
Example:Consider the stochastic volatility model
d
p
v=


p
v

dt+dW
wherevis the variance. Show that
dv=


2
+ 2
p
v2v

dt+ 2
p
vdW
Setting the variableX=
p
vgivingdX= (X)
|
{z}
A
dt+
|{z}
B
dW:We now require a SDE fordY;where
Y=X
2
:Sodv=
dY=

A
dY
dX
+
1
2
B
2
d
2
Y
dX
2

dt+B
dY
dX
dW
=

(X) (2X) +
1
2

2
2

dt+2XdW
=

2X2X
2
+
2

dt+ 2
p
vdW
=


2
+ 2
p
v2v

dt+ 2
p
vdW
34

(Harder) Example:Consider the dynamics of a non-traded assetStgiven by
dSt
St
=(logSt)dt+dWt
where the constants; >0:IfT > t;show that
logST=e
(Tt)
logSt+


1
2

2


1e
(Tt)

+
Z
T
t
e
(Ts)
dWs:
Hence show that
logSTN

e
(Tt)
logSt+


1
2

2


1e
(Tt)

;
2

1e
2(Tt)
2

Writing Itô for the SDE wheref=f(St)gives
df=

(logSt)St
df
dS
+
1
2

2
S
2
t
d
2
f
dS
2

dt+St
df
dS
dWt:
Hence iff(St) = logStthen
d(logSt) =

(logSt)
1
2

2

dt+dWt
=


1
2

2
logSt

dt+dWt
=(logSt)dt+dWt
where=
1
2

2
:Going back to
df=(f)dt+dWt
and now writext=fwhich givesdxt=dfand we are left with an Ornstein-Uhlenbeck process
dxt=xtdt+dWt:
Following the earlier integrating factor method gives
d

e
t
xt

=e
t
dWt
Z
T
t
d(e
s
xs) =
Z
T
t
e
s
dWs
xT=e
(Tt)
xt+
Z
T
t
e
(Ts)
dWs:
Now replace these terms with the original variables and parameters
logST


1
2

2

=e
(Tt)

logST


1
2

2

+
Z
T
t
e
(Ts)
dWs;
which upon rearranging and factorising gives
logST=e
(Tt)
logST+


1
2

2


1e
(Tt)

+
Z
T
t
e
(Ts)
dWs:
35 Now considerE[logST] =
e
(Tt)
logST+


1
2

2


1e
(Tt)

+E
Z
T
t
e
(Ts)
dWs

=e
(Tt)
logST+


1
2

2


1e
(Tt)

RecallV[aX+b] =a
2
V[X]. So writeV[logST] =
V

e
(Tt)
logST+


1
2

2


1e
(Tt)

+
Z
T
t
e
(Ts)
dWs

=V

e
(Tt)
logST+


1
2

2


1e
(Tt)


|
{z}
=0
+V


Z
T
t
e
(Ts)
dWs

=
2
V
Z
T
t
e
(Ts)
dWs

=
2
E
"
Z
T
t
e
(Ts)
dWs
2
#
because we have already obtained from the expectation thatE
Z
T
t
e
(Ts)
dWs

= 0:
Now use Itô’s Isometry, i.e.
E
"
Z
t
0
YsdXs
2
#
=E
Z
t
0
Y
2
sds

;
V[logST] =
2
E
"
Z
T
t
e
(Ts)
dWs
2
#
=
2
E
Z
T
t
e
2(Ts)
ds

=
2
E
"
1
2
e
2(Ts)




T
t
#
=

2
2

1e
2(Tt)

Hence veri…ed.
Example:Consider the SDE for the variance processv
dv="(m)dt+dWt;
wherev=
2
: "; ; mare constants. Using Itô’s lemma, show that the volatilitysatis…es the SDE
d=a(; t)dt+b(; t)dWt;
where the precise form ofa(; t)andb(; t)should be given.
Consider the stochastic volatility model
dv="

m
p
v

dt+
p
vdWt
36

IfF=F(v)then Itô gives
dF=

"(m)
dF
dv
+
1
2

2
v
d
2
F
dv
2

dt+
p
v
dF
dv
dWt:
ForF(v) =v
1=2
;
dF
dv
=
1
2
v
1=2
;
d
2
F
dv
2=
1
4
v
3=2
dF=d=

"
2
(m)v
1=2

1
8

2
v
1

dt+

2
dWt
=

"
2
(m)
1
8

2

dt+

2
dWt
a(; t) =

"
2
(m)
1
8

2

;b(; t) =

2
Higher Dimensional Itô
There is a multi-dimensional form of Itô’s lemma. Let us consider the two-dimensional version initially,
as this can be generalised nicely to theNdimensional case, driven by a Brownian motion of any number
(not necessarily the same number) of dimensions. Let
Wt:=

W
(1)
t; W
(2)
t

be a two-dimensional Brownian motion, whereW
(1)
t; W
(2)
tare independent Brownian motions, and de…ne
the two-dimensional Itô process
Xt:=

X
(1)
t; X
(2)
t

such that
dSi=
iSidt+iSidWi
Consider the case whereNshares follow the usual Geometric Brownian Motions, i.e.
dSi=
iSidt+iSidWi;
for1iN:The share price changes are correlated with correlation coe¢ cient
ij:By starting with a
Taylor series expansion
V(t+t; S1+S1; S2+S2; :::::; SN+SN) =
V(t; S1; S2; :::::; SN) +
@V
@t
+
NP
i=1
@V
@Si
dSi+
1
2
NP
i=1
NP
j=i
@
2
V
@Si@Sj
+::::
which becomes, usingdWidWj=
ijdt
dV=

@V
@t
+
NP
i=1

iSi
@V
@Si
+
1
2
NP
i=1
NP
j=1
ij
ijSiSj
@
2
V
@Si@Sj
!
dt+
NP
i=1
iSi
@V
@Si
dWi:
We can integrate both sides over0andtto give
V(t; S1; S2; :::::; SN) =V(0; S1; S2; :::::; SN) +
Z
t
0

@V
@
+
NP
i=1

iSi
@V
@Si
+
1
2
NP
i=1
NP
j=1
ij
ijSiSj
@
2
V
@Si@Sj
!
d
+
Z
t
0
NP
i=1
iSi
@V
@Si
dWi:
37 The Itô product rule
LetXt; Ytbe two one-dimensional Itô processes, where
dXt=a(t; Xt)dt+b(t; Xt)dW
(1)
t;
dYt=c(t; Yt)dt+d(t; Yt)dW
(2)
t
By applying the two-dimensional form of Itô’s lemma withf(t; x; y) =xy
df=
@f
@t
+
@f
@x
x+
@f
@y
y+
1
2
@
2
f
@x
2
x
2
+
1
2
@
2
f
@y
2
y
2
+
@
2
f
@x@y
xy
@f
@t
= 0
@f
@x
=y
@f
@y
=x
@
2
f
@x
2= 0
@
2
f
@y
2= 0
@
2
f
@x@y
= 1
which gives
df=yx+xy+xy
to give
d(XtYt) =XtdYt+YtdXt+dXtdYt:
Now consider a pair of stochastic processes that are independent standard Brownian motions, i.e.W
(1)
t; W
(2)
t
such thatZt=W
(1)
tW
(2)
t;then
d(Zt) =W
(1)
tdW
(2)
t+W
(2)
tdW
(1)
t+dt:
The Itô rule for ratios
Xt; Ytbe two one-dimensional Itô processes, where
dXt=
X(t; Xt)dt+X(t; Xt)dW
(1)
t;
dYt=
Y(t; Yt)dt+Y(t; Yt)dW
(2)
t:
And suppose
dW
(1)
tdW
(2)
t=dt:
By applying the two-dimensional form of Itô’s lemma withf(X; Y) =X=Y:
We already know that forf(t; X; Y)
df=
@f
@X
dX+
@f
@Y
dY+
1
2
@
2
f
@X
2
dX
2
+
1
2
@
2
f
@Y
2
dY
2
+
@
2
f
@X@Y
dXdY
=


X
@f
@X
+
Y
@f
@Y
+
1
2

2
X
@
2
f
@X
2
+
1
2

2
Y
@
2
f
@Y
2
+XY
@
2
f
@X@Y

dt
+X
@f
@X
dW
(1)
t+Y
@f
@Y
dW
(2)
t
@f
@t
= 0
@f
@X
= 1=Y
@f
@Y
=X=Y
2
@
2
f
@x
2= 0
@
2
f
@Y
2= 2X=Y
3 @
2
f
@X@Y
=1=Y
2
which gives
df=


X
1
Y

Y
X
Y
2
+
2
Y
X
Y
3
XY
1
Y
2

dt+X
1
Y
dW
(1)
tY
X
Y
2
dW
(2)
t
df
f
=


X
X


Y
Y
+

2
Y
Y
2

XY
XY

dt+
X
X
dW
(1)
t
Y
Y
dW
(2)
t
38

Another common form is
d

X
Y

=
X
Y

dX
X

dY
Y

dXdY
XY
+

dY
Y

2
!
As an example suppose we have
dS1= 0:1dt+ 0:2dW
(1)
t;
dS2= 0:05dt+ 0:1dW
(2)
t;
= 0:4
d

S1
S2

=


X
1
Y

Y
X
Y
2
+
2
Y
X
Y
3
XY
1
Y
2

dt+X
1
Y
dW
(1)
tY
X
Y
2
dW
(2)
t
where

X= 0:1;
Y= 0:05
X= 0:2;Y= 0:1
d

S1
S2

=

0:1
S2
0:05
S1
S
2
2
+ 0:01
S1
S
3
2
0:008
1
S
2
2

dt+ 0:2
1
S2
dW
(1)
t0:1
S1
S
2
2
dW
(2)
t
Producing Standardized Normal Random Variables
Consider the RAND() function in Excel that produces a uniformly distributed random number over0and
1;writtenUnif[0;1]:We can show that for a large numberN,
lim
N!1
r
12N

NP
1
U(0;1)
N
2

N(0;1):
IntroduceUito denote a uniformly distributed random variable over[0;1]and sum up. Recall that
E[Ui] =
1
2
V[Ui] =
1
12
The mean is then
E

NP
i=1
Ui

=N=2
so subtract o¤N=2;so we examine the variance of

NP
1
Ui
N
2

V

NP
1
Ui
N
2

=
NP
1
V[Ui]
=N=12
As the variance is not1, write
V



NP
1
Ui
N
2

for some2R:Hence
2N
12
= 1which gives=
p
12=Nwhich normalises the variance. Then we achieve
the result r
12N

NP
1
Ui
N
2

:
39 Rewrite as
NP
1
UiN
1
2

q
1
12
p
N
:
and forN! 1by the Central Limit Theorem we getN(0;1).
Generating Correlated Normal Variables
Consider two uncorrelated standard Normal variables"1and"2from which we wish to form a correlated
pair
1;&
2(N(0;1)), such thatE[
1
2] =:The following scheme can be used
1.E["1] =E["2] = 0 ;E["
2
1] =E["
2
2] = 1andE["1"2] = 0 (*"1; "2are uncorrelated):
2. Set
1="1and
2="1+"2(i.e. a linear combination).
3. Now
E[
1
2] ==E["1("1+"2)]
E["1("1+"2)] =
E

"
2
1

+E["1"2] =!=
E


2
2

= 1 =E

("1+"2)
2

=E


2
"
2
1+
2
"
2
2+ 2"1"2

=
2
E

"
2
1

+
2
E

"
2
2

+ 2E["1"2] = 1

2
+
2
= 1!=
p
1
2
4. This gives
1="1and
2="1+
p
1
2

"2which are correlated standardized Normal variables.
40

1
MMAATTEEMMÁÁTTIICCAA DDIISSCCRREETTAA
Índice
Unidad 1: Lógica y teoría de conjuntos ....................................................................................................... 2
1. Definiciones ......................................................................................................................................... 2
2. Leyes de la lógica ............................................................................................................................... 2
3. Reglas de inferencia ........................................................................................................................... 3
4. Lógica de predicados ......................................................................................................................... 3
5. Teoría de conjuntos ............................................................................................................................ 3
Unidad 2: Inducción matemática .................................................................................................................. 4
1. Métodos para demostrar la verdad de una implicación ................................................................ 4
2. Inducción matemática ........................................................................................................................ 4
Unidad 3: Relaciones de recurrencia........................................................................................................... 4
1. Ecuaciones de recurrencia homogéneas ........................................................................................ 5
2. Ecuaciones de recurrencia no homogéneas .................................................................................. 5
3. Sucesiones importantes..................................................................................................................... 5
Unidad 4: Relaciones ..................................................................................................................................... 6
1. Definiciones ......................................................................................................................................... 6
2. Propiedades de las relaciones .......................................................................................................... 6
3. Matriz de una relación ........................................................................................................................ 6
4. Relaciones de equivalencia y de orden ........................................................................................... 6
5. Elementos particulares....................................................................................................................... 7
Unidad 5: Álgebras de Boole ........................................................................................................................ 7
1. Definiciones y axiomas ...................................................................................................................... 7
2. Funciones booleanas ......................................................................................................................... 8
3. Propiedades de los átomos ............................................................................................................... 9
4. Mapa de Karnaugh ............................................................................................................................. 9
5. Isomorfismos entre álgebras de Boole .......................................................................................... 10
Unidad 6: Teoría de grafos .......................................................................................................................... 10
1. Definiciones de grafos y digrafos ................................................................................................... 10
2. Aristas, vértices, caminos y grafos ................................................................................................. 10
3. Grafos de Euler ................................................................................................................................. 12
5. Representación de grafos por matrices ........................................................................................ 13
6. Niveles ................................................................................................................................................ 14
7. Algoritmos de camino mínimo......................................................................................................... 14
Unidad 7: Árboles ......................................................................................................................................... 15
1. Definiciones ....................................................................................................................................... 15
2. Árboles generadores ........................................................................................................................ 16
3. Algoritmos para hallar un árbol generador mínimo ..................................................................... 16
Unidad 8: Redes de transporte ................................................................................................................... 16
1. Definiciones ....................................................................................................................................... 16
2. Algoritmo de Ford-Foulkerson ........................................................................................................ 17

2
Unidad 1: Lógica y teoría de conjuntos

1. Definiciones

Lógica: estudio de las formas correctas de pensar o razonar.
Proposición: afirmación que es verdadera o falsa, pero no ambas.
Proposición primitiva: proposición que no se puede descomponer en otras dos o más proposiciones.
Siempre son afirmativas.
Proposición compuesta: proposición formada por dos o más proposiciones relacionadas mediante
conectivas lógicas.
Tablas de verdad:

p q p
(NOT)
p q
(AND)
p q
(OR)
p q
(XOR)
p  q
(IF)
p  q
(IIF)
p  q
(NOR)
p | q
(NAND)
V V F V V F V V F F
V F F F V V F F F V
F V V F V V V F F V
F F V F F F V V V V
Nota: proposiciones 

líneas de tabla.

Negación: no, nunca, jamás, no es cierto que.
Conjunción: y, e, pero, como, aunque, sin embargo, mientras.
Disyunción: o, a menos que.
Disyunción excluyente: o bien.
Implicación: cuando, siempre que.
Doble implicación: si y sólo si (sii), cuando y solo cuando.

{|} y {} son los únicos conjuntos adecuados de un solo conectivo diádico.

“p q” “p q”
 Si p, entonces q.
 p implica q.
 p solo si q.
 p es el antecedente, q es el consecuente.
 q es necesario para p.
 p es suficiente para q.

 p es necesario y suficiente para q.
 p si y solo si q.


Tautología: proposición que es verdadera siempre.
Contradicción: proposición que es falsa siempre.
Contingencia: proposición que puede ser verdadera o falsa, dependiendo de los valores de las
proposiciones que la componen.







2. Leyes de la lógica

1) Ley de la doble negación p  p
2) Ley de conmutatividad a) p  q  q  p
b) p  q  q  p
3) Ley de asociatividad a) p  (q  r)  (p  q)  r
 p  q  p  q
 p  q  (p  q)  (q  p)
 (p  q)  (p  q)  (p  q)
 a  (b  c)  (a  b)  (a  c)
 (p  q)  t  (p  t)  (q  t)

3
b) p  (q  r)  (p  q)  r
4) Ley de distributividad a) p  (q  r)  (p  q)  (p  r)
c) p  (q  r)  (p  q)  (p  r)
5) Ley de idempotencia a) p  p  p
b) p  p  p
6) Ley del elemento neutro a) p  F
0  p
b) p  T
0  p
7) Leyes de De Morgan a) (p  q)  p  q
b) (p  q)  p  q
8) Ley del inverso a) p  p  T
0
b) p  p  F
0
9) Ley de dominancia a) p  T
0  T
0
b) p  F
0  F
0
10) Ley de absorción a) p  (p  q)  p
b) p  (p  q)  p

Dual de S: Sea S una proposición. Si S no contiene conectivas lógicas distintas de  y  entonces el dual de
S (S
d
), se obtiene de reemplazar en S todos los  () por  () y todas las T0 (F0) por F0 (T0).
Sean s y t dos proposiciones tales que s  t, entonces s
d
 t
d
.

 Recíproca: (q  p) es la recíproca de (p  q)
 Contra-recíproca: (q  p) es la contra-recíproca de (p  q)
 Inversa: (p  q) es la inversa de (p  q)

3. Reglas de inferencia


Modus ponens o Modus ponendo ponens
p  q
p
 q

Modus tollens o Modus tollendo tollens
p  q
q
 p

4. Lógica de predicados

Función proposicional: expresión que contiene una o más variables que al ser sustituidas por elementos del
universo dan origen a una proposición.
Universo: Son las ciertas opciones “permisibles” que podré reemplazar por la variable.
Cuantificador universal: proposición que es verdadera para todos los valores de en el universo.

Cuantificador existencial: proposición en que existe un elemento del universo tal que la función
proposicional es verdadera.








5. Teoría de conjuntos

Conjunto de partes: dado un conjunto A, p(A) es el conjunto formado por todos los subconjuntos de A,
incluídos A y . Si A tiene elementos, p(A) tendrá

elementos. Ejemplo:

Negación de proposiciones cuantificadas:
 [x p(x)]  x p(x)
 [x p(x)]  x p(x)

 x [p(x)  q(x)]  x p(x)  x q(x)
 x [p(x)  q(x)]  x p(x)  x q(x)
 x [p(x)  q(x)]  x p(x)  x q(x)
 x p(x)  x q(x)  x [p(x)  q(x)]
 x [p(x)  q(x)] ≠ x p(x)  q(x) 4






Pertenencia: un elemento “pertenece” a un conjunto.
Inclusión: un conjunto está “incluido” en un conjunto.

Operaciones entre conjuntos:
Unión:
Intersección:
Diferencia:
Diferencia simétrica:
Complemento:

Leyes del álgebra de conjuntos: Para cualquier A, B  U:

Leyes conmutativas

Leyes asociativas

Leyes distributivas

Leyes de idempotencia

Leyes de identidad



Complementación doble
Leyes del complemento

Leyes de De Morgan


Unidad 2: Inducción matemática

1. Métodos para demostrar la verdad de una implicación

1) Método directo: V  V
2) Método indirecto:
a) Por el contrarrecíproco: F  F
b) Por el absurdo: supongo el antecedente verdadero y el consecuente falso y busco llegar a una
contradicción de proposiciones.

2. Inducción matemática

I)
II) 

Unidad 3: Relaciones de recurrencia

Orden de una relación: mayor subíndice – menor subíndice.
 

5

1. Ecuaciones de recurrencia homogéneas

Sea la ecuación




(*). Resolverla significa:
I) Hallar las raíces de la ecuación característica de (*):




II) Utilizar los teoremas siguientes para hallar la solución.

Teorema 1: si
y
son soluciones de la ecuación (*), entonces


también es solución de (*)
.
Teorema 2: si
es raíz de la ecuación característica, entonces


es solución de (*).
Teorema 3: si
y
(

) son soluciones de la ecuación característica, entonces






es
solución de (*)y










Teorema 4: si

es raíz doble de la ecuación característica, entonces



es solución de (*).
Teorema 5: si

es raíz doble de la ecuación característica, entonces






es solución
de (*) y











2. Ecuaciones de recurrencia no homogéneas

Sea la ecuación






(*), con . Resolverla significa:
I) Resolver la ecuación homogénea asociada y obtener


.
II) Hallar una solución particular de la ecuación (*),


.
III) La solución general será:








Nota: en la solución particular propuesta no debe haber sumandos que aparecen en la solución de la
ecuación homogénea.




propuesta



(a no es raíz de la ecuación
característica)





(a es raíz de multiplicidad t de la
ecuación característica)





Polinomio de grado k y 1 no es raíz de la
ecuación característica
Polinomio genérico de grado k
Polinomio de grado k y 1 es raíz de
multiplicidad t de la ecuación característica
Polinomio genérico de grado k
multiplicado por



ó




Caso especial 1:









I) Proponer una solución


para








II) Proponer una solución


para








III) La solución será





.

Caso especial 2:









I) Proponer una solución


para








II) Proponer una solución


para








III) La solución será





. Luego, comparar con la solución del homogéneo y arreglar si es
necesario.

3. Sucesiones importantes

Interés Fibonacci Torres de Hanoi Desarreglos
an = 1,12.an-1

Fn = Fn-1 + Fn-2 hn = 2hn-1 + 1 dn = (n – 1).(dn-1 + dn-2)
6
Unidad 4: Relaciones

1. Definiciones

Producto cartesiano:
Relación n-aria: dado un conjunto A se llama relación R en conjunto A R  AA. Una relación se puede
definir por extensión (mencionando todos sus elementos) o por comprensión (dando una característica de
los elementos).
Relación „R‟: Siendo x A, y A, decimos que xRy (x,y) R.
Relación inversa: dada , la relación inversa

es tal que:













2. Propiedades de las relaciones

Sea R una relación en el conjunto A.
1) R es reflexiva  x A: xRx
2) R es simétrica  x,y A : (xRy  yRx)
3) R es transitiva  x,y,z A : (xRy  yRz)  xRz
4) R es antisimétrica  x,y A : (xRy  yRx  x=y)
Nota: Todo elemento cumple las tres primeras consigo mismo. Cuidado con la 4º: no simétrica 
antisimétrica.

3. Matriz de una relación

Sea R una relación en un conjunto finito A. La misma puede representarse matricialmente por:


siendo n=|A| definida por










Relación de orden entre matrices booleanas:

. Es decir, una matriz C es
menor a D si D tiene al menos los mismos 1 en las mismas posiciones que C.

Sea I la matriz identidad de n x n. Entonces:
 R es reflexiva
 R es simétrica


 R es antisimétrica

(el producto se entiende posición por posición)
 R es transitiva



4. Relaciones de equivalencia y de orden

Relación de equivalencia (~) Relación de orden ( )
- Reflexividad
- Simetría
- Transitividad
- Reflexividad
- Antisimetría
- Transitividad

 Orden total:  x,y A : (xRy  yRx). En el diagrama de Hasse se ve una línea recta.
 Orden parcial: x,y A : (xRy  yRx)
(Si no es orden total, es orden parcial.)

Repaso de funciones
Sean A y B dos conjuntos. Una relación es función si:
a A / f(a) = b0 f(a) = b1 (b0, b1 B b0  b1) (No existe elemento del dominio que tenga dos imágenes)

Sea  función, a A, b B:
 f es inyectiva  a1  a2  f(a1)  f(a2) (Para puntos distintos del dominio, distintas imágenes)
 f es sobreyectiva   b B,  a A / f(a) = b (La imagen de A es todo B)
 f es biyectiva  f es inyectiva y sobreyectiva (Si es biyectiva existe la inversa)

7
Clase de equivalencia: sea R una relación de equivalencia en A. Se llama clase de equivalencia de un ,
al conjunto

Teorema: sea R una relación de equivalencia en A. Se verifica:












Conjunto cociente: . El conjunto cociente es una partición de A.

Partición:



es una partición del conjunto A si y solo si:
1)

2)

3)



4)



Congruencia módulo n: En , y para , se define la relación

Diagrama de Hasse: representación gráfica simplificada de un conjunto (finito) ordenado parcialmente. Con
ellos se eliminan los lazos de reflexividad y los atajos de transitividad. Si dos elementos están relacionados,
digamos aRb, entonces dibujamos b a un nivel superior de a.
Ejemplo: sea el conjunto A = {1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30, 60} (todos los divisores de 60). Este
conjunto está ordenado parcialmente por la relación de divisibilidad. Su diagrama de Hasse puede ser
representado como sigue.


5. Elementos particulares

Sea R una relación de orden en A:

Maximal: x0 es maximal de A x A : x0Rx (x0 no se relaciona con nadie).
Minimal: x0 es minimal de A x A : xRx0 (No hay elementos que se relacionen con el x0.)

Sea X un subconjunto de A:

Cota Superior: x0 A es Cota Superior de X x X : xRx0.
Cota Inferior: x0 A es Cota Inferior de X x X : x0Rx.

Supremo: s A es el Supremo de X s es la menor de todas los cotas superiores x X : xRs.
Ínfimo: i A es Ínfimo de X i es la mayor de todas las cotas inferiores x X : iRx.

Máximo: M A es Máximo de X M es supremo de X y M X.
Mínimo: m A es Mínimo de X m es ínfimo de X y m X.

Unidad 5: Álgebras de Boole

1. Definiciones y axiomas
8
Álgebra de Boole: Sea K ( ) un conjunto no vacío que contiene dos elementos especiales, 0 (cero o
elemento neutro) y 1 (uno o elemento unidad) sobre el cual definimos las operaciones cerradas +,  y el
complemento. Entonces =(K, 0, 1, +, , ) es un Álgebra de Boole si cumple las siguientes condiciones:

A1) Axioma de conmutatividad x + y = y + x
x.y = y.x
A2) Axioma de asociatividad (x + y) + z = x + (y + z) = x + y + z
(x.y).z = x.(y.z) = x.y.z
A3) Axioma de la doble distributividad x.(y + z) = x.y + x.z
x + (y.z) = (x + y).(x + z)
A4) Axioma de existencia de elementos neutros x + 0 = x
x.1 = x
A5) Axioma de existencia de complementos x + = 1
x. = 0

Expresión dual: se obtiene cambiando todos los +() por  (+) y los 0(1) por 1(0).
Principio de dualidad: en toda álgebra de Boole, si una expresión es válida, su expresión dual también lo es.

1) Ley del doble complemento: = x
2) Leyes de Morgan: a) = .
b) = +
3) Leyes conmutativas: a) x + y = y + x
b) x.y = y.x
4) Leyes asociativas: a) x + (y + z) = (x + y) + z
b) x.(y.z) = (x.y).z
5) Leyes distributivas: a) x + (y.z) = (x + y).(x + z)
b) x.(y + z) = xy + xz
6) Leyes de idempotencia: a) x + x = x
b) x.x = x
7) Leyes de identidad: a) x + 0 = x
b) x.1 = x
8) Leyes de inversos: a) x + x = 1
b) x.x = 0
9) Leyes de acotación: a) x + 1= 1
b) x.0 = 0
10) Leyes de absorción: a) x + xy = x x + x y = x + y
b) x.(x + y) = x x.(x + y) = x.y

Permitido Prohibido
 x + y = 0  (x = 0)  (y = 0)
 x.y = 1  (x = 1)  (y = 1)
 x + y = z + y  x + y = z + y  x = z
 x + y = x.y  x = y
 x.y = 0  (x = 0) (y = 0)
 x + y = y + z  x = z

2. Funciones booleanas

Función booleana:

. Dadas n variables, existen


funciones
booleanas posibles.







Observación:
    
  +  
PROBLEMA
TABLA EXPRESIÓN de f
EXPRESIÓN
SIMPLIFICADA
CIRCUITO

9


“0”
MINITERMINOS MAXITERMINOS
m = x.y.z M = x + y + z
Forma canónica, normal, normal disyuntiva SP:
suma booleana de minitérminos.
Forma canónica, normal, normal conjuntiva PS:
producto booleano de maxitérminos.
f(x,y,z)  suma de los minitérminos que dan 1 f(x,y,z)  producto de los maxitérminos que dan 0
Codificación: x  1, x  0 Codificación: x  0, x  1






Orden en un álgebra de Boole: sea = (K,+, ,0,1,-) un álgebra de Boole. En K se define:
a b aRb a b a b a a b b a b
Teorema: . Todo álgebra de Boole está acotada.

Átomo de un álgebra de Boole: x
x
es un átomo de B
y B: (y  x
y = 0 y = x
)

Nota: Si B tiene n átomos  B tiene 2
n
elementos.

Circuitos lógicos:



3. Propiedades de los átomos

1) x
átomo 


(El producto de cualquier elemento de B con un átomo es 0
o es el átomo)
2) x0, x1 átomos distintos  x0.x1 = 0 (Si hay dos átomos distintos el producto entre ellos es 0)
3) Sean


átomos de B
 (Si hay un x que multiplicado
por cada uno de los átomos da 0, x es el 0)

Teorema: sean


los átomos de B. Entonces


tales que




.
Teorema:


, con
átomo de B.

Nota: Si n es la cantidad de variables de f, el número máximo de términos es 2
n
.

4. Mapa de Karnaugh

Para simplificar una función booleana. Se colorean los cuadrados de los minitérminos correspondientes y
luego se escribe cada término, teniendo en cuenta que si un cuadrado tiene un vecino (abajo, arriba,
derecha o izquierda) este último no se escribe.



xy\zw 00 01 11 10
00 0 1 3 2
01 4 5 7 6
11 12 13 15 14
10 8 9 11 10
f =  m(1, 3, 9, 11, 14, 6)

f = (w. + z. .y)
(simplificada)
Observación:
La suma de los minitérminos de una función producto de los maxitérminos que no aparecen en la SP.

 m(0, 1, 3, 5, 7) =  M(2, 4, 6) 10
5. Isomorfismos entre álgebras de Boole

Isomorfismo entre dos álgebras de Boole: sean B1 = (K1, +1, 1, 01, 11, 1) y B2 = (K2, +2, 2, 02, 12, 2) dos
álgebras de Boole. Se dice que B1 y B2 (#B1 = #B2) son isomorfos 

biyectiva tal que:












El número de isomorfismos posibles es (#B1)!

Propiedades:
1) f(01) = 02
2) f(11) = 12
3) f(átomo B1) = átomo B2
4) x R1 y  f(x) R2 f(y)

Unidad 6: Teoría de grafos

1. Definiciones de grafos y digrafos

Grafo no orientado: terna G = (V,A,) que representa una relación entre un conjunto finito de Vértices (
) y otro conjunto finito de Aristas (A), y  es la función de incidencia.
: A  X(V), siendo X(V) = {X: X  V |X|= 1 o 2}.

Si (a) = {u,v} entonces
u y v son extremos de a
u y v son v rtices adyacentes
a es incidente en u y v



Grafo orientado / digrafo: terna D = {V,A,) con que representa una relación entre un conjunto finito
de Vértices y otro conjunto finito de Aristas, y  es la función de incidencia.
: A  V x V.
Si (a) = (v,w) entonces
v es extremo inicial y w es extremo final de a
v y w son v rtices adyacentes
a incide positivamente en w y negativamente en v



2. Aristas, vértices, caminos y grafos

Aristas
Aristas adyacentes: aristas que tienen un solo extremo en común.
Arista paralelas o múltiples: a
a
son aristas paralelas  a
 a
. Es decir, sii  no es inyectiva.
Lazo o bucle: arista que une un vértice con sí mismo.
Arista incidente: Se dice que “e es incidente en v” si v esta en uno de los vértices de la arista e.
Extremo (para digrafos): Un extremo es inicial(final) si es el primer(ultimo) vértice de la arista.
Aristas paralelas (para digrafos): Si E.I(a) = E.I(b)  E.F(a) = E.F(b) en otro caso son anti paralelas.
Puente: Es la arista que al sacarla el grafo deja de ser conexo.

Vértices
Vértices adyacentes: Se dice que “v y w son adyacentes” si existe una arista entre los dos vértices.
 Un vértice es adyacente a sí mismo si tiene lazo.
Grado de un vértice: gr(v) es la cantidad de aristas que inciden en él. Los lazos cuentan doble.
 Se dice que un vértice es „par‟ o „impar‟ según lo sea su grado.
 gr v
v

 La cantidad de vértices de grado impar es un número par.
 Si gr(v) = 0, v es un vértice aislado.
Grado positvo (para digrafos): gr

v es la cantidad de veces que se usa el vértice como extremo final.
Grado negativo (para digrafos): gr

v es la cantidad de veces que se usa el vértice como extremo inicial.

11
Nota: Si v V gr(v)  2  el grafo tiene un circuito.
 gr

v gr

v
 grtotal(v) = gr

v gr

v
 grneto(v) = gr

v gr

v
 El lazo cuenta como arista incidente positiva y negativamente en el vértice.
Vértice de aristas múltiples: Es aquel que tiene más de un arista.

Caminos
Camino: sucesión finita no vacía de aristas distintas que contengan a vx y vy en su primer y último término.
Así: {vx,v1},{v2,v3},...,{vn,vy}
Longitud del camino: número de aristas de un camino.
Circuito o camino cerrado: camino en el cual v
v
n.
Camino simple: camino que no repite vértices.
 v w v w camino de v a w camino simple de v a w
Circuito simple: circuito que no repite vértices salvo el primer y último vértice.
Ciclo: circuito simple que no repite aristas.
 Circuito simple de longitud  3 en grafos ( 2 en digrafos) es un ciclo.



Grafos
Orden de un grafo: Es su número de vértices.
Grafo acíclico: grafo que no tiene ciclos.
Grafo conexo: grafo tal que dados 2 vértices distintos es posible encontrar un camino entre ellos.
camino de a )
Grafo simple: grafo que carece de aristas paralelas y lazos.
Grafo regular: Aquel con el mismo grado en todos los vértices.
Grafo k-regular: G=(V,A, ) es k-regular v gr v k
Grafo bipartito: Es aquel con cuyos vértices pueden formarse dos conjuntos disjuntos de modo que no haya
adyacencias entre vértices pertenecientes al mismo conjunto.





Grafo Kn,m: grafo bipartito simple con la mayor cantidad de aristas.
 #
n
= n.m
Grafo Kn: grafo simple con n vértices y la mayor cantidad de aristas.
 #
n
=
n n


Grafo completo: grafo simple con mayor cantidad de aristas. Todos están conectados con todos.
 v V, gr(v) = #V – 1.
 Si G(V,A) es completo  G es regular (No vale la recíproca)
 Dos grafos completos con mismo #V son isomorfos.

Grafo complemento: dado G=(VG,AG) simple se llama grafo complemento a

tal que










. Es el grafo G‟ que tiene conectados los vértices no conectados de G y desconectados los
vértices conectados de G.





 G  G‟ = Grafo completo.
 Si dos grafos son complementarios, sus isomorfos también.
 Sea gr
G v k  gr
G
v – k –

v1
v2
v3
v5
v4
v1 v1
v2 v3
v4 v5
v5
v3 v2
v4
G G’ 12
Grafo plano: Aquel que admite una representación bidimensional sin que se crucen sus aristas.
Grafo ponderado: Es el grafo en cual cada arista tiene asignado un n° real positivo llamado peso.
Digrafo: Grafo con todas sus aristas dirigidas. Por tanto, los pares de vértices que definen las aristas, son
pares ordenados.
Digrafo conexo: Si su grafo asociado es conexo.
Digrafo fuertemente conexo: v V  camino que me permite llegar a cualquier otro vértice.
Digrafo k-regular: D=(V,A, ) es k-regular v gr

v gr

v k

Subgrafo de G: Dado G = ( , ), G‟ = ( ‟, ‟) es subgrafo de G si ‟ V y ‟  A
Grafo parcial de G: Dado G = ( , ), G‟ = ( ‟, ‟) es grafo parcial de G si ‟ V y ‟  A
Multigrafo: Grafo que tiene alguna arista múltiple.
 Un multigrafo se transforma en grafo añadiendo un vértice en mitad de cada arista múltiple.
Pseudografo: Grafo con algún lazo.

3. Grafos de Euler

Grafo de Euler: grafo en el cual se puede encontrar un ciclo o un camino de Euler.

 Camino de Euler: camino que no repite aristas.
 Circuito de Euler: circuito que no repite aristas.

Teorema de Euler:

 Para grafos conexos:
 G tiene un Camino de Euler  G tiene exactamente 2 vértices de grado impar.
 G tiene un Circuito de Euler  G tiene exactamente 0 vértices de grado impar.

 Para digrafos:
 G tiene un Camino de Euler   u,w V (u  w)
gr

u gr

u
gr

w gr

w
gr

v gr

v v


 G tiene un Circuito de Euler  v V gr

v gr

v

Grafo de Hamilton: grafo en el cual es posible hallar un camino o circuito de Hamilton.

 Camino de Hamilton: Es un camino que no repite vértices. (Puede no pasar por todas las aristas)
 Circuito de Hamilton: Es un circuito que no repite vértices. (Puede no pasar por todas las aristas)

Teorema de Ore: Si un grafo es conexo con y    G es Grafo
Hamiltoniano.

Teorema de Dirac: un grafo simple con es Hamiltoniano si




4. Isomorfismos de grafos

Dados G=( , ) y G‟=( ‟, ‟), se denomina isomorfismo de G a G’ a la aplicación biyectiva f tal que para a,b
V, {a,b} A  se cumple {f(a),f(b)} ‟. Es decir, la aplicación que relaciona biyectivamente pares de
vértices de A con pares de vértices de ‟, de modo que los v rtices conectados siguen estándolo.
 # = # ‟ y # = # ‟
 Se cumple que (a)=(f(a))
 Si dos grafos son isomorfos, sus complementarios también.
 G y G‟ tienen igual cantidad de vértices aislados.
 G y G‟ tienen igual cantidad de lazos o bucles.
 Se mantienen los caminos.
 Se mantienen los ciclos.
 Si dos grafos complementarios son isomorfos se los llama auto complementarios.

13
 Dos grafos simples G1 y G2 son isomorfos  para cierto orden de sus vértices las MA son iguales.

Automorfismo: Es un isomorfismo en sí mismo. f(a) = a.

5. Representación de grafos por matrices


Grafos






Digrafos







Matriz de
adyacencia



( ) tal que:

con

cantidad de aristas con extremos

y


 Matriz simétrica.
 gr(vi) = aij + 2.aii (i  j)













( ) tal que

con

cantidad de aristas con E.I en vi y E.F en
vj

 No necesariamente simétrica.
Matriz de
incidencia



( ) tal que


, con




















( ) tal que


, con

























Propiedad: en la matriz
G
k
, cada coeficiente a
ij indica la cantidad de caminos de longitud k que hay
entre v
i y v
j.

Matriz de conexión: Dados G=(V,A, ) con

y

. Se define la siguiente relación:
.










Matriz de adyacencia booleana: sea un grafo G=(V,A, ) con v
v
n y a
a
m . Se define la
matriz de adyacencia de G a una matriz booleana de tal que:
a3
v3 v4
a4
a1
v1 v2
v5
a6
a5
a2
v1 v2 v3 v4 v5
v1 0 1 1 0 0
v2 1 1 0 1 0
v3 1 0 0 2 0
v4 0 1 2 0 0
v5 0 0 0 0 0

gr(v1)
gr(v1)
v4
v2
a4
a6
a5
a3
a1
v1
v3
a2
a1 a2 a3 a4 a5 a6
v1 1 0 0 0 0 1
v2 1 2 1 0 0 0
v3 0 0 0 1 1 1
v4 0 0 1 1 1 0
v5 0 0 0 0 0 0

gr(v1)
| |
2
v1 v2 v3 v4 v5
v1 0 0 1 0 0
v2 1 1 0 0 0
v3 0 0 0 1 0
v4 0 1 1 0 0
v5 0 0 0 0 0







a1 a2 a3 a4 a5 a6
v1 1 0 0 0 0 -1
v2 -1 1 1 0 0 0
v3 0 0 0 -1 -1 1
v4 0 0 -1 1 1 0
v5 0 0 0 0 0 0

| |
0
gr
+
(v1)=aij,(aij>0)
gr
-
(v1)=aij,(aij<0)
14

G m
ij tal que m
ij
si v
i es adyacente a v
j
si v
i es adyacente a v
j




Matriz de incidencia booleana: sea un grafo G=(V,A, ) con v
v
n y a
a
m . Se define la
matriz de adyacencia de G a una matriz booleana de tal que:

G m
ij tal que m
ij
si a
i es incidente a v
j
si a
i es incidente a v
j



6. Niveles

Vértice alcanzable: sea D=(V,A) un digrafo. Se dice que se alcanza de camino dirigido de
a .
Niveles de un digrafo: Un conjunto vértices N constituye o está en nivel superior a otro conjunto de vértices
K si ningún vértice de N es alcanzable desde algún vértice de K.

Dibujar MA
i = 1
while MA:
Nivel i = vi’s tales que sus filas y columnas en M A sean nulas
MA = MA – {columnas y filas que sean nulas}
i = i + 1

Nivel 1: A,G
Nivel 2: B
Nivel 3: E
Nivel 4: C
Nivel 5: F
Nivel 6: D



7. Algoritmos de camino mínimo

Objetivo: Hallar el camino mínimo de S a L:
 (v) es la etiqueta del vértice v.
 i es un contador.

Algoritmo de Moore o BFS (Breadth First Search)
 Dado un grafo o digrafo no ponderado, calcula la distancia entre dos vértices.

(S) = 0
i = 0
while (vértices adyacentes a los etiquetados con i no etiquetados):
(v) = i+1
if (L is etiquetado): break
i = i+1

Algoritmo de Dijkstra
 Dado un grafo o digrafo con pesos no negativos, calcula caminos mínimos del vértice a todos los
vértices.

(S) = 0
for v in V:
(v) = 
T = V
C
D
F
B
G A
E
A
B C D
E
F G
Solo flechas
descendentes!

15
while (L T):
Elijo v T con mínimo (v) adyacente al último etiquetado
x / x adyacente v:
(x) = min{(x), (v) + a(v,x)}
T = T – {v}

Algoritmo de Ford
 Solo para digrafos, acepta pesos negativos y detecta circuitos negativos.

(S) = 0
for v in V:
(v) = 
j = 1
while ( j ≠ |V|):
T ={v V / v sea adyacente al último etiquetado}
x V, v T :
(v) = min{(x), (v) + a(v,x)}
Si no hubo cambios: break
Else: j = j + 1
return T


Unidad 7: Árboles

1. Definiciones

Árbol: G=(V,A) es un árbol   u,v V (u v  ! camino simple de u a v)

Teorema 1: dado un grafo G=(V,A). Las siguientes afirmaciones son equivalentes:
a) G es conexo y acíclico
b) G es acíclico y si se le agrega una artista deja de serlo
c) G es conexo y si se le elimina una arista deja de serlo
d) G es árbol

Teorema 2: dado un grafo G=(V,A). Las siguientes afirmaciones son equivalentes:
a) G es conexo y acíclico
b) G es conexo y
c) G es acíclico y

Propiedad: si G es un árbol con  hay al menos 2 vértices de grado 1.

Bosque: un grafo G=(V,A) es bosque  G es acíclico.
 Los bosques son grafos no conexos cuyas componentes conexas son árboles.
 t, siendo t la cantidad de árboles del bosque.

Arboles con raíz: G=(V,A) digrafo conexo es un árbol con raíz 









Hoja / terminal: Vértice sin hijos.
Vértice interno: Vértice con hijos.
Árbol n-ario: todos los nodos tienen a lo sumo n hijos.
Árbol n-ario completo: todos los nodos tienen 0 o n hijos.
Nivel de un vértice: número de aristas que le separan de la raíz. La raíz tiene nivel 0.
Altura de un árbol: máximo nivel de sus vértices.
Árbol equilibrado: las hojas llegan al mismo nivel.
16
Teorema: Si T = (V, A) es una árbol m-ario completo con i vértices internos entonces:





2. Árboles generadores
Árbol generador: T=(
,
) es un árbol generador de G=(
,
) 
T es árbol

T
G

T
G



Árbol generador minimal: es un árbol generador, de peso mínimo. No es único.

Teorema: G es un grafo no dirigido y conexo  G tiene árbol generador.

3. Algoritmos para hallar un árbol generador mínimo

Sea G = (V, A) un grafo conexo ponderado. Existen dos algoritmos para hallar un árbol generador mínimo
de G.

Algoritmo de Prim

v = vértice cualquiera de G
T = {v}
while (|T| ≠ |V|):
a = arista de mínimo peso incidente en un v T y un w  T
T = T + {w}
return T

Algoritmo de Kruskal

a = arista de mínimo peso de G
T = {a}
while (|T| < |V|-1):
b = arista de mínimo peso tal que b  T y T + {b} es acíclico
T = T + {b}
return T

Unidad 8: Redes de transporte

1. Definiciones

Red de transporte: sea G = (V, A) un digrafo conexo y sin lazos. G es una red de transporte si se verifican:

1) Vértice Fuente: ! vértice f V / gr

f (no llegan flechas)
2) Vértice Sumidero: ! vértice s V / gr

s (no salen fleches)
3) Capacidad de la Arista:  una función 
/ si a = (vi, vj) A, C(a) = Cij

Flujo de una red: Si G = (V, A) es una red de transporte se llama flujo de G a una función F: A  N0 tal que:

1) a A: F(a)  C(a) (Si F(a) = C(a) se dice que la arista está saturada)
2) v V (v  f , v  s) se tiene que
(Flujo entrante = Flujo saliente)

17



Teorema 1: Si F es el flujo asociado a una red de transporte se cumple que

(Todo lo que sale de la fuente llega al sumidero)

Valor del flujo: suma de los flujos de todas las aristas que salen del vértice fuente:


Corte de una red: Un corte (P, ) en una red de transporte G = (V, A) es un conjunto P tal que:


f s



Capacidad de un corte: Se llama capacidad de un corte (P, ) al número:
Es la
suma de todas las aristas incidentes en v y w tal que v P y w . (Las aristas por donde pasa el corte).

Teorema 2: Sea F un flujo de la red G = (V, A) y sea (P, ) un corte de G. Entonces: C(P, )  val(F)

Teorema 3 (del flujo Máximo y Corte Minimal): Si C(P, ) = val(F)  el flujo es máximo y el corte es
minimal.

Teorema 4: C(P, ) = val(F) 






2. Algoritmo de Ford-Foulkerson

Se utiliza para hallar el flujo máximo en una red de transporte.

Dada una red de transporte G = (V, A), con f (fuente) y s (sumidero):
 (v) función de etiquetación de v.
 ek capacidad residual de vk.

1) Poner en la red un flujo compatible.
2) Etiqueto la fuente con









3) Para cualquier vértice x adyacente a a, etiquetamos a x:
a) Si , etiquetamos x con

).
b) Si , no etiquetamos x.
4) Mientras exista (x a) en V tal que x esté etiquetado y exista una arista (x,y) tal que y no esté
etiquetado, etiquetamos a y:
a) Si , etiquetamos y como

min
b) Si , no etiquetamos y.
5) Mientras exista (x a) en V tal que x esté etiquetado y exista una arista (x,y) tal que y no esté
etiquetado, etiquetamos a y:
c) Si , etiquetamos y como

min
d) Si , no etiquetamos y.

F = 0
saturada
P
Tags