Previous topic

5. Thoughts on finance

6. Mathematical notes¶

Here mathematical notes are kept.

6.1. Using LaTeX syntax for mathematics in RestructuredText and Sphinx¶

Abstract:

This document is about using Latex in reStructuredText and Sphinx. At the link below there is a introduction to mathematical symbols in Latex.

ftp://ftp.ams.org/pub/tex/doc/amsmath/short-math-guide.pdf

6.1.1. Math/Latex in RestructuredText and Sphinx¶

There are 2 ways of writing math in restructuredText:

Inline :math:SomeLatex
To insert relatively simple latex for mathematical expressions like in the text write :math:\psi(r) = \exp(-2r) Inside the back-tics () any Latex code can be written. There must be no space between :math: and the first back-tic.
Complex Latex like eg equation

For producing more complex math like eg an equation* environment in a LaTeX document write:

.. math::
\psi(r) = e^{-2r}

you will get:

Warning

The math markup can be used within reST files (to be parsed by Sphinx) but within your python’s docstring, the slashes need to be escaped ! :math:\alpha should therefore be written :math:\\alpha or put an “r” before the docstring

6.1.2. Examples¶

Below are some examples of Latex commands and symbols:

To do square roots like eg use :math:\sqrt{x^2-1}.

To do fractions like eg use :math:\frac{1}{2}.

In order to insert text into a formula use :math:k_{\text{B}}T to get like in

Use \left and \right for before brackets like in :math:\left(\frac{1}{2}\right)^n to get .

Displayed math can use \\ and & for line shifts and alignments, respectively.

.. math::
a & = (x + y)^2 \\
& = x^2 + 2xy + y^2

The result is:

The matrix environment can also contain \\ and &. To get like:

write:

.. math::
\left(\begin{matrix} a & b \\
c & d \end{matrix}\right)

Equations are labeled with the label option and referred to using:

:eq:label

E.g:

(1)

See equation (1)

is written like:

.. math::
:label: pythag

a^2 + b^2 = c^2

See equation :eq:pythag

Note that:

1. No spaces in label name
2. There should be a blank line before the actual latex code
3. There should be a space between :label: and the label name
4. When referred to label name should be surrounded by

6.2. Splines at Scipy¶

Abstract:

This chapter is about splines in Scipy and how they can be used for yield curve calculations.

It is shown that splines in scipy is probably b-splines. These splines gives different (but not nessacerily wrong) values when interpolating.

This information is important when using splines from scipy.

6.2.1. Natural cubic splines¶

The natural cubic spline is already implemented in decimalpy

In order to redo the calculations below you need to install and import the package decimalpy.

>>> import decimalpy as dp

The example will be taken from [Kiusalaas] p. 119. The data are:

>>> x_data = dp.Vector([1, 2, 3, 4, 5])
>>> y_data = dp.Vector([0, 1, 0, 1, 0])

And the natural cubic spline function is instantiated by:

>>> f = dp.NaturalCubicSpline(x_data, y_data)

And the it is possible to do calculations like function values:

>>> print f(1.5), f(4.5)
0.7678571428571428571428571427 0.7678571428571428571428571427

Slopes:

>>> print f(1.5, 1), f(4.5, 1)
1.178571428571428571428571429 -1.178571428571428571428571428

and curvatures:

>>> print f(1.5, 2), f(4.5, 2)
-2.142857142857142857142857142 -2.142857142857142857142857143

and it is quite easy to do a plot:

>>> import matplotlib.pyplot as plt
>>> linspace = lambda start, stop, steps: dp.Vector([(stop - start) / (steps - 1) * i + start for i in range(steps)])
>>> xnew = linspace(1, 5, 40)
>>> x0, spread  = 2, 0.5
>>> f0x0, f1x0, f2x0 = f(x0, 0), f(x0, 1), f(x0, 2)
>>> f0Lst = dp.Vector([f(x) for x in xnew])
>>> f1Lst = dp.Vector([f0x0 + f1x0 * (x - x0) for x in x2new])
>>> f2Lst = dp.Vector([f0x0 + f1x0 * (x - x0) + f2x0 * (x - x0)**2 / 2 for x in x2new])
>>> plt.plot(x_data, y_data, 'o', xnew, f0Lst, '-', x2new, f1Lst, '-', x2new, f2Lst, '-')
>>> plt.legend(['data', 'fitted f', 'tangent', '2. order'], loc='best')
>>> plt.show()

like:

6.2.2. Splines at Scipy¶

On the other hand [scipy] is well equipped with spline functionality

First set up scipy and numpy:

>>> from numpy import linspace, array, float64
>>> from scipy.interpolate import InterpolatedUnivariateSpline as spline

Use the same data as before:

>>> x_data = array([1, 2, 3, 4, 5], float64)
>>> y_data = array([0, 1, 0, 1, 0], float64)

Creating the cubic spline function:

>>> cubic = spline(x_data, y_data, k=3)

And calculate the cubic spline at 1.5 and 4.5:

>>> cubic(1.5)
array(1.1250000000000002)
>>> cubic(4.5)
array(1.1250000000000002)

This isn’t the same value as before (0.767857142857).

And if one uses interp1d from scipy.interpolate to get the function cubic2:

>>> from numpy import linspace, array, float64
>>> from scipy.interpolate import interp1d
>>> cubic2 = interp1d(x_data, y_data, kind='cubic')

Using cubic2 one gets:

>>> cubic2(1.5)
array(1.1944444444444462)
>>> cubic2(4.5)
array(1.1944444444444466)

Again a new set of values. What is a bit surprising is that a third set of values appears.

So to check the functions in a graph I do:

>>> xnew = linspace(1, 5, 40)
>>> import matplotlib.pyplot as plt
>>> plt.plot(x_data, y_data, 'o', xnew, f0Lst, '-', xnew, cubic(xnew), '--', xnew, cubic2(xnew), '+-')
>>> plt.legend(['data', 'natural spline', 'cubic', 'cubic2'], loc='best')
>>> plt.draw()

to get the graph:

And now it is obvious that scipy isn’t using the natural spline as cubic spline.

What kind of spline is used in scipy? Their code is build upon FITPACK by P. Dierckx.

The main reference is [Dierckx] which is all about b-splines. So a fair guess is that scipy also is b-splines.

This isn’t necessarily bad. Actually I haven’t looked into the differences between the 2 scipy interpolations.

Conclusion:

It matters what kind of interpolation one uses.

The difference in values is probably due to different derivations of (cubic) splines where scipy interpolation is based on b-splines.

6.3. Derivation of Cubic Spline¶

Abstract:

This chapter is about derivation of cubic splines and the use of cubic splines for yield curve calculation.

6.3.1. Deriving the cubic spline function¶

Assume n points and n curvatures at each point.

To ease notation we introduce:

The point are known, the curvatures are assumed known

Because we want to evaluate f,f’,f’’ at it makes things easier when they are formulated as functions of and . Eg it is quite simple to see that:

So the second derivative is continous and piecewise linear.

Also the first derivative f’ can be formulated as (since :

The function which lies and for must look like (integrating twice):

where and or

This is possible if the determinant . This is the case if there isn’t 2 points with the same x-value. An asumption which already must be in place.

And since f must be continous and go through the points we have:

and

This means that the constant :

6.3.2. Ekstrapolation¶

The question is what to do when there is a need for extrapolation. This is a situation quite common when talking about yieldcurve. Eg there are observed zero coupons up until year 20 but a zero coupon for year 25 is needed.

Talking about the natural cubic spline and financial cubic spline they are just linear extrapolations of the endpoints with the same slope as in the endpoints.

The only difference between the 2 cubic splines is that financial cubic spline is set to have a slope equal to zero at the endpoint to the right.

6.3.3. Estimation of the curvatures¶

To estimate the curvatures one uses the first order derivatives:

And the assumption that the slopes are continous at the points , ie

This gives n - 2 linear equations to estimate n curvatures. This means that 2 asumptions are necessary in order to calculate the n curvtures.

6.3.3.1. The Natural Cubic Spline¶

Here the asumptions are . The other equations becomes :

In all n variables and n linear equations

6.3.3.2. The Financial Cubic Spline¶

Here the asumptions are and or:

The other equations becomes :

Also here there are in all n variables and n linear equations

6.4. Using Barycentric Lagrange Interpolation in Finance¶

Abstract:

Lagrange Interpolation has had a revelation since the Barycentric formulas were developed.

Here Barycentric Lagrange Interpolation will be examined as a tool for finding weights to approximate the first and second order derivative for a function only known by it’s functional values.

The first order is used for the duration in finance while the second order derivative is used for the convexity.

This section is highly inspired on [BerrutTrefethen], [SadiqViswanath] and [Fornberg].

Also as references [Kiusalaas] and [Ralston] has been invaluable.

6.4.1. Lagrange Interpolation¶

First construct a polynomial with n + 1 different grid values :

This polynomial has degree n + 1. And it is obvious then that .

Consider then the tangent for p in grid value , ie.:

Now define n + 1 polynomials as:

These polynomials all satisfy:

In other words is Kroenecker’s delta. Note that the ‘s all have degree n.

Since we have:

And then when letting the grid values be the argument of l(x) (last part above is zero since is):

we have that the polynomials are well defined. This is because can be shortened out and then the polynomials are just scaled with a nonzero scaling constants.

Definition, Lagrange Interpolation and Remainder:

Consider n + 1 points . Then the unique polynomial of the minimal degree n going through all n + 1 points is:

Define as the Lagrange remainder.

Since is Kroenecker’s delta it is obvious that .

Theorem, Lagrange Remainder:

Let being an open interval containing all the n+1 grid values and f is a n+1 times continous differentiable function on :

Proof:

For :

Observe that i.e. has n+1 zeros.

Also observe that since p(x) has degree n and .

Now define:

Observe that has both and x as roots, ie n+2 zeros.

According to Rolle’s theorem has n+1 zeros and successively has n zeros.

So has 1 zero, ie or

Q.E.D.

The Lagrange Remainder can be used to limit the overall accuracy for the Lagrange Interpolation if the function f(x) is n+1 times continous differentiable on the closed interval .

This is because , ie:

So if converges to zero then the accuracy becomes of order n+1 ie .

Observation:

The Lagrange Interpolation can for a set of n+1 grid values can be seen as a way of interpolating a function value f(x) based on a weighted average of the functional values where the weights are and the accuracy will be of order .

This can be taken even further. Since only the weights depends on x the first order differentiation of can be approximated by:

So the first order derivative at x can be approximated by the functional values and the weights are and the accuracy will be of order , ie the accuracy loses 1 degree.

And successively for the second order derivative:

This makes Lagrange Interpolation well suited for finding weighted averages of functional values for derivatives of any order with high accuracy, .

6.4.2. Barycentric Lagrange Interpolation¶

Computationally Lagrange Interpolation isn’t that effective. But further analysis shows that the ‘s all have the same common factor so we have the First Barycentric formula:

(2)

where

Looking at the constant function 1 gives:

or

This way we have the Second Barycentric formula:

(3)

Both formulas shows a remarkable low dependence on the ‘s since a change in one ‘s will have a multiplicative effect only one place.

6.4.2.1. Algorithms¶

Interpolation should be done based on the Second Barycentric formula (3).

For updating the weights with an extra point consider a set of basis points to be used as a basis for Barycentric Lagrange interpolation.

When a new grid value has to be added the algorithm is:

1. Calculate
2. Calculate

It is obvious that there needs to be minimum 2 values. When there are only 2 grid values and then .

6.4.2.2. Numerical differentiation at grid points¶

One of the things to use Barycentric Lagrange interpolation is to evaluate is to differentiate a curve at a point when only a set points are known with a high precision.

Usual formulas are eg, which is considered as a 3 point estimation based on the points with first coordinate (x - h, x, x + h).

The precision or accuracy is here of order 2.

Now consider:

Now according to Second Barycentric formula (3):

Letting where is one of the roots (Important assumption) we have:

The last equation will be used several times so it gets a number:

(4)

Since by design and we have for in (4):

Now for i = j we have:

This way we have:

Theorem, Barycentric first order derivative:

Consider a set of grid values and the function values . The formula for the first order derivative at one of the grid values , is:

where

Example 1:

The formula for interpolating the first order derivative can used to optimize the approximation of the derivative. Consider eg . Then and:

where i is the row index while j is the column index.

And then with:

The result is:

And this is the formula for Newton’s difference quotient.

Example 2:

Consider eg . Then and:

These coefficients are to formula 25.3.4 on page 883 in [AbramowitzStegun].

They are all of accuracy .

Looking at the middle row the formula for Newton’s difference quotient is found once again:

(5)

Again by design and we have for in (5):

From Barycentric first order derivative Barycentric first order derivative we have so:

Finally since :

Now for i = j we have something similar as for the first order case:

This way we have:

Theorem, Barycentric second order derivative:

Consider a set of grid values and the function values . The formula for the second order derivative at one of the grid values , is:

where

Note

There is a sign error in [BerrutTrefethen] page 513

Example 2, second order:

Consider eg . Then and so:

This is usual Newton second order central finite difference approximation of accuracy , ie:

Also so:

This is usual Newton second order forward finite difference approximation of accuracy , ie:

Both results are in accordance with table 25.2 page 914 in [AbramowitzStegun].

[SadiqViswanath] study the fact that the accuracy sometimes is 1 higher than specified by Lagrange, ie the accuracy is boosted. One of their results are:

Corollary 7 page 14:

Look at the grid values . If the grid relative grid values are symmetric about 0 (in other words z is a relative grid value if and only if −z is a relative grid value) and n − m (m is the order of the derivative) is odd, the order of accuracy is boosted by 1.

So according to the corollary the Newton second order central finite difference approximation is actually of accuracy since has symmetric relative grid values and is odd. On the other hand the accuracy of the Newton second order forward finite difference approximation is still . This is confirmed by table 25.2 page 914 in [AbramowitzStegun].

Decision, the finance package:

In the finance package the derivatives numericcally will approximated such that the accuracy leading to an accuracy down to the sixteenth decimal.