Table Of Contents

Previous topic

7. Notes on spreadsheets etc

Next topic

9. References

This Page

8. On Python

Notes on Python and how to use it in finance.

8.1. Arrays for Financial Calculations

Abstract:

Basically there are 5 ways of defining arrays in Python.

  • Python’s own array module
  • Python’s own ctypes module
  • Numpy’s array module
  • Python’s own list module
  • The mpmath matrix

The reason for using arrays is to implement vector like data types to ease calculations.

But results of the calculations has to be acceptable for an ordinary financial user. Therefore an simple acid test is performed.

The conclusion is that for financial calculations and python integration purposes it is best to construct a decimal array based on Pythons own list module.

A numpy like decimal based package decimalpy implements what is necessary for financial calculations.

8.1.1. Numpy’s and Pythons own array module

And here Numpy has a large set of advantages. Let’s demonstrate.

First import numpy:

>>> import numpy as np

Now consider a simple bullit cashflow over 5 periods with a fixed rate at 0.1. It’s quite easy to generate though numpy:

>>> cf = np.ones(5) * 0.1
>>> cf[4] += 1
>>> cf
array([ 0.1,  0.1,  0.1,  0.1,  1.1])

Further it is quite easy to calculate the discount factors though numpy:

>>> n = np.array(range(1,6))
>>> discount = 1.1 ** -n
>>> discount
array([ 0.90909091,  0.82644628,  0.7513148 ,  0.68301346,  0.62092132])

So the discounted cashflows are:

>>> from pprint import pprint
>>> pprint(cf * discount)
array([ 0.09090909,  0.08264463,  0.07513148,  0.06830135,  0.68301346])

In all numpy looks like a very serious base for financial calculations. The only problem is when the present value has to be calculated, ie the sum of the discounted cashflow.

Since both the cashflow and the discount factor are based on the rate 10% the present value should be 1. But what happens is:

>>> sum(cf * discount)
0.99999999999999978

This is off course due to floating point arithmetic, but since users of financial calculations have to trust the software and because users are liable to do similar calculations to test the sofware we have to do better.

So as I see it financial calculations has to be based on the Python type Decimal. But there are no array of decimals as such. It has to be created.

But by construction neither numpy’s nor Python’s own array can have Decimals as data type.

8.1.2. The ctypes array

But what can be done with ctypes?

First some imports:

>>> import ctypes
>>> from decimal import Decimal
>>> from pprint import pprint

Then define a vector of Python objects of length 5:

>>> vec = ctypes.py_object * 5

Now for the cashflows:

>>> cf = vec()
>>> for i in range(5):
...     cf[i] = Decimal('.1')
...
>>> cf[4] += 1
>>> pprint([cf[i] for i in range(5)])
[Decimal('0.1'),
 Decimal('0.1'),
 Decimal('0.1'),
 Decimal('0.1'),
 Decimal('1.1')]

And the discounts:

>>> discount = vec()
>>> for i in range(1,6):
...     discount[i-1] = Decimal('1.1') ** -i
...
>>> pprint([discount[i] for i in range(5)])
[Decimal('0.9090909090909090909090909091'),
 Decimal('0.8264462809917355371900826446'),
 Decimal('0.7513148009015777610818933133'),
 Decimal('0.6830134553650706918926302848'),
 Decimal('0.6209213230591551744478457135')]

And finally to the present value:

>>> sum(discount[i] * cf[i] for i in range(5))
Decimal('1.000000000000000000000000000')

So even if this isn’t as elegant as numpy it could be the basis for an abstract data type like a decimal vector type.

A decimal based numpy array like datatype is implemented in the decimalVector module.

8.1.3. The list based array

If a decimalvector is a subclass of a list then it fits naturally into other packages like matplotlib.

So for integration purposes the list based decimal vector is used in the finance package.

8.1.4. The mpmath matrix

Another alternative is to use the matrix class from mpmath . The acid test is given below.

First import mpmath:

>>> from mpmath import *

Create the cashflow of a bullit loan over 5 periods with rate 10% per period as a row vector

>>> cf = ones(1,5) * 0.1
>>> cf[0, 4] += 1
>>> cf
matrix(
[['0.1', '0.1', '0.1', '0.1', '1.1']])

Then build the discount vector as a column vector:

>>> discount = matrix([1.1 ** - time for time in range(1,6)])
>>> discount
matrix(
[['0.909090909090909'],
 ['0.826446280991735'],
 ['0.751314800901578'],
 ['0.683013455365071'],
 ['0.620921323059155']])

Then the acid test gives:

>>> cf * discount
matrix(
[['1.0']])

So mpmath appears to pass the acid test. Also it is said to be 10 to 100 times faster than Pythons own decimal type. Further it has a matrix structure like the one in numpy.

But actually it doesn’t pass the acid test since

>>> (cf * discount)[0,0]
mpf('0.99999999999999978')

which isn’t a strait 1 as it should be.

8.2. On Subclassing the list Data Type

Abstract:

When you want to do a subclass on the list data type there are some peculiarities in Python that needs to be taken into account.

8.2.1. The Purpose

I want to create a data type as a subclass of a list. The goal is to get a (mathematical) vector as a list of Decimal similar to the numpy array. Therefore I need that eg slices are returned as the new data type.

8.2.2. An Example

The following is made in Python 2.6.5.

Let’s look at the following code example:

>>> class test(list):
...     def __str__(self):
...         return 'test(%s)' % str([x for x in self])
...     def __getslice__(self, i, j):
...         print 'hi2'
...         return test(list.__getslice__(self, i, j))
...     def __getitem__(self, key):
...         print 'hi'
...         if isinstance(key, int):
...             return list.__getitem__(self, key)
...         else:
...             return test(list.__getitem__(self, key))
...

And let’s some sample data:

>>> x = test([1,2,3,4,5])
>>> print x
test([1, 2, 3, 4, 5])

Now x is of type test. Let’s try to extract the element in the end of x.

>>> print 1, x[-1], type(x[-1])
1 hi
5 hi
<type 'int'>

First thing worth noticing is that hi is written twice ie __getitem__ is called twice. In other words we are on the brink of recursive calls.

The rescue is to use list.__getitem__(self, key) as a way of calling __getitem__ from within itself.

Next let’s at some slicing:

>>> print 2, x[2:-1], type(x[2:-1])
2 hi2
test([3, 4]) hi2
<class '__main__.test'>
>>> print 3, x[: -1], type(x[:-1])
3 hi2
test([1, 2, 3, 4]) hi2
<class '__main__.test'>

Here it is very interesting that only hi2 is written. In other words __getitem__ is never called, only __getslice__. It was actually the other way around when the index was a integer.

So both functions are necessary when You want to override the get/lookup function in a list subclass. And this is a bit strange since __getslice__ has been depreciated since version 2 of python.

8.3. A Note on Operator Overload

Abstract:

Some details on operator overload is described in this note. The text is about operator overload in general in python, but it is examplified though addition (__add__ and __radd__).

I was testing operator overload when I found this. Let’s start by looking at 2 classes with left operator (addition) overload:

>>> class A:
...     def __add__(self, value):
...         return 1
...
>>> class B:
...     def __add__(self, value):
...         return 2
...
>>> a=A()
>>> b=B()

So let’s test the operator overload. A’s method __add__ defines that a plus something is 1:

>>> a+b
1

And likewise with B. B.__add__ defines that b plus something is 2:

>>> b+a
2

Now if I define right addition for A, __radd__:

>>> A.__radd__ = lambda self, value: 11
>>> a=A()

To me it would make sense that if I define a new class and a overload operator then operator should fully defined by itself. Otherwise I would have to change method __add__ for all the classes I would like to add to from the right.

But if I do:

>>> a=A()
>>> a+b
1
>>> b+a
2

then it is still B.__add__ that is used. Only if I delete B.__add__ method then it done like wanted:

>>> del B.__add__
>>> b=B()
>>> a+b
1
>>> b+a
11

Looking into the documentation all this is also what is supposed to happen. Method A.__radd__ defines right addition for A when B has no (left) addition method B.__add__.

At first I thought that this was an design error. It is when I define A I know (and have to decide) which other classes I would like to add from both sides.

So I thought of changing the rules to something like:

Suggestion:

If A has both __add__ and __radd__ then these methods define the meaning of the operator on A. If however that A only has a __add__ method then the method __add__ from B should be used.

A problem however might appear the case where both A and B has both methods. Which method is then to be used?

Let’s see:

  • a+b will use B.__radd__ and A.__add__. But if the suggestion above is followed then B.__radd__ superseeds A.__add__
  • On the other hand b+a would lead to the use of A.__radd__

So the suggestion above would move the code from __add__ to __radd__, ie nothing is gained.

One way to let class A define both sides of addition is by:

>>> B.__radd__ = lambda self, value: value.__radd__(self) if isinstance(value, A) else 2
>>> b=B()
>>> b+a
11

But this way a hierachy would have to be assumed, eg number are lower than vectors, so vectors define both right and left side operator (eg addition) with numbers and so on.

Finally note that types like eg int do not use operator overload. If so we would have had problems with:

>>> a+1
1
>>> 1+a
11

8.4. A note on decorators

Abstract:

Decorators has puzzled me ever since I first heard of them. Usually in Python there is a very clear logic (to me anyway), but not here.

Then I read Python Conquers The Universe. And it made me decide to come with my version of understanding decorators. And I will do it by stydying the functionality of decorators in Python. In this text I will only look at decorators defined as functions.

Let’s start with the definition from Python Conquers The Universe. It says:

Definition, Python Conquers The Universe:

A decorator is a function that takes a function object as an argument, and returns a function object as a return value.

This definition is only half the truth, because decorators also transforms arguments and return values if needed.

Further they can keep track of usage etc. But then it appears that one has to use the class definition of decorators see for examples.

A general presentation of a decorator based on a function could be like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
 pp_dict = lambda dct: '{\n' \
             + '\n'.join(['* %s = %s' % item for item in dct.items()]) \
             + '\n}'


 def decorator(test='HI!'):

     print 'decorator in', pp_dict(locals())

     def function_on_function(function):
         print 'function_on_function in', pp_dict(locals())

         def call_function(*args, **kwargs):
             print 'call_function in', pp_dict(locals())
             tmp = '%s, %s' % (test, function(*args, **kwargs))
             print 'call_function out', pp_dict(locals())
             return tmp

         print 'function_on_function out', pp_dict(locals())
         return call_function

     print 'decorator out', pp_dict(locals())
     return function_on_function


 @decorator()
 def dummy(x, y=5):
     return x + y


 print "### decorator()(dummy)(2) ###"
 print dummy(2)

In line 6 to 23 above a decorator is defined. It is the most advanced version with 3 “recursive” layers of function definitions.

Later in line 26 to 28 the decorator is used on the function dummy.

And finally in line 32 a functional call is tried.

When this code is run one gets:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 decorator in {
 * test = HI!
 }
 decorator out {
 * test = HI!
 * function_on_function = <function function_on_function at 0x919fb8c>
 }
 function_on_function in {
 * function = <function dummy at 0x919fbc4>
 * test = HI!
 }
 function_on_function out {
 * function = <function dummy at 0x919fbc4>
 * test = HI!
 * call_function = <function call_function at 0x919fbfc>
 }
 ### decorator()(dummy)(2) ###
 call_function in {
 * kwargs = {}
 * test = HI!
 * function = <function dummy at 0x919fbc4>
 * args = (2,)
 }
 call_function out {
 * kwargs = {}
 * test = HI!
 * function = <function dummy at 0x919fbc4>
 * args = (2,)
 * tmp = HI!, 7
 }
 HI!, 7

So what happens here. First of all what happens before output line 17 “### decorator()(dummy)(2) ###” is what happens at time of definition of the decorated function dummy.

At the time the decorated function dummy is defined the decorator is called. It has an single argument test which is set to ‘HI’. This value would be a “global” for the subfunctions within the decorator function.

The decorator also defines the function function_on_function. And these 2 variables are the only local variables for the decorator function.

And then the decorator function returns the function function_on_function which requires a function as an single argument. The @ at code line 26 inserts the function dummy as the argument. This can be seen from output line 8 to 11.

So code lines 26 and 27 responds to the code: dummy = decorator()(dummy)

where the call decorator() returns function_on_function.

The function function_on_function defines and returns the function call_function. And that function is one that replaces the original function dummy when dummy is called.

Code lines 14 and 16 in function call_function shows that you can do pre and post processing the call of the function dummy.

Pre and post processing could eg be validating or transforming input and output.

Output lines 18 to 30 shows that test is accessible as an argument. But you can’t change it’s value. That might be ok, since functions decorator and function_on_function are run at definition time of function dummy.

Another interesting thing that is shown in output line 18 to 30 is that y=5 at code line 27 isn’t passed to the kwargs variable in function call_function. It is applied only, when function dummy is called at code line 15.

It is actually quite logical. At code line 32 the function dummy is called with parameters args = (2,) and no kwargs. But it is actually the call_function that is called at this point.

It is only inside the call_function at code line 15 that the original function dummy is called. And it only there that the second parameter (y) is set to five. It is also here the code will reject a wrong number of arguments.

8.4.1. Decorators and doctest

Packages like doctest and docutils are dependent the content in __doc__. To see that:

>>> def decorator(function):
...     def wrap(*args):
...         return 'decorator(%s)' % function(*args)
...     return wrap
...
>>> @decorator
... def test(x):
...     '''test text'''
...     return 'test(%s)' % x
...
>>> test(2)
'decorator(test(2))'
>>> test.__name__
'wrap'
>>> test.__doc__

It is obvious that that using the decorator cleared the content of the __doc__ variable.

To prevent this there is a simple solution. Just pass the content of __doc__ and __name__ from the function argument to the wrap function:

>>> def decorator(function):
...     def wrap(*args):
...         return 'decorator(%s)' % function(*args)
...     wrap.__name__ = function.__name__
...     wrap.__doc__ = function.__doc__
...     return wrap
...

Then the decorator keeps the content of the original function:

>>> @decorator
... def test(x):
...     '''test text'''
...     return 'test(%s)' % x
...
>>> test(2)
'decorator(test(2))'
>>> test.__name__
'test'
>>> test.__doc__
'test text'

If the decorator is imported from another module then the content of the variable __module__ must also be passed from the function argument to the wrap function.

8.5. On tkinter

Abstract:

I’m all new to making a graphical user interface in python. Previously I’ve been working with VBA and excel and there I’ve surprised on what you can accomodate using functions like messagebox and inputbox.

And later I found out trying to run Excel from ubuntu and wine that these functions aren’t even a part of VBA, but is a part of windows itself.

I’ve searched on the internet and found quite a lot saying that tkinter is to primitive. After a first try I must say I do not agree.

This chapter summarizes the set of similar standard functions in python that makes it possible to build simple communication to the user. To me this it seems that it is a more than sufficient set of functions to handle eg scripting from within eg spreadsheets or other applications.

Here an unnoticed but powerfull set of tools is revealed.

And it surprises me that this hasn’t become the base of smartphone development since it is simple yet highly fexible.

8.5.1. The simple dialog functions in python

Note

For all the functions below the Tkinter function Tk has to be running. So the following code (or similar) has to preenter any use of the functions below:

>>> import Tkinter
>>> root = Tkinter.Tk()
>>> root.withdraw()

Another way would be simply to install and use the module SimpleTk.

Below a set of simple standard dialog functions to use in user interaction is described.

In module tkSimpleDialog you get 3 functions:

  • askfloat
  • askinteger
  • askstring

They are all called the parameters title, prompt and kw, where the first 2 parameters are self explained string parameters whereas the last parameter is actually a set of keywords describing the layout of the dialogs. They can be ignored for now.

What is returned should be evident.

In module tkMessageBox you get a set of messageboxes:

  • askokcancel - Ask if operation should proceed; return true if the answer is ok
  • askquestion - Ask a question
  • askretrycancel - Ask if operation should be retried; return true if the answer is yes
  • askyesno - Ask a question; return true if the answer is yes
  • askyesnocancel - Ask a question; return true if the answer is yes, None if cancelled.
  • showerror - Show an error message
  • showinfo - Show an info message
  • showwarning - Show a warning message

Similar to the functions in tkSimpleDialog the functions here at tkMessageBox are called with parameters title, message and options where the last is to be ignored for now and the 2 firsts typically are self explaining string values.

If the user has to choose a file name or a path there is the module tkFileDialog.

In tkFileDialog there are the following methods:

  • askdirectory - Ask for a directory, and return the file name
  • askopenfile - Ask for a filename to open, and returns the opened file in the set mode (first parameter). Default mode is ‘r’
  • askopenfilename - Ask for a filename to open
  • askopenfilenames - Ask for multiple filenames to open. Returns a list of filenames or empty list if cancel button selected
  • askopenfiles - Ask for multiple filenames and return the open file objects in the set mode (first parameter). Default mode is ‘r’. Returns a list of open file objects or an empty list if cancel selected
  • asksaveasfile - Ask for a filename to save as, and returns the opened file in the set mode (first parameter). Default mode is ‘w’
  • asksaveasfilename - Ask for a filename to save as

All the functions in tkFileDialog also has a second parameter options which typically can be ignored.

Finally there is the module tkColorChooser with one function askcolor which asks for a color. A default color can be set as a first parameter.

See also the SimpleTk module