May 22nd 2014

Abstract

For scientific computing, fast iterations and rapid prototyping is essential for designing experiments and working with data. When we can see the result of what we are doing right away, it is a lot simpler to adjust algorithm parameters and fine tune plots to suit our purpose. IPython is an interactive python console with a browser based notebook that comes with support for code, text, mathematical expressions and inline plots. It is made with scientific computing in mind and allows for a matlab-like iterative approach for working with data. In this presentation I will give an introduction to IPython and showcase how it can be used for working with data, plotting and fast prototyping.

Introduction

This tutorial will cover basic principles in using IPython for development and scientific research. I will assume a basic familiarity with Python coding in general, although I think the code examples should be easy enough to understand for those that aren't. If you have questions, you are of course welcome to interrupt me and ask me to clarify something.

I will start with a few notes on installing Ipython before I introduce the basic features and showcase inline plotting. Then I will go through a small example computing the Mandelbrot set to showcase my normal workflow in IPython.

Installation

On Debian/Ubuntu you can install ipython by installing pip: sudo apt-get install python-pip

Then using pip you can easily install ipython and dependencies: sudo pip install ipython[notebook]

For pylab (plotting) you will need numpy and matplotlib: sudo pip install numpy matplotlib

On other platforms you can find instructions on the IPython installation page.

Using IPython

IPython consists of a few different shells that make it easy to enter python and evaluate python expressions. In this tutorial we will focus on the IPython Notebook which is a browser based shell with inline graphics. To open IPython Notebook, open a terminal and navigate to your project directory, then type:

ipython notebook --pylab inline

The 'pylab inline' means that any command creating a pylab plot or showing an image will open up in the interactive python shell.

Evaluating Expressions

After entering the command, IPython will open in a browser window in any open browser showing the directory page from where you can load saved IPython notebooks or create a new one. When you create a new notebook you are met with a page featuring text-input (a cell) in which you can enter any python code and have it evaluated by pressing the run button or by pressing Ctrl + Enter:

n = 4+8"I would like %i shrubberies" % n
> 'I would like 12 shrubberies'

To insert a new cell below, go to insert and click 'Insert Cell Below' or press Ctrl + m and then b (as in below). You can change any cell and reevaluate it. Any variable assignment that you evaluate will be available in all other cells:

"'n' declared above has a value of %i" % n
> "'n' declared above has a value of 12"

Markdown Cells

Outside of python code you can also specify that a cell contains markdown formatted text. Markdown is a simple syntax for formatting text documents which makes it easy to include links, lists, code examples etc. You can find a brief overview on the markdown syntax here.

IPython commands

While ipython evaluates python code, the shell comes with a set of commands built-in. A nice overview of the commands are featured here, but in day to day coding the one I find the most useful is %timeit:

import numpy
%timeit numpy.sqrt(numpy.ones((400, 400)))
> 100 loops, best of 3: 2.67 ms per loop

Plotting

Since we've loaded IPython Notebook with pylab inline, plotting a figure is as simple as typing 'plot' which uses MatPlotLib:

values = numpy.arange(30)**2
plot(values)
plot

plot

An Example: The Mandelbrot Set

As an example I will iteratively go through the example of coding up the algorithm that computes the Mandelbrot fractal.

For the definitions of the Mandelbrot set I take the liberty of quoting wikipedia:

The Mandelbrot set is a mathematical set of points whose boundary is a distinctive and easily recognizable two-dimensional fractal shape. The set is closely related to Julia sets (which include similarly complex shapes) and is named after the mathematician Benoit Mandelbrot, who studied and popularized it.

Mandelbrot set images are made by sampling complex numbers and determining for each whether the result tends towards infinity when a particular mathematical operation is iterated on it. Treating the real and imaginary parts of each number as image coordinates, pixels are colored according to how rapidly the sequence diverges, if at all.

More precisely, the Mandelbrot set is the set of values of c in the complex plane for which the orbit of 0 under iteration of the complex quadratic polynomial zn+1=z2n+c remains bounded.

This means that for a m×n grid of imaginary numbers going from, say, z=(−x−iy) to z=(x+iy) we apply function z_new=z*z+c continously, c being the initial value of z and observe when each value diverges, and then we create an m×n image where each pixel corresponds to how many iterations it took before the corresponding imaginary value in the grid converged.

Getting a grid

To do this, we need a grid of evenly spaced imaginary numbers to start with. The most straightforward way (I know of) in numpy is to create to arrays and add them as follows:

import numpy
t1 = numpy.linspace(0,1,5).reshape((1, 5))
t2 = numpy.linspace(1,3,3).reshape((3, 1))
print(t1)
print(t2)
t1 + t2*1j
> [[ 0.    0.25  0.5   0.75  1.  ]]
> [[ 1.]
   [ 2.]
   [ 3.]]
> array([[ 0.00+1.j,  0.25+1.j,  0.50+1.j,  0.75+1.j,  1.00+1.j],
         [ 0.00+2.j,  0.25+2.j,  0.50+2.j,  0.75+2.j,  1.00+2.j],
         [ 0.00+3.j,  0.25+3.j,  0.50+3.j,  0.75+3.j,  1.00+3.j]])

We can create a grid of size m×n=600×600 with values ranging from −2 to 1 for the real part and −1 to 1 for the imaginary part:

m = 600 # Height of plot
n = 600 # Width of plot
values_real = numpy.linspace(-2.3, 1, n).reshape((1,n))
values_imag = numpy.linspace(-1.4, 1.4, m).reshape((m,1))
initial_values = values_real + values_imag*1j
initial_values
> array([[-2.30000000-1.4j       , -2.29449082-1.4j       ,
          -2.28898164-1.4j       , ...,  0.98898164-1.4j       ,
           0.99449082-1.4j       ,  1.00000000-1.4j       ],
         [-2.30000000-1.39532554j, -2.29449082-1.39532554j,
          -2.28898164-1.39532554j, ...,  0.98898164-1.39532554j,
             0.99449082-1.39532554j,  1.00000000-1.39532554j],
         [-2.30000000-1.39065109j, -2.29449082-1.39065109j,
          -2.28898164-1.39065109j, ...,  0.98898164-1.39065109j,
           0.99449082-1.39065109j,  1.00000000-1.39065109j],
         ..., 
         [-2.30000000+1.39065109j, -2.29449082+1.39065109j,
          -2.28898164+1.39065109j, ...,  0.98898164+1.39065109j,
           0.99449082+1.39065109j,  1.00000000+1.39065109j],
         [-2.30000000+1.39532554j, -2.29449082+1.39532554j,
          -2.28898164+1.39532554j, ...,  0.98898164+1.39532554j,
           0.99449082+1.39532554j,  1.00000000+1.39532554j],
         [-2.30000000+1.4j       , -2.29449082+1.4j       ,
          -2.28898164+1.4j       , ...,  0.98898164+1.4j       ,
           0.99449082+1.4j       ,  1.00000000+1.4j       ]]) 

Let's now apply the function z_new=z*z+c to this grid. Then for each round we want to know which values are about to diverge. Because I don't have the patience to iterate an infinite amount of round, we just test all values if they are bigger than a certain threshold. Conveniently enough wikipedia argues that we only need to test if the norm of the value is above 2. For an imaginary number z=(x+iy) the norm is sqrt(x*x+y*y) = sqrt(z*conj(z)). For each iteration we log which numbers are divergent in the iterations matrix. We can then print this as a heatmap using matplotlib's imshow function:

values = initial_values
max_iterations = 30
iterations = numpy.ones(initial_values.shape) * max_iterations
for i in range(max_iterations) :
    values = values**2 + initial_values
    divergent = values * conj(values) > 4
    divergent = divergent & (iterations == max_iterations) # Test that we haven't already found this number
    iterations[divergent] = i
imshow(iterations)
The Mandelbrot fractal

The Mandelbrot fractal

Let's set matplotlib defaults to a bigger plot size:

rcParams['figure.figsize'] = 10, 10
imshow(iterations)
The Mandelbrot fractal (bigger)

The Mandelbrot fractal (bigger)

This is a bit cumbersome for experimentation though, so let's define a function so we can more easily play around with the values:

def mandelbrot(width, height, x_lim = (-2.3, 1), y_lim = (-1.4, 1.4), max_iterations = 30) :
    m = height # Height of plot
    n = width # Width of plot
    values_real = numpy.linspace(x_lim[0], x_lim[1], n).reshape((1,n))
    values_imag = numpy.linspace(y_lim[0], y_lim[1], m).reshape((m,1))
    initial_values = values_real + values_imag*1j
    initial_values
    values = initial_values
    iterations = numpy.ones(initial_values.shape) * max_iterations
    for i in range(max_iterations) :
        values = values**2 + initial_values
        divergent = values * conj(values) > 4
        divergent = divergent & (iterations == max_iterations) # Test that we haven't already found this number
        iterations[divergent] = i
    return iterations

Now let's try to zoom in a bit:

mandelbrot_data = mandelbrot(600, 600, (-0.56, -0.55), (-0.56,-0.55), 90)
imshow(mandelbrot_data)
Detail from the Mandelbrot fractal

Detail from the Mandelbrot fractal

The idea here is to showcase how it's simple to iteratively produce working code using ipython. The notes are also available as an ipython document here.

The cover is taken from www.misterx.ca which has a lot of nice renderings of mandelbrot sets.