Writing Scientific Application in Python

55
Writing scientific application in Python K.I.A Derouiche June 03 th 2013

description

Writing Scientific Application in Python

Transcript of Writing Scientific Application in Python

  • Writing scientific application in Python

    K.I.A Derouiche

    June 03 th 2013

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Outline

    1 Introduction

    2 Basics

    3 Python Modules for Science

    4 Faster Python and Glueing

    5 Summary

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Me !

    Membre in AADEMTIDeveloppeur-Maintainer in NetBSD ProjectFree Software Research EngPromise Theory

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Outline

    1 Introduction

    2 Basics

    3 Python Modules for Science

    4 Faster Python and Glueing

    5 Summary

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Who uses...

    We all use computers to generate or process data

    Question to the audience: who uses...

    C/C++?Fortran?Ada?Java?Matlab/Octave?

    IDL?Perl?Ruby?Python?

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    What is Python?

    Python is/has...

    a scripting languagegeneral purposeinterpretedeasy to learnclean syntax

    multi-paradigmopen-sourceavailable for all majorplatformsgreat community

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    The best of all:

    Python comes...... with batteries included!

    Libraries available for...

    daily IT needs...

    networksOS interactiontemporary fileszip filesweb API

    science!

    efficient array operations (NumPy)general numerical algorithms (SciPy)2D visualization (matplotlib)3D visualization (Mayavi, Paraview)special problems (e.g. finite elements withFEniCS, SfePy,quntum optics with QuTiP,EMpy)symbolic math (SageMath, SymPy)

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Outline

    1 Introduction

    2 Basics

    3 Python Modules for Science

    4 Faster Python and Glueing

    5 Summary

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Scientific Hello, World!

    import sysfrom math import sin, pi

    def sincSquare(x):"""Return sinc(x)^2."""if(x 0.0):

    return (sin(pi*x)/(pi*x))**2else:

    return 1.0

    x = sys.argv[1]y = sincSquare(float(x))print("sinc(%s)^2 = %s"%(x, y))

    cmp. H.-P. Langtangen,"Python Scripting forComputational Science"

    run with:

    python HelloWorld.py 0.0

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Control structures

    # if statements:if(divisor == 0):

    ...elif(divisor > 1E20):

    ...else:

    ...

    # loops:for i in range(10): # i = 0, 1, ..., 9

    print("i = %s"%i)

    # while loops:while(True):

    ...

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Functions

    # functions:def f(x, a=1.0, b=2.0):

    """Return a/x and a/x^b"""

    return a/x, a/x**b

    # somewhere else:y1, y2 = f(x, 5.0)y3, y4 = f(2, b=3.0)

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Functions (cont)#def sentence(**words):

    print(Receve %d word(s) %len(words))print(List of words: %s % .join(words.values()))print(Parameter name: %s % .join(words.keys()))

    sentence(word1=word1, word2=word2, word3=word3)

    #Combine parameter:def sentence(name, firstword=none_word, **words):

    pass

    #Arbitray parametresdef sum (sentence, *args):

    print(sentence % args)format(%d more the %s at possible, 2, combinaison)

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Example: calculs!

    c = lambda x, y: x**2 + y**2print c(3,4)

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Enumerate, Itertools

    from itertools import chain

    import numpy as np

    lst = np.arange(7)for idx, ele in enumerate(lst):

    print idx, ele

    list(chain(xrange(11), xrange(15, 21), xrange(30, 46)))

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Data types: List, Tuple

    a = 2 # integerb = 2.0 # floatc = "3.0" # stringd = [1, 2, "three"] # liste = "1"print(a*b) # valid, upcastingprint(a*c) # valid, but probably not desired: 3.03.0print(b*c) # invalidprint(d[1]) # prints 2for item in d: # lists are "iterable"

    print(item)for character in c: # strings are iterable

    print(character) # prints 3\n.\n0f = e + c # + joins strings: f = 13.0g = d + [someObj, "foobar"] # + joins lists

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Data types: Stacks, Queues, Dictionariesstack = [3, 4, 5]stack.append(6)stack.append(7)stack.pop()

    from collections import dequequeue = deque(["1", "2", "pi"])queue.append("Terry")queue.popleft()

    pole = {j: 40.98, s: 41.39}pole[g] = 41.27; pole[j]40.98del pole[s]pole.keys()g in pole

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Files

    readFile = open("infile", mode="r")writeFile = open("outfile", mode="w")

    for line in readFile: # iterate over files linesxString, yString = line.split() # split the linex = float(xString); y = float(yString)print("x = %s, y = %s"%(x, y))writeFile.write("%s * %s = %s\n"%(x, y, x*y))

    readFile.close(); writeFile.close()

    infile:

    1.0 2.03.0 4.0

    outfile:

    1.0 * 2.0 = 2.03.0 * 4.0 = 12.0

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Class, Method, Composition, Polymorphisme,Inheritancee

    import random

    class sales_normal(object):def __init__(self):

    pass

    def A(self):"""This commente """return random.randint (0, 1)

    def B(self):s = 0for i in range(0, 100):

    s += self.A ()return s

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Class, Method, Composition, Inheritancee

    class false_device(sales_normal):def A(self):

    t = random.randint(0, 10)if t >=3 :

    return 1else:

    return 0

    if __name__ == "__main__":p = sales_normal()print p.B()p2 = false_device()print p2.B()

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Reusing code: modules

    Place code to be reused in Module.py:

    """A Python module for illustration."""

    def printData():print(data)

    data = 2

    In somewhereElse.py, do something like:

    import Module

    Module.data = 3Module.printData()

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Some Python magic

    x, y = y, x # swapping

    print(1 > 2 > 3) # prints False

    # filtering (there is also reduce(), map())numbers = range(50)evenNumbers = filter(lambda x: x % 2 == 0, numbers)print("All even numbers in [0; 50): %s"%evenNumbers)

    # list comprehensions:squares = [x**2 for x in numbers]

    a += 2 # a = a + 2

    print("string" in "Long string") # prints True

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Pitfalls

    Common pitfalls:slicing: last index is exclusive, not inclusive as in e.g. Fortranx = [1, 2, 3, 4]print(x[0:2]) # prints [1, 2], not [1, 2, 3]

    What looks like performing an assignment is actually setting areference:a = []b = aa.append(2)print(a) # prints [2]print(b) # prints [2], not []!

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Outline

    1 Introduction

    2 Basics

    3 Python Modules for Science

    4 Faster Python and Glueing

    5 Summary

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Scientific Python Application Stack

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    The IPython shellIPythonAn interactive shell - may replace MatLab [tm] for interactive work

    SyntaxhighlightingTabcompletionInline docu-mentationEasyprofiling,timing...IPython 0.13.x:inlineplots...

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    NumPy: Python meets an array data type

    NumPyFast and convenient array operations

    Lists: + does join, not add!NumPy array: basic vector/matrix data typeConvenience functions (e.g. linspace(), zeros(),loadtxt()...)Array slicingelement-wise operationsCode using NumPy reads and writes very similar to modernFortran (slicing, vector valued indices...)

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    NumPy by examples(1)

    import numpy as np

    a = np.array([1.0, 2.0, 3.0, 4.0])b = np.array([4.0, 3.0, 2.0, 1.0])for item in a: # arrays are iterable

    print(item)c = a + b # c = [5, 5, 5, 5]print(a[0:3:2]) # 1.0, 3.0; last element not included!a[0:3] = b[0:-1]

    print(a*b) # prints [4, 6, 6, 4], not the scalar product!

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    NumPy by examples(2)

    import numpy as np

    #Creation:a = np.arange(10)a = np.zeros((2, 4), dtype=f)a = np.ones((3, 5))*np.nana = np.identity(3)a = np.mat([1, 0], [0, 1])a = np.indices((2, 5))c = a.flat # view

    # Dimensiona = np.array([0.1, 0.0, 0.2]) #a.ndim; a.dtype; a.sizeb = np.array([1, 2, 3], [4, 5, 6]) #b.ndim; b.dtype; b.size

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    NumPy by examples (3)#shapeimport numpy as np

    a = np.ones((3, 5, 7)).shapenp.shape(a)# reshape change shapea = np.arange(105).reshape((3, 5, 7))# Fortran implementationa = np.ones((2, 3, 4), order=Fortran); np.isfortran(a)#Transpos:a.T#index:a = np.arange(24).reshape(2, 3, 4)a[0][2][1]a[0][0:-1]a = np.arange(24).reshape(2, 3, 2, 2)a.tolist()##

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    NumPy by examples (4)

    import numpy as np

    a= np.arange(6).reshape(3, 2)1+np.sin(a)*1/2a>3a.max() #max valuea.sum() #sommea.mean # Moyenne#Matrixb = np.matrix(a); b[0]

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    SciPy

    SciPyNumerical algorithms using NumPy arrays

    Wrappers around well-established libraries

    Submodules:

    linalg: linear algebra(lapack)io: MATLAB FILE, NetCDF,HDF5 futur managersparse: sparse matricesfft: FFT (fftpack)optimize: optimization,zeros (minpack)

    integration: integration(quadpack, odepack)special: special functions(amos...)signal: signal processing

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    p(x) = x45 + 3x2 7x + 5#derivationfrom scipy import poly1dq = poly1d([0.5, 0.3, -7, 5])q.deriv()poly1d([ 1.5, 0.6, -7. ])print q

    #integrationfrom scipy import *q.integ()poly1d([0.1, 0., 1., -3.5, 5., 0.])

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Van der Pol oscillator:

    import scipy as spfrom scipy.integrate import odeintimport pylab as plt

    def vdp1000(y, t):dy=sp.zeros(2)dy[0] = y[1]dy[1] = 1000.*(1.-y[0]**2)*y[1]-y[0]return dy

    t0 = 0 ;tf = 3000; N = 300000dt = (tf-t0)/Ntgrid=sp.linspace(t0, tf, num=N)y= odeint(vdp1000, [2., 0.], tgrid)plt.plot(tgrid, y[:, 0])plt.show()

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    d2x(t)dt2 0(1 x2(t))dx(t)dt + 20x(t) = 0

    Already used here: Matplotlib

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    SymPy: CAS pure Python

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Matplotlib(mostly) 2D plots

    Pylab: MatLab alternative for interactive work

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Lorenz system

    dx(t)dt = (y(t) x(t))

    dy(t)dt = x(t) y(t) x(t)z(t)

    dz(t)dt = x(t)y(t) z(t)

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Outline

    1 Introduction

    2 Basics

    3 Python Modules for Science

    4 Faster Python and Glueing

    5 Summary

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Using Python as glue

    Python can wrap different different other programming languages

    Cython

    compiled, typed Python - interface C/C++ code

    f2py

    Fortran wrapper, included in NumPy

    Why do that?

    Python can be slowPython loops are slowcalling Python functions isslow

    Wrap external C/Fortran...librariesHappily/unfortunately (?)there is legacy code

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Problem: sinc(x)2import numpy as npfrom math import sin, pi

    def sincSquare(x):"""Return the sinc(x) = (sin(x)/x)**2 of the arrayargument x."""retVal = np.zeros_like(x)for i in range(len(x)):

    retVal[i] = (sin(pi*x[i]) / (pi*x[i]))**2

    return retVal

    #b = np.random.uniform(-10, 10, size=1000)#sincSquare(b)106 array elements: 1 loops, best of 3: 4.91 s per loop

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Problem: sinc(x)2import numpy as npfrom math import sin, pi

    def sincSquare(x):"""Return the sinc(x) = (sin(x)/x)**2 of the arrayargument x."""retVal = np.zeros_like(x)for i in range(len(x)):

    retVal[i] = (sin(pi*x[i]) / (pi*x[i]))**2

    return retVal

    #b = np.random.uniform(-10, 10, size=1000)#sincSquare(b)106 array elements: 1 loops, best of 3: 4.91 s per loop

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    First attempt: use NumPy array operations

    import numpy as np

    def sincSquareNumPy1(x):

    return (np.sin(np.pi*x[:])/(np.pi*x[:]))**2

    def sincSquareNumPy2(x):

    return np.sinc(x[:])**2

    106 array elements: first function: 10 loops, best of 3: 73 msper loop, second function: 10 loops, best of 3: 92.9 msper loop

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    First attempt: use NumPy array operations

    import numpy as np

    def sincSquareNumPy1(x):

    return (np.sin(np.pi*x[:])/(np.pi*x[:]))**2

    def sincSquareNumPy2(x):

    return np.sinc(x[:])**2

    106 array elements: first function: 10 loops, best of 3: 73 msper loop, second function: 10 loops, best of 3: 92.9 msper loop

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    How Cython works

    Cythoncompiled, possibly typed Python:

    .pyx fileCython .c file C compiler .so/.dll file

    various levels of typing possibleC output and Cythons opinion on code speed can easily beinspected (optional .html output)interfacing C libraries is easy

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    sinc(x)2 - Cython, Version 1cdef extern from "math.h":

    double sin(double)double pow(double, int)

    def sincSquareCython1(x):

    pi = 3.1415926535897932384626433retVal = np.zeros_like(x)

    for i in range(len(x)):retVal[i] = (sin(pi*x[i]) / (pi*x[i]))**2

    return retVal

    106 array elements: 1 loops, best of 3: 4.39 s per loop

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    sinc(x)2 - Cython, Version 1cdef extern from "math.h":

    double sin(double)double pow(double, int)

    def sincSquareCython1(x):

    pi = 3.1415926535897932384626433retVal = np.zeros_like(x)

    for i in range(len(x)):retVal[i] = (sin(pi*x[i]) / (pi*x[i]))**2

    return retVal

    106 array elements: 1 loops, best of 3: 4.39 s per loop

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    sinc(x)2 - Cython, Version 2cimport numpy as np # also C-import types

    cpdef np.ndarray[double] sincSquareCython2\(np.ndarray[double] x):

    cdef int icdef double pi = 3.1415926535897932384626433cdef np.ndarray[double] retVal = np.zeros_like(x)

    for i in range(len(x)):retVal[i] = pow(sin(pi*x[i]) / (pi*x[i]), 2)

    106 array elements: 10 loops, best of 3: 49.1 ms per loopThats a speedup by a factor 100!

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    sinc(x)2 - Cython, Version 2cimport numpy as np # also C-import types

    cpdef np.ndarray[double] sincSquareCython2\(np.ndarray[double] x):

    cdef int icdef double pi = 3.1415926535897932384626433cdef np.ndarray[double] retVal = np.zeros_like(x)

    for i in range(len(x)):retVal[i] = pow(sin(pi*x[i]) / (pi*x[i]), 2)

    106 array elements: 10 loops, best of 3: 49.1 ms per loopThats a speedup by a factor 100!

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    How f2py works

    f2pywrap Fortran code in Python:

    .f/.f90 filef2py .so/.dll file

    f2py is included in NumPyexposes NumPy arrays to Fortran codeonce Fortran space is entered, you run at full Fortran speed

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    sinc(x)2 - f2py, Version 1subroutine sincsquaref2py1(x, n, outVal)

    implicit none

    double precision, dimension(n), intent(in) :: xinteger, intent(in) :: ndouble precision, dimension(n), intent(out) :: outValdouble precision, parameter :: pi = 4.0d0 * atan(1.0d0)

    outVal(:) = (sin(pi*x(:)) / (pi*x(:)))**2

    end subroutine sincsquaref2py1

    106 array elements: 10 loops, best of 3: 47.4 ms per loopAgain, a speedup by a factor of 100!

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    sinc(x)2 - f2py, Version 1subroutine sincsquaref2py1(x, n, outVal)

    implicit none

    double precision, dimension(n), intent(in) :: xinteger, intent(in) :: ndouble precision, dimension(n), intent(out) :: outValdouble precision, parameter :: pi = 4.0d0 * atan(1.0d0)

    outVal(:) = (sin(pi*x(:)) / (pi*x(:)))**2

    end subroutine sincsquaref2py1

    106 array elements: 10 loops, best of 3: 47.4 ms per loopAgain, a speedup by a factor of 100!

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Techniques for faster Scripts

    After you have written a prototype in Python with NumPy andSciPy, check if your code is already fast enough. If not,

    profile your script (IPythons run -p or cProfile module...)to find bottlenecksif a large numbers of function calls is the bottleneck, typing andusing Cythons cdef/cpdef for C calling conventions speedsyour code up at the cost of flexibilityloops greatly benefit from typing, tooconsider moving heavy computations to Fortran/C completely -f2py and Cython will help you wrapping

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Outline

    1 Introduction

    2 Basics

    3 Python Modules for Science

    4 Faster Python and Glueing

    5 Summary

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Python in teaching

    Python/Pylab should be used in teaching because

    it is easy...and yet powerful;it may be used specialized to numerical computing...and also serve students as a general purpose language;it is safe;and best of all, it is free!

    Take home message 1Python is ideal for teaching

  • Introduction Basics Python Modules for Science Faster Python and Glueing Summary

    Summary

    We have...

    introduced basic Python scriptingshown some basic modules for scientific computingdemonstrated how to wrap other languageslearned how to speed Python up

    Take home message 2Python is a very valuable tool for Physicists

    Slides, LATEX and Python Sources available athttp://github.com/aeberspaecher