Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 ›...

17
Practical Numpy and Matplotlib Intro What is Numpy? Advantages of using Numpy In [ ]: Structure of a numpy array in memory In [23]: In [153]: Implication: presented by Tom Adelman, Sept 28, 2012 # an example n = 10000000 # Python a0 = [i for i in range(n)] time: 1.447 memory: 157540352 a1 = [a0[i]+1 for i in range(n)] time: 2.365 memory: 133169152 # Numpy b0 = np.arange(n) n=10000000 time: 0.036 memory: 40001536 b1 = b0 + 1 time: 0.053 memory: 40001536 less memory for the same amount of data (about 3-4x less for ints -- 4 for type pointer, 4 for reference count, 4 for value) faster (5-100x) convenient and expressive (e.g., x+1, and much more) functionality (random numbers, math, fancy slicing and indexing, convolutions, statistics, linear algebra, FFTs, etc, etc) a0 = array([1, 2, 3, 4, 5], dtype=uint16) print "a0:" print a0, type(a0) print repr(str(a0.data)) a0: [1 2 3 4 5] <type 'numpy.ndarray'> '\x01\x00\x02\x00\x03\x00\x04\x00\x05\x00' from IPython.core.display import Image as ipImage # just for displaying image in the notebook ipImage(filename='ndarray2.png') Out[153]: vectorize 1 of 17

Transcript of Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 ›...

Page 1: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

Practical Numpy and Matplotlib Intro

What is Numpy?

Advantages of using Numpy

In [ ]:

Structure of a numpy array in memory

In [23]:

In [153]:

Implication:

presented by Tom Adelman, Sept 28, 2012

# an examplen = 10000000# Pythona0 = [i for i in range(n)] time: 1.447 memory: 157540352a1 = [a0[i]+1 for i in range(n)] time: 2.365 memory: 133169152# Numpyb0 = np.arange(n) n=10000000 time: 0.036 memory: 40001536b1 = b0 + 1 time: 0.053 memory: 40001536

less memory for the same amount of data (about 3-4x less for ints -- 4 for type pointer, 4 for reference count, 4 for value)faster (5-100x)convenient and expressive (e.g., x+1, and much more)functionality (random numbers, math, fancy slicing and indexing, convolutions, statistics, linear algebra, FFTs, etc, etc)

a0 = array([1, 2, 3, 4, 5], dtype=uint16)print "a0:"print a0, type(a0)print repr(str(a0.data))

a0:[1 2 3 4 5] <type 'numpy.ndarray'>'\x01\x00\x02\x00\x03\x00\x04\x00\x05\x00'

from IPython.core.display import Image as ipImage # just for displaying image in the notebookipImage(filename='ndarray2.png')

Out[153]:

vectorize

1 of 17

Page 2: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

Summary

Uses

Why discuss Numpy (number crunching) and Matplotlib (data plotting together)?

In [31]:

In [32]:

Always plot your numbers!!!

Numpy Basics

running numpy

That is:Numpy = **Multidimensional Homogeneous Arrays** and tools to work with these arrays. Characteristics: 1) all the same type 2) fixed size 3) multidimensional: 1, 2, 3, 4,... 4) x + 1. means add 1 to each element of the array Gain a lot for these restrictions: speed, memory, functionality.

examples of uses: Astronomy Artificial intelligence & machine learning Bayesian Statistics Biology (including Neuroscience) Dynamical systems Economics and Econometrics Electromagnetics and Electrical Engineering Geosciences Molecular modeling Signal processing Symbolic math, number theory, etc.

data = [-.56, .39, .82, 1.47, 2.1]

plot(data, 'o')xlim(-.5, 4.5)

. . .

matplotlib gallery page

ipython

2 of 17

Page 3: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

In [22]:

creating ndarrays

In [31]:

In [15]:

In [25]:

In [44]:

more basics...

In [24]:

In [25]:

ndarray properties

# to import numpy, do:from numpy import * # y = exp(sin(x))# or do:import numpy as np # y = np.exp(np.sin(x))# import matplotlib like this...import matplotlib.pyplot as plt

print array([1, 5, 10]) # not array(1, 5, 10)print [1, 5, 10]

[ 1 5 10][1, 5, 10]

print zeros((2, 2)) # also ones

[[ 0. 0.] [ 0. 0.]]

print arange(10)

[0 1 2 3 4 5 6 7 8 9]

print linspace(-pi, pi, 10)

[-3.14159265 -2.44346095 -1.74532925 -1.04719755 -0.34906585 0.34906585 1.04719755 1.74532925 2.44346095 3.14159265]

loadtxt, fromfile, fromstring, .npz files, ... can easily read data from: csv, hdf5, R, Matlab, PIL, Excel files, ...

x = ones((2,2,2), dtype=uint8) # dtype = data typeprint x

[[[1 1] [1 1]]

[[1 1] [1 1]]]

x[0, 0, 0] = 300 # two points: index using [], setting values doesn't conver typeprint x

[[[44 1] [ 1 1]]

[[ 1 1] [ 1 1]]]

3 of 17

Page 4: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

In [7]:

In [8]:

In [9]:

basic math: +, -, *, /, ** operations are element-wise

In [77]:

In [80]:

In [30]:

In [84]:

slicing and indexing

[start:stop:step] method

In [147]:

x = zeros((2,2), dtype=int32)

print x.shape

(2, 2)

print x.dtype

int32

# degrees Celsiusc = linspace(0, 100., 10)print c

[ 0. 11.11111111 22.22222222 33.33333333 44.44444444 55.55555556 66.66666667 77.77777778 88.88888889 100. ]

f = (9./5)*c + 32print f

[ 32. 52. 72. 92. 112. 132. 152. 172. 192. 212.]

x = reshape(arange(4), (2, 2))print xprintprint x**2

[[0 1] [2 3]]

[[0 1] [4 9]]

x = arange(10)y = arange(10)print x*yprint 10*x-y

[ 0 1 4 9 16 25 36 49 64 81][ 0 9 18 27 36 45 54 63 72 81]

x = arange(10)print xprint x[2::2] # indexing starts at zeroprint x[::-1] # -1 step reverses direction

[0 1 2 3 4 5 6 7 8 9][2 4 6 8][9 8 7 6 5 4 3 2 1 0]

4 of 17

Page 5: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

In [3]:

In [4]:

To sort out questions during the presentation, the rule of thumb is this:

If the function changes the data in the array, then it will return a new array. Reshaping and slicing usually don't require changing the data,so they return a view when possible.

To check if you have a copy use the base attribute (ie, y.base is x).

By the way, many function have the "out" keyword, and this can be used to do the operation in place, if desired, so sin(x, "out"=x)will put the output of the calculation back into x. Mostly though, I use this not for in-place operations, but when trying to speed up codethat's run multiple times. That is, outside the loop I create an array, temp = zeros((10000000,)), and then in the loop run sin(x, "out"=temp), so I don't need to constantly recreate space for the output of sin(x).

integer indexing e.g.: x[[1, 3, 8]]

In [1]:

boolean indexing e.g.: x[[False True False True True]]

In [3]:

In [2]:

x = arange(10, dtype=uint8)y = x[0:6:2] # slices are views, NOT copiesy[2] = 100print yprint xprint x.__array_interface__['data'][0], repr(str(x.data))print y.__array_interface__['data'][0]#, repr(str(y.data))

[ 0 2 100][ 0 1 2 3 100 5 6 7 8 9]30507280 '\x00\x01\x02\x03d\x05\x06\x07\x08\t'30507280

z = reshape(x, (2,5)) # slicing and reshaping (if possible), just changes the header, not the dataprint zprint z.__array_interface__['data'][0], repr(str(z.data))

[[ 0 1 2 3 100] [ 5 6 7 8 9]]30507280 '\x00\x01\x02\x03d\x05\x06\x07\x08\t'

x = arange(10)**2print xprint x[array([1, 3, 8])]

[ 0 1 4 9 16 25 36 49 64 81][ 1 9 64]

x = arange(5)b = array([0, 1, 0, 1, 1], dtype=bool)y = x[b]print xprint bprint y

[0 1 2 3 4][False True False True True][1 3 4]

x = arange(10)print xprint x[x%3==0]

[0 1 2 3 4 5 6 7 8 9][0 3 6 9]

5 of 17

Page 6: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

In [5]:

In [3]:

assignment with slices

In [11]:

In [14]:

In [66]:

functions (ufuncs and others)

There are many, many of these. Ufuncs are the basics, and have a few extra abilities

In [107]:

print x[x>3]

[4 5 6 7 8 9]

x[~(x%3==0)]

Out[3]: array([1, 2, 4, 5, 7, 8])

x = arange(10)x[:2] = 100print x

[100 100 2 3 4 5 6 7 8 9]

x = arange(10)x[x%3==0] = 100print x

[100 1 2 100 4 5 100 7 8 100]

# math on slicesx = arange(10)x[x%3==0] += 100print x

[100 1 2 103 4 5 106 7 8 109]

ufuncs examples: add, multiply, sin, exp, log, sqrt, bitwise_and, greater, logical_and, flooralso: maximum and minimum (which should generally be used instead of Python's max and min

x = linspace(0, 10*pi, 10000)y = sin(x)plot(x, y)

Out[107]: [<matplotlib.lines.Line2D at 0x1886f550>]

6 of 17

Page 7: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

In [108]:

In [109]:

In [110]:

z = clip(y, -.8, .8)plot(x, z); ylim(-1, 1)

Out[108]: (-1, 1)

y[y>=.8] = .5y[y<=-.8] = -.5plot(x, y)

Out[109]: [<matplotlib.lines.Line2D at 0x273d9650>]

mask = (y<.4)&(y>-.4)x[mask] += 1plot(x, y)

Out[110]: [<matplotlib.lines.Line2D at 0x1ef4c0d0>]

At this point, you know enough to make a lot of progresss.Think of the operations you want, find them in Numpy and apply them.

7 of 17

Page 8: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

functions and techniques for shape manipulation (more advanced)

In [28]:

In [26]:

x = linspace(0, 10*pi, 1000000)y = sin(x) + .1*sin(1000*x)plot(x, y)

Out[28]: [<matplotlib.lines.Line2D at 0x43be410>]

from IPython.core.display import Image as ipImage # just for displaying image in the notebookipImage(filename='min01.png')

Out[26]:

8 of 17

Page 9: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

In [29]:

In [12]:

m = 1000y0 = reshape(y, (-1, m)) # y.shape = (-1, 1000)ymin = minimum.reduce(y0, axis=1)ymax = maximum.reduce(y0, axis=1)ymid = sum(y0, axis=1)/mplot(x[::m], ymin, x[::m], ymax, color='k')fill_between(x[::m], ymin, ymax, color='g')plot(x[::m], ymid, color='r')print y.shape, y0.shape, ymin.shape, ymax.shape, ymid.shape

(1000000,) (1000, 1000) (1000,) (1000,) (1000,)

# interlude with numpy functions "sum" and "minimum"seed(8)temp0 = randint(0, 9, (2,3))temp1 = randint(0, 9, (2,3))print temp0printprint temp1printprint "sum temp0: "print "axis=0"print sum(temp0, axis=0) # numpy.sum(a, axis=None, dtype=None, out=None, keepdims=False)"print "axis=1"print sum(temp0, axis=1)print print "minimum(temp0, temp1):"print minimum(temp0, temp1) # numpy.minimum(x1, x2[, out])printprint "reduce temp0:"print minimum.reduce(temp0, axis=0)print minimum.reduce(temp0, axis=1)

[[3 4 1] [5 8 3]]

[[8 0 5] [1 3 2]]

sum temp0: axis=0[ 8 12 4]axis=1[ 8 16]

minimum(temp0, temp1):[[3 0 1] [1 3 2]]

reduce temp0:[3 4 1][1 3]

9 of 17

Page 10: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

In [14]:

In [80]:

# "concatenate"# triangle wave (-2 to 3, 200 pts per period)a0 = linspace(-2, 5, 100, endpoint=False) # endpoint=False a1 = linspace(5, -2, 100, endpoint=False)b0 = concatenate((a0, a1))plot(b0)xlim(-10, 210)ylim(-2.5, 5.5)

Out[14]: (-2.5, 5.5)

b1 = concatenate((a0, a1)*6)plot(b1)

Out[80]: [<matplotlib.lines.Line2D at 0x59a4210>]

10 of 17

Page 11: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

In [81]:

In [4]:

In [33]:

In [69]:

# "resize"new_length = int(len(b0)*6.2)print len(b0), new_lengthb1 = resize(b0, (new_length,))plot(b1)

200 1240

Out[81]: [<matplotlib.lines.Line2D at 0x59ac9d0>]

# how about "repeat"?x = arange(5)print repeat(x, 4)print x

[0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4][0 1 2 3 4]

# newaxisprint "x.shape = ", x.shapeprint "x[newaxis,:].shape = ", x[newaxis,:].shapey = repeat(x[newaxis, :], 4, axis=0)print y

x.shape = (5,)x[newaxis,:].shape = (1, 5)[[0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4]]

# y.flatprint y.flatprint array(y.flat) # y.flat is an iterator, array(y.flat) is an ndarray

<numpy.flatiter object at 0x2862000>[0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4]

11 of 17

Page 12: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

In [35]:

broadcasting

In [38]:

In [85]:

b2 = array( repeat(b0[newaxis, :], 6, axis=0).flat )plot (b2, 'r')

Out[35]: [<matplotlib.lines.Line2D at 0x5099b90>]

seed(4)x0 = randint(0, 9, (3,2))x1 = array([5, 10])print x0print print x1printy = repeat(x1[newaxis,:], 3, axis=0)print yprintprint x0*y

[[7 5] [1 8] [7 8]]

[ 5 10]

[[ 5 10] [ 5 10] [ 5 10]]

[[35 50] [ 5 80] [35 80]]

# broadcasting.... like an automatic newaxis and repeat (without actually creating the array)y = x0*x1print y

[[35 10] [ 0 60] [20 0]]

12 of 17

Page 13: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

In [36]:

In [37]:

Numpy plays well with others...

PIL

x = linspace(-1, 1, 512)f0 = exp(-(x)**2/.05)plot(x, f0)

Out[36]: [<matplotlib.lines.Line2D at 0xb2a52b0>]

# newaxis with broadcasting!f02d = f0[newaxis,:]*f0[:,newaxis]imshow(f02d, cmap=cm.gray)

Out[37]: <matplotlib.image.AxesImage at 0x4d0f470>

13 of 17

Page 14: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

In [20]:

In [1]:

In [2]:

from IPython.core.display import Image as ipImage # just for displaying image in the notebookipImage(filename='lena.png')

Out[20]:

import Image as pilImage # PIL's Image classim = pilImage.open("lena.png")print im.size

(512, 512)

# convert PIL image to Numpy ndarraya = array(im)print a.shapeprint type(a), a.dtype

(512, 512, 3)<type 'numpy.ndarray'> uint8

14 of 17

Page 15: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

In [3]:

In [47]:

In [48]:

# convert to black and white using a simple sum of all colors...a = array(a, dtype=uint32)bw = sum(a, axis=2)print bw.dtype, minimum.reduce(bw.flat), maximum.reduce(bw.flat)imshow(bw, cmap=cm.gray)

uint32 106 720

Out[3]: <matplotlib.image.AxesImage at 0x48ad9d0>

bw1 = bw.copy()bw1[bw1<150] = 1000imshow(bw1, cmap=cm.gray)

Out[47]: <matplotlib.image.AxesImage at 0x570b4f0>

bw1 = roll(bw, 200, axis=1)imshow(bw1, cmap=cm.gray)

Out[48]: <matplotlib.image.AxesImage at 0x59315f0>

15 of 17

Page 16: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

In [4]:

In [50]:

In [52]:

ctypes

x = linspace(0, 10*pi, bw.shape[0])bw2 = bw*(1+sin(x))imshow(bw2, cmap=cm.gray)

Out[4]: <matplotlib.image.AxesImage at 0x4a62b90>

x, y = mgrid[-1:1:512j, -1:1:512j]mask = (x**2 + y**2)<0.6**2bw3 = bw.copy()bw3[~mask]=0imshow(bw3, cmap=cm.gray)

Out[50]: <matplotlib.image.AxesImage at 0x6857d70>

x, y = mgrid[-2:3,-2:3]print xprint print y

[[-2 -2 -2 -2 -2] [-1 -1 -1 -1 -1] [ 0 0 0 0 0] [ 1 1 1 1 1] [ 2 2 2 2 2]]

[[-2 -1 0 1 2] [-2 -1 0 1 2] [-2 -1 0 1 2] [-2 -1 0 1 2] [-2 -1 0 1 2]]

16 of 17

Page 17: Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 › numpy_phillypug.pdf · 2012-10-01 · Practical Numpy and Matplotlib Intro What is Numpy? Advantages

In [ ]:

Demos:

spectrum from computer's microphone

In [ ]:

handwritten number classification using feedforward neural network

In [38]:

In [ ]:

#ndarray.ctypes.data # C header C header C header C header C header C header int32 DAQmxWriteAnalogF64 (TaskHandle taskHandle, int32 numSampsPerChan, bool32 autoStart, float64 bool32 dataLayout, float64 writeArray[], int32 *sampsPerChanWritten, bool32 *reserved #Python Python Python Python Python Pythonnidaq.DAQmxWriteAnalogF64(taskHandle, int32(numSampsPerChan), int32(autoStart), float64(timeout), dataLayout, writeArray.ctypes.data, ctypes.byref(sampsPerChanWritten), None)) return sampsPerChanWritten.value # that is, numpy_array.ctypes.data returns data that can be used when calling a C function with ctypes

np_data = fromstring(microphone_data, dtype=np.int16)spect = log10(abs(fft.rfft(np_data)))

from IPython.core.display import Image as ipImage # just for displaying image in the notebookipImage(filename='nn.png')

Out[38]:

X1 = np.concatenate((np.ones((m,1)), X), axis=1) z2 = np.dot(X1,Theta1.T) a2 = sigmoid(z2) a2 = np.concatenate((np.ones((a2.shape[0],1)), a2), axis=1) z3 = np.dot(a2, Theta2.T); a3 = sigmoid(z3) ix = np.argmax(a3, axis=1)

17 of 17