Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 ›...
Transcript of Practical Numpy and Matplotlib Intro - Meetupfiles.meetup.com › 1555984 ›...
Practical Numpy and Matplotlib Intro
What is Numpy?
Advantages of using Numpy
In [ ]:
Structure of a numpy array in memory
In [23]:
In [153]:
Implication:
presented by Tom Adelman, Sept 28, 2012
# an examplen = 10000000# Pythona0 = [i for i in range(n)] time: 1.447 memory: 157540352a1 = [a0[i]+1 for i in range(n)] time: 2.365 memory: 133169152# Numpyb0 = np.arange(n) n=10000000 time: 0.036 memory: 40001536b1 = b0 + 1 time: 0.053 memory: 40001536
less memory for the same amount of data (about 3-4x less for ints -- 4 for type pointer, 4 for reference count, 4 for value)faster (5-100x)convenient and expressive (e.g., x+1, and much more)functionality (random numbers, math, fancy slicing and indexing, convolutions, statistics, linear algebra, FFTs, etc, etc)
a0 = array([1, 2, 3, 4, 5], dtype=uint16)print "a0:"print a0, type(a0)print repr(str(a0.data))
a0:[1 2 3 4 5] <type 'numpy.ndarray'>'\x01\x00\x02\x00\x03\x00\x04\x00\x05\x00'
from IPython.core.display import Image as ipImage # just for displaying image in the notebookipImage(filename='ndarray2.png')
Out[153]:
vectorize
1 of 17
Summary
Uses
Why discuss Numpy (number crunching) and Matplotlib (data plotting together)?
In [31]:
In [32]:
Always plot your numbers!!!
Numpy Basics
running numpy
That is:Numpy = **Multidimensional Homogeneous Arrays** and tools to work with these arrays. Characteristics: 1) all the same type 2) fixed size 3) multidimensional: 1, 2, 3, 4,... 4) x + 1. means add 1 to each element of the array Gain a lot for these restrictions: speed, memory, functionality.
examples of uses: Astronomy Artificial intelligence & machine learning Bayesian Statistics Biology (including Neuroscience) Dynamical systems Economics and Econometrics Electromagnetics and Electrical Engineering Geosciences Molecular modeling Signal processing Symbolic math, number theory, etc.
data = [-.56, .39, .82, 1.47, 2.1]
plot(data, 'o')xlim(-.5, 4.5)
. . .
matplotlib gallery page
ipython
2 of 17
In [22]:
creating ndarrays
In [31]:
In [15]:
In [25]:
In [44]:
more basics...
In [24]:
In [25]:
ndarray properties
# to import numpy, do:from numpy import * # y = exp(sin(x))# or do:import numpy as np # y = np.exp(np.sin(x))# import matplotlib like this...import matplotlib.pyplot as plt
print array([1, 5, 10]) # not array(1, 5, 10)print [1, 5, 10]
[ 1 5 10][1, 5, 10]
print zeros((2, 2)) # also ones
[[ 0. 0.] [ 0. 0.]]
print arange(10)
[0 1 2 3 4 5 6 7 8 9]
print linspace(-pi, pi, 10)
[-3.14159265 -2.44346095 -1.74532925 -1.04719755 -0.34906585 0.34906585 1.04719755 1.74532925 2.44346095 3.14159265]
loadtxt, fromfile, fromstring, .npz files, ... can easily read data from: csv, hdf5, R, Matlab, PIL, Excel files, ...
x = ones((2,2,2), dtype=uint8) # dtype = data typeprint x
[[[1 1] [1 1]]
[[1 1] [1 1]]]
x[0, 0, 0] = 300 # two points: index using [], setting values doesn't conver typeprint x
[[[44 1] [ 1 1]]
[[ 1 1] [ 1 1]]]
3 of 17
In [7]:
In [8]:
In [9]:
basic math: +, -, *, /, ** operations are element-wise
In [77]:
In [80]:
In [30]:
In [84]:
slicing and indexing
[start:stop:step] method
In [147]:
x = zeros((2,2), dtype=int32)
print x.shape
(2, 2)
print x.dtype
int32
# degrees Celsiusc = linspace(0, 100., 10)print c
[ 0. 11.11111111 22.22222222 33.33333333 44.44444444 55.55555556 66.66666667 77.77777778 88.88888889 100. ]
f = (9./5)*c + 32print f
[ 32. 52. 72. 92. 112. 132. 152. 172. 192. 212.]
x = reshape(arange(4), (2, 2))print xprintprint x**2
[[0 1] [2 3]]
[[0 1] [4 9]]
x = arange(10)y = arange(10)print x*yprint 10*x-y
[ 0 1 4 9 16 25 36 49 64 81][ 0 9 18 27 36 45 54 63 72 81]
x = arange(10)print xprint x[2::2] # indexing starts at zeroprint x[::-1] # -1 step reverses direction
[0 1 2 3 4 5 6 7 8 9][2 4 6 8][9 8 7 6 5 4 3 2 1 0]
4 of 17
In [3]:
In [4]:
To sort out questions during the presentation, the rule of thumb is this:
If the function changes the data in the array, then it will return a new array. Reshaping and slicing usually don't require changing the data,so they return a view when possible.
To check if you have a copy use the base attribute (ie, y.base is x).
By the way, many function have the "out" keyword, and this can be used to do the operation in place, if desired, so sin(x, "out"=x)will put the output of the calculation back into x. Mostly though, I use this not for in-place operations, but when trying to speed up codethat's run multiple times. That is, outside the loop I create an array, temp = zeros((10000000,)), and then in the loop run sin(x, "out"=temp), so I don't need to constantly recreate space for the output of sin(x).
integer indexing e.g.: x[[1, 3, 8]]
In [1]:
boolean indexing e.g.: x[[False True False True True]]
In [3]:
In [2]:
x = arange(10, dtype=uint8)y = x[0:6:2] # slices are views, NOT copiesy[2] = 100print yprint xprint x.__array_interface__['data'][0], repr(str(x.data))print y.__array_interface__['data'][0]#, repr(str(y.data))
[ 0 2 100][ 0 1 2 3 100 5 6 7 8 9]30507280 '\x00\x01\x02\x03d\x05\x06\x07\x08\t'30507280
z = reshape(x, (2,5)) # slicing and reshaping (if possible), just changes the header, not the dataprint zprint z.__array_interface__['data'][0], repr(str(z.data))
[[ 0 1 2 3 100] [ 5 6 7 8 9]]30507280 '\x00\x01\x02\x03d\x05\x06\x07\x08\t'
x = arange(10)**2print xprint x[array([1, 3, 8])]
[ 0 1 4 9 16 25 36 49 64 81][ 1 9 64]
x = arange(5)b = array([0, 1, 0, 1, 1], dtype=bool)y = x[b]print xprint bprint y
[0 1 2 3 4][False True False True True][1 3 4]
x = arange(10)print xprint x[x%3==0]
[0 1 2 3 4 5 6 7 8 9][0 3 6 9]
5 of 17
In [5]:
In [3]:
assignment with slices
In [11]:
In [14]:
In [66]:
functions (ufuncs and others)
There are many, many of these. Ufuncs are the basics, and have a few extra abilities
In [107]:
print x[x>3]
[4 5 6 7 8 9]
x[~(x%3==0)]
Out[3]: array([1, 2, 4, 5, 7, 8])
x = arange(10)x[:2] = 100print x
[100 100 2 3 4 5 6 7 8 9]
x = arange(10)x[x%3==0] = 100print x
[100 1 2 100 4 5 100 7 8 100]
# math on slicesx = arange(10)x[x%3==0] += 100print x
[100 1 2 103 4 5 106 7 8 109]
ufuncs examples: add, multiply, sin, exp, log, sqrt, bitwise_and, greater, logical_and, flooralso: maximum and minimum (which should generally be used instead of Python's max and min
x = linspace(0, 10*pi, 10000)y = sin(x)plot(x, y)
Out[107]: [<matplotlib.lines.Line2D at 0x1886f550>]
6 of 17
In [108]:
In [109]:
In [110]:
z = clip(y, -.8, .8)plot(x, z); ylim(-1, 1)
Out[108]: (-1, 1)
y[y>=.8] = .5y[y<=-.8] = -.5plot(x, y)
Out[109]: [<matplotlib.lines.Line2D at 0x273d9650>]
mask = (y<.4)&(y>-.4)x[mask] += 1plot(x, y)
Out[110]: [<matplotlib.lines.Line2D at 0x1ef4c0d0>]
At this point, you know enough to make a lot of progresss.Think of the operations you want, find them in Numpy and apply them.
7 of 17
functions and techniques for shape manipulation (more advanced)
In [28]:
In [26]:
x = linspace(0, 10*pi, 1000000)y = sin(x) + .1*sin(1000*x)plot(x, y)
Out[28]: [<matplotlib.lines.Line2D at 0x43be410>]
from IPython.core.display import Image as ipImage # just for displaying image in the notebookipImage(filename='min01.png')
Out[26]:
8 of 17
In [29]:
In [12]:
m = 1000y0 = reshape(y, (-1, m)) # y.shape = (-1, 1000)ymin = minimum.reduce(y0, axis=1)ymax = maximum.reduce(y0, axis=1)ymid = sum(y0, axis=1)/mplot(x[::m], ymin, x[::m], ymax, color='k')fill_between(x[::m], ymin, ymax, color='g')plot(x[::m], ymid, color='r')print y.shape, y0.shape, ymin.shape, ymax.shape, ymid.shape
(1000000,) (1000, 1000) (1000,) (1000,) (1000,)
# interlude with numpy functions "sum" and "minimum"seed(8)temp0 = randint(0, 9, (2,3))temp1 = randint(0, 9, (2,3))print temp0printprint temp1printprint "sum temp0: "print "axis=0"print sum(temp0, axis=0) # numpy.sum(a, axis=None, dtype=None, out=None, keepdims=False)"print "axis=1"print sum(temp0, axis=1)print print "minimum(temp0, temp1):"print minimum(temp0, temp1) # numpy.minimum(x1, x2[, out])printprint "reduce temp0:"print minimum.reduce(temp0, axis=0)print minimum.reduce(temp0, axis=1)
[[3 4 1] [5 8 3]]
[[8 0 5] [1 3 2]]
sum temp0: axis=0[ 8 12 4]axis=1[ 8 16]
minimum(temp0, temp1):[[3 0 1] [1 3 2]]
reduce temp0:[3 4 1][1 3]
9 of 17
In [14]:
In [80]:
# "concatenate"# triangle wave (-2 to 3, 200 pts per period)a0 = linspace(-2, 5, 100, endpoint=False) # endpoint=False a1 = linspace(5, -2, 100, endpoint=False)b0 = concatenate((a0, a1))plot(b0)xlim(-10, 210)ylim(-2.5, 5.5)
Out[14]: (-2.5, 5.5)
b1 = concatenate((a0, a1)*6)plot(b1)
Out[80]: [<matplotlib.lines.Line2D at 0x59a4210>]
10 of 17
In [81]:
In [4]:
In [33]:
In [69]:
# "resize"new_length = int(len(b0)*6.2)print len(b0), new_lengthb1 = resize(b0, (new_length,))plot(b1)
200 1240
Out[81]: [<matplotlib.lines.Line2D at 0x59ac9d0>]
# how about "repeat"?x = arange(5)print repeat(x, 4)print x
[0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4][0 1 2 3 4]
# newaxisprint "x.shape = ", x.shapeprint "x[newaxis,:].shape = ", x[newaxis,:].shapey = repeat(x[newaxis, :], 4, axis=0)print y
x.shape = (5,)x[newaxis,:].shape = (1, 5)[[0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4]]
# y.flatprint y.flatprint array(y.flat) # y.flat is an iterator, array(y.flat) is an ndarray
<numpy.flatiter object at 0x2862000>[0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4]
11 of 17
In [35]:
broadcasting
In [38]:
In [85]:
b2 = array( repeat(b0[newaxis, :], 6, axis=0).flat )plot (b2, 'r')
Out[35]: [<matplotlib.lines.Line2D at 0x5099b90>]
seed(4)x0 = randint(0, 9, (3,2))x1 = array([5, 10])print x0print print x1printy = repeat(x1[newaxis,:], 3, axis=0)print yprintprint x0*y
[[7 5] [1 8] [7 8]]
[ 5 10]
[[ 5 10] [ 5 10] [ 5 10]]
[[35 50] [ 5 80] [35 80]]
# broadcasting.... like an automatic newaxis and repeat (without actually creating the array)y = x0*x1print y
[[35 10] [ 0 60] [20 0]]
12 of 17
In [36]:
In [37]:
Numpy plays well with others...
PIL
x = linspace(-1, 1, 512)f0 = exp(-(x)**2/.05)plot(x, f0)
Out[36]: [<matplotlib.lines.Line2D at 0xb2a52b0>]
# newaxis with broadcasting!f02d = f0[newaxis,:]*f0[:,newaxis]imshow(f02d, cmap=cm.gray)
Out[37]: <matplotlib.image.AxesImage at 0x4d0f470>
13 of 17
In [20]:
In [1]:
In [2]:
from IPython.core.display import Image as ipImage # just for displaying image in the notebookipImage(filename='lena.png')
Out[20]:
import Image as pilImage # PIL's Image classim = pilImage.open("lena.png")print im.size
(512, 512)
# convert PIL image to Numpy ndarraya = array(im)print a.shapeprint type(a), a.dtype
(512, 512, 3)<type 'numpy.ndarray'> uint8
14 of 17
In [3]:
In [47]:
In [48]:
# convert to black and white using a simple sum of all colors...a = array(a, dtype=uint32)bw = sum(a, axis=2)print bw.dtype, minimum.reduce(bw.flat), maximum.reduce(bw.flat)imshow(bw, cmap=cm.gray)
uint32 106 720
Out[3]: <matplotlib.image.AxesImage at 0x48ad9d0>
bw1 = bw.copy()bw1[bw1<150] = 1000imshow(bw1, cmap=cm.gray)
Out[47]: <matplotlib.image.AxesImage at 0x570b4f0>
bw1 = roll(bw, 200, axis=1)imshow(bw1, cmap=cm.gray)
Out[48]: <matplotlib.image.AxesImage at 0x59315f0>
15 of 17
In [4]:
In [50]:
In [52]:
ctypes
x = linspace(0, 10*pi, bw.shape[0])bw2 = bw*(1+sin(x))imshow(bw2, cmap=cm.gray)
Out[4]: <matplotlib.image.AxesImage at 0x4a62b90>
x, y = mgrid[-1:1:512j, -1:1:512j]mask = (x**2 + y**2)<0.6**2bw3 = bw.copy()bw3[~mask]=0imshow(bw3, cmap=cm.gray)
Out[50]: <matplotlib.image.AxesImage at 0x6857d70>
x, y = mgrid[-2:3,-2:3]print xprint print y
[[-2 -2 -2 -2 -2] [-1 -1 -1 -1 -1] [ 0 0 0 0 0] [ 1 1 1 1 1] [ 2 2 2 2 2]]
[[-2 -1 0 1 2] [-2 -1 0 1 2] [-2 -1 0 1 2] [-2 -1 0 1 2] [-2 -1 0 1 2]]
16 of 17
In [ ]:
Demos:
spectrum from computer's microphone
In [ ]:
handwritten number classification using feedforward neural network
In [38]:
In [ ]:
#ndarray.ctypes.data # C header C header C header C header C header C header int32 DAQmxWriteAnalogF64 (TaskHandle taskHandle, int32 numSampsPerChan, bool32 autoStart, float64 bool32 dataLayout, float64 writeArray[], int32 *sampsPerChanWritten, bool32 *reserved #Python Python Python Python Python Pythonnidaq.DAQmxWriteAnalogF64(taskHandle, int32(numSampsPerChan), int32(autoStart), float64(timeout), dataLayout, writeArray.ctypes.data, ctypes.byref(sampsPerChanWritten), None)) return sampsPerChanWritten.value # that is, numpy_array.ctypes.data returns data that can be used when calling a C function with ctypes
np_data = fromstring(microphone_data, dtype=np.int16)spect = log10(abs(fft.rfft(np_data)))
from IPython.core.display import Image as ipImage # just for displaying image in the notebookipImage(filename='nn.png')
Out[38]:
X1 = np.concatenate((np.ones((m,1)), X), axis=1) z2 = np.dot(X1,Theta1.T) a2 = sigmoid(z2) a2 = np.concatenate((np.ones((a2.shape[0],1)), a2), axis=1) z3 = np.dot(a2, Theta2.T); a3 = sigmoid(z3) ix = np.argmax(a3, axis=1)
17 of 17