Introduction to Matplotlib for Data Analysis
-
Upload
ewblen -
Category
Technology
-
view
4.133 -
download
4
Transcript of Introduction to Matplotlib for Data Analysis
Introduction to Matplotlib for Data Analysis
What is matplotlib?
Why do I use it?
Installation
What you need is:Python version 2.5, 2.6 or 2.7 Numpy version 1.3+ Matplotlib version 1.0.1
LinuxMatplotlib 0.99 is the latest in the debian repositaries,Latest version 1.0.1. needs to be installed from source.
Instructionshttp://matplotlib.sourceforge.net/users/installing.html
WindowsDownload and install.
Website for documentation
http://matplotlib.sourceforge.net/
Gallery has large number of examples.
Ways to run matplotlibInteractively using pylab and ipython
Interactively in shell
File
As part of a larger program
catherine@catherine-HP-Mini-110-3100:~$ ipython -pylab
Interactively using pylab in ipython
Imports modules required to plot in one namespace
Chart is updated as you enter commands
In [1]: plot([1,2,3,4],[56,45,58,32])Out[1]: []
Show Window
Save in various open formatsChange plot size in windowZoom to inspectPan to move along
Simple BargraphUsing bar
import numpy as npimport matplotlib.pyplot as plt
data1=[12,23,38,42,41] fig = plt.figure(1,(6,6)) fig.clf()ax = fig.add_subplot(111)
ind = np.arange(len(data1))rects = ax.bar(ind+0.125, data1, width=0.75,color='thistle')
plt.show()
Add titleax.set_title('Simple bar graph', size=20)
Change the plot rangeax.set_ylim(0,180)
Axis labelsax.set_xlabel('Data',size=14)ax.set_ylabel('Places'size=14)
Axis ticks and labelsax.set_xticks(ind+0.5)labels = ['west','east','centre','north','south']ax.set_xticklabels(labels, size=14)
Add bar labelsdef bar_label(rects): above = 1.05 * min([r.get_height() for r in rects]) for rect in rects: height = rect.get_height() ax.text(rect.get_x()+rect.get_width()/2., 1.05*height, '%d'%int(height), ha='center', va='bottom')bar_label(rects)
Titles and labels
Side by Sideax.bar(ind+0.125, data1, width=0.25, color='pink', label='A1')ax.bar(ind+0.375, data2, width=0.25, color='thistle', label='A2')ax.bar(ind+0.625, data3, width=0.25, color='salmon', label='A3')ax.legend(loc='upper left')
Cumulativerects1 = ax.bar(ind+0.125, data1, width=0.75, color='lightblue', label='A1')rects2 = ax.bar(ind+0.125, data2, width=0.75, bottom=data1, color='thistle', label='A2')ax.legend(loc='upper left')
Two datasets on same axes
Importing DataUsing numpy genfromtxt
import numpy as np
infile = open("data.csv", "r")data = np.genfromtxt(infile, delimiter=",", dtype=("S20,S20,f8"), names=True)infile.close()
- Split into Coloursyellow = data[data['Colour']=='Yellow']blue = data[data['Colour']!='Yellow']
- plot histogram of datafig = plt.figure(1, figsize=(12,8))ax = fig.add_subplot(111)ax.hist(yellow['Length'], color='gold')ax.tick_params('both',labelsize=16)plt.show()
Multiple plots on the same figureUsing add_subplot
fignum
ax1 = fig.add_subplot(231) ax2 = fig.add_subplot(232) ax3 = fig.add_subplot(233) ax4 = fig.add_subplot(234) ax5 = fig.add_subplot(235) ax6 = fig.add_subplot(236)
ax1.plot([12,13,25.5,15.2,19], 'bo-')ax2.plot([13,18.5,1.5,2,21], 'ro-')ax3.plot([10,12,11.5,16,23], 'go-')ax4.plot([6,11,5,12,21,32], 'ko-')ax5.plot([1.9,13,19.5,16.2,5], 'mo-')ax6.plot([13,13.2,26,18,14], 'yo-')
numrows
numcols
Use to compare measurements across different categories
Multiple plots on the same figureUsing Gridspec
fig = plt.figure(1,(6,6)) gs = gridspec.GridSpec(3, 2, width_ratios=[1,1], height_ratios=[1,1,2], hspace=0.2,bottom=0.1)
ax1 = fig.add_subplot(gs[0,0]) ax2 = fig.add_subplot(gs[0,1]) ax3 = fig.add_subplot(gs[1,0]) ax4 = fig.add_subplot(gs[1,1]) ax5 = fig.add_subplot(gs[2,:])
3 by 2 grid
Double height for bottom row
Easier to use for complicated plot layouts
Span bottom row
Multiple datasets on the same axesUsing Twin Axes
import numpy as np import matplotlib.pyplot as plt
fig = plt.figure() ax = fig.add_subplot(111) twin_ax = ax.twinx() sales = [45,69,60,67] returns = [82,91,89,78.5] ind = np.arange(len(sales)) rects1 = ax.bar(ind+0.125, sales, width=0.75, color='thistle') p1 = twin_ax.plot(ind+0.5, returns,'gs-') ax.set_ylim(0, 75) twin_ax.set_ylim(0,100) ax.set_xticks(ind+0.5) ax.set_xticklabels(['North','South','East','West']) ax.set_ylabel('Sales') twin_ax.set_ylabel('% Returned') ax.set_title('Sales v Returns') plt.figlegend( (rects1[0], p1), ('Sales', '% Returned'), loc='upper left') plt.show()
Questions?