Introduction to Matplotlib for Data Analysis

download Introduction to Matplotlib for Data Analysis

If you can't read please download the document

Transcript of Introduction to Matplotlib for Data Analysis

Introduction to Matplotlib for Data Analysis

What is matplotlib?

Why do I use it?

Installation

What you need is:Python version 2.5, 2.6 or 2.7 Numpy version 1.3+ Matplotlib version 1.0.1

LinuxMatplotlib 0.99 is the latest in the debian repositaries,Latest version 1.0.1. needs to be installed from source.

Instructionshttp://matplotlib.sourceforge.net/users/installing.html

WindowsDownload and install.

Website for documentation

http://matplotlib.sourceforge.net/

Gallery has large number of examples.

Ways to run matplotlibInteractively using pylab and ipython

Interactively in shell

File

As part of a larger program

catherine@catherine-HP-Mini-110-3100:~$ ipython -pylab

Interactively using pylab in ipython

Imports modules required to plot in one namespace

Chart is updated as you enter commands

In [1]: plot([1,2,3,4],[56,45,58,32])Out[1]: []

Show Window

Save in various open formatsChange plot size in windowZoom to inspectPan to move along

Simple BargraphUsing bar

import numpy as npimport matplotlib.pyplot as plt

data1=[12,23,38,42,41] fig = plt.figure(1,(6,6)) fig.clf()ax = fig.add_subplot(111)

ind = np.arange(len(data1))rects = ax.bar(ind+0.125, data1, width=0.75,color='thistle')

plt.show()

Add titleax.set_title('Simple bar graph', size=20)

Change the plot rangeax.set_ylim(0,180)

Axis labelsax.set_xlabel('Data',size=14)ax.set_ylabel('Places'size=14)

Axis ticks and labelsax.set_xticks(ind+0.5)labels = ['west','east','centre','north','south']ax.set_xticklabels(labels, size=14)

Add bar labelsdef bar_label(rects): above = 1.05 * min([r.get_height() for r in rects]) for rect in rects: height = rect.get_height() ax.text(rect.get_x()+rect.get_width()/2., 1.05*height, '%d'%int(height), ha='center', va='bottom')bar_label(rects)

Titles and labels

Side by Sideax.bar(ind+0.125, data1, width=0.25, color='pink', label='A1')ax.bar(ind+0.375, data2, width=0.25, color='thistle', label='A2')ax.bar(ind+0.625, data3, width=0.25, color='salmon', label='A3')ax.legend(loc='upper left')

Cumulativerects1 = ax.bar(ind+0.125, data1, width=0.75, color='lightblue', label='A1')rects2 = ax.bar(ind+0.125, data2, width=0.75, bottom=data1, color='thistle', label='A2')ax.legend(loc='upper left')

Two datasets on same axes

Importing DataUsing numpy genfromtxt

import numpy as np

infile = open("data.csv", "r")data = np.genfromtxt(infile, delimiter=",", dtype=("S20,S20,f8"), names=True)infile.close()

- Split into Coloursyellow = data[data['Colour']=='Yellow']blue = data[data['Colour']!='Yellow']

- plot histogram of datafig = plt.figure(1, figsize=(12,8))ax = fig.add_subplot(111)ax.hist(yellow['Length'], color='gold')ax.tick_params('both',labelsize=16)plt.show()

Multiple plots on the same figureUsing add_subplot

fignum

ax1 = fig.add_subplot(231) ax2 = fig.add_subplot(232) ax3 = fig.add_subplot(233) ax4 = fig.add_subplot(234) ax5 = fig.add_subplot(235) ax6 = fig.add_subplot(236)

ax1.plot([12,13,25.5,15.2,19], 'bo-')ax2.plot([13,18.5,1.5,2,21], 'ro-')ax3.plot([10,12,11.5,16,23], 'go-')ax4.plot([6,11,5,12,21,32], 'ko-')ax5.plot([1.9,13,19.5,16.2,5], 'mo-')ax6.plot([13,13.2,26,18,14], 'yo-')

numrows

numcols

Use to compare measurements across different categories

Multiple plots on the same figureUsing Gridspec

fig = plt.figure(1,(6,6)) gs = gridspec.GridSpec(3, 2, width_ratios=[1,1], height_ratios=[1,1,2], hspace=0.2,bottom=0.1)

ax1 = fig.add_subplot(gs[0,0]) ax2 = fig.add_subplot(gs[0,1]) ax3 = fig.add_subplot(gs[1,0]) ax4 = fig.add_subplot(gs[1,1]) ax5 = fig.add_subplot(gs[2,:])

3 by 2 grid

Double height for bottom row

Easier to use for complicated plot layouts

Span bottom row

Multiple datasets on the same axesUsing Twin Axes

import numpy as np import matplotlib.pyplot as plt

fig = plt.figure() ax = fig.add_subplot(111) twin_ax = ax.twinx() sales = [45,69,60,67] returns = [82,91,89,78.5] ind = np.arange(len(sales)) rects1 = ax.bar(ind+0.125, sales, width=0.75, color='thistle') p1 = twin_ax.plot(ind+0.5, returns,'gs-') ax.set_ylim(0, 75) twin_ax.set_ylim(0,100) ax.set_xticks(ind+0.5) ax.set_xticklabels(['North','South','East','West']) ax.set_ylabel('Sales') twin_ax.set_ylabel('% Returned') ax.set_title('Sales v Returns') plt.figlegend( (rects1[0], p1), ('Sales', '% Returned'), loc='upper left') plt.show()

Questions?