A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2...

43
9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 1/43 In [2]: %matplotlib inline import pandas as pd import numpy as np import matplotlib.pylab as plt from scipy import stats from matplotlib.backends.backend_pdf import PdfPages Pandas Pictures are for fun. No need to add them to your notebook! In [2]: from IPython.display import Image Image(filename='PeterPanda2-phineas-pherb.jpg',width=400) Automated Pandas? Out[2]:

Transcript of A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2...

Page 1: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 1/43

In [2]: %matplotlib inline import pandas as pdimport numpy as npimport matplotlib.pylab as pltfrom scipy import statsfrom matplotlib.backends.backend_pdf import PdfPages

PandasPictures are for fun. No need to add them to your notebook!

In [2]: from IPython.display import ImageImage(filename='PeterPanda2-phineas-pherb.jpg',width=400)

Automated Pandas?

Out[2]:

Page 2: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 2/43

In [3]: from IPython.display import ImageImage(filename='Pandaborg.jpg',width=400)

Today we want to automate a few things in pandas. this means we will need to clean up our coding and callthings in a better method. I will walk you through it. Lets read in our file and get started. We will usewell_data.csv again. Look on courseworks and you will see our goal is making a file with the graphs we want. Ikept the df in the name as it is a dataframe

In [4]: pwd

In [3]: df_well_data=pd.read_csv('well_data.csv')

Now instead of using 'As' lets set i to 'As' and try and call it. Remember I am using head() to save space printing.You do not need it for class.

In [4]: i='As'df_well_data[i].head()

Out[3]:

Out[4]: 'C:\\work-teaching\\python\\fall18\\BigDataPython'

Out[4]: 0 NaN 1 NaN 2 NaN 3 78.97747 4 NaN Name: As, dtype: float64

Page 3: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 3/43

We can also do fucntions on it. Lets call dtype which tells us the data type.

In [7]: i='As'df_well_data[i].dtype

now try setting i to Drink

In [39]:

In [8]: from IPython.display import ImageImage(filename='standard_unit_vectors_3d.png',width=400) #copied from http://mathinsight.org/cross_product_formula

I am not going to go into it. But i,j,k are used for vectors in math and are thus the common names people iterateover. So i,j,k is the vector in x,y,z space. But you could call it anything. But i,j,k are simple and easy

Out[7]: dtype('float64')

Out[39]: dtype('O')

Out[8]:

Page 4: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 4/43

In [9]: I_need_to_think_of_a_better_name='As'df_well_data[I_need_to_think_of_a_better_name].head()

now what if we wanted to print the datatypes one by one. we could use a for loop. The nice thing about Pandasis you can iterate over the columns automatically.

In [5]: for i in df_well_data: print (i)

Now that you know how to iterate can you go through and print each data type?

Out[9]: 0 NaN 1 NaN 2 NaN 3 78.97747 4 NaN Name: As, dtype: float64

Well_ID Lat Lon Depth Drink Si P S Ca Fe Ba Na Mg K Mn As Sr F Cl SO4 Br

Page 5: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 5/43

In [41]:

We could use our format to make it cleaner and look prettier.

In [43]:

Well_ID is of type int64 Lat is of type float64 Lon is of type float64 Depth is of type int64 Drink is of type object Si is of type float64 P is of type float64 S is of type float64 Ca is of type float64 Fe is of type float64 Ba is of type float64 Na is of type float64 Mg is of type float64 K is of type float64 Mn is of type float64 As is of type float64 Sr is of type float64 F is of type float64 Cl is of type float64 SO4 is of type float64 Br is of type float64

Well_ID is of type int64 Lat is of type float64 Lon is of type float64 Depth is of type int64 Drink is of type object Si is of type float64 P is of type float64 S is of type float64 Ca is of type float64 Fe is of type float64 Ba is of type float64 Na is of type float64 Mg is of type float64 K is of type float64 Mn is of type float64 As is of type float64 Sr is of type float64 F is of type float64 Cl is of type float64 SO4 is of type float64 Br is of type float64

Page 6: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 6/43

Now lets get our code for printing a graph and instead of saying As. Lets put in i. this is my original HW answer.

THERE IS A CONFUSING PART HERE!

1. We are using arsenic twice in the same graph2. We are plotting it3. We are also coloring the dots with it.4. I would keep the coloring of the dots hardwired with arsenic.5. I would change the initial call to arsenic to an i.

In [6]: fig,ax=plt.subplots()

#plot <10ax.scatter(df_well_data.As[df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10')#plot 10-50ax.scatter(df_well_data.As[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],\ df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],color='g',label='10-50')#plot<50ax.scatter(df_well_data.As[df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='>50')

ax.xaxis.set_label_position('top')ax.xaxis.tick_top()ax.set_xlabel('Arsenic ppb')ax.set_ylabel('Depth ft')ax.set_xlim([0,800])ax.set_ylim([250,0])plt.legend(loc=4, fancybox=True, framealpha=0.5)

Out[6]: <matplotlib.legend.Legend at 0x209106db320>

Page 7: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 7/43

Now lets try it with i='As'. But we want to keep the As cutoffs. So don't change the cutoffs. We want to be able toplot a different parameter other than As but still color the points by arsenic. So keep the As boolean calls butchange the other .As to [i] BE CAREFUL WHAT YOU CHANGE!

In [49]: i='As'

Now lets change i from As to Fe

In [50]:

Do you like the axis lengths and titles? What can we do? Can you set xlim and xlabel to automatically choose theright thing?

Out[49]: <matplotlib.legend.Legend at 0xce35048>

Out[50]: <matplotlib.legend.Legend at 0xa3774a8>

Page 8: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 8/43

In [52]:

Can you now plot a different column? Choose one and see if it works.

If you forget the column names...

In [8]: df_well_data.columns

In [54]:

Out[52]: <matplotlib.legend.Legend at 0xd3dbcc0>

Out[8]: Index(['Well_ID', 'Lat', 'Lon', 'Depth', 'Drink', 'Si', 'P', 'S', 'Ca', 'Fe', 'Ba', 'Na', 'Mg', 'K', 'Mn', 'As', 'Sr', 'F', 'Cl', 'SO4', 'Br'], dtype='object')

Out[54]: <matplotlib.legend.Legend at 0xe459978>

Page 9: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 9/43

Read these notes carefully and take your time. The devil is in thedetails. READ THE NOTES!But I want a file with all the parameters.So we still need to learn two things.

How to write to files and then how to loop.Lets do them one at a time.

For a file we want pdf becuase we can do multiple pages. so if we google matplotlib write to file we get this faqpage http://matplotlib.org/faq/howto_faq.html (http://matplotlib.org/faq/howto_faq.html) and then we can click tohttp://matplotlib.org/faq/howto_faq.html#save-multiple-plots-to-one-pdf-file(http://matplotlib.org/faq/howto_faq.html#save-multiple-plots-to-one-pdf-file) so we need to import frommatplotlib.backends.backend_pdf import PdfPages which we did above. then it is three steps.

1. Open the file. pp = PdfPages('multipage.pdf')2. save the graph. pp.savefig()3. Close the file so we can open it somewhere else. pp.close()

so basically pp references our file and we are doing things to it. We keep adding pages to the file pp. When aredone we close it.

so lets try it.

Page 10: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 10/43

In [13]: pp = PdfPages('filetest.pdf')

i='Ca'fig,ax=plt.subplots(1,1)#I added line breaks to read betterax.scatter(df_well_data[i][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10')

ax.scatter( df_well_data[i][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)] ,df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)] ,color='g',label='10-50')

ax.scatter(df_well_data[i][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='>50')

ax.xaxis.set_label_position('top')ax.xaxis.tick_top()ax.set_xlabel(i)ax.set_ylabel('Depth ft')ax.set_xlim([0,df_well_data[i].max()]) #you can set the limits if you like it starting at 0ax.set_ylim([250,0])plt.legend(loc=4, fancybox=True, framealpha=0.5)

pp.savefig()pp.close()

Important noteHopefully you now have a file. files can get buggy. I usually have to delete the file before running the programand sometimes have to change file names. But it is nice.

Page 11: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 11/43

Now lets try and add a for loop around the graphs and see if we can plot them all. remember to put the open andclose outside of the for loop. Also, we need to make our subplots outside the for loop. This will make more senselater. But for my first try I got an error! Lets see what we get!

In [67]:

---------------------------------------------------------------------------ValueError Traceback (most recent call last) <ipython-input-67-233e402daeec> in <module>() 3 for i in well_data: 4 ----> 5 ax.scatter(well_data[i][well_data.As<10],well_data.Depth[well_data.As<10],color='b',label='<10') 6 ax.scatter(well_data[i][logical_and(well_data.As>=10,well_data.As<=50)],well_data.Depth[logical_and(well_data.As>=10,well_data.As<=50)],color='g',label='10-50') 7 ax.scatter(well_data[i][well_data.As>50],well_data.Depth[well_data.As>50],color='r',label='>50')

C:\Users\bmailloux\AppData\Local\Enthought\Canopy\User\lib\site-packages\matplotlib\axes\_axes.pyc in scatter(self, x, y, s, c, marker, cmap, norm, vmin,vmax, alpha, linewidths, verts, **kwargs) 3585 c = np.ma.ravel(c) 3586 -> 3587 x, y, s, c = cbook.delete_masked_points(x, y, s, c) 3588 3589 scales = s # Renamed for readability below.

C:\Users\bmailloux\AppData\Local\Enthought\Canopy\User\lib\site-packages\matplotlib\cbook.pyc in delete_masked_points(*args) 1805 return () 1806 if (is_string_like(args[0]) or not iterable(args[0])):-> 1807 raise ValueError("First argument must be a sequence") 1808 nrecs = len(args[0]) 1809 margs = []

ValueError: First argument must be a sequence

Page 12: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 12/43

You should get an error and not be able to open the file.

If the graphs are not showing we need to debug.

I would add a line where it prints i out every time through the for loop so you can see the progress.

Also, as you can see it plotted a lot of stuff on top of each other. That is because we only have one graph windowopen. We either need to close it or clear it everytime. to close use as.cla to clear the axes. So once you are donewith a grpah and you have saved it. do ax.cla() and this clears out the axes and gets ready for you to plot again.

So we are fixing two things to try and find out error. First let's add ax.cla() which clears out the axes.

then lets print at each time through the for loop so we can see where an error occurs.

Page 13: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 13/43

In [68]:

Well_ID Lat Lon Depth Drink

---------------------------------------------------------------------------ValueError Traceback (most recent call last) <ipython-input-68-687418f2c4c1> in <module>() 3 for i in well_data: 4 print i ----> 5 ax.scatter(well_data[i][well_data.As<10],well_data.Depth[well_data.As<10],color='b',label='<10') 6 ax.scatter(well_data[i][logical_and(well_data.As>=10,well_data.As<=50)],well_data.Depth[logical_and(well_data.As>=10,well_data.As<=50)],color='g',label='10-50') 7 ax.scatter(well_data[i][well_data.As>50],well_data.Depth[well_data.As>50],color='r',label='>50')

C:\Users\bmailloux\AppData\Local\Enthought\Canopy\User\lib\site-packages\matplotlib\axes\_axes.pyc in scatter(self, x, y, s, c, marker, cmap, norm, vmin,vmax, alpha, linewidths, verts, **kwargs) 3585 c = np.ma.ravel(c) 3586 -> 3587 x, y, s, c = cbook.delete_masked_points(x, y, s, c) 3588 3589 scales = s # Renamed for readability below.

C:\Users\bmailloux\AppData\Local\Enthought\Canopy\User\lib\site-packages\matplotlib\cbook.pyc in delete_masked_points(*args) 1805 return () 1806 if (is_string_like(args[0]) or not iterable(args[0])):-> 1807 raise ValueError("First argument must be a sequence") 1808 nrecs = len(args[0]) 1809 margs = []

ValueError: First argument must be a sequence

Page 14: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 14/43

We still can't open the file b/c it is not closing correctly.

But now because we are printing i during each for loop you can see that we crash at drink.

Drink is a string. so we could check the dtype to make sure it is a float and then make a graph for floats. I alsochanged the file name b/c I have trouble writing over these ones that end in an error.

In [69]:

Almost there. We just don't need Lat and lon so we could start the loop later.

Well_ID Lat Lon Depth Drink Si P S Ca Fe Ba Na Mg K Mn As Sr F Cl SO4 Br

Page 15: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 15/43

In [19]: pp = PdfPages('filetest4.pdf')fig,ax=plt.subplots(1,1)for i in df_well_data.iloc[:,5:]: print (i) if df_well_data[i].dtype==float: ax.scatter(df_well_data[i][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10') ax.scatter(df_well_data[i][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],\ df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],color='g',label='10-50') ax.scatter(df_well_data[i][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='>50') #ax.invert_yaxis() #ax.xaxis.set_tick_params(labeltop='on') ax.xaxis.set_label_position('top') ax.xaxis.tick_top() ax.set_xlabel(i) ax.set_ylabel('Depth ft') ax.set_xlim([0,df_well_data[i].max()]) ax.set_ylim([250,0]) plt.legend(loc=4, fancybox=True, framealpha=0.5) pp.savefig() ax.cla()

pp.close()

Page 16: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 16/43

you should now have a file with all your data. I love that.

Now after looking at all the data I realize I want to plot As, F, S on 3 plots next to each other. lets see if we can dothis. here is an example. http://matplotlib.org/examples/pylab_examples/subplots_demo.html(http://matplotlib.org/examples/pylab_examples/subplots_demo.html) So if you look I want to share a y-axis. Letstry..... Lets see if we can learn. I am nervous as these plot always give me trouble so I going to start simple andjust use our code for plotting less than 10. In subplots we will get three sets of axes. One for each plot. Also, I amtelling it to give me 1 row by 3 columns. then we are going to share the y-axes. To make thigs easier I am goingto just show the less than 10 data first. Also since we have 3 plots on one page we have 1 fig and 3 ax values.

For this work I was using i to call the element name. You can use any variable name to name the integer. Goback into your code and change i to something more intuitive. It could be j? it could be k? It could beelement_name. It could be I_CANT_THINK_OF_ANYTHING It is your choice. But change the name and make itwork again!

In [ ]:

Si P S Ca Fe Ba Na Mg K Mn As Sr F Cl SO4 Br

Page 17: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 17/43

In [20]: fig, (ax1, ax2, ax3) = plt.subplots(1, 3, sharey=True)

i='As'ax1.scatter(df_well_data[i][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10')

i='Fe'ax2.scatter(df_well_data[i][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10')

i='S'ax3.scatter(df_well_data[i][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10')

ax1.set_ylim([250,0])

First step is I am going to make my ax into a list.

Out[20]: (250, 0)

Page 18: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 18/43

In [21]: fig, ax = plt.subplots(1, 3, sharey=True)

i='As'ax[0].scatter(df_well_data[i][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10')

i='Fe'ax[1].scatter(df_well_data[i][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10')

i='S'ax[2].scatter(df_well_data[i][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10')

ax[0].set_ylim([250,0])

That is really encouraging. But we have to get rid of the hardwires and do it in a for loop eventually. This way wecould just choose our parameters and it would do it. So lets go back and do for loops.

In [23]: elems=['As','Fe','S']for elem in elems: print elem

Now we could look at each of those data.

Out[21]: (250, 0)

As Fe S

Page 19: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 19/43

In [23]: elems=['As','Fe','S']for elem in elems: print (df_well_data[elem].head())

Flashback to for loops. This is what I wrote in the notes!

Python has a simple trick to count through a list and give you the list. It may not make sense now but it will laterin the semester!

This is the example I gave.

In [24]: mystrlist=['env','chem','bio','psych']for i,mystr in enumerate(mystrlist): print (i,mystr)

Now lets do it for our elements!

In [25]: elems=['As','Fe','S']for count,elem in enumerate(elems): print (count,elem)

0 NaN 1 NaN 2 NaN 3 78.97747 4 NaN Name: As, dtype: float64 0 NaN 1 NaN 2 NaN 3 1.260031 4 NaN Name: Fe, dtype: float64 0 NaN 1 NaN 2 NaN 3 2085.570979 4 NaN Name: S, dtype: float64

0 env 1 chem 2 bio 3 psych

0 As 1 Fe 2 S

Page 20: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 20/43

Lets keep getting rid of hard wires. Now instead of doing subplots(1,3) lets use the length of elems!

In [26]: elems=['As','Fe','S']fig, ax = plt.subplots(1, len(elems), sharey=True)

Now lets add our for loop to after we make the plots.

In [27]: elems=['As','Fe','S']fig, ax = plt.subplots(1, len(elems), sharey=True)for count,elem in enumerate(elems): print (count,elem)

Now instead of printing lets call the correct ax and plot the elements.

0 As 1 Fe 2 S

Page 21: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 21/43

In [28]: elems=['As','Fe','S']fig, ax = plt.subplots(1, len(elems), sharey=True)for count,elem in enumerate(elems): ax[count].scatter(df_well_data[elem][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10')ax[0].set_ylim([250,0])

Now lets add the full data set!

Out[28]: (250, 0)

Page 22: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 22/43

In [30]: elems=['As','Fe','S']fig, ax = plt.subplots(1, len(elems), sharey=True)for count,elem in enumerate(elems): ax[count].scatter(df_well_data[elem][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10') ax[count].scatter(df_well_data[elem][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],\ df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],color='g',label='<10') ax[count].scatter(df_well_data[elem][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='<10')ax[0].set_ylim([250,0])

This is looking great. Lets see if we can clean it up. Lets fix the axis labels automaticaly and set the x limits tostart at 0 and end at max. Also, lets label the y axis. and make the whole thing bigger!I use fig.set_size_inches(width,height) then I use ax.locator_params(nbins=4,axis='x') to set the number of tickson the x axis.

Out[30]: (250, 0)

Page 23: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 23/43

In [32]: elems=['As','Fe','S']fig, ax = plt.subplots(1, len(elems), sharey=True)fig.set_size_inches(10,6)for count,elem in enumerate(elems): ax[count].scatter(df_well_data[elem][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10') ax[count].scatter(df_well_data[elem][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],\ df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],color='g',label='<10') ax[count].scatter(df_well_data[elem][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='<10') ax[count].set_xlabel(elem) #labels with the element ax[count].set_xlim(0,np.max(df_well_data[elem])) #sets the limit to the maximum ax[count].locator_params(nbins=4,axis='x') #gives us 4 ticks marks on the x axies ax[0].set_ylim([250,0])

To prove this works add two more parameters to elem.

Out[32]: (250, 0)

Page 24: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 24/43

In [34]: elems=['As','Fe','S','P','Na']fig, ax = plt.subplots(1, len(elems), sharey=True)fig.set_size_inches(12,6)for count,elem in enumerate(elems): ax[count].scatter(df_well_data[elem][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10') ax[count].scatter(df_well_data[elem][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],\ df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],color='g',label='<10') ax[count].scatter(df_well_data[elem][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='<10') ax[count].set_xlabel(elem) #labels with the element ax[count].set_xlim(0,np.max(df_well_data[elem])) #sets the limit to the maximum ax[count].locator_params(nbins=4,axis='x') #gives us 4 ticks marks on the x axies ax[0].set_ylim([250,0])

In [ ]:

Now as I am prepping for class I realize that it is hard to see all the points on the graph. So i think we shouldmake them transparent or open. Transparencey is controlled by alpha. alpha= and then a number.

Out[34]: (250, 0)

Page 25: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 25/43

In [60]: elems=['As','Fe','S','P','Na']fig, ax = plt.subplots(1, len(elems), sharey=True)fig.set_size_inches(12,6)for count,elem in enumerate(elems): ax[count].scatter(df_well_data[elem][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10] ,color='b',label='<10',alpha=0.5) ax[count].scatter(df_well_data[elem][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],\ df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)] ,color='g',label='<10',alpha=0.5) ax[count].scatter(df_well_data[elem][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50] ,color='r',label='<10',alpha=0.5) ax[count].set_xlabel(elem) #labels with the element ax[count].set_xlim(0,np.max(df_well_data[elem])) #sets the limit to the maximum ax[count].locator_params(nbins=4,axis='x') #gives us 4 ticks marks on the x axies ax[0].set_ylim([250,0])

Out[60]: (250, 0)

Page 26: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 26/43

After looking at the graph I don't like the transparency. I think it is sort of ugly. so instead lets try open circlesinstead. To do this you use the key words

facecolor='none' this is saying no colors filling in. You could set this color.

edgecolors= This sets the edgecolor instead of the whole color

We almost forgot a y label!

This is the link for tick_params. https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.axes.Axes.tick_params.html(https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.axes.Axes.tick_params.html) I used it to make the fontslarger. I think that looks nicer

Page 27: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 27/43

In [55]: elems=['As','Fe','S','P','Na']fig, ax = plt.subplots(1, len(elems), sharey=True)fig.set_size_inches(12,6)for count,elem in enumerate(elems): ax[count].scatter(df_well_data[elem][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10] ,color='b',label='<10',facecolor='none') ax[count].scatter(df_well_data[elem][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],\ df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)] ,color='g',label='<10',facecolor='none') ax[count].scatter(df_well_data[elem][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50] ,color='r',label='<10',facecolor='none') ax[count].set_xlabel(elem,fontsize=14) #labels with the element ax[count].set_xlim(0,np.max(df_well_data[elem])) #sets the limit to the maximum ax[count].locator_params(nbins=4,axis='x') #gives us 4 ticks marks on the x axies ax[count].tick_params(axis = 'both', which = 'major', labelsize = 14) ax[0].set_ylim([250,0])ax[0].set_ylabel('Depth (Feet)',fontsize=14)

Out[55]: Text(0,0.5,'Depth (Feet)')

Page 28: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 28/43

That appears nicer!

I still forgot one thing. If you want to publish it you need to labelyour graphs a,b,c, etc.We will add these the same way we added a legend. instead we will do a letter.

1. define your props-properties for your bbox-bounding box.2. define your letter. this is weird. You are going to take the ascii number that is A. ord('A') then add our counter

to it and turn it back to a letter. Adding a 0 is A, adding a 1 is a B.3. Then we will place our letters ising ax.text. Remember we have to transfrom from absolute axis to percent

axis. That is what transform does.

Page 29: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 29/43

In [59]: props=dict(boxstyle='round',facecolor='wheat',alpha=0.5)elems=['As','Fe','S','P','Na']fig, ax = plt.subplots(1, len(elems), sharey=True)fig.set_size_inches(12,6)for count,elem in enumerate(elems): ax[count].scatter(df_well_data[elem][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10] ,color='b',label='<10',facecolor='none') ax[count].scatter(df_well_data[elem][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],\ df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)] ,color='g',label='<10',facecolor='none') ax[count].scatter(df_well_data[elem][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50] ,color='r',label='<10',facecolor='none') ax[count].set_xlabel(elem,fontsize=14) #labels with the element ax[count].set_xlim(0,np.max(df_well_data[elem])) #sets the limit to the maximum ax[count].locator_params(nbins=4,axis='x') #gives us 4 ticks marks on the x axies ax[count].tick_params(axis = 'both', which = 'major', labelsize = 14) ax[count].text(0.05,0.98,chr(count + ord('A')),transform=ax[count].transAxes,fontsize=14,verticalalignment='top',bbox=props) ax[0].set_ylim([250,0])ax[0].set_ylabel('Depth (Feet)',fontsize=14)

Answers

Out[59]: Text(0,0.5,'Depth (Feet)')

Page 30: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 30/43

In [61]: i='Drink'df_well_data[i].dtype

In [72]: for i in df_well_data: print (i,' is of type ',df_well_data[i].dtype)

In [73]: for i in df_well_data: print ('{:7s} is of type {:7s}'.format(i,str(df_well_data[i].dtype)))

Out[61]: dtype('O')

Well_ID is of type int64 Lat is of type float64 Lon is of type float64 Depth is of type int64 Drink is of type object Si is of type float64 P is of type float64 S is of type float64 Ca is of type float64 Fe is of type float64 Ba is of type float64 Na is of type float64 Mg is of type float64 K is of type float64 Mn is of type float64 As is of type float64 Sr is of type float64 F is of type float64 Cl is of type float64 SO4 is of type float64 Br is of type float64

Well_ID is of type int64 Lat is of type float64 Lon is of type float64 Depth is of type int64 Drink is of type object Si is of type float64 P is of type float64 S is of type float64 Ca is of type float64 Fe is of type float64 Ba is of type float64 Na is of type float64 Mg is of type float64 K is of type float64 Mn is of type float64 As is of type float64 Sr is of type float64 F is of type float64 Cl is of type float64 SO4 is of type float64 Br is of type float64

Page 31: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 31/43

In [74]: i='As'

fig,ax=plt.subplots()

#plot <10ax.scatter(df_well_data[i][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10')#plot 10-50ax.scatter(df_well_data[i][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],\ df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],color='g',label='10-50')#plot<50ax.scatter(df_well_data[i][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='>50')

ax.xaxis.set_label_position('top')ax.xaxis.tick_top()ax.set_xlabel('Arsenic ppb')ax.set_ylabel('Depth ft')ax.set_xlim([0,800])ax.set_ylim([250,0])ax.legend(loc=4, fancybox=True, framealpha=0.5)

Out[74]: <matplotlib.legend.Legend at 0x2091564d908>

Page 32: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 32/43

In [75]: i='As'fig,ax=plt.subplots(1,1)ax.scatter(df_well_data[i][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10')ax.scatter(df_well_data[i][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)] ,df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],color='g',label='10-50')ax.scatter(df_well_data[i][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='>50')ax.xaxis.set_label_position('top')ax.xaxis.tick_top()ax.set_xlabel('Arsenic ppb')ax.set_ylabel('Depth ft')ax.set_xlim([0,800])ax.set_ylim([250,0])ax.legend(loc=4, fancybox=True, framealpha=0.5)

Out[75]: <matplotlib.legend.Legend at 0x209156b2be0>

Page 33: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 33/43

In [80]: i='Fe'fig,ax=plt.subplots(1,1)ax.scatter(df_well_data[i][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10')ax.scatter(df_well_data[i][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)] ,df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],color='g',label='10-50')ax.scatter(df_well_data[i][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='>50')ax.xaxis.set_label_position('top')ax.xaxis.tick_top()ax.set_xlabel('Arsenic ppb')ax.set_ylabel('Depth ft')ax.set_xlim([0,800])ax.set_ylim([250,0])plt.legend(loc=4, fancybox=True, framealpha=0.5)

Out[80]: <matplotlib.legend.Legend at 0x20916893cf8>

Page 34: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 34/43

In [83]: i='Fe'fig,ax=plt.subplots(1,1)ax.scatter(df_well_data[i][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10')ax.scatter(df_well_data[i][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)] ,df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],color='g',label='10-50')ax.scatter(df_well_data[i][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='>50')ax.xaxis.set_label_position('top')ax.xaxis.tick_top()ax.set_xlabel(i)ax.set_ylabel('Depth ft')ax.set_xlim([0,df_well_data[i].max()])ax.set_ylim([250,0])plt.legend(loc=4, fancybox=True, framealpha=0.5)

Out[83]: <matplotlib.legend.Legend at 0x209169d9c50>

Page 35: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 35/43

In [86]: i='Ca'fig,ax=plt.subplots(1,1)ax.scatter(df_well_data[i][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10')ax.scatter(df_well_data[i][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)] ,df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],color='g',label='10-50')ax.scatter(df_well_data[i][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='>50')ax.xaxis.set_label_position('top')ax.xaxis.tick_top()ax.set_xlabel(i)ax.set_ylabel('Depth ft')ax.set_xlim([0,df_well_data[i].max()])ax.set_ylim([250,0])plt.legend(loc=4, fancybox=True, framealpha=0.5)

Out[86]: <matplotlib.legend.Legend at 0x20916b11e80>

Page 36: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 36/43

In [89]: pp = PdfPages('filetest3.pdf')fig,ax=plt.subplots(1,1)for i in df_well_data:

ax.scatter(df_well_data[i][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10') ax.scatter(df_well_data[i][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)] ,df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],color='g',label='10-50') ax.scatter(df_well_data[i][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='>50') ax.xaxis.set_label_position('top') ax.xaxis.tick_top() ax.set_xlabel(i) ax.set_ylabel('Depth ft') ax.set_xlim([0,df_well_data[i].max()]) ax.set_ylim([250,0]) plt.legend(loc=4, fancybox=True, framealpha=0.5) pp.savefig()

pp.close()

Page 37: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 37/43

---------------------------------------------------------------------------TypeError Traceback (most recent call last) ~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\nanops.py in f(values, axis, skipna, **kwds) 127 else:--> 128 result = alt(values, axis=axis, skipna=skipna, **kwds) 129 except Exception:

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\nanops.py in reduction(values, axis, skipna) 506 else:--> 507 result = getattr(values, meth)(axis) 508

~\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\core\_methods.pyin _amax(a, axis, out, keepdims, initial) 27 initial=_NoValue): ---> 28 return umr_maximum(a, axis, None, out, keepdims, initial) 29

TypeError: '>=' not supported between instances of 'str' and 'float'

During handling of the above exception, another exception occurred:

TypeError Traceback (most recent call last) <ipython-input-89-1ab337cf5ccd> in <module>() 11 ax.set_xlabel(i) 12 ax.set_ylabel('Depth ft')---> 13 ax.set_xlim([0,df_well_data[i].max()]) 14 ax.set_ylim([250,0]) 15 plt.legend(loc=4, fancybox=True, framealpha=0.5)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.pyin stat_func(self, axis, skipna, level, numeric_only, **kwargs) 9611 skipna=skipna) 9612 return self._reduce(f, name, axis=axis, skipna=skipna, -> 9613 numeric_only=numeric_only) 9614 9615 return set_function_name(stat_func, name, cls)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\series.py in _reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds) 3219 'numeric_only.'.format(name)) 3220 with np.errstate(all='ignore'):-> 3221 return op(delegate, skipna=skipna, **kwds) 3222 3223 return delegate._reduce(op=op, name=name, axis=axis, skipna=skipna,

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\nanops.py in f(values, axis, skipna, **kwds) 129 except Exception: 130 try:--> 131 result = alt(values, axis=axis, skipna=skipna, **kwds)

Page 38: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 38/43

132 except ValueError as e: 133 # we want to transform an object array

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\nanops.py in reduction(values, axis, skipna) 505 result = np.nan 506 else:--> 507 result = getattr(values, meth)(axis) 508 509 result = _wrap_results(result, dtype)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\core\_methods.pyin _amax(a, axis, out, keepdims, initial) 26 def _amax(a, axis=None, out=None, keepdims=False, 27 initial=_NoValue): ---> 28 return umr_maximum(a, axis, None, out, keepdims, initial) 29 30 def _amin(a, axis=None, out=None, keepdims=False,

TypeError: '>=' not supported between instances of 'str' and 'float'

Page 39: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 39/43

In [93]: pp = PdfPages('filetest4.pdf')fig,ax=plt.subplots(1,1)for i in df_well_data: print( i) if df_well_data[i].dtype==float: ax.scatter(df_well_data[i][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10') ax.scatter(df_well_data[i][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)] ,df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],color='g',label='10-50') ax.scatter(df_well_data[i][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='>50') ax.xaxis.set_label_position('top') ax.xaxis.tick_top() ax.set_xlabel(i) ax.set_ylabel('Depth ft') ax.set_xlim([0,df_well_data[i].max()]) ax.set_ylim([250,0]) plt.legend(loc=4, fancybox=True, framealpha=0.5) pp.savefig() ax.cla()

pp.close()

Page 40: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 40/43

Well_ID Lat Lon Depth Drink Si P S Ca Fe Ba Na Mg K Mn As Sr F Cl SO4 Br

Page 41: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 41/43

In [95]: j=['As','Fe','S']f, ax = plt.subplots(1, len(j), sharey=True)

for i in range(len(j)):

ax[i].scatter(df_well_data[j[i]][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10') ax[i].scatter(df_well_data[j[i]][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)] ,df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],color='g',label='10-50') ax[i].scatter(df_well_data[j[i]][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='>50')

ax[0].set_ylim([250,0])

Out[95]: (250, 0)

Page 42: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 42/43

In [97]: j=['As','Fe','S']f, ax = plt.subplots(1, len(j), sharey=True)

for i in range(len(j)):

ax[i].scatter(df_well_data[j[i]][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10') ax[i].scatter(df_well_data[j[i]][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)] ,df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],color='g',label='10-50') ax[i].scatter(df_well_data[j[i]][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='>50') ax[i].set_xlabel(j[i]) ax[i].set_xlim([0,df_well_data[j[i]].max()])

ax[0].set_ylim([250,0])ax[0].set_ylabel('Depth ft')

Out[97]: Text(0,0.5,'Depth ft')

Page 43: A u t o m a t e d P a n d a s€¦ · 9/7/2020 pandas_well_data_part2 localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false …

9/7/2020 pandas_well_data_part2

localhost:8888/nbconvert/html/python/fall20/BigDataPython/pandas_well_data_part2.ipynb?download=false 43/43

In [98]: j=['As','Fe','S','P','Na']f, ax = plt.subplots(1, len(j), sharey=True)

for i in range(len(j)):

ax[i].scatter(df_well_data[j[i]][df_well_data.As<10],df_well_data.Depth[df_well_data.As<10],color='b',label='<10') ax[i].scatter(df_well_data[j[i]][np.logical_and(df_well_data.As>=10,df_well_data.As<=50)] ,df_well_data.Depth[np.logical_and(df_well_data.As>=10,df_well_data.As<=50)],color='g',label='10-50') ax[i].scatter(df_well_data[j[i]][df_well_data.As>50],df_well_data.Depth[df_well_data.As>50],color='r',label='>50') ax[i].set_xlabel(j[i]) ax[i].set_xlim([0,df_well_data[j[i]].max()])

ax[0].set_ylim([250,0])ax[0].set_ylabel('Depth ft')

In [ ]:

Out[98]: Text(0,0.5,'Depth ft')