Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh,...
Transcript of Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh,...
![Page 1: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/1.jpg)
Interactive Data Visualization
11/19/19Mark Grivainis
![Page 2: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/2.jpg)
Overview
What is Interactive Data Visualization
Common Interactive Visualization Techniques
What Tools Exist for Interactive Visualization
Working with Bokeh
![Page 3: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/3.jpg)
What is Interactive Data Visualization
Interactive Data Visualization allows for real time queries to be made on plots
The underlying visualizations tend to be standard figures - bar plots, scatter plots, heatmaps etc.
Adding interactions allow for data to be explored more thoroughly
You would want to start with a solid static figure before adding interactions
![Page 4: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/4.jpg)
Different Types of Interaction
Identification (Hovering)
Scaling (Zooming)
Selection (Brushing)
Linking
![Page 5: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/5.jpg)
Available Tools for Interactive Visualization
Python: Bokeh, Plotly, Matplotlib
R: Shiny
Javascript: D3
Most of these tools rely on HTML and Javascript for rendering of plots
If you want to create a non standard plot:
Learn Javascript
Use D3
![Page 6: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/6.jpg)
What is Bokeh
Interactive visualization library for Python
Works with large datasets
Simplifies the process of creating:
Interactive plots
Dashboards
Data applications
![Page 7: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/7.jpg)
Installing Bokeh
Bokeh is not part of the Python Standard Library
It can be installed using pip or conda (conda is prefered)
conda install bokeh
You can either install into your base environment or create a new environment
conda create -n vis python=3.6 bokeh jupyter pandas numpy
![Page 8: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/8.jpg)
Using Bokeh Output
Bokeh has three output modes:
Server Mode
Static HTML- output_file()
Notebook- output_notebook()
https://docs.bokeh.org/en/1.4.0/docs/reference/server.html?highlight=server#module-bokeh.server
![Page 9: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/9.jpg)
Defining a figure
from bokeh.plotting import figure, showfrom bokeh.io import output_notebook
output_notebook()
p = figure()
show(p)
![Page 10: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/10.jpg)
Bokeh Input Data
Providing Data Directly
1. from bokeh.plotting import figure, show2. from bokeh.io import output_notebook3.4. output_notebook()5.6. x_values = [1, 2, 3, 4, 5]7. y_values = [6, 7, 2, 3, 6]8.9. p = figure()
10.11. p.scatter(x=x_values, y=y_values)12. show(p)
![Page 11: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/11.jpg)
Using ColumnDataSource
1. from bokeh.plotting import figure, show2. from bokeh.io import output_notebook3. from bokeh.models import ColumnDataSource4.5. output_notebook()6.7. data = {'x_values': [1, 2, 3, 4, 5],8. 'y_values': [6, 7, 2, 3, 6]}9.
10. source = ColumnDataSource(data=data)11.12. p = figure()13. p.scatter(x='x_values', y='y_values', source=source)14. show(p)
![Page 12: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/12.jpg)
Using ColumnDataSource and Pandas
1. from bokeh.plotting import figure, show2. from bokeh.io import output_notebook3. from bokeh.models import ColumnDataSource4. import pandas as pd5.6. output_notebook()7.8. data = {'x_values': [1, 2, 3, 4, 5],9. 'y_values': [6, 7, 2, 3, 6]}
10.11. df = pd.DataFrame.from_dict(data)12.13. source = ColumnDataSource(df)14.15. p = figure()16. p.scatter(x='x_values', y='y_values', source=source)17. show(p)
![Page 13: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/13.jpg)
Built in Plot Types
line multiline vbar scatter
hbar image hex_tile
A full list is available in the documentation here
There are no prebuilt statistical plots
Eg: Boxplot, heatmaps
Many of these plots are not complicated to generate
Build your own package defining them that can be used across projects
![Page 14: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/14.jpg)
Adding Hover Functionality1. from bokeh.plotting import figure, show2. from bokeh.io import output_notebook3.4. output_notebook()5.6. source = ColumnDataSource(data=dict(7. x=[1, 2, 3, 4, 5],8. y=[2, 5, 8, 2, 7],9. desc=['A', 'b', 'C', 'd', 'E'],
10. ))11.12. TOOLS = 'hover,pan'13.14. p = figure(tools=TOOLS, tooltips=TOOLTIPS)15. p.scatter('x', 'y', source=source)16. show(p)
![Page 15: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/15.jpg)
Adding Hover Functionality1. from bokeh.plotting import figure, show2. from bokeh.io import output_notebook3.4. output_notebook()5.6. source = ColumnDataSource(data=dict(7. x=[1, 2, 3, 4, 5],8. y=[2, 5, 8, 2, 7],9. desc=['A', 'b', 'C', 'd', 'E'],
10. ))11.12. TOOLS = 'hover,pan'13. TOOLTIPS = [14. ("index", "$index"),15. ("(x,y)", "($x, $y)"),16. ("desc", "@desc"),17. ]18.19. p = figure(tools=TOOLS, tooltips=TOOLTIPS)20. p.scatter('x', 'y', source=source)21. show(p)
![Page 16: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/16.jpg)
The autompg Dataframe
mpg cyl displ hp weight accel yr origin name18.0 8 307.0 130 3504 12.0 70 1 chevrolet chevelle malibu15.0 8 350.0 165 3693 11.5 70 1 buick skylark 32018.0 8 318.0 150 3436 11.0 70 1 plymouth satellite16.0 8 304.0 150 3433 12.0 70 1 amc rebel sst17.0 8 302.0 140 3449 10.5 70 1 ford torino ... ... ... ... ... ... .. ... ...27.0 4 140.0 86 2790 15.6 82 1 ford mustang gl44.0 4 97.0 52 2130 24.6 82 2 vw pickup32.0 4 135.0 84 2295 11.6 82 1 dodge rampage28.0 4 120.0 79 2625 18.6 82 1 ford ranger31.0 4 119.0 82 2720 19.4 82 1 chevy s-10
![Page 17: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/17.jpg)
Summarizing a Dataframe
from bokeh.sampledata.autompg import autompg as df
mpg = df.groupby('cyl').describe()['mpg']acc = df.groupby('cyl').describe()['accel']
print(mpg.to_string(max_rows=10)print(acc.to_string(max_rows=10)
mpg count mean std min 25% 50% 75% maxcyl 3 4.0 20.550000 2.564501 18.0 18.75 20.25 22.05 23.74 199.0 29.283920 5.670546 18.0 25.00 28.40 32.95 46.65 3.0 27.366667 8.228204 20.3 22.85 25.40 30.90 36.46 83.0 19.973494 3.828809 15.0 18.00 19.00 21.00 38.08 103.0 14.963107 2.836284 9.0 13.00 14.00 16.00 26.6
accel count mean std min 25% 50% 75% maxcyl 3 4.0 13.250000 0.500000 12.5 13.25 13.5 13.5 13.54 199.0 16.581910 2.383185 11.6 14.80 16.2 18.0 24.85 3.0 18.633333 2.369247 15.9 17.90 19.9 20.0 20.16 83.0 16.254217 2.031778 11.3 15.05 16.0 17.6 21.08 103.0 12.955340 2.224759 8.0 11.50 13.0 14.0 22.2
![Page 18: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/18.jpg)
ColumnDataSource on a Group
from bokeh.sampledata.autompg import autompg as df
df['yr'] = df['yr'].astype(str)group = df.groupby('yr')source = ColumnDataSource(group)print(source.to_df().to_string(max_cols=10, index=False, max_rows=6))
yr mpg_count mpg_mean mpg_std mpg_min accel_min ... accel_25% accel_50% accel_75% accel_max 70 29.0 17.689655 5.339231 9.0 8.0 ... 10.000 12.5 15.000 20.5 71 27.0 21.111111 6.675635 12.0 11.5 ... 13.250 14.5 15.500 20.5 72 28.0 18.714286 5.435529 11.0 11.0 ... 13.375 14.5 16.625 23.5.. ... ... ... ... ... ... ... ... ... ... 80 27.0 33.803704 6.885854 19.1 11.4 ... 15.150 16.5 18.750 23.7 81 28.0 30.185714 5.635319 17.6 12.6 ... 14.700 16.3 17.425 20.7 82 30.0 32.000000 5.232524 22.0 11.6 ... 14.775 16.3 17.900 24.6
![Page 19: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/19.jpg)
Categorical Data
from bokeh.sampledata.autompg import autompg as dffrom bokeh.plotting import figure, showfrom bokeh.models import ColumnDataSourcefrom bokeh.io import output_notebook
output_notebook()
df['yr'] = df['yr'].astype(str)group = df.groupby('yr')source = ColumnDataSource(group)p = figure(x_range=group)p.vbar(x='yr',
top='mpg_mean', width=0.8, source=source)
show(p)
![Page 20: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/20.jpg)
Coloring Plotsfrom bokeh.sampledata.autompg import autompg as dffrom bokeh.plotting import figure, showfrom bokeh.models import ColumnDataSourcefrom bokeh.io import output_notebookfrom bokeh.palettes import d3from bokeh.transform import factor_cmap
output_notebook()
df['yr'] = df['yr'].astype(str)group = df.groupby('yr')source = ColumnDataSource(group)
fm = factor_cmap('yr', palette=d3['Category20'][13], factors=df['yr'].unique())
p = figure(x_range=group)p.vbar(x='yr', top='mpg_mean', width=0.8, color=fm, source=source)show(p)
![Page 21: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/21.jpg)
Gridsfrom bokeh.sampledata.autompg import autompg as dffrom bokeh.plotting import figure, showfrom bokeh.layouts import column, gridplotfrom bokeh.models import ColumnDataSource, Gridfrom bokeh.io import output_notebookfrom itertools import product
def build_figure(title, x_lab, y_lab, source): p = figure(plot_width=300, plot_height=300) p.scatter(x=x_lab, y=y_lab, source=source) p.xaxis.axis_label = x_lab p.yaxis.axis_label = y_lab return p
output_notebook()
COMPARE = ['mpg', 'hp', 'weight']source = ColumnDataSource(df[COMPARE])GRID_W = len(COMPARE)
plots = [build_figure('', x, y, source) for y, x in product(COMPARE, repeat=2)]grid = gridplot(plots, ncols=GRID_W)
show(grid)
![Page 22: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/22.jpg)
Gridsfrom bokeh.sampledata.autompg import autompg as dffrom bokeh.plotting import figure, showfrom bokeh.layouts import column, gridplotfrom bokeh.models import ColumnDataSource, Gridfrom bokeh.io import output_notebookfrom itertools import product
TOOLS = "box_select,lasso_select,help"
def build_figure(title, x_lab, y_lab, source): p = figure(plot_width=300, plot_height=300, tools=TOOLS) p.scatter(x=x_lab, y=y_lab, source=source) p.xaxis.axis_label = x_lab p.yaxis.axis_label = y_lab return p
output_notebook()
COMPARE = ['mpg', 'hp', 'weight']source = ColumnDataSource(df[COMPARE])GRID_W = len(COMPARE)
plots = [build_figure('', x, y, source) for y, x in product(COMPARE, repeat=2)]grid = gridplot(plots, ncols=GRID_W)
show(grid)
![Page 23: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/23.jpg)
Linked Plots
This examples code was too long to put in a slide
https://demo.bokeh.org/selection_histogram
Source Code
https://demo.bokeh.org/selection_histogram
![Page 24: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/24.jpg)
Getting your Figure Online
It is easy to host static html content on GitHub
Use output_file(‘index.html’) to save your Bokeh plot as an html file- ‘index.html’ is always loaded by default, it must be the entry point- In this case it is easy as it is the only html file
Upload this file to the master branch of a GitHub repository
Navigate: Settings -> GitHub Pages -> set Source to ‘master branch’
Note: This will not work very well with datasets that are large as the data needs to be downloaded before it can be plotted
![Page 25: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/25.jpg)
Example: Bokeh Conda Environment
Open a Terminal window (Mac) or Anaconda Prompt (Windows)
conda create -n ivis python=3.6 bokeh jupyter numpy pandas
conda activate ivis
jupyter notebook
![Page 26: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/26.jpg)
Example: Building a Boxplot
https://en.wikipedia.org/wiki/Box_plot#/media/File:Boxplot_vs_PDF.svg
![Page 27: Interactive Data Visualizationfenyolab.org/presentations/Methods_2019/slides... · Python: Bokeh, Plotly, Matplotlib R: Shiny Javascript: D3 Most of these tools rely on HTML and Javascript](https://reader036.fdocuments.net/reader036/viewer/2022070804/5f035ff17e708231d408e6fb/html5/thumbnails/27.jpg)
References
https://www.knowablemagazine.org/article/mind/2019/science-data-visualization
http://docs.bokeh.org/en/1.3.2/index.html
http://docs.bokeh.org/en/1.3.2/docs/user_guide/data.html