Introduction to Data Visualization
description
Transcript of Introduction to Data Visualization
![Page 1: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/1.jpg)
1
Introduction to Data Visualization
CS 4390/5390 Fall 2014Shirley Moore, Instructor
svmoore.pbworks.com
August 25, 2014
![Page 2: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/2.jpg)
2
The Industrial Revolution of Big Data (Joe Hellerstein, 2008)
Image credit: http://www.uberb2b.com/b4b-presents-the-first-industry-4-0-mini-conference/
![Page 3: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/3.jpg)
3
How much data is out there?
![Page 4: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/4.jpg)
4
Image credit: http://www.opentracker.net/solutions/big-data-analytics
![Page 5: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/5.jpg)
5
The ability to take data – to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it – that’s going to be a hugely important skill in the next decades…
Because now we really do have essentially free and ubiquitous data. So the complementary scarce factor is the ability to understand that data and extract value from it.
Hal Varian, Google’s Chief Economist The McKinsey Quarterly, Jan 2009
![Page 6: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/6.jpg)
6
Data-Intensive ScienceImage credit: Synergistic Challenges in Data-Intensive Science and Exascale Computing, DOE
ASCAC Data Subcommittee Report, March 2013, available on science.energy.gov
![Page 7: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/7.jpg)
7
Computing with Big Data
Slide courtesy of Kathy Yelick
![Page 8: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/8.jpg)
8
Sources of Big Scientific Data
![Page 9: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/9.jpg)
9
How much scientific data?
• ATLAS and CMS experiments at the Large Hadron Collider enerate data at rates of petabytes per second running round the clock for a large fraction of the year.
• Climate simulations generating several petabytes of data per year.
• Computational biology and genomics producing extreme volumes of data for – biophysical simulations of cellular environments– cracking the code of the genome across species– correlating observational ecology and models of population
dynamics– reverse engineering the brain
![Page 10: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/10.jpg)
10
CERN Large Hadron ColliderImage credit: wiki.creativecommons.org/Case_Studies/CERN
![Page 11: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/11.jpg)
11
Higgs Boson Particle
Simulated data modeled forthe CMS particle detectoron the Large Hadron Collider(LHC) at CERN.
Here, following a collision of two protons, a Higgs boson is produced that decays intotwo jets of hadrons and twoelectrons.
Image credit:http://www.tacc.utexas.edu/news/feature-stories/2011/testing-technicolor-physics
![Page 12: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/12.jpg)
12
Data Collection Growth
![Page 13: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/13.jpg)
13
Why Use Visualization?
1. Cognition is limited.2. Memory is limited.
![Page 14: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/14.jpg)
14
Selective Attention
• The Door Studyhttps://www.youtube.com/watch?v=FWSxSQsspiQ• The Invisible Gorillawww.invisiblegorilla.com • Mueller and Krummenacher,
“Visual search and selective attention”, Visual Cognition 14: 389-410, 2006.
![Page 15: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/15.jpg)
15
Working Memory Capacity
• Test your working memory capacity– http://www.gocognitive.net/demo/working-memory-capacity
• The Magical Number Seven Plus or Minus Two– George Miller,
“The magical number seven, plus or minus two: Some limits on our capacity for processing information”, The Psychological Review 63: 81-97, 1956.
![Page 16: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/16.jpg)
16
Visualization Works By
• Using perception to point out interesting things
• Using pictures to enhance working memory
![Page 17: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/17.jpg)
17
How Many Letter V’s
MTHIVLWYADCEQGHKILKMTWYNARDCAIREQGHLVKMFPSTWYARNGFPSVCEILQGKMFPSNDRCEQDIFPSGHLMFHKMVPSTWYACEQTWRN
![Page 18: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/18.jpg)
18
How Many Letter V’s
MTHIVLWYADCEQGHKILKMTWYNARDCAIREQGHLVKMFPSTWYARNGFPSVCEILQGKMFPSNDRCEQDIFPSGHLMFHKMVPSTWYACEQTWRN
![Page 19: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/19.jpg)
19
Which Number Appears Most Often?
15 19 60 33 11 75 57 34 79 58 51 9273 22 13 71 60 22 72 10 68 73 18 5565 46 29
![Page 20: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/20.jpg)
20
Which Number Appears Most Often? (cont.)
60 73 22 46 92 97 10 58 4657 17 83 26 99 33 88 92 60 91 29 57 96 12 47
![Page 21: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/21.jpg)
21
Class Exercise 1
• Can you devise a visualization that makes this task easier?
![Page 22: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/22.jpg)
22
Definition of Visualization
• http://www.merriam-webster.com/dictionary/visualization 1. formation of mental visual images2. the act or process of interpreting in visual terms or of
putting into visible form
“Computer-based visualization systems provide visual representations of datasets intended to help people carry out some task more effectively.” --Tamara Munzner
![Page 23: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/23.jpg)
23
Class Exercise 2
• How does triglyceride level vary by age and income level?
TRIGLYCERIDE LEVEL
• Can you devise a visualization that enables seeing the answer at a glance?
Males Females
Income Under 65 65 or Over Under 65 65 or Over
0-$24,999 250 200 275 450
$25,000+ 320 150 400 200
![Page 24: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/24.jpg)
24
Purposes of Visualization
• Answer questions• Generate hypotheses• Make decisions• See data in context• Expand memory• Support computational analysis• Find patterns• Tell a story• Inspire
![Page 25: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/25.jpg)
25
Visualization Critique
• What is the purpose of the visualization?• What techniques are used? • Are the techniques used effectively?• Does the visualization focus our attention on
important aspects of the data?• Does the visualization accomplish its purpose?• How could the visualization be improved?
![Page 26: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/26.jpg)
26
Exemplar
• Hans Rosling TED Talk– https://www.youtube.com/watch?v=usdJgEwMinM
• Homework assignment #1– Write a critique of a visualization from this video– Bring to class to share on Wednesday, Sept 3
![Page 27: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/27.jpg)
27
Course Logistics
• Course website: http://svmoore.pbworks.com/ (click on Data Visualization)• Instructor: Shirley Moore– Office: CCSB 3.0422– Office hours: MW 3:00-4:00pm, others by appointment
• Teaching assistant: Henry Moncada– Office: CCSB 3.1202H– Office hours:
![Page 28: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/28.jpg)
28
Grading
• Grades will be posted on Blackboard• Approximate breakdown– 35% homework and lab assignments– 15% class preparation and participation– 25% course exam– 25% project
![Page 29: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/29.jpg)
29
Textbooks
• Visualization Design and Analysis: Abstractions, Principles, and Methods by Tamara Munzner, AK Peters 2014 (to appear). Draft available at http://www.cs.ubc.ca/~tmm/courses/533/book/vispmp-draft.pdf
• Visual Thinking for Design by Colin Ware, Morgan Kaufman, 2008.
• Visualizing Data: Exploring and Explaining Data with the Processing Environment, by Ben Fry, O’Reilly, 2007.
• The ParaView Tutorial, Version 4.1, by Kenneth Moreland, Sandia National Lab, 2013. http://www.paraview.org/Wiki/The_ParaView_Tutorial
![Page 30: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/30.jpg)
30
Software
• Processing– processing.org– Programming language and development environment for
information visualization applications– Download and install on your laptop
• ParaView– www.paraview.org– Open-source data analysis and visualization application– Developed to analyze extremely large datasets using
distributed computing resources• Others to be determined
![Page 31: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/31.jpg)
31
Course Project
• Implementation of a visualization for a significant dataset– You may choose your own dataset or use one
provided by the instructor• Report describing background and design
decisions• Presentation during last week of class or final
exam period (or special poster/demo session?)• Work individually or in teams of up to three
![Page 32: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/32.jpg)
32
Connections to Other Courses
• CPS 5310 Simulation and Modeling• CS 5334 Parallel & Concurrent Programming• CS 4342 Database Management• CS 4390/5339 Web-based Systems• CS 4317/5317 Human-Computer Interaction• CS 3370 Computer Graphics• Graphic Design • Psychology
![Page 33: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/33.jpg)
33
Why Take This Course
• Build more complex and insightful visualizations than your current skills and tools allow
• Learn how to effectively communication information about complex data to others
• Ask questions about, explore, and understand data for your job or research
• The set of people who need to be able to visualize data is growing beyond experts in visualization.
![Page 34: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/34.jpg)
34
Opportunity: Vizzies Visualization Challenge
• http://www.nsf.gov/news/special_reports/scivis/• Sponsored by National Science Foundation and Popular Science• Deadline September 30, 2014• Cash prizes• Categories
– Photography– Illustration– Posters and graphics– Games and apps– Videos
• Extra credit and/or use as basis for project for this class
![Page 35: Introduction to Data Visualization](https://reader036.fdocuments.net/reader036/viewer/2022062408/56813df1550346895da7cc4e/html5/thumbnails/35.jpg)
35
Preparation for Next Class
• Read Munzner Chapter 1• Readings on Visualization Techniques• Download Processing software and install on
your computer (see me if you don’t have a computer you can use for this)– Will start using Processing second week of class– Will also get Processing installed on lab computers
• Start on Homework Assignment 1