Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic...
Transcript of Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic...
![Page 1: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/1.jpg)
Endless Forms Most Beautiful: Creating Customized Data Visualizations with ggplot2
Lisa Federer, PhD, MLISData Science and Open Science Librarian
Office of Strategic InitiativesNational Library of MedicineNational Institutes of Health
![Page 2: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/2.jpg)
Workshop overview
The Grammar of Graphics: components of visualizations
Practical considerations and design choices
Creating plots in RStudio with ggplot2
Your questions
![Page 3: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/3.jpg)
The Grammar of Graphics
“A language consisting of words and no grammar expresses only as many ideas as there are words. By specifying how words are combined in statements, a grammar expands a language’s scope…The grammar of graphics takes us beyond a limited set of charts (words) to an almost unlimited world of graphical forms (statements).”
![Page 4: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/4.jpg)
Grammar of Graphics “parts of speech”
• Data: what is being visualized.
• Mappings: mappings between variables in the data and components of the chart.
• Geometric Objects: geometric objects that are used to display the data, such as points, lines, or shapes.
• Aesthetic Properties: qualities about geometric objects that convey details about the data
• Scales: control how variables are mapped to aesthetics.
• Coordinates: describe how data is mapped to the plot
• Statistical Transformations: applied to the data to summarize it.
• Facets: describe how the data is partitioned into subsets and how these different subsets are plotted.
![Page 5: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/5.jpg)
data
mappings geometric objects
aesthetics
scales
coordinates
facets
![Page 6: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/6.jpg)
From code to chart
diamonds %>% ggplot(aes(x = price, y = carat, col = color, size = clarity)) + geom_point(stat = "unique") + coord_cartesian(xlim = c(0,20000)) + xlab("Price, US $") + ylab("Carat") + ggtitle("Prices and Characteristics of Round Cut Diamonds") + facet_wrap(~cut, nrow=1) + scale_colour_brewer(palette = "YlOrRd")
![Page 7: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/7.jpg)
From code to chart: data
diamonds %>% ggplot(aes(x = price, y = carat, col = color, size = clarity)) + geom_point(stat = "unique") + coord_cartesian(xlim = c(0,20000)) + xlab("Price, US $") + ylab("Carat") + ggtitle("Prices and Characteristics of Round Cut Diamonds") + facet_wrap(~cut, nrow=1) + scale_colour_brewer(palette = "YlOrRd")
![Page 8: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/8.jpg)
diamonds %>% ggplot(aes(x = price, y = carat, col = color)) + geom_point(stat = "unique") + coord_cartesian(xlim = c(0,20000)) + xlab("Price, US $") + ylab("Carat") + ggtitle("Prices and Characteristics of Round Cut Diamonds") + facet_wrap(~cut, nrow=1) + scale_colour_brewer(palette = "YlOrRd")
From code to chart: mappings
![Page 9: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/9.jpg)
From code to chart: geometric objects
diamonds %>% ggplot(aes(x = price, y = carat, col = color, size = clarity)) + geom_point(stat = "unique") + coord_cartesian(xlim = c(0,20000)) + xlab("Price, US $") + ylab("Carat") + ggtitle("Prices and Characteristics of Round Cut Diamonds") + facet_wrap(~cut, nrow=1) + scale_colour_brewer(palette = "YlOrRd")
![Page 10: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/10.jpg)
From code to chart: aesthetic properties
diamonds %>% ggplot(aes(x = price, y = carat, col = color, size = clarity)) + geom_point(stat = "unique") + coord_cartesian(xlim = c(0,20000)) + xlab("Price, US $") + ylab("Carat") + ggtitle("Prices and Characteristics of Round Cut Diamonds") + facet_wrap(~cut, nrow=1) + scale_colour_brewer(palette = "YlOrRd")
![Page 11: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/11.jpg)
diamonds %>% ggplot(aes(x = price, y = carat, col = color, size = clarity)) + geom_point(stat = "unique") + coord_cartesian(xlim = c(0,20000)) + xlab("Price, US $") + ylab("Carat") + ggtitle("Prices and Characteristics of Round Cut Diamonds") + facet_wrap(~cut, nrow=1) + scale_colour_brewer(palette = "YlOrRd")
From code to chart: scales
![Page 12: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/12.jpg)
diamonds %>% ggplot(aes(x = price, y = carat, col = color, size = clarity)) + geom_point(stat = "unique") + coord_cartesian(xlim = c(0,20000)) + xlab("Price, US $") + ylab("Carat") + ggtitle("Prices and Characteristics of Round Cut Diamonds") + facet_wrap(~cut, nrow=1) + scale_colour_brewer(palette = "YlOrRd")
From code to chart: coordinates
![Page 13: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/13.jpg)
diamonds %>% ggplot(aes(x = price, y = carat, col = color, size = clarity)) + geom_point(stat = "unique") + coord_cartesian(xlim = c(0,20000)) + xlab("Price, US $") + ylab("Carat") + ggtitle("Prices and Characteristics of Round Cut Diamonds") + facet_wrap(~cut, nrow=1) + scale_colour_brewer(palette = "YlOrRd")
From code to chart: facets
![Page 14: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/14.jpg)
Practical considerations and design choices
Working effectively with color and chart choices
![Page 15: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/15.jpg)
Pre‐attentive processing
Differences in hue Differences in shape
![Page 16: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/16.jpg)
Perceptual tasks
From Alberto Cairo, The Functional ArtAdaptation of Cleveland and McGill’s scale from “Graphical Perception: Theory, Experimentation and Application to the Development of Graphical Methods,” available at https://web.cs.dal.ca/~sbrooks/csci4166‐6406/seminars/readings/Cleveland_GraphicalPerception_Science85.pdf
![Page 17: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/17.jpg)
Design for ease of perceptual processing
![Page 18: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/18.jpg)
Colorspaces (ggplot default = RGB)
![Page 19: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/19.jpg)
Greyscale (“photocopy safe”)
![Page 20: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/20.jpg)
Greyscale – nope!
![Page 21: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/21.jpg)
Color blindness
![Page 22: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/22.jpg)
http://www.vischeck.com/
vischeck
![Page 23: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/23.jpg)
Named colors in R
http://www.stat.columbia.edu/~tzheng/files/Rcolor.pdf
![Page 24: Endless Forms Most Beautiful: Creating Customized Data … · 2019. 9. 11. · Office of Strategic Initiatives National Library of Medicine National Institutes of Health. Workshop](https://reader035.fdocuments.net/reader035/viewer/2022071219/605505aeb3773e07795a7b22/html5/thumbnails/24.jpg)
Color Brewer palettes