Chapter 8 - Graphics

19
Chapter 8 - Graphics Exercises - Alternative with SIMD Brian Fogarty 24 August 2018 Contents EXERCISE I - BAR PLOTS 1 ANSWERS FOR EXERCISE I 1 Question 1.1 ................................................ 1 Question 1.2 ................................................ 2 Question 1.3 ................................................ 3 Question 1.4 ................................................ 4 Question 1.5 ................................................ 5 EXERCISE II - HISTOGRAMS 6 ANSWERS FOR EXERCISE II 6 Question 2.1 ................................................ 6 Question 2.2 ................................................ 7 Question 2.3 ................................................ 8 EXERCISE III - SCATTERPLOTS 9 ANSWERS FOR EXERCISE III 10 Question 3.1 ................................................ 10 Question 3.2 ................................................ 10 Question 3.3 ................................................ 11 Question 3.4 ................................................ 12 Question 3.5 ................................................ 13 Question 3.6 ................................................ 14 Question 3.7 ................................................ 15 Question 3.8 ................................................ 16 Question 3.9 ................................................ 17 EXERCISE I - BAR PLOTS Using the 2012 survey about smoking and drug use amongst English pupils (2012smokedrugs.dta), create the following bar plots using ggplot2 code. 1. Create a bar plot with grey bars using the recoded version of lifewant from Chapter 5, which we called lifewant2. Discuss what it shows you. 2. Create the same bar plot but now with red bars. 3. Create a bar plot with lifewant2 and free. You should label the values of free as 0=’0. Not Free Lunch’;1=’1. Free Lunch’. Discuss what it shows you. 4. Create the same bar plot but now with grey shade bars. 1

Transcript of Chapter 8 - Graphics

Page 1: Chapter 8 - Graphics

Chapter 8 - GraphicsExercises - Alternative with SIMD

Brian Fogarty24 August 2018

ContentsEXERCISE I - BAR PLOTS 1

ANSWERS FOR EXERCISE I 1Question 1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Question 1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Question 1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Question 1.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Question 1.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

EXERCISE II - HISTOGRAMS 6

ANSWERS FOR EXERCISE II 6Question 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Question 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Question 2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

EXERCISE III - SCATTERPLOTS 9

ANSWERS FOR EXERCISE III 10Question 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10Question 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10Question 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11Question 3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Question 3.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13Question 3.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Question 3.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15Question 3.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Question 3.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

EXERCISE I - BAR PLOTS

Using the 2012 survey about smoking and drug use amongst English pupils (2012smokedrugs.dta), createthe following bar plots using ggplot2 code.

1. Create a bar plot with grey bars using the recoded version of lifewant from Chapter 5, which wecalled lifewant2. Discuss what it shows you.

2. Create the same bar plot but now with red bars.

3. Create a bar plot with lifewant2 and free. You should label the values of free as 0=’0. Not FreeLunch’;1=’1. Free Lunch’. Discuss what it shows you.

4. Create the same bar plot but now with grey shade bars.

1

Page 2: Chapter 8 - Graphics

5. Create the same bar plot but now using dodging.

ANSWERS FOR EXERCISE I

setwd("C:/QSSD/Exercises/Chapter 8 - Exercises")getwd()

[1] "C:/QSSD/Exercises/Chapter 8 - Exercises"library(foreign)drugs <- read.dta("2012smokedrugs.dta", convert.factors=FALSE)

Question 1.1

library(car)

Loading required package: carDatadrugs$lifewant2 <- recode(drugs$lifewant, "1='1. Strongly Agree';2='2. Agree';

3='3. Neither Agree/Disagree';4='4. Disagree';5='5. Strongly Disagree'")

drugs1 <- subset(drugs, !is.na(lifewant2))

library(ggplot2)x11()ggplot(data = drugs1) +

geom_bar(mapping = aes(lifewant2))

2

Page 3: Chapter 8 - Graphics

0

1000

2000

1. Strongly Agree 2. Agree 3. Neither Agree/Disagree 4. Disagree 5. Strongly Disagree

lifewant2

coun

t

The bar plot shows that the modal response is that pupils “strongly agree” that they have the life they want.Meanwhile, the smallest category are pupils responding “strongly disagree” that they have the life they want.

Question 1.2

x11()ggplot(data = drugs1) +

geom_bar(mapping = aes(lifewant2), fill="red")

3

Page 4: Chapter 8 - Graphics

0

1000

2000

1. Strongly Agree 2. Agree 3. Neither Agree/Disagree 4. Disagree 5. Strongly Disagree

lifewant2

coun

t

Question 1.3

drugs$free1 <- recode(drugs$free, "0='0. Not Free Lunch';1='1. Free Lunch'")table(drugs$free1)

0. Not Free Lunch 1. Free Lunch6162 1194

drugs1 <- subset(drugs, !is.na(lifewant2) & !is.na(free1))

x11()ggplot(data = drugs1) +

geom_bar(mapping = aes(lifewant2, fill=free1))

4

Page 5: Chapter 8 - Graphics

0

1000

2000

1. Strongly Agree 2. Agree3. Neither Agree/Disagree4. Disagree5. Strongly Disagree

lifewant2

coun

t free1

0. Not Free Lunch

1. Free Lunch

The bar plot shows that whether a student receives a free lunch or not makes up roughly the same portion ofresponses for each category.

Question 1.4

x11()ggplot(data = drugs1) +

geom_bar(mapping = aes(lifewant2, fill=free1)) +scale_fill_grey()

5

Page 6: Chapter 8 - Graphics

0

1000

2000

1. Strongly Agree 2. Agree3. Neither Agree/Disagree4. Disagree5. Strongly Disagree

lifewant2

coun

t free1

0. Not Free Lunch

1. Free Lunch

Question 1.5

x11()ggplot(data = drugs1) +

geom_bar(mapping = aes(lifewant2, fill=free1), position="dodge")

6

Page 7: Chapter 8 - Graphics

0

500

1000

1500

2000

1. Strongly Agree 2. Agree3. Neither Agree/Disagree4. Disagree5. Strongly Disagree

lifewant2

coun

t free1

0. Not Free Lunch

1. Free Lunch

EXERCISE II - HISTOGRAMS

Using the 2016 SIMD (simd.csv), create the following histograms using ggplot2 code.

1. Create a histogram with using the variable pct_employment_deprived, the percentage of people in adatazone receiving financial support from the government due to being unable to work, and set thebinwidth to 1. Discuss what it shows you.

2. Create the same histogram but now with red bars.

3. Create a histogram with pct_employment_deprived and urban as the second variable. Discuss whatit shows you.

ANSWERS FOR EXERCISE II

Question 2.1

simd <- read.csv("simd.csv")

simd <- subset(simd, !is.na(pct_depress))

x11()

7

Page 8: Chapter 8 - Graphics

ggplot(data=simd) +geom_histogram(mapping = aes(pct_employment_deprived), binwidth=1)

0

200

400

0 20 40

pct_employment_deprived

coun

t

The histogram shows that most datazones are clustered around 5% percent of employment deprivation.

Question 2.2

x11()ggplot(data=simd) +

geom_histogram(mapping = aes(pct_employment_deprived), binwidth=1, fill="red")

8

Page 9: Chapter 8 - Graphics

0

200

400

0 20 40

pct_employment_deprived

coun

t

Question 2.3

library(car)simd$urban <- recode(simd$urban, "0='Rural';1='Urban'",as.factor=TRUE)class(simd$urban)

[1] "factor"x11()ggplot(data=simd) +

geom_histogram(mapping = aes(pct_employment_deprived, fill=urban),binwidth=1)

9

Page 10: Chapter 8 - Graphics

0

200

400

0 20 40

pct_employment_deprived

coun

t urban

Rural

Urban

The histogram shows that urban datazones tend to have percentages of employment deprivation than ruraldatazones. The datazones with the highest percentages of employment deprivation mostly appear to beurban datazones.

EXERCISE III - SCATTERPLOTS

Using the 2016 SIMD data (simd.csv), create the following scatterplots using ggplot2 code.

1. Create a scatterplot with black points with pct_school_attend on the x-axis, pct_employment_deprivedon the y-axis, and include position="jitter". Discuss what it shows you.

2. Create the same scatterplot, but with red instead of black data points.

3. Create the same scatterplot using black points and alpha=1/10 to illustrate points that overlay oneanother. Discuss what you find.

4. Create a scatterplot with pct_school_attend on the x-axis, pct_employment_deprived on the y-axis,and urban as the third variable. Use scale_colour_grey() to differentiate the urban/rural data points.Discuss what it shows you.

5. Create the same scatterplot, but with the default colours to differentiate the urban/rural data points.

6. Create the same scatterplot, but specify "darkblue" and "orange" as the colours to differentiate theurban/rural data points.

7. Create the same scatterplot, but with default symbols to differentiate the urban/rural data points.

10

Page 11: Chapter 8 - Graphics

8. Create a similar scatterplot using urban as the faceting variable. This will create two scatterplots -one for rural datazones and one for urban datazones.

9. Place the scatterplots from questions 4 through 7 onto one page using the grid.arrange() function.

ANSWERS FOR EXERCISE III

Question 3.1

x11()ggplot(data=simd) +

geom_point(mapping=aes(x=pct_school_attend,y=pct_employment_deprived),position="jitter")

0

20

40

50 60 70 80 90 100

pct_school_attend

pct_

empl

oym

ent_

depr

ived

The scatterplot shows that as the percentage of school attendance increases, the percentage of employmentdeprivation decreases in Scottish datazones.

Question 3.2

x11()ggplot(data=simd) +

geom_point(mapping=aes(x=pct_school_attend,y=pct_employment_deprived),position="jitter", colour="red")

11

Page 12: Chapter 8 - Graphics

0

20

40

50 60 70 80 90 100

pct_school_attend

pct_

empl

oym

ent_

depr

ived

Question 3.3

x11()ggplot(data=simd) +

geom_point(mapping=aes(x=pct_school_attend,y=pct_employment_deprived),position="jitter", alpha=1/10)

12

Page 13: Chapter 8 - Graphics

0

20

40

50 60 70 80 90 100

pct_school_attend

pct_

empl

oym

ent_

depr

ived

This scatterplot shows a cluster of datazones between 85-95% of school attendance and between 0-8% ofemployment deprivation.

Question 3.4

x11()ggplot(data=simd) +

geom_point(mapping=aes(x=pct_school_attend,y=pct_employment_deprived,colour=urban), position="jitter") +

scale_colour_grey()

13

Page 14: Chapter 8 - Graphics

0

20

40

50 60 70 80 90 100

pct_school_attend

pct_

empl

oym

ent_

depr

ived

urban

Rural

Urban

This scatterplot shows a mix of urban and rural datazones with high percentages of school attendanceand low percentages of employment deprivation. However, the datazones with low percentages of schoolattendance and high percentages of employment deprivation tend to be overwhelmingly urban datazones.

Question 3.5

x11()ggplot(data=simd) +

geom_point(mapping=aes(x=pct_school_attend,y=pct_employment_deprived,colour=urban), position="jitter")

14

Page 15: Chapter 8 - Graphics

0

20

40

50 60 70 80 90 100

pct_school_attend

pct_

empl

oym

ent_

depr

ived

urban

Rural

Urban

Question 3.6

x11()ggplot(data=simd) +

geom_point(mapping=aes(x=pct_school_attend,y=pct_employment_deprived,colour=urban), position="jitter") +

scale_colour_manual(values=c("darkblue","orange"))

15

Page 16: Chapter 8 - Graphics

0

20

40

50 60 70 80 90 100

pct_school_attend

pct_

empl

oym

ent_

depr

ived

urban

Rural

Urban

Question 3.7

x11()ggplot(data=simd) +

geom_point(mapping=aes(x=pct_school_attend,y=pct_employment_deprived,shape=urban), position="jitter")

16

Page 17: Chapter 8 - Graphics

0

20

40

50 60 70 80 90 100

pct_school_attend

pct_

empl

oym

ent_

depr

ived

urban

Rural

Urban

Question 3.8

x11()ggplot(data=simd) +

geom_point(mapping=aes(x=pct_school_attend,y=pct_employment_deprived),position="jitter") +

facet_wrap(~urban)

17

Page 18: Chapter 8 - Graphics

Rural Urban

50 60 70 80 90 100 50 60 70 80 90 100

0

20

40

pct_school_attend

pct_

empl

oym

ent_

depr

ived

Question 3.9

p1 <- ggplot(data=simd) +geom_point(mapping=aes(x=pct_school_attend,y=pct_employment_deprived,

colour=urban)) +scale_colour_grey()

p2 <- ggplot(data=simd) +geom_point(mapping=aes(x=pct_school_attend,y=pct_employment_deprived,

colour=urban))

p3 <- ggplot(data=simd) +geom_point(mapping=aes(x=pct_school_attend,y=pct_employment_deprived,

colour=urban)) +scale_colour_manual(values=c("darkblue","orange"))

p4 <- ggplot(data=simd) +geom_point(mapping=aes(x=pct_school_attend,y=pct_employment_deprived,

shape=urban))

library(gridExtra)x11()grid.arrange(p1, p2, p3, p4, ncol=2, nrow=2)

18

Page 19: Chapter 8 - Graphics

0

20

40

50 60 70 80 90 100

pct_school_attend

pct_

empl

oym

ent_

depr

ived

urban

Rural

Urban

0

20

40

50 60 70 80 90 100

pct_school_attend

pct_

empl

oym

ent_

depr

ived

urban

Rural

Urban

0

20

40

50 60 70 80 90 100

pct_school_attend

pct_

empl

oym

ent_

depr

ived

urban

Rural

Urban

0

20

40

50 60 70 80 90 100

pct_school_attend

pct_

empl

oym

ent_

depr

ived

urban

Rural

Urban

19