Assignment 3 - linguisticsociety.org · Assignment 3 Author: Josef Fruehwald Created Date:...

2
Assignment 3 Josef Fruehwald 7/20/2017 library(tidyverse) ## Loading tidyverse: ggplot2 ## Loading tidyverse: tibble ## Loading tidyverse: tidyr ## Loading tidyverse: readr ## Loading tidyverse: purrr ## Loading tidyverse: dplyr ## Conflicts with tidy packages ---------------------------------------------- ## filter(): dplyr, stats ## lag(): dplyr, stats library(lsa2017) The duration data in iy_ah is heavilly skewed: iy_ah %>% ggplot(aes(dur, F1_n))+ geom_point(aes(color = plt_vclass), alpha = 0.3) -2 0 2 4 0.25 0.50 0.75 dur F1_n plt_vclass ah iy For some data types, an appropriate thing to do is to “log transorm” it: iy_ah %>% ggplot(aes(log2(dur), F1_n, color = plt_vclass))+ 1

Transcript of Assignment 3 - linguisticsociety.org · Assignment 3 Author: Josef Fruehwald Created Date:...

Page 1: Assignment 3 - linguisticsociety.org · Assignment 3 Author: Josef Fruehwald Created Date: 7/21/2017 10:46:31 AM ...

Assignment 3Josef Fruehwald

7/20/2017

library(tidyverse)

## Loading tidyverse: ggplot2## Loading tidyverse: tibble## Loading tidyverse: tidyr## Loading tidyverse: readr## Loading tidyverse: purrr## Loading tidyverse: dplyr

## Conflicts with tidy packages ----------------------------------------------

## filter(): dplyr, stats## lag(): dplyr, stats

library(lsa2017)

The duration data in iy_ah is heavilly skewed:iy_ah %>%

ggplot(aes(dur, F1_n))+geom_point(aes(color = plt_vclass), alpha = 0.3)

−2

0

2

4

0.25 0.50 0.75

dur

F1_

n

plt_vclass

ah

iy

For some data types, an appropriate thing to do is to “log transorm” it:iy_ah %>%

ggplot(aes(log2(dur), F1_n, color = plt_vclass))+

1

Page 2: Assignment 3 - linguisticsociety.org · Assignment 3 Author: Josef Fruehwald Created Date: 7/21/2017 10:46:31 AM ...

geom_point(alpha = 0.3)

−2

0

2

4

−4 −3 −2 −1 0

log2(dur)

F1_

n

plt_vclass

ah

iy

The logarithmic transform stretches out the shorter durations and squishes in the longer durations. Andwhen you use the log2 transformation, a change in 1 unit corresponds to doubling on the original scale. Solog2(0.125) = -3, and log2(0.25) = -2.

For this assignment, I’m going to recommend that you log2 transform duration, and center it, but not rescaleit.

Voicing and Duration

1. Another factor which has been said to influence F1 is the voicing of the following segment. Explorehow this has an effect on the iy_ah data, and whether it interacts with duration.

This is a normal kind of research question to occur to someone while doing their research, and there is usuallyno top-down authority to structure the analysis process, but I’ll try to be more helpful here:

1. Use only iy and ah from word internal contexts (the context column).2. Use only obstruent following segments. (B, CH, D, DH, F, G, JH, K, P, S, SH, T, V, Z, ZH).3. Create a new column for voicing recoding the following segment column. You can review code to do

this here http://jofrhwld.github.io/teaching/courses/2017_lsa/lectures/Session_2.nb.html#preserve_hid_info. Pay close attention to how I asked it to recode the T and F values.

4. log2 Transform and center duration.5. Make at least one plot.6. Fit one model for the iy data and one model for the ah data.7. Scrutinize the parameter estimates of these two models.8. Discuss what you see.

2