Statistics Trivial Pursuit (Sort of)For Review (math 17)
description
Transcript of Statistics Trivial Pursuit (Sort of)For Review (math 17)
STATISTICS TRIVIAL PURSUIT (SORT OF) FOR REVIEW (MATH 17)
COLORS AND CATEGORIES
Blue – Basic Graphs and Descriptive Statistics Pink – Assumptions (cumulative) Yellow – Statistical Theory and History Brown – Interpretations Green – Last 1/3 Inference Orange – Other Hypothesis Testing Related
BLUE 1 What are the descriptive statistics that are
sensitive to outliers?
BLUE 2 Provide the name and primary purpose of this
graph.
BLUE 3 Provide a basic description of the distribution of this variable
from its graph (remember there are 3 things to describe).
BLUE 4 What are the descriptive statistics used in
the creation of a boxplot?
BLUE 5 Name the rule used to compute outliers, and
describe how to apply it.
BLUE 6 Name graphs that are appropriate to display
categorical variables, and state whether or not you should discuss the shape of distributions based on those graphs.
BLUE 7 Compare/contrast these 2 distributions based
on the plot.
BLUE 8 A standard deviation of a measurement in
feet is 3.4 feet, from a sample with a mean of 29.2. Interpret the standard deviation.
BLUE 9 This plot is part of the preliminary analysis
for ….
BLUE 10 If there was a high outlier in the distribution
of a particular variable, and it was removed, what descriptive statistics are likely (or certain) to change to a significant extent?
PINK 1 What is the assumption that all chi-square
tests have in common?
PINK 2 What is the assumption related to sample
sizes for a 2 sample z-test?
PINK 3 What is the assumption related to sample
size when constructing a confidence interval for p?
PINK 4 What are the specifics of the nearly normal
condition for a paired t-test?
PINK 5 What are the specifics of the nearly normal
condition for ANOVA?
PINK 6 What are the specifics of the 2 assumptions
in regression related to error terms?
PINK 7 You are told that the randomization and
independence condition is met for a sample of high school students who were asked how much money they received for their most recent birthday. Describe what the randomization and independence assumption means in this context.
PINK 8 What are some example tests where
assumptions related to normality are NOT required?
PINK 9 What are the specifics of the nearly normal
condition for a 2-sample t-test?
PINK 10 What is the assumption that all tests/CIs
have in common but which (since it is common to all) Prof. Wagaman doesn’t require that you write down when you list assumptions?
YELLOW 1 What is a sampling distribution for a
statistic? (conceptually)
YELLOW 2 (Fill in at least 3 of the blanks for credit)The t distribution was discovered by
___________ who published under the pseudonym ____________. He discovered the t distribution while working for _____________ in Ireland. Specifically he was working in the field of ______________ (2 words, but one blank) and was primarily responsible for checking out _________, one of their many products.
YELLOW 3 What does the Central Limit Theorem say?
YELLOW 4 How are z-scores computed, and what are
they useful for? (variety of answers)
YELLOW 5 When sampling distributions have standard
deviations that involve unknown parameters, and we plug in estimates for those parameters, we obtain what value(s)?
YELLOW 6 Suppose 2 random variables X and Y are
independent. X has mean 6 and standard deviation 3. Y has mean 14 and standard deviation 4.
What are the values of the mean and standard deviation of X+Y?
YELLOW 7 What are the differences between a chi-
square test of homogeneity and a chi-square test of independence?
YELLOW 8 What are the three types of bias in sampling?
YELLOW 9 If you are designing an experiment and you
have 3 different drugs you want to try, and you want to try them at 2 different doses each (1 pill or 2 pills daily), and you want to include (a) placebo group(s), how many treatments are there in your experiment?
YELLOW 10 Name and describe two different sampling
techniques.
BROWN 1 Running a hypothesis test for slope equal to
0 or not, you obtain a t-test statistic value of -2.14. Interpret this test statistic.
BROWN 2 A linear regression results in an R-squared
value of .81. Assuming linear regression was appropriate, interpret this R-square in terms of general X and Y variables.
BROWN 3 A random sample of n=16 observations
yields an s=24 (sample standard deviation). What is the numerical value of the standard error of the sample mean? Also, interpret this value.
BROWN 4 Describe what is wrong with the statement:
“A p-value is the probability that the null hypothesis is true.”
BROWN 5 A 95% confidence interval for a mean weight
of a new dog breed goes from (25.2, 34.6) pounds. Interpret the confidence interval given here.
BROWN 6 A regression results in an s_e value of 3.46.
The y-axis goes from 36 to 109. What does the s_e value represent, and what does it tell you about how well the regression does?
BROWN 7 A p-value for an ANOVA testing for equality of
5 means with an F of 24.56 is .0359. Interpret this p-value.
BROWN 8 A 95% confidence interval for a mean weight
of a new dog breed goes from (25.2, 34.6) pounds. Interpret the confidence level used here.
BROWN 9 A conclusion in a t-test of mu=150 vs. mu>150 is
given as:
Our evidence is not inconsistent with our null hypothesis.
How should this conclusion be changed to be correct?
BROWN 10 A p-value for a two-sided two sample z-test is
.1470 based on a Z of 1.45. Interpret this p-value.
GREEN 1 Which set(s) of graphs indicate it would NOT be appropriate to perform an ANOVA? Explain.
GREEN 2 You want to know if the distribution of class
year among Reunion workers is equally split among first-years, sophomores, and juniors. What test is appropriate?
(Note, I am assuming that seniors can’t get hired to work Reunion, if they can, change this to equally split among all four class years).
GREEN 3 An ANOVA where the null hypothesis is
rejected results in multiple comparisons of:
Estimate lwr upr2-1 4.146737 -2.737867 11.0313423-1 -3.742933 -10.627537 3.1416713-2 -7.889670 -14.774274 -1.005066
Summarize what this multiple comparisons shows you.
GREEN 4 If you wanted to know whether or not there is
a significant association between heart rate and weight in rats, what statistical procedure would you perform?
GREEN 5 You want to compare the means of 4 groups.
Describe why you would want to do an ANOVA rather than 6 t-tests to compare all pairs of means.
GREEN 6 You want to know if there is an association
between t-shirt size (S,M,L,etc.) and class year at Amherst. What is the appropriate statistical procedure to perform?
GREEN 7 You want to know if a higher proportion of
underclassmen have corrective lenses compared to upperclassmen. Explain why there is no appropriate chi-square test for this situation. What analysis could you run?
GREEN 8 A balanced ANOVA is an ANOVA where….
GREEN 9 Describe the similarities and differences in
finding p-values for ANOVA and chi-square.
GREEN 10 A scatterplot for
regression is given as:
R also reports an R-squared value of .81
What is the correlation between X and Y?
ORANGE 1 What is power and how would you increase it
for a hypothesis test?
ORANGE 2 What is a Type I error?
ORANGE 3 If given a significance level of .035, for what
p-values would you reject the null hypothesis?
ORANGE 4 Explain the difference between practically
significant results and statistically significant results.
ORANGE 5 Most of the tests we learned in class were
______________ tests. If certain assumptions related to them are not met, you can run ________________ tests, one example of which is ________________________.
(Fill-in at least 1 blank).
ORANGE 6 Hypothesis tests and confidence intervals are
based on an understanding of the __________________ _______________________ (two words) of statistics.
ORANGE 7 You are performing a t-test for mu=60 versus
a 2-sided alternative and all conditions are satisfied. What is the expected value of your test statistic under the null hypothesis?
ORANGE 8 You are testing for p=.4 vs. p>.4 and all
conditions are satisfied. Your sample results in 30 yes replies out of 100 responses. What can you say about your p-value for this test?
ORANGE 9 In order to use a confidence interval to do a
one-sided t-test with a significance level of .05, what confidence level would need to be used?
ORANGE 10 You are testing for mu=50 vs. mu>50, and
the appropriate confidence interval is (52,64). Can you reject your null hypothesis? Explain.
Final Exam is Monday, May 9th, 9 am -12 noon in SM 207
You can bring a two-sided page of notes and calculator, plus pen/pencils.
Office Hours: Thursday – 2-4 Friday – 1-4 Sunday – 2-4 pm, SM 206 or 207
Good luck studying!
REMINDER:
Math dept. end of semester picnic is Saturday from 12-2 at the Alumni House
THANKS FOR A GREAT SEMESTER!