AP Statistics Midterm Review - The Math Class of Mrs....

13
Name __________________________ AP Statistics Midterm Review Exploring Data 1) Mini-tab data for monthly rate of return on Wal-Mart stock from 1973 to 1991 is shown on the right. (a) Give the five number summary for the data presented. (b) Describe in words the distribution of the data. (SOCS!) (c) Find the IQR. Are there any outliers based on the outlier formula? Based on the output, does it appear the Minitab program uses this formula? (d) What does the term “resistant” mean? Which statistics are considered resistant? 2) The distribution of the annual number of hurricanes in the US over a 70 year period is given on the right. (a) Give a description of the data. (b) What would be the appropriate measures to use for center and spread based on your answer to (a)? (c) About where does the center of the data lie?

Transcript of AP Statistics Midterm Review - The Math Class of Mrs....

Page 1: AP Statistics Midterm Review - The Math Class of Mrs. Loucksmrsloucks.weebly.com/.../9/13498778/ap_statistics_midterm_review_… · AP Statistics Midterm Review Exploring Data 1)

Name __________________________

AP Statistics Midterm Review

Exploring Data

1) Mini-tab data for monthly rate of

return on Wal-Mart stock from 1973

to 1991 is shown on the right.

(a) Give the five number summary

for the data presented.

(b) Describe in words the

distribution of the data. (SOCS!)

(c) Find the IQR. Are there any outliers based on the outlier formula? Based on the output, does it appear

the Minitab program uses this formula?

(d) What does the term “resistant” mean? Which statistics are considered resistant?

2) The distribution of the annual number of hurricanes in

the US over a 70 year period is given on the right.

(a) Give a description of the data.

(b) What would be the appropriate measures to use for

center and spread based on your answer to (a)?

(c) About where does the center of the data lie?

Page 2: AP Statistics Midterm Review - The Math Class of Mrs. Loucksmrsloucks.weebly.com/.../9/13498778/ap_statistics_midterm_review_… · AP Statistics Midterm Review Exploring Data 1)

* NOTE: Rounding of thousands may cause slight error in totals!

3) What percent of all undergraduate students were 18 to 24 years old in the fall of the academic year?

4) What proportion of 2-year full-time students were 40 and up?

5) Find the marginal distribution of age among all undergraduate students, first in counts and then percents.

Make a bar graph of the distribution in percents. Comment on what you see.

6) The box-plots on the right show the career homeruns distributions for

Barry Bonds and Hank Aaron. Describe and compare the

distributions.

Page 3: AP Statistics Midterm Review - The Math Class of Mrs. Loucksmrsloucks.weebly.com/.../9/13498778/ap_statistics_midterm_review_… · AP Statistics Midterm Review Exploring Data 1)

Modeling Distributions of Data

The Weschler Adult Intelligence Scale scores for teenagers is described by (110, 25)N .

7) What percent of teenagers have scores between 85 and 135?

8) What proportion of teenagers have scores lower than 80?

9) What score is considered the cutoff for the 90th

percentile?

10) What score would a teenager need to be in the top 1% of all teenagers?

11) What is the z-score for a student who scores 105? Describe what this z-score means.

12) What area of a normal curve would correspond to 1.3 2.4Z ?

Page 4: AP Statistics Midterm Review - The Math Class of Mrs. Loucksmrsloucks.weebly.com/.../9/13498778/ap_statistics_midterm_review_… · AP Statistics Midterm Review Exploring Data 1)

Describing Relationships (Least Squares Regression)

13) Construct a scatter-plot to the right and label axes. Describe the relationship between the variables below.

Assume alcohol consumption is the explanatory variable.

14) Determine the LSRL equation for predicting heart disease death rate from wine consumption below. Show

the line on your scatter-plot.

15) Interpret the correlation coefficient and the coefficient of determination based on the data. Be very careful

how you state the interpretations.

16) Based on your regression model in #15.

(a) Predict the heart disease death rate from a country with a annual alcohol from wine consumption of 10.1

liters/year of alcohol from wine. Are you confident in prediction?

(b) Show a residual plot (including labels!) for this problem below. Comment on the plot – is a linear

model appropriate? Could any data point be considered influential?

Page 5: AP Statistics Midterm Review - The Math Class of Mrs. Loucksmrsloucks.weebly.com/.../9/13498778/ap_statistics_midterm_review_… · AP Statistics Midterm Review Exploring Data 1)

17) The figure above shows computer output for a linear regression between measured brain activity in a human

and a measure of social distress a person feels from exclusion. Both variables are quantitative to allow for a

regression.

(a) What is the linear regression equation found by the computer? Be sure to label the explanatory and

response variables.

(b) What is the correlation coefficient between the variables?

18) In an economics class, the correlation between students’ total scores prior to the final exam and their score

on the final examination score is 0.7r . The mean and standard deviation of prior total scores are 280 and

30, respectively. The mean and standard deviation for final exam scores are 75 and 8, respectively. What is

the equation for the LSRL?

Page 6: AP Statistics Midterm Review - The Math Class of Mrs. Loucksmrsloucks.weebly.com/.../9/13498778/ap_statistics_midterm_review_… · AP Statistics Midterm Review Exploring Data 1)

Designing Studies

19) Two treatments are considered for breast cancer detected in early stages, total removal of the breast or

removal of just the tumor and nearby lymph nodes. A medical team examines the records of 25 hospitals

and examines the length of time of survival after surgery of women who have had either treatment.

(a) What are the explanatory and response variables?

(b) Is this an observational study or an experiment?

(c) Is there a confounding variable present? If so, explain.

20) An experiment compares two new varieties of corn with higher lysine content, opaque-2 and floury-2, to see

if roosters/chickens grow larger with the new types of corn. Researchers mix each type of corn with

soybeans at three protein levels, 12%, 16% and 20% to create diet mixes for the chickens. They feed each

diet to 10 one-day old birds and record weight gains after 21 days.

(a) What are the experimental units and response variable?

(b) How many factors are there? How many treatments? Use a diagram to describe the experiment. How

many experimental units are required?

(c) How would you create a randomized design for the experiment?

(d) Male chicks (baby roosters?) grow faster than females. How could you account for this in the

experimental design? What is this process called?

21) We want to know if people like Coke or Pepsi better. So we set up a double-blind taste test where each

subject tastes both. Describe what this means. Is this an experiment or observational study? What type of

observational study or experiment is this?

Page 7: AP Statistics Midterm Review - The Math Class of Mrs. Loucksmrsloucks.weebly.com/.../9/13498778/ap_statistics_midterm_review_… · AP Statistics Midterm Review Exploring Data 1)

Probability: What Are The Chances?

22) What is P(Married)?

23) What is P(Age 30-64)?

24) What is P(Married\Age 30-64) ?

25) Are the events Married and Age 30-64 independent? Justify your answer.

26) Are the events Married and Age 30-64 disjoint? Justify your answer.

27) If a fair coin is tossed many times, as n increases,

the proportion of heads should approach 0.5. What

important statistical idea does this represent?

Page 8: AP Statistics Midterm Review - The Math Class of Mrs. Loucksmrsloucks.weebly.com/.../9/13498778/ap_statistics_midterm_review_… · AP Statistics Midterm Review Exploring Data 1)

28) Sadly, the probability that a dropped piece of buttered bread will land on the buttered side is 0.72. Yuck!

(a) If three pieces of buttered bread are dropped, what is the probability at least one will land on the buttered

side?

(b) Describe how you would simulate this using a random digit table.

29) John is considering either surgery for heart bypass or medical management of his heart condition. The event

“A” is that John survives at least 5 years and has a good quality of life during that time. The doctor gives

him the following information:

* Under medical management M, P(A\M) = 0.7

* Probability is 0.05 John will not survive bypass surgery, P(B)

* Probability is 0.10 John survives surgery but with serious complications, P(C)

* Probability is 0.85 John survives surgery with no complications, P(D)

* If John survives surgery with complications, P(A\C) = 0.73.

* If John survives with no complications, P(A\D) = 0.76.

Calculate overall P(A) assuming John chooses surgery. Should he choose surgery or medical management

based on the doctor’s information?

Page 9: AP Statistics Midterm Review - The Math Class of Mrs. Loucksmrsloucks.weebly.com/.../9/13498778/ap_statistics_midterm_review_… · AP Statistics Midterm Review Exploring Data 1)

Random Variables

P(X)

X

A probability distribution of a continuous random variable.

30) ( 0.5)P X

31) ( 0.5)P X

32) (0.5 1.5)P X

33) (1.0 2.5)P X

Assume an NFL football score simulator is designed to give the following discrete distribution of points by an

average football team in a single quarter of the game:

X 0 3 7 10 14

P(X) 0.1 0.1 0.4 0.2 ?

34) What is the missing probability?

35) What is the expected mean ( X ) and standard deviation ( X ) of a score by a team in a quarter?

36) What is the expected mean ( X ) and standard deviation ( X ) of a score by a team in a game? (Assume

each quarter score is independent).

37) Unfortunately, the Lions X = 22.1 and X = 3.1 for an entire game. By how much should the Lions expect

to lose to an average NFL team? What is the standard deviation of the distribution of the difference

between an average NFL team and the Lions score?

38) If the Lions draft the #1 running back, Las Vegas assumes their revised mean score (Y ) would be modeled

by the equation 1.5 2.1Y X . What are the new values of Y and Y for the Lions?

Page 10: AP Statistics Midterm Review - The Math Class of Mrs. Loucksmrsloucks.weebly.com/.../9/13498778/ap_statistics_midterm_review_… · AP Statistics Midterm Review Exploring Data 1)

A family has seven children. Assume the genetics of the parents is such that the chance of having a girl is 0.6,

and the probability of having a boy is 0.4. Assume they are independent events.

39) What is the probability that all seven children are boys?

40) What is the probability that exactly 3 children are girls?

41) What is the probability that at most 2 children are girls?

42) What is the mean number of girls expected of this family? What is the standard deviation of the count of

girls?

43) Would it be accurate to assume a normal distribution with the mean and standard deviation calculated

above? Why or why not?

Suppose Miguel Cabrera expects to hit for his lifetime average next year for the Tigers, which happens to be

0.321. (This is his total numbers of hits / total number of at-bats).

44) What is the probability he will get a hit on his first at bat in a game?

45) What is the probability he will get his first hit on his fifth at-bat in the game?

46) What is the probability it will take at most 3 at-bats to get a hit in a game?

47) If Miguel gets 620 at bats next year, how many hits should he expect for the season?

48) How many at bats should we expect before he gets a hit next season?

Page 11: AP Statistics Midterm Review - The Math Class of Mrs. Loucksmrsloucks.weebly.com/.../9/13498778/ap_statistics_midterm_review_… · AP Statistics Midterm Review Exploring Data 1)

Sampling Distributions

49) Voter registrations show 68% of registered Troy voters are Republicans. To test this you randomly phone

150 registered Troy voters, and 73% are found to be Republican.

(a) Use proper notation to indicate if 68% a parameter or statistic? What about the 73%?

(b) What are the mean and standard deviation of the sampling distribution?

(c) What is the probability of obtaining a SRS of 150 Troy voters in which 73% or more are registered

Republicans.

50) A sulfur compound (DMS) is sometimes found in milk from cows eating near onion fields. Researchers

want to find the lowest concentration of DMS detectable to humans. Previous studies show that this

threshold follows a normal distribution with 26 and 7 . Here are results from 10 randomly chosen

subjects (in micrograms/liter):

28 40 28 33 20 31 29 27 17 21

(a) What is the mean threshold based on this sample? (Use proper notation)

(b) What is the probability of getting a sample mean even farther away from than this?

Page 12: AP Statistics Midterm Review - The Math Class of Mrs. Loucksmrsloucks.weebly.com/.../9/13498778/ap_statistics_midterm_review_… · AP Statistics Midterm Review Exploring Data 1)

51) Suppose a density curve for a distribution of a population is very non-normal, with 1 , as shown in the

figure above. Many sample means of different sizes ( 2, 10, 25n n n ) from this population are

observed. As n increases the shape becomes more normal. What important statistical idea does this

represent?

Estimating with Confidence

52) The City of Troy is wondering whether residents would favor a millage to support road repairs in the city.

A preliminary SRS of 300 potential Troy voting residents finds that 162 would vote in favor of the millage.

(a) Check required assumptions to create a confidence interval of passing the millage.

(b) Construct a 95% confidence interval. Should the City be 95% confident the millage will pass?

Justify your answer.

(c) Assuming the sample proportion remains the same as found in part (a). How large would a SRS is

required so the City could be 95% confident of the millage passing?

Page 13: AP Statistics Midterm Review - The Math Class of Mrs. Loucksmrsloucks.weebly.com/.../9/13498778/ap_statistics_midterm_review_… · AP Statistics Midterm Review Exploring Data 1)

53) A member of the Troy Colt bowling team usually bowls 400 games in the course of a year. A random

sample of a member of the Troy Colt Bowling team shows the following scores for 12 games:

145 141 123 162 170 155 167 177 148 163 175 169

(a) Construct a 99% confidence interval for the mean score for this bowler.

(b) What condition was violated when creating this interval? Do you feel confident in the interval that was

created?