AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The...

39
AP Statistics Summer Assignment The summer assignment for this course will be due on the first school day of the 2013-2014 school year. Late assignments will not be taken and no exceptions will be made. It is imperative to your grade that you put forth every effort when completing this assignment, as it will be the first homework score you will receive for this class. Good luck and I look forward to meeting you in the fall. Introduction Statistics is not just a regular math class. Statistics requires knowledge of mathematics as well as the ability to read for comprehension and write well. You will be required as part of the summer assignment to do the following: 1. Be familiar with the graphing calculator topics and skills in the additional uploaded attachment. This is considered to be assignment 1. This attachment guides you on both the calculator skills and mathematical skills needed for the first week of class. You will need to master these calculator skills prior to the first day of school. 2. Find five articles for statistical analysis for assignment 2. See directions below. 3. Accurately complete assignment 3. We will have a test over the material within the first few days after school starts. If you find that you are in need of help after exhausting internet resources and other statistical resources, you can e-mail me and I will be happy to help anyone via email. I will try to begin checking email once per week in July so please know it may be several days before you get a response. Math Topics needed for summer assignment Least Squares Regression Lines Basic Probability Mean, Median , Mode, Range (5 number summary) Bar Graph Histograms Stem Leaf Plots Frequency Table Dot Plots Summer Assignment 2: Find five articles that are statistically based and answer the following questions about each: 1. Who are the individuals described by the data in the article? 2. What are the variables? In what units is each variable recorded? 3. Why was/were the data gathered? 4. When, where, how, and by whom, were the data produced? 5. Describe any graphs included in the article. Were they relevant to the article? How do they help explain or support the article? 6. Interpretation—Summarize the article in your own words. You must have them ready to turn in on the first day.

Transcript of AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The...

Page 1: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

AP Statistics Summer Assignment

The summer assignment for this course will be due on the first school day of the 2013-2014 school year. Late assignments will not be taken and no exceptions will be made. It is imperative to your grade that you put forth every effort when completing this assignment, as it will be the first homework score you will receive for this class. Good luck and I look forward to meeting you in the fall.

Introduction

Statistics is not just a regular math class. Statistics requires knowledge of mathematics as well as the ability to read for comprehension and write well. You will be required as part of the summer assignment to do the following:

1. Be familiar with the graphing calculator topics and skills in the additional uploaded attachment. This is considered to be assignment 1. This attachment guides you on both the calculator skills and mathematical skills needed for the first week of class. You will need to master these calculator skills prior to the first day of school.

2. Find five articles for statistical analysis for assignment 2. See directions below. 3. Accurately complete assignment 3.

We will have a test over the material within the first few days after school starts. If you find that you are in need of help after exhausting internet resources and other statistical resources, you can e-mail me and I will be happy to help anyone via email. I will try to begin checking email once per week in July so please know it may be several days before you get a response.

Math Topics needed for summer assignment

Least Squares Regression Lines Basic Probability Mean, Median , Mode, Range (5 number summary) Bar Graph Histograms Stem Leaf Plots Frequency Table Dot Plots Summer Assignment 2: Find five articles that are statistically based and answer the following questions about each: 1. Who are the individuals described by the data in the article? 2. What are the variables? In what units is each variable recorded? 3. Why was/were the data gathered? 4. When, where, how, and by whom, were the data produced? 5. Describe any graphs included in the article. Were they relevant to the article? How do they help explain or support the article? 6. Interpretation—Summarize the article in your own words. You must have them ready to turn in on the first day.

Page 2: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Summer Assignment 3: Buy a single serving bag of M&M’s and perform the following activities as indicated. Use the calculator handout to aide you as needed.

Activity 1: How Do The Colors Vary?

1. Find the weight (in grams) of your bag of M&Ms. ______________ (We will use it in a later activity!)

2. Open your bag of M&Ms and count the number and the percentage of each color and the total number of M&Ms in the bag.

Color Brown Yellow Red Blue Orange Green Total

Number

Percentage

Page 3: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

3. Using the data from your bag of M&Ms, construct a pie chart of colors.

Pie Chart

Page 4: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

4. Construct, using corresponding colors, a bar chart using the data from your bag of candy.

Bar Chart

Brown Yellow Red Blue Orange Green

Page 5: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Activity 2: How much does the candy weigh?

Here is a previously-collected set of weights.

Bag Weights

51.1 48.9 49.3 51.2 50.8 51.4 52.1 52.4 49.9 49

53.5 48.5 52 50.3 50.3 51.3 51.2 50.2 51.7 49.4

52.4 47 50.5 48.4 50.8 49.9 52.4 50.7 51.2 49.5

51 48.2 49.7 53.5 50.4 49.8 50 50.9 51.4 50.9

49 49.4 49.9 50.7 51.4 51.7 50 49.5 50.9 48.2

46.7 49.1 52.8 50.8 54.2 50.2 52.2 53.4 50.5 49.5

50.1 49.7 50.3 51.4 51.3 50 50.2 49.2 51.9 49.9

49 50.2 49.1 50.7 48.7 50.5 49.7 49.5 50.5 48.5

50.7 48.2 49.9 50.9 49.2 51.5 50.5 52.4 51.9 50.3

50 50.6 48 50.5 49.7 49.4 51.9 48.6 51.3 48.9

1. What do you think the data will look like?

2. Enter the data into a list in your calculator.

Page 6: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

3. Using your calculator, draw a histogram/boxplot/stemplot of the data. Draw, and label, a basic sketch, of the data.

Frequency

Weight

4. What is the overall shape of the data? What unique features do you notice? Were you surprised by the results? How does yours compare?

Page 7: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

5. A previously-collected sample of two varieties (Milk Chocolate and Peanut) of individual candies were weighed.

Candy Weights

Milk Chocolate Peanut

0.92 0.86 0.86 2.45 2.32 2.47

0.87 0.86 0.84 2.08 2.35 2.19

0.88 0.87 0.84 2.17 2.19 2.11

0.93 0.90 0.81 2.40 1.85 2.02

0.88 0.90 0.82 2.45 2.43 2.50

0.88 0.88 0.88 2.32 2.08 2.05

0.86 0.82 0.89 2.92 2.08 1.99

0.75 0.85 0.80 2.45 1.95 2.02

0.80 0.90 0.82 2.28 2.19 2.14

0.92 0.90 0.83 1.90 2.29 2.15

0.89 0.89 0.82 2.16 2.30 2.18

0.92 0.93 0.88 2.44 1.86 2.18

0.90 0.89 0.89 2.13 1.87 2.04

0.91 0.86 0.84 2.04 2.25 2.34

0.88 0.99 0.90 2.07 2.26 2.40

0.86 0.91 0.88 2.07 2.01 1.90

0.80 0.80 0.89 1.96 2.18 2.58

0.87 0.85 0.89 2.24 2.17 2.33

0.93 0.89 0.85 2.05 2.35 2.40

0.81 0.87 0.80 2.45 2.25 1.90

Page 8: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Candy Weights (Continued)

Milk Chocolate (Continued) Peanut (Continued)

0.84 0.92 0.83 2.71 2.72 2.58

0.90 0.83 0.86 2.53 2.44 2.33

0.83 0.78 0.86 2.16 1.99 2.31

0.85 0.86 0.79 2.63 2.34 2.38

0.87 0.80 0.83 2.02 2.50 1.88

0.93 0.87 0.96 2.14 2.02 2.69

0.97 0.85 0.89 2.43 1.98 2.16

0.82 0.87 0.80 2.63 2.21 2.58

0.86 0.88 0.87 2.01 2.35 1.59

0.76 0.84 0.85 2.13 2.05 2.18

0.82 0.86 0.92 2.15 2.11 2.25

0.86 0.79 0.90 2.34 2.54 2.10

0.82 0.87 0.94 2.10 2.11 2.33

0.90 0.87 0.91 2.00 2.43 2.02

6. How will the distributions of their weights be similar? How will they be different?

Page 9: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Activity 3: Are Numbers and Weights of Candy Associated?

1. The Weights and Number of Pieces of Candy have been collected from different size bags of a multi-colored candy.

Type Weight of Candy (gm)

Number of Pieces of Candy

Fun Size 20.4 23

Regular 49.6 59

Regular + 10 53.3 65

King Size 90.3 106

Big Party 644.3 765

XXL 1245.2 1427

2. Draw a scatterplot of Weight vs. Number.

3. Does there seem to be an association? Is it strong? In what direction does it go?

4. Find the least-squares regression line of (Number, Weight). [ y = a + bx ]

5. What does a mean? What does b mean?

Number

Weight

Page 10: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Activity 4: What happens when we eat the candy?

1. Count the number of pieces of candy in your tube - Place that number in Trial Number 0. 2. (a) Place the candy in the tube. (b) Shake. (c) Pour on the table. (d) Remove the pieces of candy with no mark displayed. (e) Count the remaining pieces of candy. (f) Place that number in Trial Number 1. 3. Repeat step 2 (increasing the trial number by 1) until there is only 1 piece of candy left. Trial Number 0 1 2 3 4 5 6 7 8 9 Number of Candies Left 4. Draw a scatterplot of: (Trial Number, Number of Candies) 5. Find the least-squares regression line of the transformed data. [ y = a + bx ] LSRL _______________________________ r2 ____________

Trial Number

Number of Candies

Page 11: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Activity 5: The Candy Family

1. If a family decides to have 5 children, how many boys and how many girls will they have?

2. We will use a simulation to discover the distribution of boys and girls.

a. Select 5 pieces of candy. Place them in a cup. Shake the cup and roll the candy on the table. Mark Facing Up is a girl, Mark Facing Down is a boy.

b. Write the number of girls on a sticky note. c. Repeat steps a. and b. several times. d. Form a histogram using the sticky notes.

3. Based on your simulation, what is the most likely distribution of boys and girls? Least likely distribution?

4. Can we calculate the probability of each family distribution mathematically?

Page 12: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

CHAPTER

1

Looking at Data—Distributions

1.11.21.3

Displaying Distributions with GraphsDescribing Distributions with NumbersThe Normal Distributions

Introduction

We begin by using the TI-83 to store and view data sets. In this chapter, we will first see how tomake histograms and time plots. Then we will learn how to compute basic statistics such as themean, variance, standard deviation, median, and quartiles, and how to view data further withboxplots. Lastly, we study the normal distributions and provide a program to compute normalprobabilities and graph a normal curve.

1

Page 13: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

2 CHAPTER 1

1.1 Displaying Distributions with Graphs

We start by using the TI-83 to help us visualize data sets. In this section, we will use the STATmenu to store data sets into lists and the STAT PLOT menu to create histograms and timeplots.

Exercise 1.26 Make a histogram of the degree of reading power (DRP) from the following dataset acquired in a study of third-grade students.

40 26 39 14 42 18 25 43 46 27 1947 19 26 35 34 15 44 40 38 31 4652 25 35 35 33 29 34 41 49 28 5247 35 48 22 33 41 51 27 14 54 45

Solution: We must first enter the data into the TI-83. (We note that at most 999 measurementscan be entered or generated into a list.)

Entering Data into a List

Step 1: Press STAT, then press 1 to call up the STAT Edit screen.

Step 2: In order to clear any data that might be in the lists, press the Up Arrow to highlightL1, press CLEAR, then press ENTER. If desired, highlight L2, press CLEAR, andpress ENTER. Then move the cursor back under list L1.

Step 3: To enter the data into list L1, type 40 , press ENTER;type 47 , press ENTER; continue until the entire dataset is entered into L1.

Sorting a List

We can sort the data into increasing order to observe the range of the data. To do so, pressSTAT, press 2 to obtain the command SortA( on the Home screen. Type L1 (i.e., 2nd 1). PressENTER to execute the command SortA(L1. Now press STAT, press 1, and observe that thedata are now in increasing order. Scroll to the bottom of L1 to see the largest value.

Page 14: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Looking at Data—Distributions 3

Making a Histogram

Now let us make a histogram of our data set where the x axis ranges from 10 to 60 on a scaleof 5. After entering the data into a list, we must adjust the WINDOW and STAT PLOTsettings.

Step 1: Press WINDOW and enter the settings as shown. Note thatYmin should always be 0 for a histogram, but that Ymax mayneed to be adjusted in order to see the top of each bar.

Step 2: Press STAT PLOT (2nd Y=) and press 1 to get to the settingsscreen for Plot1. Highlight On and press ENTER. Scroll downto Type, then scroll right to highlight the third type(histogram), and press ENTER. Set Xlist to L1 and Freq to 1(since we count each data point one time).

Step 3: Before graphing, return to the STAT PLOT screen and, if necessary turn off PLOT2and PLOT3. Also press Y= and either clear or de-select any functions to preventthem from being graphed.

Step 4: Press GRAPH. You can then press TRACE and move the right arrow cursor alongthe histogram to see the range of each bar and how many measurements occur in thatrange. (Note: By adjusting Xscl in the WINDOW, we can alter the shape of thehistogram. The graph on the right below uses an Xscl of 3.)

Exercise 1.27 Make a histogram of Cavendish’s measurements of the density of the earth.

5.50 5.61 4.88 5.07 5.26 5.55 5.36 5.29 5.58 5.655.57 5.53 5.62 5.29 5.44 5.34 5.79 5.10 5.27 5.395.42 5.47 5.63 5.34 5.46 5.30 5.75 5.68 5.85

Solution: So as not to lose our previous data in L1, we will enter thedata for Exercise 1.27 into list L2. Clear list L2, enter the data into thislist, and then sort the data by entering the command SortA(L2.

We will graph using a range of 4.8 to 6 on a scale of 0.1. After adjusting the WINDOW, setthe Xlist to L2 in the STAT PLOT screen. Then press GRAPH.

Page 15: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

4 CHAPTER 1

Frequency Charts

Often data is given in a frequency chart which lists the number of times each measurement occurs,rather than listing each measurement numerous times. A supplemental example is providedhere to explain how to work with such data on the TI-83.

Example Make a histogram for the following counts of the number of children living in ahousehold.

Number of children 0 1 2 3 4 5 6Number of households 60 42 86 59 22 4 2

Solution: To enter the data and not lose our previous data, we will use lists L3 and L4. Clearany data from lists L3 and L4, then enter the counts (children) into L3 and the frequencies(households) into L4.

In the WINDOW, we must set the ranges so that we see the entire histogram. So we setXmin to 0 and Xmax to 7, on a scale of 1. We set Ymax to 90 with a Yscl of 10. In the STATPLOT screen, we set XList to L3, but then set Freq to L4. After adjusting the settings, pressGRAPH and then TRACE.

Storing a List into Another List

We may wish to use lists L1 and L2 for other data sets, but not wish to lose the current data inthose lists. If so we can store the data from these lists into other lists. To move the data fromlists L1 and L2 into lists L5 and L6, enter the commands L1 L5 and L2 L6. (The command is obtained on the Home screen with the STO button.) Observe that the original L1and L2 data sets are now in lists L5 and L6.

Page 16: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Looking at Data—Distributions 5

Time plots

We next discuss how to view data observations made over a period of time. Rather than usinga histogram, we shall now use a time plot.

Exercise 1.38 Make a time plot of the motor vehicle death rates (number of deaths per 100million miles driven) in the United States.

Year Rate Year Rate Year Rate Year Rate1960 5.1 1970 4.7 1980 3.3 1990 2.11962 5.1 1972 4.3 1982 2.8 1992 1.71964 5.4 1974 3.5 1984 2.6 1994 1.71966 5.5 1976 3.2 1986 2.5 1996 1.71968 5.2 1978 3.3 1988 2.3 1998 1.6

Solution: We will enter the years into list L1 andthe rates into L2. To do so, first follow Steps 1–3in the solution to Exercise 1.26 above. Then adjustthe WINDOW settings so that X ranges betweenthe years in listed in L1, and Y rangesappropriately to see the rates in L2.

Next, adjust the STAT PLOT settings. After turning on Plot1, press the down arrow tomove to Type, then scroll right to highlight the second type (time plot), and press ENTER.Enter L1 for Xlist and L2 for Ylist. Before graphing, remember to turn off the other plots andeither to clear or deselect other functions in the Y= screen. Then press GRAPH and TRACE.

We observe an overall tendency of a decreasing rate of death as the years pass.

Exercise 1.40 The winning times (to the nearest minute) in the Boston Marathon are given onthe next page. Make time plots of the men’s and women’s winning times on the same graph.

Page 17: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

6 CHAPTER 1

Men WomenYear Time Year Time Year Time Year Time Year Time1959 143 1974 134 1989 129 1972 190 1987 1461960 141 1975 130 1990 128 1973 186 1988 1451961 144 1976 140 1991 131 1974 167 1989 1441962 144 1977 135 1992 128 1975 162 1990 1451963 139 1978 130 1993 130 1976 167 1991 1441964 140 1979 129 1994 127 1977 168 1992 1441965 137 1980 132 1995 129 1978 165 1993 1451966 137 1981 129 1996 129 1979 155 1994 1421967 136 1982 129 1997 131 1980 154 1995 1451968 142 1983 129 1998 128 1981 147 1996 1471969 134 1984 131 1999 130 1982 150 1997 1461970 131 1985 134 2000 130 1983 143 1998 1431971 139 1986 128 2001 130 1984 149 1999 1431972 136 1987 132 1985 154 2000 1461973 136 1988 129 1986 145 2001 144

Solution: We shall enter the men’s years and times into lists L1 and L2 and the women’s yearsand times into lists L3 and L4. We can use the seq( command to quickly enter the years intolists L1 and L3. Press LIST (2nd STAT), scroll right to OPS, and press 5 for the sequencecommand. Then enter the command seq(K,K,1959,2001)L1. Press 2nd ENTER to retrievethe command, then edit it, and enter the command seq(K,K,1972,2001)L3. Then manuallyenter the times into lists L2 and L4.

Adjust the window settings, then set the PLOT1 settings for a time plot of L1 and L2, andset the PLOT2 settings for a time plot of L3 and L2.

1.2 Describing Distributions with Numbers

In this section, we will compute the various statistics of a data set including the mean, variance,standard deviation, median, and quartiles. We will also use boxplots and modified boxplots toview these statistics.

Page 18: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Looking at Data—Distributions 7

Basic Statistics

Exercise 1.74 Using Cavendish’s data from Exercise 1.27, find the mean x and sampledeviation s . Also give the five-number summary and create a boxplot to view the spread.

5.50 5.61 4.88 5.07 5.26 5.55 5.36 5.29 5.58 5.655.57 5.53 5.62 5.29 5.44 5.34 5.79 5.10 5.27 5.395.42 5.47 5.63 5.34 5.46 5.30 5.75 5.68 5.85

Solution: We first enter the data into a list in the STAT EDIT screen. To do so, follow thesteps outlined in the previous section in Exercise 1.26. However, if the data is already stored ina list (as we have previously stored it in L6), then we do not need to re-enter it. After data hasbeen entered into a list, we can compute the desired statistics. We will assume the data are inlist L6. Press STAT, press the right arrow to display the CALC screen and press 1 to bring thecommand 1–Var Stats to the Home screen. Press L6 (2nd 6) to obtain the command 1–VarStats L6. Press ENTER.

After completing the above steps, we receive a display of thedesired statistics:

Note: If the data had been entered into a different list, say list L4, thenwe would use the command 1–Var Stats L4.

Since the data is a sample of measurements, the value of x is the sample mean of thepopulation, though it can be considered to be the true mean of just this data set. Twostandard deviation values are given. The first, Sx, is the sample deviation, denoted by s in thetext, which is to be used if considering this data to be a sample from a larger population. Thesecond, x, is the true standard deviation of just this set of measurements. Thus here we cansay x ≈ 5.448 and s ≈ 0.221. The five-number summary is also computed with the 1–Var Statscommand. Press the down arrow to scroll down and view thesestatistics. The five-number summary is as follows: The minimum valueis 4.88, the first quartile is 5.295, the median is 5.46, third quartile is5.615, and the maximum value is 5.85.

(Since there are 29 measurements, the first quartile is actually the average of the 7th and 8thmeasurement when listed in increasing order. Here the third quartile is the average of the 22ndand 23rd measurements when listed in increasing order.)

Accessing the Statistics

After computing the statistics, the calculatorstores their values in the VARS Statistics screen.Whenever needed, we can retrieve the values fromthis screen.

Page 19: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

8 CHAPTER 1

Example For the data of the previous Exercise 1.74, recall value of x , compute the interval

( x − s , x + s ), and compute the sample variance s2 .

Solution: Press CLEAR (or 2nd Quit) to return to the Home screen. Torecall x , press VARS, press 5 for Statistics, press 2 for x , and thenpress ENTER.

To compute ( x − s , x + s ), press VARS, press 5, press 2, press –, press VARS, press 5,press 3, and press ENTER. We have just entered the command x – Sx. Now press 2ndENTER to recall the previous command, edit it to x + Sx, and press ENTER. We see thatmeasurements from about 5.227 to about 5.669 are within one sample deviation of the samplemean.

To compute the sample variance s2 , first access Sx as above, thenpress x2 to obtain the command Sx2. Then press ENTER.

Boxplots

We now continue with Exercise 1.74 to make the boxplot. We first set the WINDOW, as with ahistogram, to the appropriate range of the measurements (although the boxplot ignores the Yrange). In the STAT PLOT screen, first turn on PLOT1 (and turn off all other plots and functions).Press the down arrow to get to Type, then scroll the right arrow to highlight the boxplot (fifthtype), and press ENTER. Set Xlist to L6, or whatever list holds the data, and the frequency to1. The press GRAPH. Press TRACE and then press the right arrow to see the values of thefive-number summary.

Statistics for Data in a Frequency Chart

Example Compute the statistics and make a boxplot for the number of children living in ahousehold.

Number of children 0 1 2 3 4 5 6Number of households 60 42 86 59 22 4 2

Page 20: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Looking at Data—Distributions 9

Solution: First clear lists L1 and L2 in the STAT Edit screen, and then enter the children(measurements) into list L1 and the households (frequencies) into list L2. To compute the statistics, press STAT, scroll to CALC , and press 1. Enter the command1–Var Stats L1,L2, which means that the measurements in list L1 occur with frequency L2. Ifthe data were stored in other lists, such as L3 and L4, then we would have entered thecommand 1–Var Stats L3,L4.

We see that there were 275 measurements,with the average number of children per householdbeing 1.858.

For a boxplot, adjust the WINDOW to an appropriate range for X (the Y range can beignored). Set Xlist and Freq in the STAT PLOT settings, and graph.

Modified Boxplots

Exercise 1.45 Make a modified boxplot following the 1.5 × IQR criterion for the monthly fees(in dollars) paid by a random sample of users of commercial Internet service providers.

20 40 22 22 21 21 20 10 20 2020 13 18 50 20 18 15 8 22 2522 10 20 22 22 21 15 23 30 129 20 40 22 29 19 15 20 20 20

20 15 19 21 14 22 21 35 20 22

Solution: We will enter the data into list L3. Then we set the WINDOW with X ranging from 0to 55 (again the boxplot ignores the Y range). In the STAT PLOT menu, press the down arrowto get to Type, then scroll right to highlight the modified boxplot, or fourth type, and pressENTER. Set Xlist to L3 and the frequency to 1. Press GRAPH and then press TRACE to viewthe bounds of the IQR, as well as the outliers.

We observe that the IQR spreads from 12 to 25 causing various outliers such as 35.

Page 21: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

10 CHAPTER 1

Exercise 1.49 Make side-by-side modified boxplots of the women’s scores and the men’sscores on the Survey of Study Habits and Attitudes.

Women’s scores154 109 137 115 152 140 154 178 101103 126 126 137 165 165 129 200 148

Men’s scores108 140 114 91 180 115 126 92 169 146109 132 75 88 113 151 70 115 187 104

Solution: In the STAT Edit screen, enter the women’s scores into list L1 and the men’s scoresinto list L2. (Remember to clear existing data out of these lists before entering new data.)Choose an appropriate WINDOW with an X range that allows you to see the minimum andmaximum of both data sets. In the STAT PLOT screen, adjust the settings for both PLOT1 andPLOT2. The GRAPH shows the women’s scores from L1 at the top of the screen.

Computing Statistics for Two Data Sets of Common Size

If we have two data sets with an equal number of measurements (no more than 999), then wecan enter the data into the STAT Edit screen in order to compute the statistics simultaneously.If the data sets are of different sizes, then they must be computed separately. In the previousexercise, we would enter the command 1–Var Stats L1 (from the STAT CALC screen) to findthe women’s statistics, and enter 1–Var Stats L2 to find the men’s statistics.

But what if we were to ignore the last two men’s scores? Press STAT, press 1. Scroll downlist L2, highlight each of the last two men’s scores, and press DEL (delete). Now each data set has 18 measurements. To compute the statistics simultaneously, pressSTAT, press the right arrow to display the CALC screen, and press 2 for the command 2–VarStats. Then type L1,L2 and enter the resulting command 2–Var Stats L1,L2. The screeninitially shows the statistics from L1. Scroll down to see the statistics from L2.

Page 22: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Looking at Data—Distributions 11

1.3 The Normal Distributions

One of the best statistical features on the TI-83 is the built-in normal distribution commands inthe DISTR menu. In this section, we shall compute various normal probabilities as well asinverse normal distribution values.

The Normal Distribution and Inverse Normal Commands

For normally distributed measurements X with specified mean and standard deviation ,we can find the proportion of measurements between the values a and b with the commandnormalcdf(a , b , , ). To find the value x for which P(X ≤ x ) equals a desired proportion p ,use the command invNorm( p , , ).

Exercise 1.99 The lengths of human pregnancies are normally distributed with a mean of 266days and a standard deviation of 16 days. (a) What percent of pregnancies last less than 240days? (b) What percent of pregnancies last between 240 and 270 days? (c) How long do thelongest 20% of pregnancies last?

Solution: We can work part (b) directly with the built-in normalcdf(command. Press 2nd VARS (DISTR), then press 2. Type the commandnormalcdf(240, 270, 266, 16) and press ENTER. We see that around54.66% of pregnancies last between 240 and 270 days

For part (a) we first note that 50% of pregnancies will fall below themean of 266. Thus to find the percentage less than 240, we instead findthe percentage between 240 and 266 and then subtract from 50%. Doingso, we see that around 5.2% of pregnancies last less than 240 days.

(c) To find how long the longest 20% of pregnancies last, we mustfind the value x for which P(X ≤ x ) = 0.80. Thus we enter the commandinvNorm(.80, 266, 16). We see that the longest 20% last about 279.46days.

To compute normal probabilities, one could also use a program which prompts thevariables. Such a program, NORMDIST, is given on page 12 and first must be keyed ordownloaded into the TI-83. To execute the NORMDIST program for part (b) of Exercise 1.99,call up the program from the PRGM menu, then enter 266 for MEAN, enter 16 forSTANDARD DEV., enter 240 for LOWER BOUND, and enter 270 for UPPER BOUND. Then if you wish to see a shaded graph of the normal curve, enter 1 for GRAPH?.Otherwise, enter 0. If you enter 1, then the graph of the partially shaded Bell-Curve initiallyappears, but then it is replaced by the display of the computed probability. To see the graphagain, press GRAPH. Then to see the probability value again, press CLEAR.

Page 23: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

12 CHAPTER 1

Again we see that about 54.66% of pregnancies last between 240 and 270 days.

The NORMDIST ProgramPROGRAM:NORMDIST:Disp "MEAN":Input M:Disp "STANDARD DEV.":Input S:Disp "LOWER BOUND":Input J:Disp "UPPER BOUND":Input K:If K>J:Then:normalcdf(J,K,M,S)B:Else:normalcdf(M,K,M,S)B:End:round(B,4)B:Disp "GRAPH?":Input Z:If Z=1:Then:PlotsOff:FnOff:"normalpdf(X,M,S)"Y⁄:M-3SXmin:M+3SXmax

:SXscl:0Ymin:Y⁄(M)Ymax:.1Yscl:If K>J:Then:Shade(0,Y⁄,J,K):Else:Shade(0,Y⁄,–1†99,K):End:End:If J=K:Then:Disp "CUMULATIVE=",B+.5:Disp "RIGHT TAIL=",.5-B:Disp "BODY=":If K≥M:Then:Disp B:Else:Disp –B:End:Else:Disp "PROB=",B:End

Computing Tail Values

A probability value such as P(X ≤ k) or P(X < k) is called the cumulative area (or left-tail value).A probability value such as P(X ≥ k) or P(X > k) is called the right-tail value. A probabilityvalue such as P( ≤ X ≤ k) or P( ≤ X < k) is called the body region. All three of these values can be computed simultaneously with the NORMDIST program,for any single value k , by entering the value of k for both the LOWER BOUND and the UPPERBOUND. In this case, the cumulative area will be shaded if entering 1 for Graph?.

Page 24: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Looking at Data—Distributions 13

Alternate Tail Value Commands

An alternate way to compute a left-tail value P(X ≤ k) is with the command

normalcdf(–1 E 99, k , , )

A right-tail value P(X ≥ k) can be found with the command

normalcdf(k , 1 E 99, , )

The bounds –1 E 99 and 1 E 99 are used as “estimates” of –∞ and +∞.

Example Rework part (a) of Exercise 1.99 using the NORMDIST program and the alternatebuilt-in command.

Solution: Call up the NORMDIST program from the PRGM menu, thenenter 266 for MEAN, enter 16 for STANDARD DEV., enter 240 forLOWER BOUND, and enter 240 again for UPPER BOUND. Enter 0 forGRAPH?. We obtain a cumulative value of 0.0521; so about 5.21% ofpregnancies last less than 240 days.

Using -1E99 as the lower bound, we can also find this left-tail valueP(X < 240) with the built-in command:

Exercise 1.89 Let Z ~ N (0, 1) be the standard normal curve. Shade the areas and find theproportions for the regions (a) Z ≤ –2.25, (b) Z ≥ –2.25, (c) Z > 1.77, (d) –2.25 < Z < 1.77.

Solution: For each part, we use the NORMDIST program, with a MEAN of 0 and aSTANDARD DEV. of 1. We can do parts (a) and (b) together by entering –2.25 for bothbounds. We see that P(Z ≤ −2.25) = 0.0122, with a very small shaded area at the left tail of thegraph. Also P(Z ≥ −2.25) = 0.9878, and its region is the large unshaded portion under thecurve.

(c) We rerun the program and enter 1.77 for bothbounds. We see that P(Z > 1.77) = 0.0384. In thiscase, the desired right tail is the unshaded portionof the graph.

Page 25: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

14 CHAPTER 1

(d) We rerun the program and enter bounds of–2.25 and 1.77. We see that P(−2.25 < Z < 1.77) =0.9494.

Alternately we can use the Shade( command from the DRAW menu. First press Y= andenter the normalpdf( command (item 1 from the DISTR menu) into Y1. Complete the functionas normalpdf(X). Next set the WINDOW with X ranging from –3 to 3 and Y ranging from 0 to1/ (2π) . For (a), enter the command Shade(0, Y1, –1E99, –2.25). For (b), shown below, enterShade(0, Y1,–2.25, 1E99). For (c), enter Shade(0, Y1, 1.77, 1E99). For part (d), enter Shade(0,Y1, –2.25, 1.77).

We can also use the normalcdf( command if we want just a probability. For part (d), thecommand normalcdf(–2.25, 1.77, 0, 1) gives a value of 0.949412044.

Exercise 1.91 Let Z ~ N (0, 1) be the standard normal distribution. (a) Find the point z suchthat Z < z has relative frequency 0.8. (b) Find the point z such that Z > z has relativefrequency 0.35.

Solution: We use the invNorm( command from theDISTR menu. For (a), enter the commandinvNorm(.8,0,1). For part (b), the cumulativeprobability up to point z has relative frequency 1– 0.35 = 0.65; thus we enter the commandinvNorm(.65,0,1).

Exercise 1.104 WISC scores are normally distributed with a mean of 100 and a standarddeviation of 15. What percentage of scores lie (a) below 100, (b) below 80, (c) above 140, (d)between 100 and 120?

Solution: (a) By the symmetric property of a normal distribution, 50% of the scores must liebelow 100. For parts (b) and (c) we use the alternate normalcdf( command as shown below.Part (d) can be computed directly.

We see that 9.12% are below 80, 0.383% areabove 140, and 40.88% are between 100 and 120.

These values can be verified with theNORMDIST program.

Page 26: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Looking at Data—Distributions 15

Exercise 1.05 (a) For the WISC scores of the previous exercise, what score will place a child inthe top 5%? (b) In the top 1%? (c) What scores, symmetric about the mean, are such that 95%of children fall in between?

Solution: (a) Since 95% of the children will fall below x , we enter thecommand invNorm(.95, 100, 15). For (b), enter invNorm (.99, 100,15).

Thus, 5% of children score above 124.67, and 1% of children scoreabove 134.895.

For part (c), there are two values x and y such that P(x ≤ X ≤ y) = 0.95. By the symmetryof the normal curve, the left and right tails each have 2.5%. That is, P(X < x ) = 0.025 andP(X < y) = 0.975. Entering the commands invNorm(.025, 100, 15) followed by invNorm(.975,100, 15), we find that x = 70.6 and y = 129.4.

Assessing Normality and Normal Quantile Plot

Exercise 1.111 Assess the normality of Cavendish’s data from Exercise 1.27 by finding thepercentages that lie within one, two, and three standard deviations of average. Then make anormal quantile plot.

Solution: Earlier we had saved Cavendish’s data by storing it in list L6. Enter the command L6 L1 to recover the data (or re-enter the data into list L1 if you had not previously saved it).From the STAT menu, enter the command SortA(L1) to sort the data into increasing order. Next, compute the statistics by entering the command 1–Var Stats L1 from the STATCALC screen. As done previously, we must now access the statistics from the VARS Statisticsmenu to compute the ranges for one, two, and three standard deviations from average.Compute these ranges as below.

Now press STAT and press 1 to return to the list. Scrolling down the list and counting, wesee that 22 out of 29, or 75.86%, of the measurements lie within one standard deviation ofaverage from 5.227 to 5.669. We see that 28 out of 29, or 96.55%, of the measurements liebetween 5.006 and 5.89. Finally we see that 100% of the measurements are within threestandard deviations of average. These results vary somewhat from the 68-95-99.7 rule. To make a normal quantile plot, first adjust the WINDOW as below so that X representsthe data and Y ranges from –3 to 3, which covers most of the standard normal curve. In theSTAT PLOT screen. Press the down arrow to get to TYPE, then scroll right until the sixth typehighlights and press ENTER. Set Data List to L1 and Data Axis to X. Then press GRAPH.

We observe that thedata is nearly in a straightline, with only a couple ofoutliers.

Page 27: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014
Page 28: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

CHAPTER

2

Looking at Data—Relationships

2.12.22.32.4

2.6

ScatterplotsCorrelationLeast-Squares RegressionCautions about Regressionand CorrelationTransforming Relationships

Introduction

In this chapter we will graph the relationship between two quantitative variables usingscatterplots. We study the correlation and find the least-squares regression line and otherregression fits such as an exponential fit. We also look at residual plots for these fits. Lastly,we use the TI-83 to obtain some non-linear regression fits.

17

Page 29: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

18 CHAPTER 2

2.1 Scatterplots

We first plot two quantitative variables along the x and y axes to see if we can observe arelationship. In particular, we look for a linear relationship.

Exercise 2.7 Make a scatterplot that shows how national wine consumption helps explainheart disease death rates.

CountryAlcohol

from wineHeart disease

deaths CountryAlcohol

from wineHeart disease

deathsAustralia 2.5 211 Netherlands 1.8 167Austria 3.9 167 New Zealand 1.9 266Belgium 2.9 131 Norway 0.8 227Canada 2.4 191 Spain 6.5 86

Denmark 2.9 220 Sweden 1.6 207Finland 0.8 297 Switzerland 5.8 115France 9.1 71 United Kingdom 1.3 285Iceland 0.8 211 United States 1.2 199Ireland 0.7 300 West Germany 2.7 172

Italy 7.9 107

Solution: We first enter the data into the STAT Edit screen. Clear (or store into other lists) anydata from L1 and L2. Enter the amounts of alcohol from wine into L1 and the heart diseasedeaths into L2. The alcohol amounts will be plotted on the x axis and the heart disease deathswill be plotted on the y axis. Adjust the WINDOW as below so that the ranges include allscores. Adjust the STAT PLOT settings, by highlighting and entering the first Type, and settingthe appropriate lists. Press GRAPH to see the scatterplot, and then press TRACE if sodesired.

We can observe that the number of heartdisease deaths generally tends to decrease as theamount of alcohol from wine increases.

Exercise 2.11 Make a scatterplot of mass versus metabolic rate for the females. Make anotherscatterplot with a different symbol for the males, and then combine the two plots.

Page 30: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Looking at Data—Relationships 19

Sex Mass Rate Sex Mass RateM 62.0 1792 F 40.3 1189M 62.9 1666 F 33.1 913F 36.1 995 M 51.9 1460F 54.6 1425 F 42.4 1124F 48.5 1396 F 34.5 1052F 42.0 1418 F 51.1 1347M 47.4 1362 F 41.2 1204F 50.6 1502 M 51.9 1867F 42.0 1256 M 46.9 1439M 48.7 1614

Solution: We first enter the mass and rate of just the females into lists L1 and L2 respectively.Then enter the mass and rate of the males into lists L3 and L4. However we adjust theWINDOW so that the X range includes all the masses and the Y range includes all the rates. Weadjust the STAT PLOT settings in Plot1 as in the previous exercise to obtain the scatterplot ofL1 versus L2.

Next turn off Plot1, turn on Plot2 and adjustits settings with a different Mark for the males,and plot L3 versus L4.

To see the females and males together, turn on both Plot1 and Plot2and regraph.

Exercise 2.18 Make a plot of the counts of insects trapped against board color. Compute themeans for each color, add the means to the plot, and connect the means with a line segment.

Board color Insects trappedLemon yellow 45 59 48 46 38 47

White 21 12 14 17 13 17Green 37 32 15 25 39 41Blue 16 11 20 21 14 7

Page 31: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

20 CHAPTER 2

Solution: We will plot the colors on the x axis consecutively as the values 1, 2, 3, and 4. Sincethere are six measurements for each color, we enter each of the values 1 through 4 a total of sixtimes into list L1 as shown below. We list the corresponding measurements in list L2.

Next we adjust the WINDOW so that the X range includes the values 1 through 4 and the Yrange includes all the measurements. Adjust the STAT PLOT settings and graph.

Next, we enter the values 1 through 4 into list L3 and then compute the mean number ofinsects trapped for each color; we put the means in list L4. (Since there are only sixmeasurements, it is easy to compute the means “by hand” by summing and dividing by 6.)Turn on Plot2, set it to the second Type, and set the lists to L3 and L4. Graph Plot1 and Plot2together to see the measurements and the means.

2.2 Correlation

In this section, we use the TI-83 to compute the correlation coefficient r and the squared

correlation coefficient r2 between paired data of quantitative variables.

Diagnostic On

In order to compute the correlation, we first mustmake sure that the calculator’s diagnostics areturned on. To turn the setting on, press 2nd 0(CATALOG) and scroll down to theDiagnosticOn command. Press ENTER to bringthe command to the Home screen, then pressENTER again.

Page 32: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Looking at Data—Relationships 21

Exercise 2.21 (a) Make a scatterplot. (b) Compute the correlation coefficient r between theheights (in inches) of these dating couples of women and men. (c) How would r change if allthe men were 6 inches shorter than the heights given in the table?

Women ( x ) 66 64 66 65 70 65Men ( y ) 72 68 70 68 71 65

Solution: We first enter the data into lists L1 and L2. Clear any data from these lists andenter the heights of the women into list L1 and the heights of the men into L2. Adjust theWINDOW and STAT PLOT, then graph.

To find the correlation, press STAT, scrollright to CALC , press 8 for the commandLinReg(a+bx), and press ENTER. We see that r= 0.5653337711.

Note: Regression defaults are for lists L1 and L2. If the data were in other lists, say lists L3and L4, then we would first press 8 to obtain the command LinReg(a+bx) on the Home screen,and then we would enter the command LinReg(a+bx) L3,L4. The LinReg(ax+b) command(item 4) will also compute the correlation.

For part (c), we enter heights for the malesthat are all 6 inches shorter in list L3. Then weenter the command LinReg(a+bx) L1,L3. We seethat r has not changed.

Exercise 2.24 Below are the measurements in centimeters of preserved bones of five specimensof archaeopteryx.

Femur 38 56 59 64 74Humerus 41 63 70 72 84

Here are the same specimens with the femur measured in centimeters and the humerus inmillimeters:

Femur 0.38 0.56 0.59 0.64 0.74Humerus 410 630 700 720 840

Page 33: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

22 CHAPTER 2

(a) Make a scatterplot of each on the same axes with the x axis from 0 to 75 and the yaxis from 0 to 850. (b) Show that the correlation coefficient is the same for the two sets ofmeasurements.

Solution: We first enter the data respectively into lists L1, L2, L3, and L4. Then we adjust theWINDOW and STAT PLOT settings for Plot1 and Plot2. In order to see an unobstructedgraph we then enter the command AxesOff from the Catalog. We then graph and observe thenoticeable difference in scatterplots.

Finally, we compute the correlation coefficients of both sets of measurements with thecommands LinReg(a+bx) L1,L2 and LinReg(a+bx) L3,L4. We see that regardless of the unitsused, the correlation is the same.

2.3 Least-Squares Regression

In this section we will compute the least-squares line of two quantitative variables and graph itthrough the scatterplot of the variables. We will also use the line to predict the y -value thatshould occur for a given x -measurement.

Exercise 2.40 (a) Make a scatterplot of the data and find the correlation. (b) Calculate andgraph the least-squares regression line for predicting absorbance from nitrate concentration.Predict the absorbance for 500 mg of nitrates.

Nitrates 50 50 100 200 400 800 1200 1600 2000 2000Absorbance 7.0 7.5 12.8 24.0 47.0 93.0 138.0 183.0 230.0 226.0

Solution: We make a scatterplot in the usual way. Here we will use lists L5 and L6. Adjustthe WINDOW and STAT PLOT settings, and then graph.

Page 34: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Looking at Data—Relationships 23

Correlation and the Regression Line

We obtain the linear regression line using the same LinReg(a+bx) command that computes thecorrelation. Press STAT, scroll right to CALC , press 8, to obtain the command LinReg(a+bx)on the Home screen, and then enter the command LinReg(a+bx) L5,L6 .

We see that the equation of the least-squares regression line is

ˆ y = 1.657+ 0.1133x . The values of r and r2 are also given. Because ofthe strong positive linear relationship, we have a correlation very close to1.

Graphing the Regression Line

To graph the regression line, we must enter it into the Y= screen. We can type it directly, or wecan access this regression function from the VARS Statistics menu. Press Y= and clear anyfunction that might be in Y1. (Make sure the calculator is on Func mode.) Now press VARS,then press 5 for Statistics. Scroll right to EQ, and press 1. The regression function is enteredinto Y1. Now press GRAPH.

Prediction

To evaluate the line at a specified x , we can now access Y1 from the Y–VARS screen. PressVARS, scroll right to Y–VARS, press 1, then press 1 again. The function Y1 is entered into theHome screen. To evaluate the line at x = 500 (mg of nitrates), enter the command Y1(500).

We obtain a value of58.3 for the predictedabsorbance value.

We can also verify that the point ( x , y ) is on the line. First we compute the statistics. Wecan do so simultaneously with the 2–Var Stats command from the STAT CALC menu since thetwo data sets have the same number of points. Enter the command 2–Var Stats L5,L6. Thenenter Y1( x ), by recalling x from the VARS Statistics menu.

We see that Y1( x ) = y .

Page 35: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

24 CHAPTER 2

Exercise 2.54 Here are data for the number of people (in millions) living on American farms.

Year 1935 1940 1945 1950 1955 1960 1965 1970 1975 1980Population 32.1 30.5 24.4 23.0 19.1 15.6 12.4 9.7 8.9 7.2

(a) Make a scatterplot and find the least-squares regression line of farm population on year.(b) According to the regression line, on average how much did farm population decline per yearduring this period? What percent of the observed variation in farm population is accounted forby linear change over time? (c) Use the regression line to predict the number of people living onfarms in 1990.

Solution: (a) We first enter the data in L1 (year) and L2 (population), and adjust the Windowand Stat Plot settings. Next we compute the least-squares regression line for farm populationon year by entering the command LinReg(a+bx) L1,L2. Then we enter the regression equationinto Y1 as above in Exercise 2.40 and press GRAPH to observe the plots.

The least-squares regression line is computed as ˆ y ≈ 1166.927− 0.586788x .

(b) The average farm population decline during this time is given by 7.2 − 32.1

1980− 1935 ≈ –0.553

(millions). Thus the average decline was about 553,000 per year. However according to the

regression line slope, the estimated average decline becomes about 586,788 per year. Since r2 =0.977, we can say that about 97.7% of the observed variation in farm population is accountedfor by linear change over time

(c) The regression line predicts about –780,000 people (or –0.78 million)to be living on farms in 1990. Obviously the result is not possible; thusthis regression line should not be used beyond 1980.

2.4 Cautions about Regression and Correlation

We now provide an exercise to demonstrate how to plot the residuals of a least-squares line.

Exercise 2.62 For the data below, make a scatterplot, compute and graph the least-squaresregression line of y on x , and plot the residuals against x .

Amount Response0.25 6.55 7.98 6.54 6.37 7.961.00 29.7 30.0 30.1 29.5 29.15.00 211 204 212 213 205

20.00 929 905 922 928 919

Page 36: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Looking at Data—Relationships 25

Solution: Since the amounts have five responses, we enter each amount five times into list L1,and enter the corresponding measurements into L2. We then adjust the WINDOW and STATPLOT settings, and graph.

We next compute theregression line, enter it intoY1, and graph again.

Residual Plot

After computing a regression, the residuals are stored in the LIST screen.Press 2nd STAT (LIST), scroll down to item RESID, and press ENTERto obtain the command LRESID on the Home screen. Enter thecommand LRESID L3 to store the residuals into list L3 .

Enter the STAT Edit screen and scroll down list L3 to decide upon an appropriate rangefor the residuals. Set the Y range in the WINDOW to include all the residuals. Adjust theSTAT PLOT to a Ylist of L3. Deselect the regression line Y1 in the Y= screen, then pressGRAPH.

2.6 Transforming Relationships

Paired data are often related in a manner that is not linear. For example, many growth patternsexperience an exponential or power law relationship. In this section we shall use the TI-83 tocalculate and graph these non-linear fits.

Exercise 2.93 The following data are commonly used values for the decay of the isotopeiodine-125.

Days Activity Days Activity Days Activity Days Activity0 1.000 60 0.500 120 0.250 180 0.125

20 0.794 80 0.397 140 0.198 200 0.09940 0.630 100 0.315 160 0.157 220 0.079

Page 37: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

26 CHAPTER 2

Make a scatterplot to determine whether a linear, exponential, or power model best fitsthese data. Find the appropriate regression function to predict activity after x days.

Solution: We enter the data into lists, adjust our window settings, and graph. The shape of theresulting plot appears to be exponential decay.

To compute the regression function, enter theSTAT CALC screen and enter item 0 for ExpReg.Enter the command ExpReg L1,L2 to obtain

ˆ y = 0.999933× (0.988512) x .

Exercise 2.99 Maria’s savings bond is initially worth $500 and earns interest at 7.5% each year.

(a) Find the value of the bond at the end of 1 year, 2 years, and so on up to 10 years.(b) Plot the value y against years x and connect the points with a smooth curve.(c) Take the logarithm of each of the values y and plot the logarithm against years x .

Solution: We can use the seq( command to quickly enter the years and values into lists L1 andL2. Press LIST (2nd STAT), scroll right to OPS, and press 5 for the sequence command. Thenenter the commands seq(K,K,0,10)L1 and seq(500 * (1.075)^K,K,0,10)L2.

From list L2 we can observe the value after each year.

(b) Now compute the exponential regression function with the ExpReg command the STATCALC screen. Enter the regression function into Y1, set the window range, turn on the STATPLOT settings for a scatterplot and graph.

Page 38: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

Looking at Data—Relationships 27

(c) To compute the logarithms of the values, enter the command log(L2)L3. Now adjust the Ysettings in the WINDOW for these new values. Change the STAT PLOT settings for ascatterplot of L1 vs. L3 and graph.

The plot of log y against years is now a straight line.

Exercise 2.101 (a) Make a scatterplot of United States population (in millions) against timeand find an exponential fit. (b) Then plot the logarithms of population against time.

Date Pop. Date Pop. Date Pop. Date Pop.1790 3.9 1850 23.2 1910 92.0 1970 203.31800 5.3 1860 31.4 1920 105.7 1980 226.51810 7.2 1870 39.8 1930 122.8 1990 248.71820 9.6 1880 50.2 1940 131.7 2000 281.41830 12.9 1890 62.9 1950 151.31840 17.1 1900 76.0 1960 179.3

Solution: (a) Enter the dates into L1 and the populations into L2, adjust the WINDOWsettings, and adjust the Plot1 settings for a scatterplot of L1 versus L2. Then enter the

command ExpReg L1,L2. We obtain an exponential fit of ˆ y ≈ (1.211×10−15) ×1.0204x .

Now enter the regression equation into Y1.(Press Y=, clear Y1, press VARS, press 5 forStatistics, scroll right to EQ, and press 1.) Thenpress GRAPH.

(b) To store the logarithms of the populations into L3, enter the command log(L2)L3. Thenview L3 to see the new range before readjusting the WINDOW. Deselect the function Y1 in theY= screen by highlighting = and pressing ENTER, then adjust Plot1 to graph L1 versus L3.

Page 39: AP Statistics Summer Assignment - Boone County Schools · AP Statistics Summer Assignment . The summer assignment for this course will be due on the first school day of the 2013-2014

28 CHAPTER 2

We see that the plot the logarithms of population against time is not exactly linear due tothe inexact exponential fit in part (a).

Exercise 2.105 The following table gives the average weight and average lifespan for severalspecies of mammals. Fit a power law model to these data. Use the fitted model to predict theaverage lifespan for humans (average weight 65 kg).

SpeciesWeight

(kg)Lifespan(years) Species

Weight(kg)

Lifespan(years)

Baboon 32 20 Guinea pig 1 4Beaver 25 5 Hippopotamus 1400 41

Cat 2.5 12 Horse 480 20Chimpanzee 45 20 Lion 180 15

Dog 8.5 12 Mouse 0.024 3Elephant 2800 35 Pig 190 10

Goat 30 8 Red fox 6 7Gorilla 140 20 Sheep 30 12

Grizzly bear 250 25

Solution: First enter the weights and the lifespans into lists, then adjust the window andobserve the scatterplot. Then retrieve the PwrReg command (item A in the STAT CALCscreen) and enter the command PwrReg L1,L2 (or use whichever lists hold the data). We

obtain a power law fit of ˆ y ≈ (5.77717) x0.2182 .

Due to the relatively low value of r2 , we do not have a very good fit. To see the graph ofthe power law model, enter this regression equation into Y1 (press Y=, clear Y1, press VARS,press 5 for Statistics, scroll right to EQ, and press 1) and re-graph. Finally evaluate thepredicted lifetime Y1 for a weight of 65 kg. (To bring Y1 to the Home screen, press VARS, scrollright to Y–VARS, press 1, then press 1 again.)

We thankfully see that humans are an exception to the rule!