65590668 Understandable Statistics Solutions Manual Sosany2
-
Upload
mark-wheatley -
Category
Documents
-
view
483 -
download
2
Transcript of 65590668 Understandable Statistics Solutions Manual Sosany2
Part IV: Complete Solutions, Chapter 1 3
Copyright © Houghton Mifflin Company. All rights reserved.
Chapter 1: Getting Started Section 1.1 1. Individuals are people or objects included in the study, whereas a variable is a characteristic of the individual
that is measured or observed. 2. Nominal data are always qualitative. 3. A parameter is a numerical measure that describes a population. A statistic is a numerical value that describes a
sample. 4. If the population does not change, a parameter will not change. Thus, for a fixed population, parameter values
are constant. Surely, if we take three samples of the same size from a population, the values of the sample statistics will differ.
5. (a) The variable is the response regarding frequency of eating at fast-food restaurants. (b) The variable is qualitative. The categories are the number of times one eats in fast-food restaurants. (c) The implied population is responses for all adults in the United States. 6. (a) The variable is miles per gallon. (b) The variable is quantitative because arithmetic operations can be applied to the mpg values. (c) The implied population is gasoline mileage for all new cars. 7. (a) The variable is the nitrogen concentration (milligrams of nitrogen/liter of water). (b) The variable is quantitative because arithmetic operations can be applied to nitrogen concentration. (c) The implied population is all the lakes in the wetlands. 8. (a) The variable is the number of ferromagnetic artifacts per 100 square meters. (b) The variable is quantitative because arithmetic operations can be applied to the number of artifacts. (c) The implied population is the number of ferromagnetic artifacts per each distinct 100-square-meter plot in
the Tara region. 9. (a) Length of time to complete an exam is a ratio level of measurement. The data may be arranged in order,
differences and ratios are meaningful, and a time of 0 is the starting point for all measurements. (b) Time of first class is an interval level of measurement. The data may be arranged in order, and differences
are meaningful. (c) Major field of study is a nominal level of measurement. The data consist of names only. (d) Course evaluation scale is an ordinal level of measurement. The data may be arranged in order. (e) Score on last exam is a ratio level of measurement. The data may be arranged in order, differences and
ratios are meaningful, and a score of 0 is the starting point for all measurements. (f) Age of student is a ratio level of measurement. The data may be arranged in order, differences and ratios are
meaningful, and an age of 0 is the starting point for all measurements.
4 Part IV: Complete Solutions, Chapter 1
Copyright © Houghton Mifflin Company. All rights reserved.
10. (a) Salesperson’s performance is an ordinal level of measurement. The data may be arranged in order. (b) Price of company’s stock is a ratio level of measurement. The data may be arranged in order, differences
and ratios are meaningful, and a price of 0 is the starting point for all measurements. (c) Names of new products is a nominal level of measurement. The data consist of names only. (d) Room temperature is an interval level of measurement. The data may be arranged in order, and differences
are meaningful. (e) Gross income is a ratio level of measurement. The data may be arranged in order, differences and ratios are
meaningful, and an income of 0 is the starting point for all measurements. (f) Color of packaging is a nominal level of measurement. The data consist of names only. 11. (a) Species of fish is a nominal level of measurement. Data consist of names only. (b) Cost of rod and reel is a ratio level of measurement. The data may be arranged in order, differences and
ratios are meaningful, and a cost of 0 is the starting point for all measurements. (c) Time of return home is an interval level of measurement. The data may be arranged in order, and
differences are meaningful. (d) Guidebook rating is an ordinal level of measurement. Data may be arranged in order. (e) Number of fish caught is a ratio level of measurement. The data may be arranged in order, differences and
ratios are meaningful, and 0 fish caught is the starting point for all measurements. (f) Temperature of the water is an interval level of measurement. The data may be arranged in order, and
differences are meaningful. 12. Form B would be better. Statistical methods can be applied to the ordinal data obtained from Form B but not to
the answers obtained from Form A. 13. (a) Answers vary. Ideally, weigh the packs in pounds using a digital scale that has tenths of pounds for
accuracy. (b) Some students may refuse to allow the weighing. (c) Informing students before class may cause students to remove items before class. Section 1.2 1. In stratified samples, we select a random sample from each stratum. In cluster sampling, we randomly select
clusters to be included, and then each member of the selected cluster is sampled. 2. In simple random samples, every sample of size n has an equal chance of being selected. In a systematic
sample, the only possible samples are those including every kth member of the population with respect to the random starting position.
3. Sampling error is the difference between the value of the population parameter and the value of the sample
statistic that stems from the random selection process. Certainly, larger boxes of cereal will cost more than small boxes of cereal.
4. (a) Yes, your seating location and the randomized coin flip ensure equal chances of being selected. (b) Not using the described method of selection. This is not a simple random sample; it is a cluster sample. (c) Simply assign each student a number 1, 2, . . . , 40 and use a computer or a random-number table to select
20 students.
Part IV: Complete Solutions, Chapter 1 5
Copyright © Houghton Mifflin Company. All rights reserved.
5. Simply use a computer or random-number table to randomly selected n students from the class after numbers are assigned.
(a) Answers vary. Perhaps they are excellent students who make a special effort to get to class early. (b) Answers vary. Perhaps they are busy students who are never on time to class. (c) Answers vary. Perhaps students in the back row are introverted. (d) Answers vary. Perhaps tall students generally are healthier. 6. (a) Sick students and those who are skipping class cannot be sampled. (b) Home-schooled students, homeless students, and dropouts cannot be sampled. 7. Answers vary. 8. Answers vary. 9. Answers vary. 10. Answers vary. Perhaps use 0, 1, 2, 3, 4 to indicate H and 5, 6, 7, 8, 9 to indicate T. 11. (a) It is appropriate. Certainly we can roll a 1 more than once in 20 rolls. The fourth roll was 2. (b) No, simulated rolls of the die are random events, and we certainly would expect a different sequence. 12. Answers vary. We do expect at least once match on birthdays on over 50% of the times we run this experiment. 13. Answers vary. 14. Answers vary. 15. (a) This technique is simple random sampling. Every sample of size n from the population has an equal chance
of being selected, and every member of the population has an equal chance of being included in the sample. (b) This technique is cluster sampling. The state, Hawaii, is divided into ZIP Codes. Then, within each of the
10 selected ZIP Codes, all businesses are surveyed. (c) This technique is convenience sampling. This technique uses results or data that are conveniently and
readily obtained. (d) This technique is systematic sampling. Every fiftieth business is included in the sample. (e) This technique is stratified sampling. The population was divided into strata based on business type. Then a
simple random sample was drawn from each stratum. 16. (a) This technique is stratified sampling. The population was divided into strata (four categories of length of
hospital stay), and then a simple random sample was drawn from each stratum. (b) This technique is simple random sampling. (c) This technique is cluster sampling. There are five geographic regions, and some facilities from each region
are selected randomly. Then, for each selected facility, all patients on the discharge list are surveyed to create the patient satisfaction profiles.
(d) This technique is systematic sampling. Every 500th patient is included in the sample. (e) This technique is convenience sampling. This technique uses results or data that are conveniently and
readily obtained.
6 Part IV: Complete Solutions, Chapter 1
Copyright © Houghton Mifflin Company. All rights reserved.
Section 1.3 1. Answers vary. People with higher incomes likely will have high-speed Internet access, which will lead to
spending more time on-line. Spending more time on-line might lead to spending less time watching TV. Thus, spending less time watching TV cannot be attributed solely to high income or high-speed Internet access.
2. A double-blind procedure would entail neither the patients nor those administering the treatments knowing
which patients received which treatments. This process should eliminate potential bias from the treatment administrators and from patient psychology regarding benefits of the drug.
3. (a) This is an observational study because observations and measurements of individuals are conducted in a
way that doesn’t influence the response variable being measured. (b) This is an experiment because a treatment is deliberately imposed on the bighorn sheep in order to observe
a possible change in heartworm prevention. (c) This is an experiment because a treatment is deliberately imposed on the fishermen in order to observe a
possible change in the length of fish in the river. (d) This is an observational study because observations of the turtles are conducted in a way that doesn’t
change the response being measured. 4. (a) Sampling was used in the hospitals. (b) A computer simulation was used to mimic flight. (c) A census was used because all data were used by the NFL. (d) This was an experiment; patients were assigned a treatment, and the change in precancerous lesions was
measured. 5. (a) Use random selection to pick 10 calves to inoculate. After inoculation, test all calves to see if there is a
difference in resistance to infection between the two groups. No placebo is being used. (b) Use random selection to pick nine schools to visit. After the police visits, survey all the schools to see if
there is a difference in views between the two groups. No placebo is being used. (c) Use random selection to pick 40 volunteers for the skin patch with the drug. Then record the smoking
habits of all volunteers to see if a difference exists between the two groups. A placebo patch is used for the remaining 35 volunteers in the second group.
6. (a) “Over the last few years” could mean 2 years, 3 years, 7 years, etc. A more precise phrase is “Over the past
5 years.” (b) If a respondent is first asked, “Have you ever run a stop sign,” chances are that their response to the
question, “Should fines be doubled,” will change. Those who run stop signs probably don’t want the fine to double.
(c) When only yes or no are possible, most people likely will choose no. When, rarely, sometimes, and frequently are possible, most people likely will choose rarely or sometimes.
7. Based on the information, Scheme A will be better because the blocks are similar. The plots bordering the river
should be similar, and the plots away from the river should be similar.
Part IV: Complete Solutions, Chapter 1 7
Copyright © Houghton Mifflin Company. All rights reserved.
Chapter 1 Review 1. (a) Stratified (b) All undergraduates at the specific campus studied (c) The variable is number of hours worked. It is quantitative. It is a ratio. (d) The variable is career applicability. It is qualitative. It is ordinal. (e) It is a statistic. (f) The nonresponse rate is 60%, and it most likely will introduce bias into the study because those who do not
answer may have different experiences than those who do answer. (g) Probably not. These results are most applicable only to the campus in the study. 2. The implied population is all the listeners (or even all the voters). The variable is the voting preference of a
caller. There is probably bias in the selection of the sample because those with the strongest opinions are most likely to call in.
3. Using the random-number table, pick seven digits at random. Digits 0, 1, and 2 can correspond to “Yes,” and
digits 3, 4, 5, 6, 7, 8, and 9 can correspond to “No.” This will effectively simulate a random draw from a population with 30% TIVO owners.
4. (a) Cluster (b) Convenience (c) Systematic (d) Simple random (e) Stratified 5. (a) This is an observational study because no treatments were applied. (b) This is an experiment because a treatment was applied (test type) and the results then were compared. 6. (a) Randomly select 500 donors to receive the literature and 500 donors to receive the phone call. After the
donation collection period, compare the average amount or total amount collected from each of the two treatment groups.
(b) Randomly select the 43 adults to be given the treatment gel and the 42 adults to receive the placebo gel. After the treatment period, compare the whiteness of the two groups. To make this double blind, neither the treatment administrators nor would the patients would know which gel the patients are receiving.
7. Questions should be worded in a clear, concise, and unbiased manner. No questions should be misleading.
Commonsense rules should be stated for any numerical answers. 8. No response required. 9. (a) This is an experiment; the treatment was the amount of light given to the colonies. (b) We can assume that the normal-light group is the control group because this simulates normal light patterns
for the fireflies. Therefore, the constant-light group is the treatment group. (c) Number of fireflies alive at the end of the study (d) Ratio
220 Part IV: Complete Solutions, Chapter 2
Copyright © Houghton Mifflin Company. All rights reserved.
Chapter 2: Organizing Data Section 2.1 1. Class limits are possible data values, and they specify the span of data values that fall within a class. Class
boundaries are not possible data values; they are values halfway between the upper class limit of one class and the lower class limit of the next class.
2. Each data value must fall into one class. Data values above 50 do not have a class. 3. The classes overlap. A data value such as 20 falls into two classes. 4. These class widths are 11. 5. (a) Yes. (b)
Highway mpg
Freq
uenc
y
40.536.532.528.524.520.516.5
12
10
8
6
4
2
0
Histogram of Highway mpg
6. (a) (c)
Salaries
Freq
uenc
y
253.5207.5161.5115.569.523.5
40
30
20
10
0
Histogram of Salary
Salaries
Freq
uenc
y
68.559.550.541.532.523.5
12
10
8
6
4
2
0
Histogram of Salary
(b) Yes. Yes. 7. (a) Class width = 25 (b)
Class Limits Class Boundaries Midpoints Frequency Relative
Frequency Cumulative Frequency
236–260 235.5–260.5 248 4 0.07 4 261–285 260.5–285.5 273 9 0.16 13 286–310 285.5–310.5 298 25 0.44 38 311–335 310.5–335.5 323 16 0.28 54 336–360 335.5–360.5 348 3 0.05 57
Part IV: Complete Solutions, Chapter 2 221
Copyright © Houghton Mifflin Company. All rights reserved.
(c) (d)
Finish Times
Freq
uenc
y
360.0335.5310.5285.5260.5236.0
25
20
15
10
5
0
Histogram of Finish Times
Finish Times
Rel
ativ
e Fr
eque
ncy
360.0335.5310.5285.5260.5236.0
50
40
30
20
10
0
Histogram of Finish Times
(e) This distribution is slightly skewed to the left but fairly mound-shaped, symmetric. (f)
8. (a) Class width = 11 (b)
Class Limits
Class Boundaries
Midpoint Frequency Relative Frequency
Cumulative Frequency
45–55 44.5–55.5 50 3 0.0429 3 56–66 55.5–66.5 61 7 0.8714 10 67–77 66.5–77.5 72 22 0.3143 32 78–88 77.5–88.5 83 26 0.3714 58 89–99 88.5–99.5 94 9 0.1286 67
100–110 99.5–110.5 105 3 0.0429 70
222 Part IV: Complete Solutions, Chapter 2
Copyright © Houghton Mifflin Company. All rights reserved.
(c) (d)
GLUCOSE
Freq
uenc
y
110.599.588.577.566.555.544.5
25
20
15
10
5
0
Histogram of GLUCOSE
GLUCOSE
Rel
ativ
e Fr
eque
ncy
110.599.588.577.566.555.544.5
40
30
20
10
0
Histogram of GLUCOSE
(e) Approximately mound-shaped, symmetric.
(f) To create the ogive, place a dot on the x axis at the lower class boundary of the first class, and then, for each class, place a dot above the upper class boundary value at the height of the cumulative frequency for the class. Connect the dots with line segments.
9. (a) Class width = 12 (b)
Class Limits
Class Boundaries
Midpoint Frequency Relative Frequency
Cumulative Frequency
1–12 0.5–12.5 6.5 6 0.14 6 13–24 12.5–24.5 18.5 10 0.24 16 25–36 24.5–36.5 30.5 5 0.12 21 37–48 36.5–48.5 42.5 13 0.31 34 49–60 48.5–60.5 54.5 8 0.19 42
(c) (d)
Time Until Recurrence
Freq
uenc
y
60.548.536.524.512.50.5
14
12
10
8
6
4
2
0
Histogram of Time Until Recurrence
Time Until Recurrence
Rel
ativ
e Fr
eque
ncy
60.548.536.524.512.50.5
35
30
25
20
15
10
5
0
Histogram of Time Until Recurrence
(e) The distribution is bimodal. (f) To create the ogive, place a dot on the x axis at the lower class boundary of the first class, and then, for
each class, place a dot above the upper class boundary value at the height of the cumulative frequency for the class. Connect the dots with line segments.
Part IV: Complete Solutions, Chapter 2 223
Copyright © Houghton Mifflin Company. All rights reserved.
10. (a) Class width = 28. (b)
Class Limits
Class Boundaries
Midpoint Frequency Relative Frequency
10–37 9.5–37.5 23.5 7 7 38–65 37.5–65.5 51.5 25 32 66–93 65.5–93.5 79.5 26 58 94–121 93.5–121.5 107.5 9 67
122–149 121.5–149.5 135.5 5 72 150–177 149.5–177.5 163.5 0 72 178–205 177.5–205.5 191.5 1 73
(c) (d)
Depth
Freq
uenc
y
205.5177.5149.5121.593.565.537.59.5
25
20
15
10
5
0
Histogram of Depth
Depth
Rel
ativ
e Fr
eque
ncy
205.5177.5149.5121.593.565.537.59.5
40
30
20
10
0
Histogram of Depth
(e) This distribution is skewed right with a possible outlier. (f) To create the ogive, place a dot on the x axis at the lower class boundary of the first class, and then, for
each class, place a dot above the upper class boundary value at the height of the cumulative frequency for the class. Connect the dots with line segments.
11. (a) Class width = 9 (b)
Class Limits
Class Boundaries
Midpoint Frequency Relative Frequency
Cumulative Frequency
10–18 9.5–18.5 14 6 0.11 6 19–27 18.5–27.5 23 26 0.47 32 28–36 27.5–36.5 32 20 0.36 52 37–45 36.5–45.5 41 1 0.02 53 46–54 45.5–54.5 50 2 0.04 55
224 Part IV: Complete Solutions, Chapter 2
Copyright © Houghton Mifflin Company. All rights reserved.
(c) (d)
MPGAL
Freq
uenc
y
54.545.536.527.518.59.5
25
20
15
10
5
0
Histogram of MPGAL
MPGAL
Rel
ativ
e Fr
eque
ncy
54.545.536.527.518.59.5
50
40
30
20
10
0
Histogram of MPGAL
(e) This distribution is skewed right. (f) To create the ogive, place a dot on the x axis at the lower class boundary of the first class, and then,
for each class, place a dot above the upper class boundary value at the height of the cumulative frequency for the class. Connect the dots with line segments.
12. (a) Class width = 6 (b)
Class Limits
Class Boundaries
Midpoint Frequency Relative Frequency
Cumulative Frequency
0–5 0.5–5.5 2.5 13 0.24 13 6–11 5.5–11.5 8.5 15 0.27 28 12–17 11.5–17.5 14.5 11 0.20 39 18–23 17.5–23.5 20.5 3 0.05 42 24–29 23.5–29.5 26.5 6 0.11 48 30–35 29.5–35.5 32.5 4 0.07 52 36–41 35.5–41.5 38.5 2 0.04 54 42–47 41.5–47.5 44.5 1 0.02 55
(c) (d)
Three-Syllable Words
Freq
uenc
y
47.541.535.529.523.517.511.55.5-0.5
16
14
12
10
8
6
4
2
0
Histogram of Three-Syllable Words
Three-Syllable Words
Rel
ativ
e Fr
eque
ncy
47.541.535.529.523.517.511.55.5-0.5
30
25
20
15
10
5
0
Histogram of Three-Syllable Words
(e) The distribution is skewed right.
Part IV: Complete Solutions, Chapter 2 225
Copyright © Houghton Mifflin Company. All rights reserved.
(f) To create the ogive, place a dot on the x axis at the lower class boundary of the first class, and then, for each class, place a dot above the upper class boundary value at the height of the cumulative frequency for the class. Connect the dots with line segments.
13. (b) Class Limits Class Boundaries Midpoint Frequency
46–85 45.5–85.5 65.5 4
86–125 85.5–125.5 105.5 5
126–165 125.5–165.5 145.5 10
166–205 165.5–205.5 185.5 5
206–245 205.5–245.5 225.5 5
246–285 245.5–285.5 265.5 3
Freq
uenc
y
2.8552.4552.0551.6551.2550.8550.455
10
8
6
4
2
0
Histogram of Tonnes
226 Part IV: Complete Solutions, Chapter 2
Copyright © Houghton Mifflin Company. All rights reserved.
(c) 14. (b)
Freq
uenc
y
0.32150.27850.23550.19250.14950.1065
10
8
6
4
2
0
Histogram of Average
Class Limits Class Boundaries Midpoint Frequency
0.46–0.85 0.455–0.855 0. 655 4
0.86–1.25 0.855–1.255 1.055 5
1.26–1.65 1.255–1.655 1.455 10
1.66–2.05 1.655–2.055 1.855 5
2.06–2.45 2.055–2.455 2.255 5
2.46–2.85 2.455–2.855 2.655 3
Class Limits Class Boundaries Midpoint Frequency
107–149 106.5–149.5 128 3
150–192 149.5–192.5 171 4
193–235 192.5–235.5 214 3
236–278 235.5–278.5 257 10
279–321 278.5–321.5 300 6
Part IV: Complete Solutions, Chapter 2 227
Copyright © Houghton Mifflin Company. All rights reserved.
(c) Class Limits Class Boundaries Midpoint Frequency
0.107–0.149 0.1065–0.1495 0.128 3
0.150–0.192 0.1495–0.1925 0.171 4
0.193–0.235 0.1925–0.2355 0.214 3
0.236–0.278 0.2355–0.2785 0.257 10
0.279–0.321 0.2785–0.3215 0.300 6
15. (a) 1 (b) About 5/51 = 0.098 = 9.8%
(c) 650 to 750
16.
Finish Times360342324306288270252234
Dotplot of Finish Times
The dotplot shows some of the characteristics of the histogram, such as more dot density from 280 to 340, for instance, that corresponds roughly to the histogram bars of heights 25 and 16. However, the dotplot and histogram are somewhat difficult to compare because the dotplot can be thought of as a histogram with one value, the class mark (i.e., the data value), per class. Because the definitions of the classes (and therefore the class widths) differ, it is difficult to compare the two figures.
228 Part IV: Complete Solutions, Chapter 2
Copyright © Houghton Mifflin Company. All rights reserved.
17.
Months56484032241680
Dotplot of Months
The dotplot shows some of the characteristics of the histogram, such as the concentration of most of the data in two peaks, one from 13 to 24 and another from 37 to 48. However, the dotplot and histogram are somewhat difficult to compare because the dotplot can be thought of as a histogram with one value, the class mark (i.e., the data value), per class. Because the definitions of the classes (and therefore the class widths) differ, it is difficult to compare the two figures.
Section 2.2
1. A Pareto chart because it shows the five conditions in their order of importance to employees. 2. A time-series graph because the pattern of stock prices over time is more relevant than just the frequency of
a specific range of closing prices. 3.
Inco
me
Docto
ral De
gree
Master
Degre
e
Bach
elor D
egree
Asso
ciate
Degre
e
High S
choo
l Gradu
ate
9th Gr
ade
90000
80000
70000
60000
50000
40000
30000
20000
10000
0
Bar Graph for Income vs Education
Part IV: Complete Solutions, Chapter 2 229
Copyright © Houghton Mifflin Company. All rights reserved.
4.
Dea
ths
(100
0s)
Swed
enU.K.
Italy
Nethe
rland
sSp
ain
Denm
ark
Icelan
d
Germ
any
Japa
n
Franc
e
Austr
alia
Switz
erlan
d
Cana
da
Hung
ary
Polan
d
New Ze
aland
Unite
d Stat
es
Portu
gal
Mexico
Korea
25
20
15
10
5
0
Pareto Chart of Deaths (1000s) vs Country
5.
Met
ric
Tons
SablefishRockfishFlatfishPacific CodWalleye Pollock
80
70
60
50
40
30
20
10
0
Pareto Chart of Metric Tons vs Fish Species
6. (a)
Num
ber
of S
pear
head
s
BlackwaterBarrowErneBannShannon
35
30
25
20
15
10
5
0
Pareto Chart of Number of Spearheads vs River
230 Part IV: Complete Solutions, Chapter 2
Copyright © Houghton Mifflin Company. All rights reserved.
(b)
15.7%Barrow
37.1%Shannon
16.9%Erne
9.0%Blackwater
21.3%Bann
Pie Chart of Number of Spearheads
7.
23.0%Under Bed
6.0%Bathtub
3.0%Freezer
68.0%Closet
Pie Chart of Hiding Places
8.
6.0%Consulting
11.0%College Service
11.0%Community Service
5.0%Professional Growth
16.0%Research
51.0%Teaching
Pie Chart of Professor Activities
Part IV: Complete Solutions, Chapter 2 231
Copyright © Houghton Mifflin Company. All rights reserved.
9. (a)
Crim
e R
ate
Per
100,
000
Murder
Rape
Robb
ery
Assa
ult
Motor V
ehicle
The
ft
Hous
e Burgla
ry
900
800
700
600
500
400
300
200
100
0
Pareto Chart of Crime Rate vs. Crime Type
(b) Yes, but the graph would take into account only these particular crimes and would not indicate if
multiple crimes occurred during the same incident. 10. (a)
Perc
ent
Com
plai
nt
Othe
rs Inc
onsid
erate
Others
Drive
Too S
low
Being
Cut O
ff
No T
urn S
ignal
Tailg
ating
25
20
15
10
5
0
Pareto Chart of Complaints
(b) Since the percentages do not add to 100%, a circle graph cannot be used. If we create an “other”
category and assume that all other respondents fit this category, then a circle could be created. 11.
232 Part IV: Complete Solutions, Chapter 2
Copyright © Houghton Mifflin Company. All rights reserved.
Year
Elev
atio
n
200019991998199719961995199419931992199119901989198819871986
3820
3815
3810
3805
3800
3795
Time Series Plot of Elevation
12.
Age
Hei
ght
14.013.012.011.010.09.08.07.06.05.04.03.02.01.00.5
65
60
55
50
45
40
35
30
25
Time Series Plot of Height
Part IV: Complete Solutions, Chapter 2 233
Copyright © Houghton Mifflin Company. All rights reserved.
Section 2.3
1. (a) The smallest value is 47 and the largest value is 97, so we need stems 4, 5, 6, 7, 8, and 9. Use the tens digit as the stem and the ones digit as the leaf.
Longevity of Cowboys 4 7 = 47 years 4 7 5 2 7 8 8 6 1 6 6 8 8 7 0 2 2 3 3 5 6 7 8 4 4 4 5 6 6 7 9 9 0 1 1 2 3 7
(b) Yes, these cowboys certainly lived long lives, as evidenced by the high frequency of leaves for stems 7, 8, and 9 (i.e., 70-, 80-, and 90-year-olds).
2. The largest value is 91 (percent of wetlands lost) and the smallest value is 9 (percent), which is coded as 09. We need stems 0 to 9. Use the tens digit as the stem and the ones digit as the leaf. The percentages are concentrated from 20% to 50%. These data are fairly symmetric, perhaps slightly skewed right. There is a gap showing that none of the lower 48 states has lost from 10% to 19% of its wetlands.
Percent of Wetlands Lost 4 0 = 40% 0 9 1 2 0 3 4 7 7 8 3 0 1 3 5 5 5 6 7 8 8 9 4 2 2 6 6 6 8 9 9 5 0 0 0 2 2 4 6 6 9 9 6 0 7 7 2 3 4 8 1 5 7 7 9 9 0 1
3. The longest average length of stay is 11.1 days in North Dakota, and the shortest is 5.2 days in Utah. We need stems from 5 to 11. Use the digit(s) to the left of the decimal point as the stem and the digit to the right as the leaf.
Average Length of Hospital Stay 5 2 = 5.2 days 5 2 3 5 5 6 7 6 0 2 4 6 6 7 7 8 8 8 8 9 9 7 0 0 0 0 0 0 1 1 1 2 2 2 3 3 3 3 4 4 5 5 6 6 8 8 4 5 7 9 4 6 9 10 0 3 11 1
The distribution is skewed right.
234 Part IV: Complete Solutions, Chapter 2
Copyright © Houghton Mifflin Company. All rights reserved.
4. Number of Hospitals per State
0 8 = 8 hospitals 0 8 15 1 1 2 5 6 9 16 2 2 1 7 7 17 5 3 5 7 8 18 4 1 2 7 19 3 5 1 2 3 9 20 9 6 1 6 8 21 7 1 22 7 8 8 23 1 6 9 0 2 6 8 10 1 2 7 42 1 11 3 3 7 9 43 12 2 3 9 44 0 13 3 3 6 14 8
Texas and California have the highest number of hospitals, 421 and 440, respectively. Both states have large populations and large areas. The four largest states by area are Alaska, Texas, California, and Montana.
5. (a) The longest time during 1961–1980 is 23 minutes (i.e., 2:23), and the shortest time is 9 minutes (2:09). We need stems 0, 1, and 2. We’ll use the tens digit as the stem and the ones digit as the leaf, placing leaves 0, 1, 2, 3, and 4 on the first stem and leaves 5, 6, 7, 8, and 9 on the second stem.
Minutes Beyond 2 Hours (1961–1980)
0 9 = 9 minutes past 2 hours 0 9 9 1 0 0 2 3 3 1 5 5 6 6 7 8 8 9 2 0 2 3 3
(b) The longest time during the period 1981–2000 was 14 (2:14) and the shortest was 7 (2:07), so we’ll need stems 0 and 1 only.
Minutes Beyond 2 Hours (1981–2000)
0 7 = 7 minutes past 2 hours 0 7 7 7 8 8 8 8 9 9 9 9 9 9 9 9 1 0 0 1 1 4
(c) There were seven times under 2:15 during 1961–1980, and there were 20 times under 2:15 during 1981–2000.
6. (a) The largest (worst) score in the first round was 75; the smallest (best) score was 65. We need stems 6 and 7. Leaves 0–4 go on the first stem, and leaves 5−9 belong on the second stem.
Part IV: Complete Solutions, Chapter 2 235
Copyright © Houghton Mifflin Company. All rights reserved.
First-Round Scores
6 8 = score of 68
6 5 6 7 7 7 0 1 1 1 1 1 1 1 1 1 1 2 2 2 3 3 3 3 4 4 4 7 5 5 5 5 5 5 5
(b) The largest score in the fourth round was 74, and the smallest was 68. Here we need stems 6 and 7.
Fourth-Round Scores
6 8 = score of 68
6 8 9 9 9 9 9 7 0 0 0 0 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 4 4 4
(c) Scores are lower in the fourth round. In the first round, both the low and high scores were more extreme than in the fourth round.
7. The largest value in the data is 29.8 mg of tar per cigarette smoked, and the smallest value is 1.0. We will need stems from 1 to 29, and we will use the numbers to the right of the decimal point as the leaves.
Milligrams of Tar per Cigarette
1 0 = 1.0 mg tar 1 0 2 3 4 1 5 5 6 7 3 8 8 0 6 8 9 0
10 11 4 12 0 4 8 13 7 14 1 5 9 15 0 1 2 8 16 0 6 17 0
.
.
.
29 8
236 Part IV: Complete Solutions, Chapter 2
Copyright © Houghton Mifflin Company. All rights reserved.
8. The largest value in the data set is 23.5 mg of carbon monoxide per cigarette smoked, and the smallest is 1.5. We need stems from 1 to 23, and we’ll use the numbers to the right of the decimal point as leaves.
Milligrams of Carbon Monoxide
1 5 = 1.5 mg CO
1 5 2 3 4 9 5 4 6 7 8 5 9 0 5
10 0 2 2 6 11 12 3 6 13 0 6 9 14 4 9 15 0 4 9 16 3 6 17 5 18 5
.
.
.
23 5 9. The largest value in the data set is 2.03 mg of nicotine per cigarette smoked. The smallest value is 0.13.
We will need stems 0, 1, and 2. We will use the number to the left of the decimal point as the stem and the first number to the right of the decimal point as the leaf. The number 2 placed to the right of the decimal point (the hundredths digit) will be truncated (not rounded).
Milligrams of Nicotine per Cigarette
0 1 = 0.1 milligram
0 1 4 4
0 5 6 6 6 7 7 7 8 8 9 9 9
1 0 0 0 0 0 0 0 1 2
1
2 0
10. (a) For Site I, the least depth is 25 cm, and the greatest depth is 110 cm. For Site II, the least depth is 20 cm, and the greatest depth is 125 cm.
(b) The Site I depth distribution is fairly symmetric, centered near 70 cm. Site II is fairly uniform in shape except that there is a huge gap with no artifacts from 70 to 100 cm.
(c) It would appear that Site II probably was unoccupied during the time period associated with 70 to 100 cm.
Part IV: Complete Solutions, Chapter 2 237
Copyright © Houghton Mifflin Company. All rights reserved.
Chapter 2 Review
1. (a) Bar graphs, Pareto charts, pie charts (b) All, but quantitative data must be categorized to use a bar graph, Pareto chart, or pie chart. 2. A time-series graph because a change over time is most relevant 3. Any large gaps between bars or stems might indicate potential outliers. 4. Dotplots and stem-and-leaf displays both show every data value. Stem-and-leaf plots group the data with
the same stem, whereas dotplots only group the data with identical values. 5. (a) Figure 2-1(a) (in the text) is essentially a bar graph with a “horizontal” axis showing years and a
“vertical” axis showing miles per gallon. However, in depicting the data as a highway and showing them in perspective, the ability to correctly compare bar heights visually has been lost. For example, determining what would appear to be the bar heights by measuring from the white line on the road to the edge of the road along a line drawn from the year to its mpg value, we get the bar height for 1983 to be approximately ⅞ inch and the bar height for 1985 to be approximately 1⅜ inches (i.e., 11/8 inches). Taking the ratio of the given bar heights, we see that the bar for 1985 should be 27.526 1.06≈ times the length of the 1983 bar. However, the measurements show a ratio of
11878
117
1.60;≈= i.e., the 1985 bar is (visually) 1.6 times the length of the 1983 bar. Also, the years are
evenly spaced numerically, but the figure shows the more recent years to be more widely spaced owing to the use of perspective.
(b) Figure 2-1(b) is a time-series graph showing the years on the x axis and miles per gallon on the y axis. Everything is to scale and not distorted visually by the use of perspective. It is easy to see the mpg standards for each year, and you also can see how fuel economy standards for new cars have changed over the 8 years shown (i.e., a steep increase in the early years and a leveling off in the later years).
6. (a) We estimate the 1980 prison population at approximately 140 prisoners per 100,000 and the 1997 population at approximately 440 prisoners per 100,000 people.
(b) The number of inmates per 100,000 increased every year.
(c) The population 266,574,000 is 2,665.74 × 100,000, and 444 per 100,000 is 444 .100,000
So 444 (2,665.74×100,000) 1,183,589100,000
× ≈ prisoners.
The projected 2020 population is 323,724,000, or 3,237.24 × 100,000.
So 444 (3,237.24 100,000) 1,437,335100,000
× × ≈ prisoners.
238 Part IV: Complete Solutions, Chapter 2
Copyright © Houghton Mifflin Company. All rights reserved.
7. Owing to rounding, the percentages are slightly different from those in the text.
10.1%Unknown
8.1%Calculation
10.1%Correct Form
28.3%Deductions
43.4%IRS Jargon
Pie Chart of Tax Return Difficulties
8. (a) Since the ages are two-digit numbers, use the ten’s digit as the stem and the one’s digit as the leaf.
Age of DUI Arrests
1 6 = 16 years
1 6 8
2 0 1 1 2 2 2 3 4 4 5 6 6 6 7 7 7 9
3 0 0 1 1 2 3 4 4 5 5 6 7 8 9
4 0 0 1 3 5 6 7 7 9 9
5 1 3 5 6 8
6 3 4
(b) The largest age is 64 and the smallest is 16, so the class width for seven classes is 64 16 6.86;7−
≈ use
7. The lower class limit for the first class is 16; the lower class limit for the second class is 16 + 7 = 23. The total number of data points is 50, so calculate the relative frequency by dividing the class frequency by 50.
Part IV: Complete Solutions, Chapter 2 239
Copyright © Houghton Mifflin Company. All rights reserved.
Age Distribution of DUI Arrests
Class Limits
Class Boundaries
Midpoint Frequency Relative Frequency
Cumulative Frequency
16–22 15.5–22.5 19 8 0.16 8
23–29 22.5–29.5 26 11 0.22 19
30–36 29.5–36.5 33 11 0.22 30
37–43 36.5–43.5 40 7 0.14 37
44–50 43.5–50.5 47 6 0.12 43
51–57 50.5–57.5 54 4 0.08 47
58–64 57.5–64.5 61 3 0.06 50
The class boundaries are the average of the upper class limit of the next class. The midpoint is the
average of the class limits for that class.
(c)
Age
Freq
uenc
y
64.557.550.543.536.529.522.515.5
12
10
8
6
4
2
0
Histogram of Age
(d) This distribution is skewed right.
240 Part IV: Complete Solutions, Chapter 2
Copyright © Houghton Mifflin Company. All rights reserved.
9. (a) The largest value is 142 mm, and the smallest value is 69. For seven classes, we need a class width of 142 69 10.4;
7−
≈ use 11. The lower class limit of the first class is 69, and the lower class limit of the
second class is 69 + 11 = 80. The class boundaries are the average of the upper class limit of one class and the lower class limit of
the next higher class. The midpoint is the average of the class limits for that class. There are 60 data values total, so the relative frequency is the class frequency divided by 60.
Class
Limits
Class
Boundaries
Midpoint Frequency Relative
Frequency
Cumulative
Frequency
69–79 68.5–79.5 74 2 0.03 2
80–90 79.5–90.5 85 3 0.05 5
91–101 90.5–101.5 96 8 0.13 13
102–112 101.5–112.5 107 19 0.32 32
113–123 112.5–123.5 118 22 0.37 54
124–134 123.5–134.5 129 3 0.05 57
135–145 134.5–145.5 140 3 0.05 60
(b)
Circumference
Freq
uenc
y
145.5134.5123.5112.5101.590.579.568.5
25
20
15
10
5
0
Histogram of Circumference
Part IV: Complete Solutions, Chapter 2 241
Copyright © Houghton Mifflin Company. All rights reserved.
(c)
Circumference
Rel
ativ
e Fr
eque
ncy
145.5134.5123.5112.5101.590.579.568.5
40
30
20
10
0
Histogram of Circumference
(d) This distribution is skewed left. (e) The ogive begins on the x axis at the lower class boundary and connects dots placed at (x, y)
coordinates (upper class boundary, cumulative frequency). 10. (a) General torts occur most frequently.
Filin
gs (
1000
s)
All OtherOther ProductAsbestosContractsTorts
200
150
100
50
0
Pareto Chart of Filings vs Type
(b)
5.2%All Other
9.4%Other Product
12.1%Asbestos
47.0%Torts
26.4%Contracts
Pie Chart of Filings
242 Part IV: Complete Solutions, Chapter 2
Copyright © Houghton Mifflin Company. All rights reserved.
11. (a) To determine the decade that contained the most samples, count both rows (if shown) of leaves; recall that leaves 0–4 belong on the first line and leaves 5–9 belong on the second line when two lines per stem are used. The greatest number of leaves is found on stem 124, i.e., the 1240s (the 40s decade in the 1200s), with 40 samples.
(b) The number of samples with tree-ring dates 1200 to 1239 A.D. is 28 + 3 + 19 + 25 = 75.
(c) The dates of the longest interval with no sample values are 1204 through 1211 A.D. This might mean that for these 8 years, the pueblo was unoccupied (thus no new or repaired structures), or that the population remained stable (no new structures needed), or that, say, weather conditions were favorable those years, so existing structures didn’t need repair. If relatively few new structures were built or repaired during this period, their tree rings might have been missed during sample selection.
Part IV: Complete Solutions, Chapter 3 261
Copyright © Houghton Mifflin Company. All rights reserved.
Chapter 3: Averages and Variation Calculations may vary slightly owing to rounding.
Section 3.1 1. The middle value is the median. The most frequent value is the mode. The mean takes all values into
account. 2. The symbol for the sample mean is ,x and the symbol for the population mean is μ. 3. For a mound-shaped, symmetric distribution, the mean, median, and mode all will be equal. 4. (a) Mean, median, and mode if it exists (b) Mode if it exists (c) Mean, median, and mode if it exists 5. (a) Mode = 5, the most common value Median = 4, the middle value in the ordered data set
Mean = 2 3 4 5 5 19 3.85 5
+ + + += =
(b) Only the mode (c) All three make sense. (d) The mode and the median 6. (a) Mode = 2, the most common value Median = 3, the middle value in the ordered data set
Mean = 2 2 3 6 10 4.65
+ + + +=
(b) Mode = 7, median = 8, mean = 9.6, using the same techniques as part (a) (c) Each statistic was increased by 5. In general, adding a constant c to each value in a data set results in
the mode, median, and mean increasing by c. 7. (a) Mode = 2, the most common value Median = 3, the middle value in the ordered data set
Mean = 2 2 3 6 10 4.65
+ + + +=
(b) Mode = 10, median = 15, mean = 23, using the same techniques as part (a) (c) Each statistic was multiplied by 5. In general, multiplying each value in a data set by a constant c
results in the mode, median, and mean being multiplied by c. (d) Mode = 177.8 cm, median = 172.72 cm, mean = 180.34 cm 8. (a) If the largest data value is replaced by a larger value, the mean will increase because the sum of the
data values will increase. The median will not change because the same value will still be in the eighth position when the data are ordered.
(b) If the largest value is replaced by a smaller value (but still higher than the median), the mean will decrease because the sum of the data values will decrease. The median will not change because the same value will be in the eighth position in increasing order.
262 Part IV: Complete Solutions, Chapter 3
Copyright © Houghton Mifflin Company. All rights reserved.
(c) If the largest value is replaced by a value that is smaller than the median, the mean will decrease because the sum of the data values will decrease. The median also will decrease because the former value in the eighth position will move to the ninth position in increasing order. The median will be the new value in the eighth position.
9. Mean = 146 152 144 167.314
+ + +≈
To compute the median, first order the data set smallest to largest. Then
Median = 168 174 1712+
=
Mode = most common value = 178
10.
111Mean 6.16718
5 7Median 62
Mode 7
= ≈
+= =
=
11. First, organize the data from smallest to largest. Then compute the mean, median, and mode. (a) Upper Canyon
1 1 1 2 3 3 3 3 4 6 9
36Mean 3.2711
Median 3 (middle value)Mode 3 (occurs most frequently)
xxnΣ
= = = ≈
==
(b) Lower Canyon
0 0 1 1 1 1 2 2 3 6 7 8 13 14
59Mean 4.2114
2 2Median 22
Mode 1 (occurs most frequently)
xxnΣ
= = = ≈
+= =
=
(c) The mean for the Lower Canyon is greater than that of the Upper Canyon. However, the median and mode for the Lower Canyon are less than those of the Upper Canyon.
(d) 5% of 14 is 0.7, which rounds to 1. So eliminate one data value from the bottom of the list and one from the top. Then compute the mean of the remaining 12 values.
455% trimmed mean 3.7512
xnΣ
= = =
Now this value is closer to the Upper Canyon mean.
Part IV: Complete Solutions, Chapter 3 263
Copyright © Houghton Mifflin Company. All rights reserved.
12. (a) 1050Mean 26.340
xxnΣ
= = = ≈ years
25 26Median 25.52+
= = years
Mode 25=
(b) The three averages are close, so each represents the age fairly accurately. There may be one high outlier (37), so the median may be the best measure.
13. (a)
2723Mean $136.2020
65 68Median $66.502
Mode $60.00
x= = ≈
+= =
=
(b) 5% of 20 data values is 1, so we remove the smallest and largest values and recompute the mean.
2183Mean $121.3018
x= = ≈ . The trimmed mean is still much larger than the median.
(c) Reporting the median certainly will give the customer a much lower figure for the daily cost, but that really doesn’t tell the whole story. Reporting the mean and the median, as well as the high outliers, may be the most useful description of the situation.
14.
( ) ( ) ( ) ( )
Weighted average =
92 0.25 81 0.225 93 0.225 85 0.301
87.65
xww
∑∑
+ + +=
=
15.
( ) ( ) ( ) ( )
Weighted average =
9 2 7 3 6 1 10 42 3 1 4
85108.5
xww
∑∑
+ + +=
+ + +
=
=
16. (a) Weighted average
64.1(0.38) 75.8(0.47) 23.9(0.07) 68.2(0.08)1
67.1 mg/l
xww
Σ=Σ
+ + +=
≈
(b) Since 67.1 mg/L is greater than 58 mg/L, this wetlands system does not meet the target standard for the chlorine compound. The average chlorine compound mg/L is too high.
264 Part IV: Complete Solutions, Chapter 3
Copyright © Houghton Mifflin Company. All rights reserved.
17. 2Harmonic mean 66.671 160 75
= ≈+
mph
18. 5Geometric mean 1.10 1.12 1.148 1.038 1.16 1.112= × × × × = . Thus the average growth factor is approximately 11%.
Section 3.2 1. The mean is associated with the standard deviation. 2. The standard deviation is the square root of the variance. 3. Yes. When computing the sample standard deviation, divide by n – 1. When computing the population
standard deviation, divide by n. 4. The symbol for the sample standard deviation is S. The symbol for the population standard deviation is σ. 5. (a) i, ii, iii (b) The data change between data sets (i) and (ii) increased by the squared difference sum 2( )x x−∑ by
10, whereas the data change between data sets (ii) and (iii) increased the squared difference sum 2( )x x−∑ by only 6.
6. (a) ( )2
3.611
x xs
n−
= ≈−
∑
(b) Adding a constant to each data value does not change s. Thus s ≈ 3.61. (c) Shifting data by c units does not change the standard deviation. 7. (a) s ≈ 3.61 (same as above) (b) s ≈ 18.0 (c) We see that the standard deviation has increased by 5. In general, multiplying each data value by a
constant c will result in the standard deviation being multiplied by the absolute value of c. 8. (a) No, 80 is only 2 standard deviations away from its mean. (b) Yes, 80 is 3.33 standard deviations away from its mean. 9. (a) Range = maximum – minimum = 30 – 15 = 15 (b) Use a calculator to verify that 110xΣ = and that 2 2,568.xΣ =
(c) Computation formula (sample data) for 2.s
Part IV: Complete Solutions, Chapter 3 265
Copyright © Houghton Mifflin Company. All rights reserved.
2
2
( )2
(110)5
1
2568
5 16.08
xnx
sn
ΣΣ −=
−
−=
−≈
2 26.08
37s =
≈
(d) 110 225
xxnΣ
= = =
Defining formula (sample data) for 2.s
2
2 2 2
( )1
(23 22) (17 22) (25 22)5 1
6.08
x xsn
Σ −=
−
− + − + + −=
−≈
2 26.0837
s =≈
(e) 22µ =
2
2 2 2
( )
(23 22) (17 22) (25 22)5
5.44
xNµσ Σ −
=
− + − + + −=
≈
2 25.4429.59
σ =≈
266 Part IV: Complete Solutions, Chapter 3
Copyright © Houghton Mifflin Company. All rights reserved.
10. (a)
X 2x y 2y
11 121 10 100
0 0 −2 4
36 1296 29 841
21 441 14 196
31 961 22 484
23 529 18 324
24 576 14 196
−11 121 −2 4
−11 121 −3 9
−21 441 −10 100
103xΣ = 2 4607xΣ = 90yΣ = 2 2258yΣ =
(b) 103 10.310
xxnΣ
= = = 90 910
yynΣ
= = =
2
2
( )2
(103)10
1
4607
10 119.85
xnx
sn
ΣΣ −=
−
−=
−≈
2
2
( )2
(90)10
1
2258
10 112.68
yny
sn
ΣΣ −=
−
−=
−≈
2 219.85 394.0s = ≈ 2 212.68 160.8s = ≈
(c) x − 2s = 10.3 − 2(19.85) = −29.4
x + 2s = 10.3 + 2(19.85) = 50
y = 2s = 9 − 2(12.68) = −16.36
y + 2s = 9 + 2(12.68) = 34.36
At least 75% of the returns for the stock Total Stock Fund fall between –29.4% and 50%, whereas at
least 75% of the returns for the Balanced Index fall between –16.36% and 34.36.
Part IV: Complete Solutions, Chapter 3 267
Copyright © Houghton Mifflin Company. All rights reserved.
(d) Stock fund: CV = 19.85100 100 192.7%10.3
sx⋅ = ⋅ ≈
Balanced fund: 12.68100 100 140.9%9
sCVy
= ⋅ = ⋅ ≈
For each unit of return, the balanced fund has lower risk. Since the CV can be thought of as a measure of risk per unit of expected return, a smaller CV is better because a lower risk is better.
11. (a) Range = 7.89 – 0.02 = 7.87
(b) Use a calculator to verify that 62.11xΣ = and 2 164.23.xΣ =
(c) 62.11 1.2450
xxnΣ
= = ≈
2
2
( )2
(62.11)50
1
164.23
50 11.3331.33
xnx
sn
ΣΣ −=
−
−=
−≈≈
2 21.3331.78
s =≈
(d) 1.33100 100 107%1.24
sCVx
= ⋅ = ⋅ ≈
The standard deviation of the time to failure is just slightly larger than the average time. 12. (a)
x 2x y 2y
13.20 174.24 11.85 140.42
5.60 31.36 15.25 232.56
19.80 392.04 21.30 453.69
15.05 226.50 17.30 299.29
21.40 457.96 27.50 756.25
17.25 297.56 10.35 107.12
27.45 753.50 14.90 222.01
16.95 287.30 48.70 2371.69
23.90 571.21 25.40 645.16
268 Part IV: Complete Solutions, Chapter 3
Copyright © Houghton Mifflin Company. All rights reserved.
32.40 1049.76 25.95 673.40
40.75 1660.56 57.60 3317.76
5.10 26.01 34.35 1179.92
17.75 315.06 38.80 1505.44
28.35 803.72 41.00 1681.00
31.25 976.56
284.95xΣ = 2 7046.80xΣ = 421.5yΣ = 2 14,562.27yΣ =
(b) Grid E: 284.95 20.3514
xxnΣ
= = =
2
2
( )22
(284.95)14
1
7046.80
14 196
xnx
sn
ΣΣ −=
−
−=
−≈
2 96 9.80s s= = ≈
Grid H: 421.5 28.115
yynΣ
= = =
2
2
( )22
(421.5)15
1
14,562.27
15 1194
yny
sn
ΣΣ −=
−
−=
−≈
2 194 13.93s s= = ≈
(c) 2 20.35 2(9.80) 0.75x s− = − =
2 20.35 2(9.80) 39.95x s+ = + =
For Grid E, at least 75% of the data fall in the interval 0.75–39.95. 2 28.1 2(13.93) 0.24y s− = − =
2 28.1 2(13.93) 55.96y s+ = + =
For Grid H, at least 75% of the data fall in the interval 0.24–39.95. Grid H shows a wider 75% range of values.
Part IV: Complete Solutions, Chapter 3 269
Copyright © Houghton Mifflin Company. All rights reserved.
(d) Grid E: 9.80100 100 48%20.35
sCVx
= ⋅ = ⋅ ≈
Grid H: 13.93100 100 49%28.1
sCVy
= ⋅ = ⋅ ≈
Grid H demonstrates slightly greater variability per expected signal. The CV, together with the confidence interval, indicates that Grid H might have more buried artifacts.
13. (a) Students verify results with a calculator.
(b) 245 495
xxnΣ
= = =
2
2
( )2
(245)5
1
14,755
5 126.22
xnx
sn
ΣΣ −=
−
−=
−≈
2 226.22 687.49s = ≈
(c) 224 44.85
yynΣ
= = =
2
2
( )2
(224)5
1
12,070
5 122.55
yny
sn
ΣΣ −=
−
−=
−≈
2 222.55 508.50s = ≈
(d) Mallard nest: 26.22100 100 53.5%49
sCVx
= ⋅ = ⋅ ≈
Canada Goose nest: 22.55100 100 50.3%44.8
sCVy
= ⋅ = ⋅ ≈
The CV gives the ratio of the standard deviation to the mean. With respect to their means, the variation for the mallards is slightly higher than the variation for the Canada geese.
14. (a) 14.05Pax 100 100 146.7%9.58
12.50Vanguard 100 100 138.6%9.02
sCVxsCVx
= ⋅ = ⋅ ≈
= ⋅ = ⋅ ≈
Vanguard fund has slightly less risk per unit of return. (b) Pax: 2 9.58 2(14.05) 18.52
2 9.58 2(14.05) 37.68x sx s− = − = −+ = + =
At least 75% of returns for Pax fall within the interval −18.52% to 37.68%.
270 Part IV: Complete Solutions, Chapter 3
Copyright © Houghton Mifflin Company. All rights reserved.
Vanguard: 2 9.02 2(12.50) 15.98
2 9.02 2(12.50) 34.02x sx s− = − = −+ = + =
At least 75% of the returns for Vanguard fall within in the interval −15.98% to 34.02%. Vanguard has a narrower range of returns, with less downside, but also less upside.
15. 100sCVx
= ⋅
100x CV s=⋅
( )100
2.2 1.5100
0.033
x CVs
s
s
=
=
=
⋅
16. Class f x xf x x− ( )2x x− ( )2x x f− 1–10 34 5.5 187 −10.6 112.36 3820.24 11–20 18 15.5 279 −0.6 0.36 6.48 21–30 17 25.5 433.5 9.4 88.36 1502.12 31 and over 11 35.5 390.5 19.4 376.36 4139.96 80n f= ∑ = 1290xf∑ = ( )2 9468.8x x f∑ − =
( )22
1290 16.180
9468.8 119.91 79
119.9 10.95
xfxn
x x fs
ns
∑= = ≈
∑ −= = ≈
−= ≈
17
Class f x
xf x − ( )2x x−
( )2x x f−
21–30 260
6630 −
106.09 27,583.4 31–40 348 3
5 12,354 −
0.09 31.3
41 and
287 45
13,058.5 97
94.09 27,003.8 8n f= ∑ =
32,042xf∑ =
( )2 54,x x f∑ − =
( )22
32,042.5 35.80895
54,619 61.11 894
61.1 7.82
xfxn
x xs
ns
f
∑= = ≈
∑ −= = ≈
−= ≈
⋅
Part IV: Complete Solutions, Chapter 3 271
Copyright © Houghton Mifflin Company. All rights reserved.
18. x f xf 2x f
3.5 2 7 24.5 4.5 2 9 40.5 5.5 4 22 121.0 6.5 22 143 929.5 7.5 64 480 3,600.0 8.5 90 765 6,502.5 9.5 14 133 1,263.5
10.5 2 21 220.5
200f∑ = 1,580xf∑ = 2 12,702x f∑ =
( ) ( )2 22
1,580 7.9200
1,58012,702 220
200
220 1.051 199
1.05100 100 13.29%7.9
x
x
xfxn
xfSS x f
n
SSs
nsCVx
∑= = =
∑= ∑ − = − =
= = ≈−
= = ≈× ×
19. Class f x xf x x− ( )2x x− ( )2x x f− 8.6–12.5 15 10.55 158.25 −5.05 25.502 382.537 12.6–16.5 20 14.55 291.00 −1.05 1.102 22.050 16.6–20.5 5 18.55 92.75 2.95 8.703 43.513 20.6–24.5 7 22.55 157.85 6.95 48.303 338.118 24.6–28.5 3 26.55 79.65 10.95 119.903 359.708 50n f= ∑ =
779.5xf∑ = ( )2 1,145.9x x f∑ − =
( )22
779.5 15.650
1,145.9 23.41 49
23.4 4.8
xfxn
x x fs
ns
∑= = ≈
∑ −= = ≈
−= ≈
20. (a) Students can use a TI-83 to verify the calculations.
(b) For 1992, 1.78 17.79 7.46 9.013
x + += =
For 2000, 17.49 6.80 2.38 7.303
x + −= =
(c) Students can use a TI-83 to verify the calculations. (d) The 3-year moving averages have approximately the same mean as computed in part (a), but the
standard deviation is much smaller.
272 Part IV: Complete Solutions, Chapter 3
Copyright © Houghton Mifflin Company. All rights reserved.
21.
( ) ( )
( )
2 2 2 2 2
2 2 2 2
22 2 2 2 2 2
22
2 2
2 2
2
x x x xx x x xx x
x x x nx x xnx nx
xx - nx nx x nx x n
n
xx
n
− = − + = − + =
− + = − + =
+ = − = − =
−
∑ ∑ ∑ ∑ ∑
∑ ∑ ∑
∑∑ ∑ ∑
∑∑
Section 3.3 1. 82% or more of the scores were at or below her score. 100% − 82% = 18% or fewer of the scores were
above her score.
2. The upper quartile is the 75th percentile. Therefore, the minimum percentile rank must be the 75th percentile.
3. No, the score 82 might have a percentile rank less than 70. Raw scores are not necessarily equal to percentile scores.
4. Timothy performed better because a percentile rank of 72 is greater than a percentile rank of 70.
5. Order the data from smallest to largest.
Lowest value 2Highest value 42
==
There are 20 data values.
23 23Median 232+
= =
There are 10 values less than the Q2 position and 10 values greater than the Q2 position.
1
3
3 1
8 11 9.52
28 29 28.52
28.5 9.5 19
Q
Q
IQR Q Q
+= =
+= =
= − = − =
Part IV: Complete Solutions, Chapter 3 273
Copyright © Houghton Mifflin Company. All rights reserved.
Mon
ths
60
50
40
30
20
10
0
Boxplot of Months for Nurses
6. (a) Order the data from smallest to largest.
Lowest value 3Highest value 72
==
There are 20 data values.
22 24Median 232+
= =
There are 10 values less than the median and 10 values greater than the median.
1
3
3 1
15 17 162
29 31 302
30 16 14
Q
Q
IQR Q Q
+= =
+= =
= − = − =
Mon
ths
Cler
ical
40
30
20
10
0
Boxplot of Months Clerical
274 Part IV: Complete Solutions, Chapter 3
Copyright © Houghton Mifflin Company. All rights reserved.
(b) The median for nurses and clerical workers is 23 months. The upper half of the data for the nurses falls between values of 23 and 42 months, whereas the upper half of the data for the clerical workers falls between 23 and 72 months. The distance between Q3 and the maximum for nurses is 13.5 months; for clerical workers, this distance is 42 months. The distance between Q1 and the minimum for nurses is 7.5 months; for clerical workers, this distance is 13 months.
7. (a) Lowest value 17Highest value 38
==
There are 50 data values.
24 24Median 242+
= =
There are 25 values above and 25 values below the Q2 position.
13
222727 22 5
IQR
=== − =
Colle
ge G
radu
ates
35
30
25
20
15
Boxplot of College Graduates
(b) 26% is in the third quartile because it is between the median and Q3.
8. (a) Lowest value 5Highest value 15
==
There are 50 data values.
10 10Median 102+
= =
There are 25 values above and 25 values below the Q2 position.
13
91212 9 3
IQR
=== − =
Part IV: Complete Solutions, Chapter 3 275
Copyright © Houghton Mifflin Company. All rights reserved.
Hig
h Sc
hool
Dro
pout
s
15.0
12.5
10.0
7.5
5.0
Boxplot of High School Dropouts
(b) 7% is in the first quartile because it is below Q1.
9. (a) California has the lowest premium, and Pennsylvania has the highest. (b) Pennsylvania has the highest median premium. (c ) California has the smallest range, and Texas has the smallest IQR. (d) The smallest IQR will be Texas. The largest IQR will be Pennsylvania. For figure (a), IQR = 3,652 – 2,758 = 894 For figure (b), IQR = 5,801 – 4,326 = 1,475 For figure (c), IQR = 3,966 – 2,801 = 1,165 Therefore, figure (a) is Texas and figure (b) is Pennsylvania. By elimination, figure (c) is California.
10. (a) Order the data from smallest to largest.
Lowest value 4Highest value 80
==
There are 24 data values.
65 66Median 65.52+
= =
There are 12 values above and 12 values below the median.
1
3
61 62 61.52
71 72 71.52
Q
Q
+= =
+= =
276 Part IV: Complete Solutions, Chapter 3
Copyright © Houghton Mifflin Company. All rights reserved.
Hei
ghts
80
70
60
50
40
30
20
10
0
Boxplot of Heights
(b) 3 1 71.5 61.5 10IQR Q Q= − = − =
(c) ( )( )( )
13
1.5 10 15Lower limit: 1.5 61.5 15 46.5Upper limit: 1.5 71.5 15 86.5
Q IQRQ IQR
=− = − =+ = + =
(d) Yes, the value 4 is below the lower limit and so is an outlier; it is probably an error.
Chapter 3 Review 1. (a) The variance and the standard deviation (b) Box-and-whisker plot 2. (a) For (i), the mode is the tallest bar, namely, 7; the median and mean are estimated to be 7. For (ii), the
mode = median = mean = 7. (b) Distribution (i) will have a larger standard deviation because more data are in the tails. This is
indicated by the tall bars at values of 4 and 10. 3. (a) For both data sets, the mean is 20. Also, for both data sets, the range = maximum – minimum = 31 – 7
= 24. (b) Data set C1 seems more symmetric because the mean equals the median and the median is centered in
the interquartile range. (c) For C1, IQR = 25 – 15 = 10. For C2, IQR = 22 – 20 = 2. Thus, for C1, the middle 50% of the data have
a range of 10, whereas for C2, the middle 50% of the data have a smaller range of 2.
4. (a) Mean = 1.9 2.8 7.28
36.28
4.525
xxnΣ + + +
= =
=
=
Part IV: Complete Solutions, Chapter 3 277
Copyright © Houghton Mifflin Company. All rights reserved.
Order the data from smallest to largest.
1.9 1.9 2.8 3.9 4.2 5.7 7.2 8.6
Median = 3.9 4.2 4.052+
=
The mode is 1.9 because it is the value that occurs most frequently.
(b) ( )2 42.395 2.461 7
2.46100 100 54.4%4.525
x xs
nsCVx
∑ −= = ≈
−
= = ≈⋅ ⋅
Range = 8.6 1.9 6.7− =
5. (a) Lowest value 31Highest value 68
==
There are 60 data values.
45 45Median 452+
= =
There are 30 values above and 30 values below the Q2 position.
1
3
40 40 402
52 53 52.52
52.5 40 12.5
Q
Q
IQR
+= =
+= =
= − =
Geor
gia
Dem
ocra
ts
70
60
50
40
30
Boxplot of Percentage of Georgia Democrats by County
278 Part IV: Complete Solutions, Chapter 3
Copyright © Houghton Mifflin Company. All rights reserved.
(b) Class width = 8 Class Midpoint
x f xf 2x f 31–38 34.5 11 379.5 13,092.8 39–46 42.5 24 1020 43,350.0 47–54 50.5 15 757.5 38,253.8 55–62 58.5 7 409.5 23,955.8 63–70 66.5 3 199.5 13,266.8
60n f= ∑ = 2,766xf∑ = 2 131,919x f∑ =
( )2 2(2,766)260
2,766 46.160
131,919 4,406.4 8.641 59 59
xfn
xfxn
x fs
n
∑
∑= = =
∑ − −= = = ≈
−
( )( )
2 46.1 2 8.64 28.822 46.1 2 8.64 63.38
x sx s− = − =+ = + =
We expect at least 75% of the counties in Georgia to have between 28.82% and 63.38% Democrats.
(c) 46.15, 8.63x s= ≈
6. (a)
( ) ( ) ( ) ( ) ( ) ( ) ( )
Weighted average =
92 0.05 73 0.08 81 0.08 85 0.15 87 0.15 83 0.15 90 0.340.05 0.08 0.08 0.15 0.15 0.15 0.34
85.771
85.77
xww
∑∑
+ + + + + +=
+ + + + + +
=
=
(b)
( ) ( ) ( ) ( ) ( ) ( ) ( )
Weighted average =
20 0.05 73 0.08 81 0.08 85 0.15 87 0.15 83 0.15 90 0.341
82.17
xww
∑∑
+ + + + + +=
=
7. 2,500Mean weight 156.2516
= =
The mean weight is 156.25 lb.
8. (a) Lowest value 7.8Highest value 29.5
==
There are 72 data values.
20.2 20.3Median 20.252+
= =
There are 36 values above and 36 values below the Q2 position.
Part IV: Complete Solutions, Chapter 3 279
Copyright © Houghton Mifflin Company. All rights reserved.
1
3
14.0 14.4 14.22
23.8 23.8 23.82
Q
Q
+= =
+= =
(b) IRQ = 23.8 − 14.2 = 9.6 kilograms (c)
Kilo
gram
s
30
25
20
15
10
Boxplot of Kilograms
(d) The median is closer to the maximum value, indicating that the higher weights are more concentrated than the lower weights. The lower whisker is also longer than the upper, which emphasizes again skewness toward the lower values. Yes, the lower half shows slightly more spread, indicating skewness to the left (low).
9. (a) A college degree does not guarantee an increase of 83.4% in earnings compared with a high-school
diploma. This statement is based on averages. (b) We compute as follows:
2 $51,206 2($8,500) $34,2062 $51,206 2($8,500) $68,206
x sx s− = − =+ = + =
(c)
(0.46)(4,500) (0.21)(7,500) (0.07)(12,000) (0.08)(18,000) (0.09)(24,000) (0.09)(31,000)0.46 0.21 0.07 0.08 0.09 0.09
$10,875
x
x
+ + + + += =
+ + + + +
=
10. (a) Order the data from smallest to largest.
Lowest value 6Highest value 16
==
There are 50 data values.
11 11Median 112+
= =
280 Part IV: Complete Solutions, Chapter 3
Copyright © Houghton Mifflin Company. All rights reserved.
There are 25 values above and 25 values below the Q2 position. 1
3
3 1
1013
13 10 3
IQR Q Q
==
= − = − =
Soil
Wat
er C
onte
nt
17.5
15.0
12.5
10.0
7.5
5.0
Boxplot of Soil Water Content
(b) Class Midpointx f xf 2x f
6–8 7 4 28 196 9–11 10 24 240 2,400 12–14 13 15 195 2,535 15–17 16 7 112 1,792
50n f= ∑ = 575xf∑ = 2 6,923x f∑ =
( )2 2(575)250
575 11.550
6,923 310.5 2.521 49 49
xfn
xfxn
x fs
n
∑= = =
∑− −= = ≈ ≈
−∑
( )( )
2 11.5 2 2.52 6.462 11.5 2 2.52 16.54
x sx s− = − =+ = + =
We expect at least 75% of the soil water content measurements to fall in the interval 6.46–16.54.
(c) Using a TI-83, 11.48; 2.44x s≈ ≈
Part IV: Complete Solutions, Chapter 3 281
Copyright © Houghton Mifflin Company. All rights reserved.
11.
( ) ( ) ( ) ( ) ( )
Weighted average =
5 2 8 3 7 3 9 5 7 32 3 3 5 3
121167.56
xww
∑∑
+ + + +=
+ + + +
=
≈
Cumulative Review Problems Chapters 1, 2, 3 1. (a) Median, percentile (b) Mean, variance, standard deviation 2. (a) Gap between first bar and rest of bars or between last bar and rest of bars (b) Large gap between data on far left side or far right side and rest of data (c) Several empty stems after stem including lowest values or before stem including highest values (d) Data beyond fences placed at Q1 – 1.5(IQR) and at Q3 + 1.5(IQR). 3. (a) Same (b) Set B has higher mean. (c) Set B has higher standard deviation. (d) Set B has much longer whisker beyond Q3. 4. (a ) In Set A, 86 is the relatively higher score because a larger percentage of scores fall below it. (b) In Set B because 86 is more standard deviations above the mean 5. One could assign a consecutive number to each well in West Texas and then use a random-number table or
a computer package to draw the simple random sample. 6. The pH levels are ratios because the values can be multiplied. Also, 0 pH is meaningful and not just a place
on the scale. 7. Use the one’s digit for the stem and the tenths decimal for the leaves. Split each stem into five rows. Here, 7 0 = 7.0.
7 000000001111111111 7 222222222233333333333 7 44444444455555555 7 666666666777777 7 8888899999 8 01111111 8 2222222 8 45 8 67 8 88
282 Part IV: Complete Solutions, Chapter 3
Copyright © Houghton Mifflin Company. All rights reserved.
8.
Class Limits Class Boundaries Midpoints Frequency Relative
Frequency Cumulative Frequency
7.0–7.3 6.95–7.35 7.15 39 0.382 0.382 7.4–7.7 7.35–7.75 7.55 32 0.314 0.696 7.8–8.1 7.75–8.15 7.95 18 0.176 0.872 8.2–8.5 8.15–8.55 8.35 9 0.088 0.960 8.6–8.9 8.55–8.95 8.75 4 0.039 0.999
Freq
uenc
y
8.958.558.157.757.356.95
40
30
20
10
0
Histogram of pH Level
Rel
ativ
e Fr
eque
ncy
8.958.558.157.757.356.95
40
30
20
10
0
Histogram of pH Level
To construct the frequency polygon, draw a dot at the minimum class boundary, at each midpoint, and at
the maximum class boundary. Then connect the dots. 9. To draw the ogive, the vertical axis is labeled with relative frequency, and the horizontal axis is labeled
with the upper class boundaries. Draw a dot at the minimum class boundary and zero, and then draw a dot at each upper class boundary and the corresponding cumulative frequency. Connect the dots.
10. Range = 8.8 – 7.0 = 1.8
7.0 7.0 ... 8.8 7.58102
7.5 7.5Median 7.5 Mode 7.32
xx
n+ + +
= = =
+= = =
∑
11. (a) The students can verify the figures using a calculator or a statistics package. (b)
( )22
2
0.19841
0.1984 0.4454
0.4454 0.59 5.9%7.58
x xs
n
s s
sCVx
−= =
−
= = =
= = = =
∑
The sample variance is only 5.9% of the mean. This appears to be small.
Part IV: Complete Solutions, Chapter 3 283
Copyright © Houghton Mifflin Company. All rights reserved.
12.
2( ) 7.58 2(0.4454) 6.692( ) 7.58 2(0.4454) 8.47
x sx s− = − =+ = + =
Thus 75% of all pH levels are found between 6.69 and 8.47. 13. We know the minimum value is 7.0, the maximum value is 8.8, and the median is 7.5. Using Minitab, we
find that Q1 = 7.2 and Q3 = 7.9. Thus IQR = 7.9 – 7.2 = 0.7.
pH
9.0
8.5
8.0
7.5
7.0
Boxplot of pH Levels for West Texas
14. The histogram shows that the distribution is skewed right. Lower values are more common because the
height of the bars is higher. 15. 87.2% of the wells have a pH of less than 8.15. 57.8% of the wells could be used for the irrigation. Here,
57.8% = 31.4% + 17.6% + 8.8%. 16. There do not appear to be any outliers because there are no large gaps in the data set. Eight are neutral. 17. Half the wells are found to have a pH between 7.2 and 7.9. There is skewness toward the high values,
with half the wells having a pH between 7.5 and 8.8. The boxplot and the histogram are consistent because both show the distribution to be right skewed.
18. Answers will vary. Good reports will include the preceding graphs, measures of center, measures of
variation, and a comment about any unusual features.
296 Part IV: Complete Solutions, Chapter 4
Copyright © Houghton Mifflin Company. All rights reserved.
Chapter 4: Elementary Probability Theory Section 4.1 1. Equally likely outcomes, relat ive frequency, intuition 2. The complement is “not rain today.” This probability is 100% – 30% = 70%. 3. (a) The probability of a certain event is 1. (b) The probability of an impossible event is 0. 4. The law of large numbers states that in the long run, as the sample size or number o f trials increases, the
relative frequency of outcomes approaches the theoretical probability of the outcome. Five hundred trials are better because the law of large numbers works better for larger samples.
5. No. The probability of throwing tails on the second toss is 0.50 regardless of the outcome on the first toss. 6. (a) Probabilities must be between 0 and 1 inclusive. –0.41 < 0 (b) Probabilities must be between 0 and 1 inclusive. 1.21 > 1 (c) 120% = 1.20, and probabilit ies must be between 0 and 1 inclusive. 1.20 > 1 (d) Yes, 0 ≤ 0.56 ≤ 1. 7. The resulting relat ive frequency can be used as an estimate of the true probability of all Americans who can
wiggle their ears. 8. The resulting relat ive frequency can be used as an estimate of the true probability of all Americans who can
raise one eyebrow.
9. (a) P(no similar preferences) 15 71 124 131 34(0) , (1) , (2) , (3) , (4)375 375 375 375 375
P P P P P= = = = = =
(b) 15 71 124 131 34 375 1,375 375
+ + + += = yes
Personality types were classified into four main p references; all possible numbers of shared preferences were considered. The sample space is 0, 1, 2, 3, and 4 shared preferences.
10. (a) The sample space would be 1, 2, 3, 4, 5, and 6 dots. If the die is fair, all outcomes will be equally likely.
(b) 1(1) (2) (3) (4) (5) (6)6
P P P P P P= = = = = = because the die faces are equally likely, and there are six
outcomes. The probabilities should and do add to 1 1 1 1 1 1 1 6 16 6 6 6 6 6 6
+ + + + + = =
because all
possible outcomes have been considered.
(c) P(number of dots < 5) = P(1 o r 2 or 3 or 4 dots) = P(1) + P (2) + P(3) + P(4) 1 1 1 16 6 6 6
= + + + 4 26 3
= =
or P(dots < 5) = 1 – P(5 or 6 dots) 1 213 3
= − = (The applicab le probability ru le used here will be
discussed in the next section of the text; rely on your common sense for now.) (d) Complementary event rule: P(A) = 1 – P(not A)
P(5 or 6 dots) = 1 – P(1 or 2 or 3 o r 4 dots) 2 11 ,3 3
= − = or P(5 or 6) = P(5) + P(6) 1 1 2 16 6 6 3
= + = =
Part IV: Complete Solutions, Chapter 4 297
Copyright © Houghton Mifflin Company. All rights reserved.
11. (a) Note: “Includes the left limit but not the right limit” means 6 A.M. ≤ t ime t < noon, noon ≤ t < 6 P.M.,
6 P.M. ≤ t < midnight, midnight ≤ t < 6 A.M.
P(best idea 6 A.M.–12 noon) 290 0.30966
= ≈
P(best idea 12 noon–6 P.M.) 135 0.14966
= ≈
P(best idea 6 P.M.–12 midn ight) 319 0.33966
≈
P(best idea from 12 midnight to 6 A.M.) 222 0.23966
= ≈
(b) The probabilit ies add up to 1. They should add up to 1 provided that the intervals do not overlap and each inventor chose only one interval. The sample space is the set of four time intervals.
12. (a) P(germinate) = number germinated 2,430 0.81number planted 3,000
= =
(b) P(not germinate) = 3,000 2,430 570 0.193,000 3,000−
= =
(c) The sample space is two outcomes, germinate and not germinate. P(germinate) + P(not germinate) = 0.81 + 0.19 = 1 The probabilit ies of all the outcomes in the sample space should and do sum to 1.
(d) No because P(germinate) = 0.81 ≠ P(not germinate) = 0.19
If they were equally likely, each would have probability 1 0.5.2=
13. (a) Given: Odds in favor of A are n:m i.e., nm
.
Show: ( ) nP Am n
=+
Proof: Odds in favor of A are ( )(not )P A
P A by defin ition
(not ) 1 ( ) complementary events( ) ( ) substitution
(not ) 1 ( )[1 ( )] [ ( )] cross multiply
[ ( )] [ ( )][ ( )] [ ( )]
( )[ ( )]
P A P An P A P Am P A P A
n P A m P An n P A m P A
n n P A m P An n m P A
= −
= =−
− =− =
= += +
So ( ),n P An m
=+
as was to be shown.
(b) Odds of a successful call are 2 to 15. Now 2 to 15 can be written as 2:15 or 2 .15
From part (a): if the odds are 2:15 (let n = 2, m = 15), then P(sale) 22 15
nn m
= =+ +
2 0.118.17
= ≈
298 Part IV: Complete Solutions, Chapter 4
Copyright © Houghton Mifflin Company. All rights reserved.
(c) Odds of free throw are 3 to 5, i.e., 3:5. Let n = 3 and m = 5 here; then, from part (a):
P(free throw) 3 3 0.3753 5 8
nn m
= = = =+ +
14. (a) Given: Odds against W are a :b or .ab
Show: P(not W) .aa b
=+
Proof: Odds against W are (not )( )
P WP W
by definition.
( ) 1 (not ) complementary events(not ) substitution
( )(not ) substitution
1 (not )[ (not )] [1 (not )]
P W P WP W a
P W bP W a
P W bb P W a P W
= −
=
=−
= − cross-multiply[ (not )] [ (not )]
[ ( )] [ (not )]( )[ (not )]
(not )
b P W a a P Wb P not W a P W a
a b P W aaP W
a b
= −+ =+ =
=+
(not ) aP Wa b
=+
, as was to be shown.
(b) Point Given’s betting odds are 9:5. Betting odds are based on the probability that the horse does not
win, so odds against Point Given (PG) winning are (not PG wins) .(PG wins)
PP
Let a = 9 and b = 5 in part (a) formula. From part (a), P(not PG wins) 9 9 ,9 5 14
aa b
= = =+ +
but event
“not PG wins” is the same as “PG loses,” so P(PG loses) 9 0.64,14
= ≈ and P(PG wins)
9 51 0.36.14 14
= − = ≈
(c) Betting odds for Monarchos are 6:1. Betting odds are based on the probability that the horse does not
win; i.e., the horse loses.
Let W be the event that Monarchos wins. From part (a), if the events against W are g iven as a:b, the
P(not W) .aa b
=+
Let a = 6 and b = 1 in the part (a) formula, so
Part IV: Complete Solutions, Chapter 4 299
Copyright © Houghton Mifflin Company. All rights reserved.
6 6(not )
6 1 76(not ) (Monarchos loses)= 0.867
P W
P W P
= =+
= ≈
(Monarchos wins) ( ) 1 (not )6 11 0.147 7
P P W P W= = −
= − = ≈
(d) Invisible Ink was given betting odds of 30 to 1; i.e., odds against Invisible Ink winning were 301
.
Let W denote the event that Invisible Ink wins. Let a = 30, b = 1 in formula from part (a).
Then, from part (a), P(not W) 30 30, (not Invisible Ink wins) ;30 1 31
a Pa b
= = =+ +
i.e.,
30(Invisible Ink loses) 0.9731
(Invisible Ink wins) 1 (Invisible Ink loses)30 11 0.0331 31
P
P P
= ≈
= −
= − = ≈
15. One approach is to make a table showing the information about the 127 people who walked by the store.
Buy Did Not Buy Row Total Came into the store 25 58 – 25 = 33 58 Did not come in 0 69 127 – 58 = 69 Column total 25 102 127
If 58 came in, 69 didn’t; 25 of the 58 bought something, so 33 came in but didn’t buy anything. Those who
did not come in couldn’t buy anything. The row entries must sum to the row totals, the column entries must sum to the column totals, and the row totals, as well as the column totals, must sum to the overall total, i.e., the 127 people who walked by the store. Also, the four inner cells must sum to the overall total: 25 + 33 + 0 + 69 = 127.
This kind of problem relies on formula (2), number outcomes favorable to (event ) .total number of outcomes
AP A =
(a) 58( ) 0.46127
P A = ≈ Here, we d ivide by 127 people.
(b) 25( ) 0.4358
P A = ≈ Here, we d ivide by 58 people (only those who entered).
(c) 58 25 25( ) (Enter and buy) 0.20127 58 127
P A P= = × = ≈
Or similarly, read from the table that 25 people both entered and bought something. Div ide this by the
total number of people, namely, 127.
(d) 33( ) (Buy nothing) 0.5758
P A P= = ≈ Here, we div ide by 58 people.
300 Part IV: Complete Solutions, Chapter 4
Copyright © Houghton Mifflin Company. All rights reserved.
Section 4.2 1. No. Mutually exclusive events cannot occur at the same time. 2. If A and B are independent, then P(A) = P(A | B). Therefore, P(A | B) = 0.3. 3. (a) Event A cannot occur if event B has occurred. Therefore, P(A | B) = 0. (b) Since we are to ld that P(A) ≠ 0, and we have determined that P(A | B) = 0, we can deduce that P(A) ≠ P(A | B). Therefore, events A and B are not independent. 4. (a) P(A and B) = P(A) × P(B) if events A and B are independent. This product can equal zero only if either
P(A) = 0 or P(B) = 0 (or both). We are told that P(A) ≠ 0 and that P(B) ≠ 0. Therefore, P(A and B) ≠ 0. (b) By the preceding line, the definition of mutually exclusive events is violated. Thus A and B are not
mutually exclusive. 5. (a) P(A and B) (b) P(B | A) (c) P(Ac | B) (d) P(A or B) (e) P(A or Bc) 6. (a) P(Ac or B) (b) P(B | A) (c) P(A | B) (d) P(A and Bc) (e) P(A and B) 7. (a) Green and blue are mutually exclusive because each M&M candy is only one color. P(green or blue) = P(green) + P(blue) = 10% + 10% = 20% = 0.20. (b) Yellow and red are mutually exclusive once again because each candy is only one color.
P(yellow or red) = P(yellow) + P(red) = 20% + 20% = 40% = 0.40. (c) Use the complementary event. P(not purple) = 1 – P(purple) = 1 – 0.20 = 0.80 = 80% 8. The total number of arches tabled is 288. Arch heights are mutually exclusive.
(a) P(3 to 9 feet) 111288
=
(b) P(30 feet or taller) = P(30 to 49) + P(50 to 74) + P(75 and higher) 30 33 18 81288 288 288 288
= + + =
(c) P(3 to 49 feet) = P(3 to 9) + P(10 to 29) + P(30 to 49) 111 96 30 237288 288 288 288
= + + =
(d) P(10 to 74 feet) = P(10 to 29) + P(30 to 49) + P(50 to 74) 96 30 33 159288 288 288 288
= + + =
(e) P(75 feet or taller) 18288
=
Hint: For Problems 9–12, refer to Figure 4-2 if necessary. Think of the outcomes as an (x, y) ordered pair. 9. (a) Yes, the outcome of the red die does not influence the outcome of the green die.
(b) P(5 on green and 3 on red) = P(5 on green) · P(3 on red) 1 1 1 0.0286 6 36
= = ≈
.
Part IV: Complete Solutions, Chapter 4 301
Copyright © Houghton Mifflin Company. All rights reserved.
(c) P(3 on green and 5 on red) = P(3 on green) · P(5 on red) 1 1 1 0.0286 6 36
= = ≈
(d) P[(5 on green and 3 on red) or (3 on green and 5 on red)] = P(5 on green and 3 on red) + P(3 on green and 5 on red)
= 1 1 2 1 0.05636 36 36 18
+ = = ≈ (because they are mutually exclusive outcomes).
10. (a) Yes.
(b) P(1 on green and 2 on red) = P(1 on green) · P(2 on red) 1 1 16 6 36
= =
(c) P(2 on green and 1 on red) = P(2 on green) · P(1 on red) 1 1 16 6 36
= =
(d) P[(1 on green and 2 on red) or (2 on green and 1 on red)] = P(1 on green and 2 on red) + P(2 on green and 1 on red)
= 1 1 2 136 36 36 18
+ = = (because they are mutually exclusive outcomes).
11. (a) We can obtain a sum of 6 as follows: 1 + 5 = 6 2 + 4 = 6 3 + 3 = 6 4 + 2 = 6 5 + 1 = 6
(sum 6) [(1, 5) or (2, 4) or (3 on red, 3 on green) or (4, 2) or (5, 1)]
(1, 5) (2, 4) (3, 3) (4, 2) (5, 1)because the (red, green) outcomes are mutually exclusive
1 1 1 1 16 6 6 6 6
P PP P P P P
= == + + + +
= + +
1 1 1 1 16 6 6 6 6
because the red die outcome is independent of the green die outcome1 1 1 1 1 5
36 36 36 36 36 36
+ +
= + + + + =
(b) We can obtain a sum of 4 as follows: 1 + 3 = 4 2 + 2 = 4
3 + 1 = 4 (sum is 4) [(1, 3) or (2, 2) or (3, 1)]
(1, 3) (2, 2) (3, 1)because the (red, green) outcomes are mutually exclusive1 1 1 1 1 16 6 6 6 6 6
because the red die outcome is inde
P PP P P
== + +
= + +
pendent of the green die outcome1 1 1 3 136 36 36 36 12
= + + = =
(c) You cannot roll a sum of 6 and a sum of 4 at the same time. These are mutually exclusive events.
P(sum of 6 or 4) = P(sum of 6) + P(sum of 4) = 5 3 8 236 36 36 9
+ = =
302 Part IV: Complete Solutions, Chapter 4
Copyright © Houghton Mifflin Company. All rights reserved.
12. (a) We can obtain a sum of 7 as follows: 1 + 6 = 7 2 + 5 = 7 3 + 4 = 7 4 + 3 = 7 5 + 2 = 7 6 + 1 = 7
(sum is 7) [(1, 6) or (2, 5) or (3, 4) or (4, 3) or (5, 2) or (6, 1)]
(1, 6) (2, 5) (3, 4) (4, 3) (5, 2) (6, 1)because the (red, green) outcomes are mutually exclusive1 1 1 1 16 6 6 6 6
P PP P P P P P
== + + + + +
= + +
1 1 1 1 1 1 16 6 6 6 6 6 6
because the red die outcome is independent of the green die outcome1 1 1 1 1 1 6 136 36 36 36 36 36 36 6
+ + +
= + + + + + = =
(b) We can obtain a sum of 11 as fo llows: 5 + 6 = 11 or 6 + 5 = 11
(sum is 11) [(5, 6) or (6, 5)]
(5, 6) (6, 5)because the (red, green) outcomes are mutually exclusive1 1 1 16 6 6 6
because the red die outcome is independent of the green die outcome
P PP P
== +
= +
1 1 2 136 36 36 18
= + = =
(c) You cannot roll a sum of 7 and a sum of 11 at the same t ime. These are mutually exclusive events.
P(sum is 7 o r 11) = P(sum is 7) + P(sum is 11) = 6 2 8 236 36 36 9
+ = =
13. (a) No, the draws are not independent. The key idea is “without replacement” because the probability of
the second card drawn depends on the first card drawn. Let the card draws be represented by an (x, y) ordered pair. For example, (K, 6) means the first card drawn was a king and the second card drawn was a 6. Here the order of the cards is important.
(b) P(ace on first draw and king on second draw) = P(ace, king) 4 4 16 452 51 2,652 663
= = =
There are four aces and fpour kings in the deck. Once the first card is drawn and not replaced, there are only 51 cards left to draw from, but all the kings are available.
(c) P(king, ace) 4 4 16 452 51 2652 663
= = =
(d) P(ace and king in either order)
= P[(ace, king) or (king, ace)] = P(ace, king) + P(king, ace)
because these two outcomes are mutually exclusive
= 16 16 32 82,652 2,652 2,652 663
+ = =
Part IV: Complete Solutions, Chapter 4 303
Copyright © Houghton Mifflin Company. All rights reserved.
14. (a) No, the draws are not independent. The key idea is “without replacement” because the probability of the second card drawn depends on the first card drawn. Let the card draws be represented by an (x, y) ordered pair. For example, (K, 6) means the first card drawn was a king and the second card drawn was a 6. Here the order of the cards is important.
(b) (3, 10) [(3 on 1st) and (10 on 2nd, given 3 on 1st)](3 on 1st) (10 on 2nd, given 3 on 1st)4 4 16 4 0.006
52 51 2,652 663
P PP P
== ⋅ = = = ≈
(c) (10, 3) [(10 on 1st) and (3 on 2nd, given 10 on 1st)]
(10 on 1st) (3 on 2nd, given 10 on 1st)4 4 16 4 0.006
52 51 2,652 663
P PP P
== ⋅ = = = ≈
(d) P[(3, 10) o r (10, 3)] = P(3, 10) + P(10, 3) because these two outcomes are mutually exclusive.
= 4 4 8 0.012663 663 663
+ = ≈
15. (a) Yes, the draws are independent. The key idea is “with replacement.” When the first card drawn is
replaced, the sample space is the same for the second card as it was for the first card. In fact, it is possible to draw the same card twice. Let the card draws be represented by an (x, y) ordered pair; for example, (K, 6) means a king was drawn, rep laced, and then the second card, a 6, was drawn.
(b) (A, K) (A) (K) because they are independent.
4 4 16 152 52 2,704 169
P P P= ⋅ = = =
(c) (K, A) (K) (A) because they are independent.
4 4 16 152 52 2,704 169
P P P= ⋅ = = =
(d) P[(A, K) or (K, A)] = P(A, K) + P(K, A) because the two outcomes are mutually exclusive.
= 1 1 2169 169 169
+ =
16. (a) Yes, the draws are independent. The key idea is “with replacement.” When the first card drawn is
replaced, the sample space is the same for the second card as it was for the first card. In fact, it is possible to draw the same card twice. Let the card draws be represented by an (x, y) ordered pair; for example, (K, 6) means a king was drawn, rep laced, and then the second card, a 6, was drawn.
(b) (3, 10) (3) (10) because draws are independent.
4 4 16 1 0.005952 52 2,704 169
P P P= ⋅ = = = ≈
(c) (10, 3) (10) (3) because of independence.
4 4 16 1 0.005952 52 2,704 169
P P P= ⋅ = = = ≈
(d) P[(3, 10) o r (10, 3)] = P(3, 10) + P(10, 3) because the two outcomes are mutually exclusive.
= 1 1 2 0.0118169 169 169
+ = ≈
304 Part IV: Complete Solutions, Chapter 4
Copyright © Houghton Mifflin Company. All rights reserved.
17. (a) P(6 years old or older) = 27% + 14% + 22% = 63% (b) P(12 years old or younger) = 1 – P(13 years old or older) = 100% – 22% = 78% (c) P(Between 6 and 12 years old) = 27% + 14% = 41% (d) P(Between 2 and 9 years old) = 22% + 27% = 49% The 13-and-older category may include child ren up to 17 or 18 years old. This is a larger category. 18. Let S denote “senior.” Let F denote “got the flu.” We are g iven the following probabilit ies: P(F | S) = 0.14 P(F | Sc) = 0.24 P(S) = 0.125 P(Sc) = 0.875 (a) P(S and F) = P(S) × P(F | S) = (0.125) × (0.14) = 0.0175
(b) P(Sc and F) = P(Sc) × P(F | Sc) = (0.875) × (0.24) = 0.21
(c) Here, P(S) = 0.95, so P(Sc) = 1 – 0.95 = 0.05 (a) P(S and F) = P(S) × P(F | S) = (0.95) × (0.14) = 0.133 (b) P(Sc and F) = P(Sc) × P(F | Sc) = (0.05) × (0.24) = 0.012
(d) Here, P(S) = P(Sc) = 0.50. (a) P(S and F) = P(S) × P(F | S) = (0.50) × (0.14) = 0.07 (b) P(Sc and F) = P(Sc) × P(F | Sc) = (0.50) × (0.24) = 0.12
19. Let T denote “telling the truth.” Let L denote “machine catches a person lying.” We are given the following
probabilit ies: P(L | Tc) = 0.72 P(L | T) = 0.07 (a) Given P(T) = 0.90. Then P(T and L) = P(T) × P(L | T) = (0.90) × (0.07) = 0.063 (b) Given P(Tc) = 0.10. Then P(Tc and L) = P(Tc) × P(L | Tc) = (0.10) × (0.72) = 0.072 (c) Given P(T) = P(Tc) = 0.50. Then P(T and L) = (0.50) × (0.07) = 0.035 P(Tc and L) = (0.50) × (0.72) = 0.36 (d) Given P(T) = 0.15 and P(Tc) = 0.85. Then P(T and L) = (0.15) × (0.07) = 0.0105 P(Tc and L) = (0.85) × (0.72) = 0.612 20. (a) We want to solve for P(Tc). There are two possibilities when the polygraph says that the person is
lying: Either the polygraph is right, or the polygraph is wrong. If the polygraph is right, the polygraph results show “lying,” and the person is not telling the truth; i.e., P(L and not T). If the polygraph is wrong, then the polygraph results show “lying,” but in fact, the person is telling the truth; i.e., P(L and T).
P(L) = P(L and Tc) + P(L and T) = [P(Tc) × P(L | Tc)] + [P(T) × P(L | T)] = [P(Tc) × P(L | Tc)] + {[1 – P(Tc)] × P(L | T)}
Part IV: Complete Solutions, Chapter 4 305
Copyright © Houghton Mifflin Company. All rights reserved.
We are told that P(L) = 0.30, so 0.30 = [P(Tc) × P(L | Tc)] + {[1 – P(Tc)] × P(L | T)} (**) = [P(Tc) × 0.72 ] + {[1 – P(Tc)] × 0.07} = (0.72) × P(Tc) + {0.07 – [0.07 × P(Tc)]} 0.23 = P(Tc) × (0.72 – 0.07) = P(Tc) × (0.65) 0.23/0.65 = P(Tc) = 0.354 = 35.4% (b) Here, P(L) = 70% = 0.70. Replace the 0.30 with 0.70 in (**) and solve.
P(Tc) = 0.63/0.65 = 0.969
21. (a) P(S) 6861,160
=
P(S | A) 270580
=
P(S | Pa) 416580
=
(b) No, they are not independent. P(S | Pa) ≠ P(S) based on the previous part. (c) P(A and S) = 270/1,160 using the table.
P(Pa and S) = 416/1,160 using the table.
(d) P(N) 4741,160
=
P(N | A) 310580
=
(e) No, they are not independent. P(N | A) ≠ P(N ) based on the preceding part. (f) ( or ) ( ) ( ) ( and )
580 686 270 9961,160 1,160 1,160 1,160
P A S P A P S P A S= + −
= + − =
22. (a) P(+ | condit ion present) 110130
=
(b) P(– |condition present) 20130
=
(c) P(– | condition absent) = 5070
(d) P(+ | condit ion absent) 2070
=
(e) P(condition present and +) = P(condition present) × P(+ | condition present)
130 110 110200 130 200
= =
(f) P(condition present and –) = P(condition present) × P(– | condition present)
130 20 20200 130 200
= =
306 Part IV: Complete Solutions, Chapter 4
Copyright © Houghton Mifflin Company. All rights reserved.
23. Let C denote the presence of the condition and not C denote absence of the condition.
(a) P(+ | C ) 72154
=
(b) P(– | C) 82154
=
(c) P(– | not C) 79116
=
(d) P(+ | not C) 37116
=
(e) P(C and +) = P(C ) × P(+ | C ) 154 72 72270 154 270
= =
(f) P(C and –) = P(C ) × P(– | C ) 154 82 82270 154 270
= =
24. (a) P(10 to 14 years) = 2912008
(b) P(10 to 14 years | East) = 77452
(c) P(at least 10 years) = 291 535 8262008 2008+
=
(d) P(at least 10 years | East) = 45 86 131373 373+
=
(e) P(West | less than 1 year) = 41157
(f) P(South | less than 1 year) = 53157
(g) P(1 or more years | East) = 1 – P(less than 1 year | East) = 32 4201452 452
− =
(h) P(1 or more years | West) = 1 – P(less than 1 year | West) = 41 3321373 373
− =
(i) We can check if P(East) = P(15 or more years | East). If these probabilities are equal, then the events
are independent.
452 118(East) 0.225 (15 years | East) 0.2612008 452
P P= = + = =
Since the probabilities are not equal, the events are not independent.
Part IV: Complete Solutions, Chapter 4 307
Copyright © Houghton Mifflin Company. All rights reserved.
25. Given: Let A be the event that a new store grosses > $940,000 in year 1; then Ac is the event the new store grosses ≤ $940,000 the first year.
Let B be the event that the store grosses > $940,000 in the second year; then Bc is the event the store grosses ≤ $940,000 in the second year of operation.
2-Year Results Translations
A and B Profitable both years A and Bc Profitable first but not second year Ac and B Profitable second but not first year Ac and Bc Not profitable either year
P(A) = 65% (show profit in first year) P(Ac) = 35% P(B) = 71% (show profit in second year) P(Bc) = 29% P(close) = P(Ac and Bc) P(B, g iven A) = 87%
(a) P(A) = 65% = 0.65 (b) P(B) = 71% = 0.71 (c) P(B | A) = 87% = 0.87 (d) P(A and B) = P(A) × P(B | A) = (0.65)(0.87) = 0.5655 ≈ 0.57 (e) P(A or B) = P(A) + P(B) – P(A and B) = 0.65 + 0.71 – 0.57 = 0.79 (f) P(not closed) = P(show a profit in year 1 or year 2 or both) = 0.79
P(closed) = 1 – P(not closed) = 1 – 0.79 = 0.21 26. P(female) = 85%, so P(male) = 15%
P(BSN | female) = 70% P(BSN | male) = 90%
(a) P(BSN | female) = 70% = 0.70 (b) P(BSN and female) = P(female) × P(BSN | female) = (0.85) × (0.70) = 0.595 (c) P(BSN | male) = 90% = 0.90 (d) P(BSN and male) = P(male) × P(BSN | male) = (0.15)(0.90) = 0.135 (e) Of the graduates, some are female and some are male. We can add the mutually exclusive
probabilit ies. P(BSN) = [P(BSN | female) × P(female)] + [P(BSN | male) × P(male)] = [(0.70) × (0.85)] + [(0.90) ×
(0.15)] = 0.73 (f) The phrase “will graduate and is female” describes the proportion of all students who are female and
will graduate. The phrase “will graduate, given female” describes the proportion of the females who will graduate. Observe from parts (a) and (b) that the probabilities are indeed different.
27. Let TB denote that the person has tuberculosis.
Let + denote the test for tuberculosis indicates the presence of the disease. Let – denote the test for tuberculosis indicates the absence of the disease.
We are given the following probabilit ies: P(+ | TB) = 0.82 (sensitivity of the test)
P(+ | TBc) = 0.09 (false-positive rate) P(TB) = 0.04
(a) P(TB and +) = P(+ | TB) × P(TB) = (0.82) × (0.04) = 0.0328
(b) P(TBc) = 1 – P(TB) = 1 – 0.04 = 0.96
(c) P(TBc and +) = P(+ | TBc) × P(TBc) = (0.09) × (0.96) = 0.0864
308 Part IV: Complete Solutions, Chapter 4
Copyright © Houghton Mifflin Company. All rights reserved.
28. Known: Let A be the event the client relapses in phase I. Let B be the event the client relapses in phase II. Let C be the event that the client has no relapse in phase I; i.e., C = not A. Let D be the event that the client has no relapse in phase II; i.e., D = not B. P(A) = 0.27, so P(Ac) = P(C ) = 1 – 0.27 = 0.73 P(B) = 0.23, so P(Bc) = P(D) = 1 – 0.23 = 0.77 P(Bc | Ac) = 0.95 = P(D | C) = 0.95 P(B | A) = 0.70
Possible Outcomes Translation
A, B Relapse in I, relapse in II Ac, B (= C, B) No relapse in I, relapse in II A, Bc (= A, D) Relapse in I, no relapse in II Ac, Bc (= C, D) No relapse in I no relapse in II
(a) P(A) = 0.27, P(B) = 0.23, P(C ) = 0.73, P(D) = 0.77
(b) P(B | A) = 0.70, P(D | C ) = 0.95 (c) P(A and B) = P(A) × P(B | A) = (0.27) × (0.70) = 0.189
P(C and D) = P(C ) × P(D | C ) = (0.73) × (0.95) = 0.6935 (d) P(A or B) = P(A) + P(B) – P(A and B) = 0.27 + 0.23 – 0.189 = 0.311 (e) P(C and D) = 0.69 (f) P(A and B) = 0.189 (g) Translate as the inclusive or. P(A o r B) = 0.31. Section 4.3 1. The permutations rule counts the number of d ifferent arrangements, or r items out of n distinct items. Here,
the ordering matters. The combinations rule counts the number of groups of r items out of n distinct items. Here, the ordering does not matter. For a permutation, ABC is different from ACB. For a combination, ABC and ACB are the same item. The number of permutations is larger than the number of combinations.
2. A tree diagram lists all possible events. The user of the diagram can trace the sequential event from the start
to the end by following a distinct path along the branches. Counting the number of final branches gives the total number of outcomes.
3. (a) Use the combinations rule because we are concerned only with the groups of size five. (b) Use the permutations rules because we are concerned with the number of d ifferent arrangements of size five. 4. Both methods are correct because you are counting the number of possible arrangements of five items
taken five at a time.
Part IV: Complete Solutions, Chapter 4 309
Copyright © Houghton Mifflin Company. All rights reserved.
5. (a)
(b) HHT, HTH, THH. There are three outcomes.
(c) There are eight possible outcomes, and three outcomes have exactly two heads. 38
.
310 Part IV: Complete Solutions, Chapter 4
Copyright © Houghton Mifflin Company. All rights reserved.
6. (a)
(b) H5, H6. There are two outcomes.
(c) There are 12 possible outcomes, and 2 outcomes meet the requirements. 2 112 6
= .
Part IV: Complete Solutions, Chapter 4 311
Copyright © Houghton Mifflin Company. All rights reserved.
7. (a)
(b) Let P(x, y) be the probability of choosing an x-colored ball on the first draw and a y-colored ball on the
second draw. Notice that the probabilities add to 1.
2 1 2 1( , )6 5 30 152 3 6 1( , )6 5 30 52 1 2 1( , )6 5 30 15
P R R
P R B
P R Y
= = = = = = = = =
3 2 6 1( , )6 5 30 53 2 6 1( , )6 5 30 53 1 3 1( , )6 5 30 101 2 2 1( , )6 5 30 151 3 3 1( , )6 5 30 10
P B R
P B B
P B Y
P Y R
P Y B
= = = = = = = = = = = = = = =
312 Part IV: Complete Solutions, Chapter 4
Copyright © Houghton Mifflin Company. All rights reserved.
8. (a) For clarity, only a partial tree diagram is provided. Each of the branches for B, C, and D would continue in the same manner as the fully expanded A branch.
(b) If the outcomes are equally likely, then P(all 3 correct) 1 1 1 14 4 4 64
= =
.
9. Using the provided hint, we multiply. There are 4 × 3 × 2 × 1 = 4! = 24 possible wiring configurations. 10. Using the multiplication ru le, we mult iply. There are 4! = 24 possible ways to visit the four cities. Th is
problem is exactly like Problem 9. 11. There are four fert ilizers, three temperature zones for each fertilizer, and three water treatments for every
fertilizer–temperature zone combination. She needs to test 4 × 3 × 3 = 36 p lots. 12. (a) The die ro lls are independent, so mult iply the six outcomes for the first die and the six outcomes for
the second die. There are 6 × 6 = 36 possible outcomes. (b) There are three possible even outcomes per die. There are 3 × 3 = 9 outcomes.
Part IV: Complete Solutions, Chapter 4 313
Copyright © Houghton Mifflin Company. All rights reserved.
(c) P(even, even) 9 1 0.2536 4
= = =
Using P(event) number of favorable outcomestotal number of outcomes
=
Problems 13, 14, 15, and 16 deal with permutations.
Use ,!
( )!n rnP
n r=
−to count the number of ways r objects can be selected from n objects when ordering
matters. 13. 5, 2 : 5, 2P n r= =
5, 25! 5 4 3 2 1 20
(5 2)! 3!P ⋅ ⋅ ⋅ ⋅
= = =−
14. 8,3 : 8, 3P n r= =
8,38! 8 7 6 5 4 3 2 1 8 7 6 5! 336
(8 3)! 5! 5!P ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅
= = = =−
15. 7,7 : 7P n r= =
7,77! 7! 7! 5,040 (recall 0! 1)
(7 7)! 0!P = = = = =
−
In general, ,! ! ! !
( )! 0! 1n nn n nP n
n n= = = =
−.
16. 9,9 : 9P n r= =
9,99! 9! 9! 362,880
(9 9)! 0! 1P = = = =
−
Problems 17, 18, 19, and 20 deal with combinations.
Use ,!
!( )!n rnC
r n r=
− to count the number of ways r objects can be selected from n objects when ordering
is irrelevant. 17. 5, 2 : 5, 2C n r= =
5, 25! 5! 5 4 3 2 1 20 10
2!(5 2)! 2!3! 2 1 3 2 1 2C ⋅ ⋅ ⋅ ⋅
= = = = =− ⋅ ⋅ ⋅ ⋅
18. 8,3 : 8, 3C n r= =
8,38! 8! 8 7 6 5! 56
3!(8 3)! 3!5! 3 2 1 5!C ⋅ ⋅ ⋅
= = = =− ⋅ ⋅ ⋅
314 Part IV: Complete Solutions, Chapter 4
Copyright © Houghton Mifflin Company. All rights reserved.
19. 7,7 : 7C n r= =
7,77! 7! 7! 1 (recall 0! 1)
7!(7 7)! 7!0! 7!(1)C = = = = =
−
In general, ,! ! ! 1.
!( )! !0! !(1)n nn n nC
n n n n n= = = =
− There is only one way to choose n objects without regard
to order. 20. 8,8 : 8C n r= =
8,88! 8! 8! 1 (recall 0! 1)
8!(8 8)! 8!0! 8!(1)C = = = = =
−
21. Since the order matters (first is day supervisor, second is night supervisor, and third is coordinator), this is a
permutation of 15 nurse candidates to fill three positions.
15,315! 15! 15 14 13 12! 2,730
(15 3)! 12! 12!P ⋅ ⋅ ⋅
= = = =−
22. Order matters here because the order of the finalists selected determines the prize awarded.
10,310! 10! 10 9 8 7! 720
(10 3)! 7! 7!P ⋅ ⋅ ⋅
= = = =−
23. Order matters because the resulting sequence determines who wins first, second, and third place.
5,35! 5! 120 60
(5 3)! 2! 2P = = = =
−
24. The order of the software packages selected is irrelevant, so use the combinations method.
10,310! 10! 10 9 8 7! 720 120
3!(10 3)! 3!7! 3!7! 6C ⋅ ⋅ ⋅
= = = = =−
25. The order of trainee selection is irrelevant, so use the combinations method.
15,515! 15! 15 14 13 12 11 10! 15 14 13 12 11 3,003
5!(15 5)! 5!10! 5!10! 5 4 3 2 1C ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅
= = = = =− ⋅ ⋅ ⋅ ⋅
26. The order of the problems selected is irrelevant, so use the combinations method.
(a) 12,512! 12! 12 11 10 9 8 7! 12 11 10 9 8 792
5!(12 5)! 5!7! 5!7! 5 4 3 2 1C ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅
= = = = =− ⋅ ⋅ ⋅ ⋅
(b) Jerry must have completed the same five problems as the professor selected to grade.
P(Jerry chose the right problems) 1 0.001792
= ≈
(c) Silvia d id seven problems, so she completed 7,57! 7! 7 6 5! 7 6 42 21
5!(7 5)! 5!2! 5!2! 2 1 2C ⋅ ⋅ ⋅
= = = = = =− ⋅
possible subsets.
Part IV: Complete Solutions, Chapter 4 315
Copyright © Houghton Mifflin Company. All rights reserved.
P(Silv ia picked the correct set of graded problems) 21 0.027792
= ≈
Silvia increased her chances by a factor of 21 compared with Jerry.
27. (a) Six applicants are selected from among 12 without regard to order. 12,6 212! 479,001,600 924.6!6! (720)
C = = =
(b) This problem is asking, “In how many ways can six women be selected from seven applicants?”
7,67! 7
6! 1!C = =
×
(c) P(event A) number of favorable outcomestotal number of outcomes
=
P(all h ired are women) 7 1 0.008924 132
= = ≈
Chapter 4 Review 1. (a) The individual does not own a cell phone. (b) The individual owns both a cell phone and a laptop computer. (c ) The individual owns either a cell phone or a laptop computer or both. (d) A laptop owner who owns a cell phone. (e) A cell phone owner who owns a laptop. 2. (a) Only if events A and B are mutually exclusive. Then P(A and B) = 0 and P(A or B) = P(A) + P(B). (b) Yes, see above. 3. (a) No, unless events A and B are independent. If they are not, we need either P(A | B) or P(B | A) to
compute P(A and B). (b) Yes, now we can compute P(A and B) = P(A) × P(B). 4. The informat ion yields P(B | A) = 2. Probabilities must be between 0 and 1 inclusive. Also, P(A and B)
cannot be greater than P(A) or P(B) individually. 5. P(asked) = 24% = 0.24
P(received | asked) = 45% = 0.45 P(asked and received) = P(asked) × P(received | asked) = (0.24) × (0.45) = 0.108 = 10.8%
6. P(asked) = 20% = 0.20
P(received | asked) = 59% = 0.59 P(asked and received) = P(asked) × P(received | asked) = (0.20) × (0.59) = 0.118 = 11.8%
7. (a) Throw a large number of similar thumbtacks or one thumbtack a large number of t imes, and record the
relative frequency of the outcomes. Assume that the thumbtack falls either flat side down or t ilted. To estimate the probability the tack lands on its flat side, find the relat ive frequency of this occurrence, dividing the number of times this occurred by the total number of thumbtack tosses.
316 Part IV: Complete Solutions, Chapter 4
Copyright © Houghton Mifflin Company. All rights reserved.
(b) The sample space consists of two outcomes: flat side down and tilted.
(c) P(flat side down) 340 0.68500
= =
P(tilted) 1 0.68 0.32= − =
8. (a) 470( ) 0.4701000390( ) 0.390
1000140( ) 0.140
1000
P N
P M
P S
= =
= =
= =
(b) P(N | W) 420 0.840500
= =
P(S | W) 20 0.040500
= =
(c) P(N | A) 50 0.100500
= =
P(S | A) 120 0.240500
= =
(d) P(N and W) = P(W) × P(N | W) = (0.50) × (0.84) = 0.42
P(M and W) = P(W) × P(M | W) = (0.50) – (0.12) = 0.06
(e) ( or ) ( ) ( ) if mutually exclusive470 390 860 0.860
1,000 1,000 1,000
P N M P N P M= +
= + = =
No reaction is mutually exclusive from a mild react ion; they cannot occur at the same time. (f) If N and W were independent, P(N and W) = P(N ) · P(W) = (0.470) × (0.500) = 0.235. However, from (d), we have P(N and W) = 0.420. They are not independent. 9. (a) Possible values for x are 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, and 12. (b) Below, the values for x are listed, along with the combinations required.
2 1 and 1 1 way 3 1 and 2, or 2 and 1 2 ways 4 1 and 3, 2 and 2, 3 and 1 3 ways 5 1 and 4, 2 and 3, 3 and 2, 4 and 1 4 ways 6 1 and 5, 2 and 4, 3 and 3, 4 and 2, 5 and 1 5 ways 7 1 and 6, 2 and 5, 3 and 4, 4 and 3, 5 and 2, 6 and 1 6 ways 8 2 and 6, 3 and 5, 4 and 4, 5 and 3, 6 and 2 5 ways 9 3 and 6, 4 and 5, 5 and 4, 6 and 3 4 ways 10 4 and 6, 5 and 5, 6 and 4 3 ways 11 5 and 6, 6 and 5 2 ways 12 6 and 6 1 way
Part IV: Complete Solutions, Chapter 4 317
Copyright © Houghton Mifflin Company. All rights reserved.
x P(x) Where there are (6)(6) = 36 possible, equally likely outcomes. (The sums, however, are not equally likely).
2 1 0.02836
≈
3 2 0.05636
≈
4 3 0.08336
≈
5 4 0.11136
≈
6 5 0.13936
≈
7 6 0.16736
≈
8 5 0.13936
≈
9 4 0.11136
≈
10 3 0.08336
≈
11 2 0.05636
≈
12 1 0.02836
≈
10. P(pass 101) = 0.77
P(pass 102 | pass 101) = 0.90 P(pass 101 and pass 102) = P(pass 101) × P(pass 102 | pass 101) = (0.77) × (0.90) = 0.693
11. 8, 28! 8 7 6! 56 28
2!6! (2 1)6! 2C ⋅ ⋅
= = = =⋅
12. (a) 7, 27! 7! 7(6) 42
(7 2)! 5!P = = = =
−
(b) 7, 27! 7 6 21
2!5! 2C ⋅
= = =
(c) 3,33! 3! 6
(3 3)! 0!P = = =
−
(d) 4, 44! 4! 1
4!(4 4)! 4!0!C = = =
−
13. Five multip le choice questions, each with flurossible (A, B, C, o r D).
There are 4 × 4 × 4 × 4 × 4 = 1,024 possible sequences, such as A, D, B, B.
P(getting the correct sequence) 1 0.000981024
= ≈
318 Part IV: Complete Solutions, Chapter 4
Copyright © Houghton Mifflin Company. All rights reserved.
14.
15. There are 10 possible numbers per turn of dial and, we turn the dial three times.
There are 10 × 10 × 10 = 1,000 possible combinations. 16. The combination uses the three numbers 2, 9, and 5, in an ordered sequence.
The number of sequences is 3,33! 3 2 1 6.
(3 3)! 0!P ⋅ ⋅
= = =−
The possible combinations are 259, 295, 529, 592, 925, and 952.
Part IV: Complete Solutions, Chapter 5 327
Copyright © Houghton Mifflin Company. All rights reserved.
Chapter 5: The Binomial Probability Distribution and Related Topics Section 5.1 1. (a) The number of traffic fatalities can be only a whole number. This is a discrete random variable. (b) Distance can assume any value, so this is a continuous random variable. (c) Time can take on any value, so this is a continuous random variable. (d) The number of ships can be only a whole number. This is a discrete random variable. (e) Weight can assume any value, so this is a continuous random variable. 2. (a) Speed can assume any value, so this is a continuous random variable. (b) Age can take on any value, so this is a continuous random variable. (c) Number of books can be only a whole number, so this is a discrete random variable. (d) Weight can assume any value, so this is a continuous random variable. (e) Number of lightning strikes can be only a whole number, so this is a discrete random variable. 3. (a) ( ) 0.25 0.60 0.15 1.00P x∑ = + + =
Yes, this is a valid probability distribution because the sum of the probabilities is 1, each probability is between 0 and 1 inclusive, and each event is assigned a probability.
(b) ( ) 0.25 0.60 0.20 1.05P x∑ = + + = No, this is not a probability distribution because the probabilities sum to more than 1.
4. No, the expected value of a random variable x can be a value different from the exact values of x. For
example, if we have the following random variable, the expected value is μ = (0 × 0.5) + (1 × 0.5) = 0.50.
5. (a) Yes, seven of the ten digits are assigned to “make a basket.” (b) Let S represent “make a basket” and F represent “miss.” We have F F S S S F F F S S (c) Yes, again, seven of the ten digits represent “make a basket.” We have S S S S S S S S S S 6. (a) ( ) 0.07 0.44 0.24 0.14 0.11 1.00P x∑ = + + + + =
Yes, this is a valid probability distribution because the events are distinct and the probabilities sum to 1.
x 0 1 P(x) 0.5 0.5
328 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
(b)
Perc
ent
6756453423
50
40
30
20
10
0
11
14
24
44
7
Age of Promotion Sensitive Shoppers
(c) ( )
( ) ( ) ( ) ( ) ( )23 0.07 34 0.44 45 0.24 56 0.14 67 0.1142.58
xP xµ = ∑
= + + + +
=
(d) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
2
2 2 2 2 219.58 0.07 8.58 0.44 2.42 0.24 13.42 0.14 24.42 0.11
151.44
12.31
x P xσ µ= ∑ −
= − + − + + +
=
≈
7. (a) ( ) 0.21 0.14 0.22 0.15 0.20 0.08 1.00P x∑ = + + + + + =
Yes, this is a valid probability distribution because the events are distinct and the probabilities sum to 1.
(b)
Perc
ent
605040302010
25
20
15
10
5
0
8
20
15
22
14
21
Histogram of Income Distribution
Part IV: Complete Solutions, Chapter 5 329
Copyright © Houghton Mifflin Company. All rights reserved.
(c) ( )( ) ( ) ( ) ( ) ( ) ( )10 0.21 20 0.14 30 0.22 40 0.15 50 0.20 60 0.08
32.3
xP xµ = ∑
= + + + + +
=
(d) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )2 2 2 2 2 2
2
22.3 0.21 12.3 0.14 2.3 0.22 7.7 0.15 17.7 0.20 27.7 0.08
259.71
16.12
x P xσ µ
− + − + − + + +
= ∑ −
=
=
≈
8. (a) ( ) 0.057 0.097 0.195 0.292 0.250 0.091 0.018
1.000P x∑ = + + + + + +
=
Yes, this is a valid probability distribution because the outcomes are distinct and the probabilities sum to 1.
(b)
Perc
ent
84.574.564.554.544.534.524.5
30
25
20
15
10
5
0
1.8
9.1
25
29.2
19.5
9.7
5.7
Histogram of British Nurse Ages
(c) ( ) ( ) ( ) ( )60 years of age or older 64.5 74.5 84.5
0.250 0.091 0.0180.359
P P P P= + +
= + +=
The probability is 35.9%. (d) ( )
( ) ( ) ( ) ( )( ) ( ) ( )
24.5 0.057 34.5 0.097 44.5 0.195 54.5 0.29264.5 0.250 74.5 0.091 84.5 0.018
53.76
xP xµ = ∑
= + + ++ + +
=
330 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
(e) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )( ) ( ) ( ) ( )
2
2 2 2 2 2
2 229.26 0.057 19.26 0.097 9.26 0.195 0.74 0.292 10.74 0.250
20.74 0.091
186.65
13.66
30.74 0.018
x P xσ µ= ∑ −
− + − + − + +=
+
=
≈
+
9. (a)
Perc
ent
43210
50
40
30
20
10
01
4
15
36
44
Histogram of Number of Trout Caught
(b) ( ) ( )1 or more 1 0
1 0.440.56
P P= −
= −=
(c) ( ) ( ) ( ) ( )2 or more 2 3 4 or more0.15 0.04 0.010.20
P P P P= + +
= + +=
(d) ( )( ) ( ) ( ) ( ) ( )0 0.44 1 0.36 2 0.15 3 0.04 4 0.01
0.82
xP xµ = ∑
= + + + +
=
(e) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
2
2 2 2 2 20.82 0.44 0.18 0.36 1.18 0.15 2.18 0.04 3.18 0.01
0.8076
0.899
x P xσ µ= ∑ −
= − + + + +
=
≈
10. ( ) 1.000P x∑ ≠ owing to rounding.
Part IV: Complete Solutions, Chapter 5 331
Copyright © Houghton Mifflin Company. All rights reserved.
(a) ( ) ( )1 or more 1 01 0.2370.763
P P= −
= −=
(b) ( ) ( ) ( ) ( ) ( )2 or more 2 3 4 50.264 0.088 0.015 0.0010.368
P P P P P= + + +
= + + +=
(c) ( ) ( ) ( )4 or more 4 50.015 0.0010.016
P P P= +
= +=
(d) ( )( ) ( ) ( ) ( ) ( ) ( )0 0.237 1 0.396 2 0.264 3 0.088 4 0.015 5 0.001
1.253
xP xµ = ∑
= + + + + +
=
(e) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )( ) ( ) ( ) ( )
2
2 2 2 2
2 21.253 0.237 0.253 0.396 0.747 0.264 1.747 0.0882.747 0.015 3.747 0.001
0.941
0.97
x P xσ µ= ∑ −
− + − + +=
+ +
=
≈
11. (a) ( )
( )
15win 0.021719719 15 704not win 0.979
719 719
P
P
= ≈
−= = ≈
(b) ( ) ( )Expected earnings = value of dinner probability of winning
15$35719
$0.73
=
≈
Lisa’s expected earnings are $0.73.
Contribution $15 $0.73 $14.27= − =
Lisa effectively contributed $14.27 to the hiking club.
12. (a) ( )
( )
6win 0.00212,8522,852 6 2,846not win 0.9979
2,852 2,852
P
P
= ≈
−= = ≈
(b) ( )( )( )
Expected earnings = value of cruise probability of winning$2,000 0.0021$4.20
≈≈
Kevin spent 6($5) = $30 for the tickets. His expected earnings are less than the amount he paid. Contribution $30 $4.20 $25.80= − =
Kevin effectively contributed $25.80 to the homeless center.
332 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
13. (a) ( )( )
60 years 0.01191Expected loss $50,000 0.01191 $595.50
P == =
The expected loss for Big Rock Insurance is $595.50. (b) Probability Expected Loss
( )61 0.01292P = ( )$50,000 0.01292 $646= ( )62 0.01396P = ( )$50,000 0.01396 $698=
( )63 0.01503P = ( )$50,000 0.01503 $751.50=
( )64 0.01613P = ( )$50,000 0.01613 $806.50=
Expected loss $595.50 $646 $698 $751.50 $806.50
$3,497.50= + + + +=
The total expected loss is $3,497.50. (c) $3,497.50 + $700 = $4,197.50
They should charge $4,197.50.
(d) $5,000 $3,497.50 $1,502.50They can expect to make $1,502.50.
− =
14. (a) ( )
( )60 years 0.00756
Expected loss $50,000 0.00756 $378P =
= =
The expected loss for Big Rock Insurance is $378. (b) Probability Expected Loss
( )61 0.00825P = ( )$50,000 0.00825 $412.50= ( )62 0.00896P = ( )$50,000 0.00896 $448=
( )63 0.00965P = ( )$50,000 0.00965 $482.50=
( )64 0.01035P = ( )$50,000 0.01035 $517.50=
E xpected loss $378 $412.50 $448 $482.50 $517.50
$2,238.50= + + + +=
The total expected loss is $2,238.50. (c) $2,238.50 + $700 = $2,938.50
They should charge $2,938.50.
(d) $5,000 $2,238.50 $2,761.50They can expect to make $2,761.50.
− =
15. (a) W = x1 − x2; a = 1, b = −1
( )1 2
1 222 2 2 2 2 2
2
115 100 15
1 1 12 8 208
208 14.4
W
W
W W
µ µ µ
σ σ σ
σ σ
−= = − =
= + − = + =
= = ≈
Part IV: Complete Solutions, Chapter 5 333
Copyright © Houghton Mifflin Company. All rights reserved.
(b) W = 0.5x1 + 0.5x2; a = 0.5, b = 0.5 ( ) ( )
( ) ( ) ( ) ( )1 2
1 22 2 2 22 2 2
2
0.5 0.5 0.5 115 0.5 100 107.5
0.5 0.5 0.25 12 0.25 8 52
52 7.2
W
W
W W
µ µ µ
σ σ σ
σ σ
= + = + =
= + = + =
= = ≈
(c) L = 0.8x1 − 2; a = −2, b = 0.8
( )( ) ( )1
12 22 2
2
2 0.8 2 0.8 115 90
0.8 0.64 12 92.16
92.16 9.6
L
L
L L
µ µ
σ σ
σ σ
= − + = − + =
= = =
= = =
(d) L = 0.95x2 − 5; a = −5, b = 0.95 ( )
( ) ( )2
22 22 2
2
5 0.95 5 0.95 100 90
0.95 0.9025 8 57.76
57.76 7.6
L
L
L L
µ µ
σ σ
σ σ
= − + = − + =
= = =
= = =
16. (a) W = x1 + x2; a = 1, b = 1
( ) ( )1 2
1 22 22 2 2
2
28.1 90.5 118.6 minutes
8.2 15.2 298.28
298.28 17.27 minutes
W
W
W W
µ µ µ
σ σ σ
σ σ
+= = + =
= + = + =
= = ≈
(b) W = 1.50x1 + 2.75x2; a = 1.50, b = 2.75
( ) ( )( ) ( ) ( ) ( )1 2
1 22 2 2 22 2 2
2
1.50 2.75 1.50 28.1 2.75 90.5 $291.03
1.50 2.75 2.25 8.2 7.5625 15.2 1,898.53
1,898.53 $43.57
W
W
W W
µ µ µ
σ σ σ
σ σ
= + = + ≈
= + = + =
= = ≈
(c) L = 1.5x1 + 50; a = 50, b = 1.5
( )( ) ( )1
12 22 2
2
50 1.5 50 1.5 28.1 $92.15
1.5 2.25 8.2 151.29
151.29 $12.30
L
L
L L
µ µ
σ σ
σ σ
= + = + =
= = =
= = =
17. (a) W = 0.5x1 + 0.5x2; a = 0.5, b = 0.5
( ) ( )( ) ( )1 2
1 22 22 2 2 2 2 2 2
2
0.5 0.5 0.5 50.2 0.5 50.2 50.2
0.5 0.5 0.5 11.5 0.5 11.5 66.125
66.125 8.13
W
W
W W
µ µ µ
σ σ σ
σ σ
= + = + =
= + = + =
= = ≈
(b) ( )
( )1 1Single policy : 50.2
Two policies : 50.2W
xW
µµ
=≈
The means are the same. (c) ( )
( )1 1Single policy : 11.5
Two policies : 8.13W
xW
σσ
=≈
334 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
The standard deviation for the average of two policies is smaller.
(d) Yes, the risk decreases by a factor of 1
nbecause 1 .W
nσ σ=
Section 5.2 1. The random variable counts the number of successes that occur in the n trials. 2. The outcome of one trial will not affect the probability of success of any other trial. 3. Binomial experiments have two possible outcomes, denoted success and failure. 4. In a binomial experiment, the probability of success does not change for each trial. 5. (a) No, there must be only two outcomes for each trial. Here, there are three outcomes. (b) Yes. If we combined outcomes B and C into a single outcome, then we have a binomial experiment.
The probability of success for each trial is P(A) = p = 0.40. 6. Yes, the five trials are independent, repeated under the same conditions, and have the same two outcomes
(win or not win) and the same probability of success on each trial. Here, n = 5, p = 1/6, and r = 2 7. (a) A trial is the random selection of one student and noting whether the student is a freshman or is not a
freshman. Here, the probability of success is p = 0.40 and the probability of a failure is 1 – 0.40 = 0.60 (b) For a small population of size 30, sampling without replacement will alter the probability of drawing a
freshman. In this situation, the hypergeometric distribution is appropriate. 8. (a) Yes, 90% of the digits are assigned to “successful surgery”. (b) S F S S S S S S S S S S S S F (c) Yes, the assignment is fine. This simulation produces all successes. 9. A trial is one flip of a fair quarter. Success = head. Failure = tail. 3, 0.5, 1 0.5 0.5n p q= = = − =
(a) ( ) ( ) ( )
( ) ( )
3 3 33,3
3 0
3 0.5 0.5
1 0.5 0.50.125
P C −=
=
=
To find this value in Table 3 of Appendix II, use the group in which n = 3, the column headed by p = 0.5 and the row headed by r = 3.
(b) ( ) ( ) ( )
( ) ( )
2 3 23,2
2 1
2 0.5 0.5
3 0.5 0.50.375
P C −=
=
=
To find this value in Table 3 of Appendix II, use the group in which n = 3, the column headed by p = 0.5 and the row headed by r = 2.
(c) ( ) ( ) ( )2 2 3
0.125 0.3750.5
P r P P≥ = +
= +=
Part IV: Complete Solutions, Chapter 5 335
Copyright © Houghton Mifflin Company. All rights reserved.
(d) The probability of getting exactly three tails is the same as getting exactly zero heads.
( ) ( ) ( )
( ) ( )
0 3 03,0
0 3
0 0.5 0.5
1 0.5 0.50.125
P C −=
=
=
To find this value in Table 3 of Appendix II, use the group in which n = 3, the column headed by p = 0.5 and the row headed by r = 0.
10. A trial is answering a question on the quiz. Success = correct answer. Failure = incorrect answer.
15
10, 0.2, 1 0.2 0.8n p q= = = = − =
(a) ( ) ( ) ( )
( ) ( )
10 10 1010,10
10 0
10 0.2 0.8
1 0.2 0.80.000 (to three digits)
P C −=
=
=
(b) Answering 10 incorrectly is the same as answering 0 correctly.
( ) ( ) ( )
( ) ( )
0 10 010,0
0 10
0 0.2 0.8
1 0.2 0.80.107
P C −=
=
=
(c) First method:
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )1 1 2 3 4 5 6 7 8 9 100.268 0.302 0.201 0.088 0.026 0.006 0.001 0.000 0.000 0.0000.892
P r P P P P P P P P P P≥ = + + + + + + + + +
= + + + + + + + + +=
Second method: ( ) ( )1 1 0
1 0.1070.893
P r P≥ = −
= −=
The two results should be equal, but because of rounding error, they differ slightly.
(d) ( ) ( ) ( ) ( ) ( ) ( ) ( )5 5 6 7 8 9 100.026 0.006 0.001 0.000 0.000 0.0000.033
P r P P P P P P≥ = + + + + +
= + + + + +=
11. A trial consists of determining the sex of a wolf. Success = male. Failure = female.
336 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
(a) 12, 0.55, 0.45n p q= = =
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )6 6 7 8 9 10 11 120.212 0.223 0.170 0.092 0.034 0.008 0.0010.740
P r P P P P P P P≥ = + + + + + +
= + + + + + +=
Six or more females is the same as six or fewer males.
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )6 0 1 2 3 4 5 60.000 0.001 0.007 0.028 0.076 0.149 0.2120.473
P r P P P P P P P≤ = + + + + + +
= + + + + + +=
Fewer than four females is the same as more than eight males.
( ) ( ) ( ) ( ) ( )8 9 10 11 120.092 0.034 0.008 0.0010.135
P r P P P P> = + + +
= + + +=
(b) 12, 0.70, 0.30n p q= = =
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )6 6 7 8 9 10 11 120.079 0.158 0.231 0.240 0.168 0.071 0.0140.961
P r P P P P P P P≥ = + + + + + +
= + + + + + +=
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )6 0 1 2 3 4 5 60.000 0.000 0.000 0.001 0.008 0.029 0.0790.117
P r P P P P P P P≤ = + + + + + +
= + + + + + +=
( ) ( ) ( ) ( ) ( )8 9 10 11 120.240 0.168 0.071 0.0140.493
P r P P P P> = + + +
= + + +=
12. A trial is a one-time fling. Success = has done a one-time fling. Failure = has not done a one-time fling.
7, 0.10, 1 0.10 0.90n p q= = = − =
(a) ( ) ( ) ( )
( ) ( )
0 7 07,0
0 7
0 0.10 0.90
1 0.10 0.900.478
P C −=
=
=
(b) ( ) ( )1 1 01 0.4780.522
P r P≥ = −
= −=
(c) ( ) ( ) ( ) ( )2 0 1 20.478 0.372 0.1240.974
P r P P P≤ = + +
= + +=
13. A trial consists of a woman’s response regarding her mother-in-law. Success = dislike. Failure = like. 6, 0.90, 1 0.90 0.10n p q= = = − =
(a) ( ) ( ) ( )
( ) ( )
6 6 66,6
6 0
6 0.90 0.10
1 0.90 0.100.531
P C −=
=
=
Part IV: Complete Solutions, Chapter 5 337
Copyright © Houghton Mifflin Company. All rights reserved.
(b) ( ) ( ) ( )
( ) ( )
0 6 06,0
0 6
0 0.90 0.10
1 0.90 0.100.000 (to 3 digits)
P C −=
=
≈
(c) ( ) ( ) ( ) ( )4 4 5 60.098 0.354 0.5310.983
P r P P P≥ = + +
= + +=
(d) ( ) ( )3 1 41 0.9830.017
P r P r≤ = − ≥
≈ −=
From the table:
( ) ( ) ( ) ( ) ( )3 0 1 2 30.000 0.000 0.001 0.0150.016
P r P P P P≤ = + + +
= + + +=
14. A trial is how a businessman wears a tie. Success = too tight. Failure = not too tight. 20, 0.10, 1 0.10 0.90n p q= = = − = (a) ( ) ( )1 1 0
1 0.1220.878
P r P r≥ = − =
= −=
(b) ( ) ( )( ) ( ) ( )( ) ( ) ( )( ) ( ) ( )
( ) ( ) ( )
2 1 2
1 0 1 2
1 0 1 2
1 0 1 2
1 1 20.878 0.270 0.285 using (a)0.323
P r P r
P r P r P r
P r P r P r
P r P r P r
P r P r P r
> = − ≤
= − = + = + = = − = − = − =
= − = − = − = = ≥ − = − =
= − −=
(c) ( )0 0.122P r = = (d) “At least 18 are not too tight” is the same as “at most 2 are too tight.” (To see this, note that “at least 18
failures” is the same as “18 or 19 or 20 failures,” which is 2, 1, or 0 successes, i.e., at most 2 successes.)
( ) ( )2 1 21 0.323 using (b)0.677
P r P r≤ = − >
= −=
15. A trial consists of taking a polygraph examination. Success = pass. Failure = fail.
9, 0.85, 1 0.85 0.15n p q= = = − = (a) ( )9 0.232P =
(b) ( ) ( ) ( ) ( ) ( ) ( )5 5 6 7 8 90.028 0.107 0.260 0.368 0.2320.995
P r P P P P P≥ = + + + +
= + + + +=
338 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
(c) ( ) ( )4 1 51 0.9950.005
P r P r≤ = − ≥
= −=
From the table:
( ) ( ) ( ) ( ) ( ) ( )4 0 1 2 3 40.000 0.000 0.000 0.001 0.0050.006
P r P P P P P≤ = + + + +
= + + + +=
The two results should be equal, but because of rounding error, they differ slightly. (d) All students failing is the same as no students passing.
( )0 0.000 (to 3 digits)P = 16. A trial consists of checking the gross receipts of the store for one business day. Success = gross over $850. Failure = gross is at or below $850. p = 0.6, q = 1 − p = 0.4. (a) 5n =
( ) ( ) ( ) ( )3 3 4 50.346 0.259 0.0780.683
P r P P P≥ = + +
= + +=
(b) 10n =
( ) ( ) ( ) ( ) ( ) ( )6 6 7 8 9 100.251 0.215 0.121 0.040 0.0060.633
P r P P P P P≥ = + + + +
= + + + +=
(c) 10n =
( ) ( ) ( ) ( ) ( ) ( )5 0 1 2 3 40.000 0.002 0.011 0.042 0.1110.166
P r P P P P P< = + + + +
= + + + +=
(d) 20n =
( ) ( ) ( ) ( ) ( ) ( ) ( )6 0 1 2 3 4 50.000 0.000 0.000 0.000 0.000 0.0010.001
P r P P P P P P< = + + + + +
= + + + + +=
Yes. If p were really 0.60, then the event of a 20-day period with gross income exceeding $850 fewer than 6 days would be very rare. If it happened again, we would suspect that p = 0.60 is too high.
(e) 20n =
( ) ( ) ( ) ( )17 18 19 200.003 0.000 0.0000.003
P r P P P> = + +
= + +=
Yes. If p were really 0.60, then the event of a 20-day period with gross income exceeding $850 more than 17 days would be very rare. If it happened again, we would suspect that p = 0.60 is too low.
Part IV: Complete Solutions, Chapter 5 339
Copyright © Houghton Mifflin Company. All rights reserved.
17. (a) A trial consists of using the Meyers-Briggs instrument to determine if a person in marketing is an extrovert. Success = extrovert. Failure = not extrovert.
15, 0.75, 1 0.75 0.25n p q= = = − =
( ) ( ) ( ) ( ) ( ) ( ) ( )10 10 11 12 13 14 150.165 0.225 0.225 0.156 0.067 0.0130.851
P r P P P P P P≥ = + + + + +
= + + + + +=
( ) ( ) ( ) ( ) ( ) ( ) ( )5 5 6 7 8 9 100.001 0.003 0.013 0.039 0.092 0.8510.999
P r P P P P P P r≥ = + + + + + ≥
= + + + + +=
( )15 0.013P = (b) A trial consists of using the Meyers-Briggs instrument to determine if a computer programmer is an
introvert. Success = introvert. Failure = not introvert. 5, 0.60, 1 0.60 0.40n p q= = = − =
( )0 0.010P =
( ) ( ) ( ) ( )3 3 4 50.346 0.259 0.0780.683
P r P P P≥ = + +
= + +=
18. A trial consists of the response from adults regarding their concern that employers are monitoring phone
calls. Success = yes. Failure = no. 0.37, 1 0.37 0.63p q= = − =
(a) 5n =
( ) ( ) ( )
( ) ( )
0 5 05,0
0 5
0 0.37 0.63
1 0.37 0.630.099
P C −=
=
≈
(b) 5n =
( ) ( ) ( )
( ) ( )
5 5 55,5
5 0
5 0.37 0.63
1 0.37 0.630.007
P C −=
=
≈
(c) 5n =
( ) ( ) ( )
( ) ( )
3 5 35,3
3 2
3 0.37 0.63
10 0.37 0.630.201
P C −=
=
≈
19. A trial consists of the response from adults regarding their concern that Social Security numbers are used
for general identification. Success = concerned that SS numbers are being used for identification. Failure = not concerned that SS numbers are being used for identification.
8, 0.53, 1 0.53 0.47n p q= = = − = (a) ( ) ( ) ( ) ( ) ( ) ( ) ( )5 0 1 2 3 4 5
0.002381 0.021481 0.084781 0.191208 0.269521 0.2431430.812515
P r P P P P P P≤ = + + + + +
= + + + + +=
( )5 0.81251P r ≤ = from the cumulative probability is the same, truncated to 5 digits.
340 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
(b) ( ) ( ) ( ) ( )5 6 7 8
0.137091 0.044169 0.0067260.187486
P r P P P> = + +
= + +=
( ) ( )5 1 51 0.812510.18749
P r P r> = − ≤
= −=
Yes, this is the same result rounded to 5 digits.
Part IV: Complete Solutions, Chapter 5 341
Copyright © Houghton Mifflin Company. All rights reserved.
20. A trial consists of an office visit. (a) Success visitor age is under 15 years old.
Failure visitor age is 15 years old or older.==
8, 0.20, 1 0.20 0.80n p q= = = − =
( ) ( ) ( ) ( ) ( ) ( )4 4 5 6 7 80.046 0.009 0.001 0.000 0.0000.056
P r P P P P P≥ = + + + +
= + + + +=
(b) Success visitor age is 65 years old or older.Failure visitor age is under 65 years old.
==
8, 0.25, 1 0.25 0.75n p q= = = − =
( ) ( ) ( ) ( ) ( )2 5 2 3 4 50.311 0.208 0.087 0.0230.629
P r P P P P≤ ≤ = + + +
= + + +=
(c) Success visitor age is 45 years old or older.Failure visitor age is less than 65 years old.
==
8, 0.20 0.25 0.45, 1 0.45 0.55n p q= = + = = − =
( ) ( ) ( ) ( ) ( )2 5 2 3 4 50.157 0.257 0.263 0.1720.849
P r P P P P≤ ≤ = + + +
= + + +=
(d) Success visitor age is under 25 years old.Failure visitor age is 25 years old or older.
==
8, 0.20 0.10 0.30, 1 0.30 0.70n p q= = + = = − =
( )8 0.000 (to 3 digits)P = (e) Success visitor age is 15 years old or older.
Failure visitor age is under 15 years old.==
8, 0.10 0.25 0.20 0.25 0.80, 0.20n p q= = + + + = =
( )8 0.168P = 21. (a) ( )
( )0.30, 3 0.1320.70, 2 0.132
p Pp P= == =
They are the same. (b) ( )
( )0.30, 3 0.132 0.028 0.002 0.1620.70, 2 0.002 0.028 0.132 0.162
p P rp P r= ≥ = + + == ≤ = + + =
They are the same. (c) ( )
( )0.30, 4 0.0280.70, 1 0.0281
p Pp Pr
= == ==
(d) The column headed by p = 0.80 is symmetric with the one headed by p = 0.20. 22. 3, 0.0228n p= = , q = 1 − p = 0.9772
(a) ( ) ( ) ( )2 12 3 23,22 3 0.0228 0.9772 0.00152P C p q −= = =
(b) ( ) ( ) ( )3 03 3 33,33 1 0.0228 0.9772 0.00001P C p q −= = =
(c) ( ) ( ) ( )2 or 3 2 3 0.00153P P P= + =
342 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
23. (a) n = 8; p = 0.65
(6 and 4 )(6 , 4 )(4 )
(6 )(4 )
(6) (7) (8)(4) (5) (6) (7) (8)
0.259 0.137 0.0320.188 0.279 0.259 0.137 0.0320.4280.8950.478
P r rP r given rP r
P rP r
P P PP P P P P
≤ ≤≤ ≤ =
≤≤
=≤
+ +=
+ + + ++ +
=+ + + +
=
=
(b) n = 10; p = 0.65 (8 and 6 )(8 , 6 )
(6 )(8 )(6 )
(8) (9) (10)(6) (7) (8) (9) (10)
0.176 0.072 0.0140.238 0.252 0.176 0.072 0.0140.2620.7520.348
P r rP r given rP r
P rP r
P P PP P P P P
≤ ≤≤ ≤ =
≤≤
=≤
+ +=
+ + + ++ +
=+ + + +
=
=
(c) Answers vary. Possibilities include the stock market, getting raises at work, or passing exams. (d) Let event A = 6 ≤ r and event B = 4 ≤ r in the formula. 24. (a) n = 10; p = 0.70
(8 and 6 )(8 , given 6 )(6 )
(8 )(6 )
(8) (9) (10)(6) (7) (8) (9) (10)
0.233 0.121 0.0280.200 0.267 0.233 0.121 0.0280.3820.8490.450
P r rP r rP r
P rP r
P P PP P P P P
≤ ≤≤ ≤ =
≤≤
=≤
+ +=
+ + + ++ +
=+ + + +
=
=
Part IV: Complete Solutions, Chapter 5 343
Copyright © Houghton Mifflin Company. All rights reserved.
(b) n = 10; p = 0.70 ( 10 and 6 )( 10, 6 )
(6 )( 10)(6 )
0.0280.8490.033
P r rP r given rP r
P rP r
= ≤= ≤ =
≤=
=≤
=
=
Section 5.3 1. The average number of successes in n trials. 2. The expected value of the first distribution will be higher. 3. (a) Yes, 120 is more than 2.5 standard deviations above the mean. (b) Yes, 40 is more than 2.5 standard deviations below the mean. (c) No, the entire interval is within 2.5 standard deviations above and below the mean. 4. (a) At p = 0.50, the distribution is symmetric. The expected value is r = 5. Yes, the distribution is centered
at r = 5. (b) The distribution is skewed right. (c) The distribution is skewed left. 5. (a) The distribution is symmetric.
Prob
abili
ty
543210
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0.03125
0.15625
0.312500.31250
0.15625
0.03125
n = 5 , p = 0 .50
344 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
(b) The distribution is skewed right.
Prob
abili
ty
543210
0.4
0.3
0.2
0.1
0.00.000977
0.014648
0.087891
0.263672
0.395508
0.237305
n = 5 , p = 0 .25
(c) The distribution is skewed left.
Prob
abili
ty
543210
0.4
0.3
0.2
0.1
0.0
0.237305
0.395508
0.263672
0.087891
0.0146480.000977
n = 5 , p = 0 .75
(d) The distributions are mirror images of one another. (e) The distribution would be skewed left for p = 0.73 because p > 0.50. 6. (a) p = 0.30 goes with graph II because it is slightly skewed right. (b) p = 0.50 goes with graph I because it is symmetric. (c) p = 0.65 goes with graph III because it is slightly skewed left. (d) p = 0.90 goes with graph IV because it is skewed left. (e) The graph is approximately symmetric when p is close to 0.5. The graph is skewed left when p is close
to 1 and skewed right when p is close to 0.
Part IV: Complete Solutions, Chapter 5 345
Copyright © Houghton Mifflin Company. All rights reserved.
7. Minitab was used to generate the distribution. (a) 10, 0.80n p= =
Prob
abili
ty
109876543210
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0.107
375
0.268
443
0.301
995
0.201
33
0.088
0844
0.026
4213
0.005
5002
8
0.000
7800
39
7.000
35e-
005
n = 10, p = 0 .80
( )
( ) ( )10 0.8 8
10 0.8 0.2 1.26
np
npq
µ
σ
= = =
= = ≈
(b) 10, 0.5n p= =
Prob
abili
ty
109876543210
0.25
0.20
0.15
0.10
0.05
0.000.0
0097
0068
0.009
7606
80.0
4394
31
0.117
188
0.205
084
0.246
107
0.205
084
0.117
188
0.043
9431
0.009
7606
8
0.000
9700
68
n = 10, p = 0 .50
346 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
( )( ) ( )
10 0.5 5
10 0.5 0.5 1.58
np
npq
µ
σ
= = =
= = ≈
(c) Yes; since the graph in part (a) is skewed left, it supports the claim that more households buy film that
have children under 2 years of age than households that have no children under 21 years of age. 8. (a) 8, 0.01n p= =
Prob
abili
ty
876543210
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
n = 8 , p = 0 .01
(b) ( )8 0.01 0.08npµ = = = The expected number of defective syringes the inspector will find is 0.08. (c) The batch will be accepted if fewer than two defectives are found.
( ) ( ) ( )2 0 10.923 0.0750.998
P r P P< = +
= +=
(d) ( ) ( )8 0.01 0.99 0.281npqσ = = ≈ 9. (a) 6, 0.70n p= =
Prob
abili
ty
6543210
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0.113402
0.309278
0.329897
0.185567
0.0515464
0.0103093
n = 6 , p = 0 .70
Part IV: Complete Solutions, Chapter 5 347
Copyright © Houghton Mifflin Company. All rights reserved.
(b) ( )( ) ( )
6 0.70 4.2
6 0.70 0.30 1.122
np
npq
µ
σ
= = =
= = ≈
We expect 4.2 friends to be found. (c) Find n such that ( )2 0.97.P r ≥ =
Try n = 5.
( ) ( ) ( ) ( ) ( )2 2 3 4 50.132 0.309 0.360 0.1680.9690.97
P r P P P P≥ = + + +
= + + +=≈
You would have to submit five names to be 97% sure that at least two addresses will be found. If you solve this problem as
( ) ( )( ) ( )
2 1 2
1 0 1
1 0.002 0.028 0.97
P r P r
P r P r
≥ = − <
= − = + = = − − =
the answers differ owing to rounding error in the table. 10. (a) 5, 0.85n p= =
Prob
abili
ty
543210
0.5
0.4
0.3
0.2
0.1
0.0
0.443888
0.391784
0.138277
0.02404810.00200401
n = 5, p = 0.85
(b) ( )
( ) ( )5 0.85 4.25
5 0.85 0.15 0.798
np
npq
µ
σ
= = =
= = ≈
For samples of size 5, the expected number of claims made by people under 25 years of age is about 4.
348 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
11. (a) 7, 0.20n p= =
Prob
abili
ty
76543210
0.4
0.3
0.2
0.1
0.00.000300090.00430129
0.0286086
0.114634
0.275283
0.36711
0.209763
n = 7 , p = 0 .20
(b) ( )( ) ( )
7 0.20 1.4
7 0.20 0.80 1.058
np
npq
µ
σ
= = =
= = ≈
We expect 1.4 people to be illiterate. (c) Let success = literate and p = 0.80. Find n such that ( )7 0.98.P r ≥ = Try n = 12. ( ) ( ) ( ) ( ) ( ) ( ) ( )7 7 8 9 10 11 12
0.053 0.133 0.236 0.283 0.206 0.0690.98
P r P P P P P P≥ = + + + + +
= + + + + +=
You would need to interview 12 people to be 98% sure that at least 7 of these people are not illiterate. 12. (a) 12, 0.35n p= =
Prob
abili
ty
1211109876543210
0.25
0.20
0.15
0.10
0.05
0.00
n = 12, p = 0 .35
Part IV: Complete Solutions, Chapter 5 349
Copyright © Houghton Mifflin Company. All rights reserved.
(b) ( )12 0.35 4.2npµ = = = The expected number of vehicles out of 12 that will tailgate is 4.2.
(c) ( ) ( )12 0.35 0.65 1.65npqσ = = ≈ 13. (a) 8, 0.25n p= =
Prob
abili
ty
876543210
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.000.0
0030
012
0.003
8015
20.0
2300
92
0.086
5346
0.207
683
0.311
525
0.267
007
0.100
14
n = 8 , p = 0 .25
(b) ( )
( ) ( )8 0.25 2
8 0.25 0.75 1.225
np
npq
µ
σ
= = =
= = ≈
We expect two people to believe that the product is actually improved. (c) Find n such that ( )1 0.99.P r ≥ =
Try n = 16. ( ) ( )1 1 0
1 0.010.99
P r P≥ = −
= −=
Sixteen people are needed in the marketing study to be 99% sure that at least one person believes the product to be improved.
14. p = 0.10
Find n such that ( )1 0.90P r ≥ = From a calculator or a computer, we determine n = 22 gives P(r ≥ 1) = 0.9015.
15. (a) Since success = not a repeat offender, then p = 0.75.
r 0 1 2 3 4 P(r) 0.004 0.047 0.211 0.422 0.316
350 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
Prob
abili
ty
43210
0.4
0.3
0.2
0.1
0.0
0.317269
0.422691
0.210843
0.0461847
0.00301205
n = 4, p = 0.75
(c) ( )4 0.75 3npµ = = = We expect three parolees to not repeat offend.
( ) ( )4 0.75 0.25 0.866npqσ = = ≈ (d) Find n such that
( )3 0.98P r ≥ = Try n = 7.
( ) ( ) ( ) ( ) ( ) ( )3 3 4 5 6 70.058 0.173 0.311 0.311 0.1330.986
P r P P P P P≥ = + + + +
= + + + +=
This is slightly higher than needed, but n = 6 yields P(r ≥ 3) = 0.963. Alice should have a group of seven to be about 98% sure that three or more will not become repeat offenders.
16. (a) p = 0.65
Find n such that ( )1 0.98P r ≥ = Try n = 4.
( ) ( )1 1 01 0.0150.985
P r P≥ = −
= −=
Four stations are required to be 98% certain that an enemy plane flying over will be detected by at least one station.
(b) n = 4, p = 0.65 ( )4 0.65 2.6npµ = = = If four stations are in use, we expect 2.6 stations to detect an enemy plane.
17. (a) Let success = available, then p = 0.75, n = 12. ( )12 0.032.P =
Part IV: Complete Solutions, Chapter 5 351
Copyright © Houghton Mifflin Company. All rights reserved.
(b) Let success = not available, then p = 0.25, n = 12.
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )6 6 7 8 9 10 11 120.040 0.011 0.002 0.000 0.000 0.000 0.0000.053
P r P P P P P P P≥ = + + + + + +
= + + + + + +=
(c) n = 12, p = 0.75 ( )12 0.75 9npµ = = = The expected number of those available to serve on the jury is nine.
( ) ( )12 0.75 0.25 1.5npqσ = = = (d) p = 0.75
Find n such that ( )12 0.959.P r ≥ = Try n = 20.
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )12 12 13 14 15 16 17 18 19 200.061 0.112 0.169 0.202 0.190 0.134 0.067 0.021 0.0030.959
P r P P P P P P P P P≥ = + + + + + + + +
= + + + + + + + +=
The jury commissioner must contact 20 people to be 95.9% sure of finding at least 12 people who are available to serve.
18. (a) Let success = emergency, then p = 0.15, n = 4. ( )4 0.001.P = (b) Let success = not emergency, then p = 0.85, n = 4.
( ) ( ) ( )3 3 40.368 0.5220.890
P r P P≥ = +
= +=
(c) p = 0.15 Find n such that ( )1 0.96.P r ≥ = Try n = 20.
( ) ( )1 1 01 0.0390.961
P r P≥ = −
= −=
The operators need to answer 20 calls to be 96% (or more) sure that at least one call was in fact an emergency.
19. Let success = case solved, then p = 0.2, n = 6. (a) ( )0 0.262P =
(b) ( ) ( )1 1 01 0.2620.738
P r P≥ = −
= −=
(c) ( )6 0.20 1.2npµ = = = The expected number of crimes that will be solved is 1.2.
( ) ( )6 0.20 0.80 0.98npqσ = = ≈
352 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
(d) Find n such that ( )1 0.90.P r ≥ = Try n = 11.
( ) ( )1 1 01 0.0860.914
P r P≥ = −
= −=
[ : For 10, ( 1) 0.893.]Note n P r= ≥ = The police must investigate 11 property crimes before they can be at least 90% sure of solving one or more cases.
20. (a) p = 0.55
Find n such that ( )1 0.99.P r ≥ = Try n = 6.
( ) ( )1 1 01 0.0080.992
P r P≥ = −
= −=
Six alarms should be used to be 99% certain that a burglar trying to enter is detected by at least one alarm.
(b) 9, 0.55n p= =
( )9 0.55 4.95npµ = = = The expected number of alarms that would detect a burglar is about five.
21. (a) Japan: n = 7, p = 0.95.
( )7 0.698P = United States: n = 7, p = 0.60. ( )7 0.028P =
(b) Japan: n = 7, p = 0.95. ( )
( ) ( )7 0.95 6.65
7 0.95 0.05 0.58
np
npq
µ
σ
= = =
= = ≈
United States: n = 7, p = 0.60. ( )
( ) ( )7 0.60 4.2
7 0.60 0.40 1.30
np
npq
µ
σ
= = =
= = ≈
The expected number of guilty verdicts in Japan is 6.65, and in the United States it is 4.2.
Part IV: Complete Solutions, Chapter 5 353
Copyright © Houghton Mifflin Company. All rights reserved.
(c) United States: p = 0.60.
Find n such that ( )2 0.99.P r ≥ = Try n = 8.
( ) ( ) ( )( )
2 1 0 1
1 0.001 0.0080.991
P r P P≥ = − + = − +
=
Japan: p = 0.95.
Find n such that ( )2 0.99.P r ≥ = Try n = 3.
( ) ( ) ( )2 2 30.135 0.8570.992
P r P P≥ = +
= +=
Cover eight trials in the United States and three trials in Japan. 22. 6, 0.45n p= = (a) ( )6 0.008P =
(b) ( )0 0.028P =
(c) ( ) ( ) ( ) ( ) ( ) ( )2 2 3 4 5 60.278 0.303 0.186 0.061 0.0080.836
P r P P P P P≥ = + + + +
= + + + +=
(d) ( )6 0.45 2.7npµ = = = The expected number is 2.7.
( ) ( )6 0.45 0.55 1.219npqσ = = ≈
(e) Find n such that ( )3 0.90.P r ≥ = Try n = 10.
( ) ( ) ( ) ( )( )
3 1 0 1 2
1 0.003 0.021 0.0761 0.1000.900
P r P P P≥ = − + + = − + +
= −=
You need to interview 10 professors to be at least 90% sure of filling the quota. 23. (a) p = 0.40
Find n such that ( )1 0.99.P r ≥ = Try n = 9.
( ) ( )1 1 01 0.0100.990
P r P≥ = −
= −=
The owner must answer nine inquiries to be 99% sure of renting at least one room. (b) n = 25, p = 0.40
( )25 0.40 10npµ = = = The expected number is 10 room rentals.
354 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
24. (a) Out of n trials, there can be 0 through n successes. The sum of the probabilities for all members of the sample space must be 1.
(b) r ≥ 1 consists of all members of the sample space except r = 0. (c) r ≥ 2 consists of all members of the sample space except r = 0 and r = 1. (d) r ≥ m consists of all members of the sample space except for r values between 0 and m − 1. Section 5.4 1. The geometric distribution 2. The mean or expected value, denoted by λ.
3. No, since the approximation requires n ≥ 100. 4. Yes. We have np = 150 × 0.02 = 3 ≤ 10 and n = 100 ≥ 10. 5. (a) Geometric probability distribution, p = 0.77.
( ) ( )( ) ( ) ( )
1
1
1
0.77 0.23
n
n
P n p p
P n
−
−
= −
=
(b) ( ) ( ) ( )( ) ( )
1 1
0
1 0.77 0.23
0.77 0.230.77
P −=
=
=
(c) ( ) ( ) ( )( ) ( )
2 1
1
2 0.77 0.23
0.77 0.230.1771
P −=
=
=
(d) ( ) ( ) ( )3 or more tries 1 1 21 0.77 0.17710.0529
P P P= − −
= − −=
(e) 1 1 1.290.77pµ = = ≈
The expected number is 1.29, or 1, attempt to pass. 6. (a) Geometric probability distribution, p = 0.57.
( ) ( )( ) ( ) ( )
1
1
1
0.57 0.43
n
n
P n p p
P n
−
−
= −
=
(b) ( ) ( ) ( )( ) ( )
2 1
1
2 0.57 0.43
0.57 0.430.2451
P −=
=
=
(c) ( ) ( ) ( )( ) ( )
3 1
2
3 0.57 0.43
0.57 0.430.1054
P −=
=
≈
Part IV: Complete Solutions, Chapter 5 355
Copyright © Houghton Mifflin Company. All rights reserved.
(d) ( ) ( ) ( ) ( )more than 3 attempts 1 1 2 31 0.57 0.2451 0.10540.0795
P P P P= − − −
= − − −=
(e) 1 1 1.750.57p
µ = = ≈
The expected number is 1.75, or 2, attempts to pass. 7. (a) Geometric probability distribution, p = 0.80.
( ) ( )( ) ( ) ( )
1
1
1
0.80 0.20
n
n
P n p p
P n
−
−
= −
=
(b) ( ) ( ) ( )( ) ( ) ( )( ) ( ) ( )
1 1
2 1
3 1
1 0.80 0.20 0.80
2 0.80 0.20 0.16
3 0.80 0.20 0.032
P
P
P
−
−
−
= =
= =
= =
(c) ( ) ( ) ( ) ( )4 1 1 2 31 0.80 0.16 0.0320.008
P n P P P≥ = − − −
= − − −=
(d) ( ) ( ) ( )( ) ( ) ( )( ) ( ) ( )( ) ( ) ( )
1
1 1
2 1
3 1
0.04 0.96
1 0.04 0.96 0.04
2 0.04 0.96 0.0384
3 0.04 0.96 0.0369
nP n
P
P
P
−
−
−
−
=
= =
= =
= =
( ) ( ) ( ) ( )4 1 1 2 31 0.04 0.0384 0.03690.8847
P n P P P≥ = − − −
= − − −=
8. (a) Geometric probability distribution, p = 0.36.
( ) ( )( ) ( ) ( )
1
1
1
0.036 0.964
n
n
P n p p
P n
−
−
= −
=
(b) ( ) ( )( )( ) ( )( )( ) ( )( )
3 1
5 1
12 1
3 0.036 0.964 0.03345
5 0.036 0.964 0.0311
12 0.036 0.964 0.0241
P
P
P
−
−
−
= ≈
= ≈
= ≈
(c) ( ) ( ) ( ) ( ) ( )( ) ( ) ( ) ( ) ( ) ( )2 3
5 1 1 2 3 4
1 0.036 0.036 0.964 0.036 0.964 0.036 0.9641 0.036 0.0347 0.03345 0.032250.8636
P n P P P P≥ = − − − −
= − − − −
= − − − −=
(d) 1 1 27.80.036p
µ = = ≈
The expected number is 27.8, or 28, apples.
356 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
9. (a) Geometric probability distribution, p = 0.30.
( ) ( )( ) ( ) ( )
1
1
1
0.30 0.70
n
n
P n p p
P n
−
−
= −
=
(b) ( ) ( ) ( )3 13 0.30 0.70 0.147P −= =
(c) ( ) ( ) ( ) ( )( ) ( )
3 1 1 2 3
1 0.30 0.30 0.70 0.1471 0.30 0.21 0.1470.343
P n P P P> = − − −
= − − −
= − − −=
(d) 1 1 3.330.30p
µ = = =
The expected number is 3.33, or 3, trips. 10. (a) The Poisson distribution would be a good choice because finding prehistoric artifacts is a relatively
rare occurrence. It is reasonable to assume that the events are independent and that the variable is the number of artifacts found in a fixed amount of sediment.
1.5 5 7.5 ; 7.5 per 50 liters10 L 5 50 L
λ λ= = =⋅
( )
( ) ( )7.5!
7.5!
r
r
eP rr
eP r
r
λλ−
−
=
=
(b) ( ) ( )
( ) ( )
( ) ( )
27.5
37.5
47.5
7.52 0.0156
2!
7.53 0.0389
3!
7.54 0.0729
4!
eP
eP
eP
−
−
−
= ≈
= ≈
= ≈
(c) ( ) ( ) ( ) ( )3 1 0 1 2
1 0.0006 0.0041 0.01560.9797
P r P P P≥ = − − −
= − − −=
(d) ( ) ( ) ( ) ( )3 0 1 2
0.0006 0.0041 0.01560.0203
P r P P P< = + +
= + +=
or
( ) ( )3 1 31 0.97970.0203
P r P r< = − ≥
= −=
Part IV: Complete Solutions, Chapter 5 357
Copyright © Houghton Mifflin Company. All rights reserved.
11. (a) The Poisson distribution would be a good choice because frequency of initiating social grooming is a relatively rare occurrence. It is reasonable to assume that the events are independent and that the variable is the number of times that one otter initiates social grooming in a fixed time interval.
1.7 3 5.1 ; 5.1 per 30 min interval10 min 3 30 min
λ λ= = =⋅
( )
( ) ( )5.1!
5.1!
r
r
eP rr
eP r
r
λλ−
−
=
=
(b) ( ) ( )
( ) ( )
( ) ( )
45.1
55.1
65.1
5.14 0.1719
4!
5.15 0.1753
5!
5.16 0.1490
6!
eP
eP
eP
−
−
−
= ≈
= ≈
= ≈
(c) ( ) ( ) ( ) ( ) ( )4 1 0 1 2 3
1 0.0061 0.0311 0.0793 0.13480.7487
P r P P P P≥ = − − − −
= − − − −=
(d) ( ) ( ) ( ) ( ) ( )4 0 1 2 3
0.0061 0.0311 0.0793 0.13480.2513
P r P P P P< = + + +
= − − +=
or
( ) ( )4 1 41 0.74870.2513
P r P r< = − ≥
= −=
12. (a) The Poisson distribution would be a good choice because frequency of shoplifting is a relatively rare
occurrence. It is reasonable to assume that the events are independent and that the variable is the number of incidents in a fixed time interval.
11 113 3
113
1 11; 3.7 per 11 hours3 hours 11 hours 3
λ λ= = = ≈⋅ (rounded to nearest tenth)
(b) ( ) ( )1 1 01 0.02470.9753
P r P≥ = −
= −=
(c) ( ) ( ) ( ) ( )3 1 0 1 2
1 0.0247 0.0915 0.16920.7146
P r P P P≥ = − − −
= − − −=
(d) ( )0 0.0247P =
358 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
13. (a) The Poisson distribution would be a good choice because frequency of births is a relatively rare occurrence. It is reasonable to assume that the events are independent and that the variable is the number of births (or deaths) for a community of a given population size.
(b) For 1,000 people, 16 births; 8 deaths.λ λ= =
By Table 4 in Appendix II:
( )( )( )( )
10 births 0.034110 deaths 0.099316 births 0.0992
16 deaths 0.0045
PPP
P
=
=
=
=
(c) For 1,500 people,
16 1.5 24 ; 24 births per 1,500 people1,000 1.5 1,500
λ λ= = =⋅
8 1.5 12 ; 12 deaths per 1500 people1000 1.5 1500
λ λ= = =⋅
By Table 4 in Appendix II or a calculator: ( )
( )( )( )
10 births 0.0006610 deaths 0.104816 births 0.02186
16 deaths 0.0543
PPP
P
=
=
=
=
(d) For 750 people,
16 0.75 12 ; 12 births per 750 people1,000 0.75 750
8 0.75 6 ; 6 deaths per 750 people1,000 0.75 750
λ λ
λ λ
= = =
= = =
⋅
⋅
( )( )( )( )
10 births 0.104810 deaths 0.041316 births 0.0543
16 deaths 0.0003
PPP
P
=
=
=
=
14. (a) The Poisson distribution would be a good choice because frequency of hairline cracks is a relatively
rare occurrence. It is reasonable to assume that the events are independent and that the variable is the number of hairline cracks for a given length of retaining wall.
(b) 5353
4.2 7 ; 7 per 50 ft30 ft 50 ft
λ λ= = =⋅
From Table 4 in Appendix II:
( )( ) ( ) ( ) ( )
3 0.0521
3 1 0 1 21 0.0009 0.0064 0.02230.9704
P
P r P P P
=
≥ = − − −
= − − −=
Part IV: Complete Solutions, Chapter 5 359
Copyright © Houghton Mifflin Company. All rights reserved.
(c) 2323
4.2 2.830 ft 20 ft
2.8 per 20 ft
λ
λ
= =
=
⋅
( )
( ) ( ) ( ) ( )3 0.2225
3 1 0 1 21 0.0608 0.1703 0.23840.5305
P
P r P P P
=
≥ = − − −
= − − −=
(d) 1
151
15
4.2 0.2830 ft 2 ft
0.3 per 2 ft
λ
λ
= =
=
⋅
( )
( ) ( ) ( ) ( )3 0.0033
3 1 0 1 21 0.7408 0.2222 0.03330.0037
P
P r P P P
=
≥ = − − −
= − − −=
(e) Three hairline cracks spread out evenly over a 50-foot section is no cause for concern. For part (c), we
expect 2.8 cracks per 20 feet, so actually having 3 cracks is nothing unusual. For part (d), actually having 3 cracks is unusual and a cause for concern.
15. (a) The Poisson distribution would be a good choice because frequency of gale-force winds is a relatively
rare occurrence. It is reasonable to assume that the events are independent and that the variable is the number of gale-force winds in a given time interval.
(b) 1 1.8 1.8 ; 1.8 per 108 hours60 hours 1.8 108 hours
λ λ= = =⋅
From Table 4 in Appendix II:
( )( )( )
( ) ( ) ( )
2 0.2678
3 0.1607
4 0.0723
2 0 10.1653 0.29750.4628
P
P
P
P r P P
=
=
=
< = +
= +=
(c) 1 3 3 ; 3 per 180 hours60 hours 3 180 hours
λ λ= = =⋅
P(3) = 0.2240 P(4) = 0.1680 P(5) = 0.1008 P(r < 3) = P(0) + P(1) + P(2) = 0.0498 + 0.1494 + 0.2240 = 0.4232
360 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
16. (a) The Poisson distribution would be a good choice because frequency of earthquakes is a relatively rare occurrence. It is reasonable to assume that the events are independent and that the variable is the number of earthquakes in a given time interval.
(b)
( )1.00 per 22 years
1 0.6321P r
λ =
≥ =
(c)
( )1.00 per 22 years
0 0.3679P
λ =
=
(d)
( )
25112511
1 2.27 ; 2.27 per 50 years22 years 50 years
1 0.8967P r
λ λ= ≈ =
≥ =
⋅
(e)
( )2.27 per 50 years
0 0.1033P
λ =
=
17. (a) The Poisson distribution would be a good choice because frequency of commercial building sales is a
relatively rare occurrence. It is reasonable to assume that the events are independent and the variable is the number of buildings sold in a given time interval.
(b) 12 96
55551255
8 96; 1.7 per 60 days275 days 60 days 55
λ λ= ≈ = ≈⋅
From Table 4 in Appendix II:
( )( )
( ) ( ) ( )
0 0.1827
1 0.3106
2 1 0 11 0.1827 0.31060.5067
P
P
P r P P
=
=
≥ = − −
= − −=
(c) 18551855
8 2.6 ; 2.6 per 90 days275 days 90 days
λ λ= ≈ ≈⋅
( )( )
( ) ( ) ( ) ( )
0 0.0743
2 0.2510
3 1 0 1 21 0.0743 0.1931 0.25100.4816
P
P
P r P P P
=
=
≥ = − − −
= − − −=
18. (a) The problem satisfies the conditions for a binomial experiment with
( )
661100,000
large, 316, and small, 0.00661.316 0.00661 2.1 10.
n n p pnp
= = =
= ≈ <
The Poisson distribution would be a good approximation to the binomial. 316, 0.00661, 2.1n p npλ= = = ≈
(b) From Table 4 in Appendix II, ( )0 0.1225.P =
Part IV: Complete Solutions, Chapter 5 361
Copyright © Houghton Mifflin Company. All rights reserved.
(c) ( ) ( ) ( )1 0 10.1225 0.25720.3797
P r P P≤ = +
= +=
(d) ( ) ( )2 1 11 0.37970.6203
P r P r≥ = − ≤
= −=
19. (a) The problem satisfies the conditions for a binomial experiment with
( )1
569large, 1000, and small, 0.0018.1000 0.0018 1.8 10.
n n p pnp
= = ≈≈ = <
The Poisson distribution would be a good approximation to the binomial. 1.8npλ = ≈
(b) From Table 4 in Appendix II, ( )0 0.1653.P =
(c) ( ) ( ) ( )1 1 0 11 0.1653 0.29750.5372
P r P P> = − −
= − −=
(d) ( ) ( ) ( )2 1 20.5372 0.26780.2694
P r P r P> = > −
= −=
(e) ( ) ( ) ( )3 2 30.2694 0.16070.1087
P r P r P> = > −
= −=
20. (a) The Poisson distribution would be a good choice because frequency of lost bags is a relatively rare
occurrence. It is reasonable to assume that the events are independent and the variable is the number of bags lost per 1,000 passengers.
6.02 or 6.0 per 1,000 passengersλ = (b) From Table 4 in Appendix II,
( )( ) ( ) ( ) ( )
0 0.0025
3 1 0 1 21 0.0025 0.0149 0.04460.9380
P
P r P P P
=
≥ = − − −
= − − −=
( ) ( ) ( ) ( ) ( )6 3 3 4 50.9380 0.0892 0.1339 0.16060.5543
P r P r P P P≥ = ≥ − − −
= − − −=
(c) 13.0 per 1,000 passengersλ =
( )( ) ( )
( ) ( )
0 0.000 (to 3 digits)
6 1 51 0.01070.9893
12 1 111 0.35320.6468
P
P r P r
P r P r
=
≥ = − ≤
= −=
≥ = − ≤
= −=
362 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
21. (a) The problem satisfies the conditions for a binomial experiment with n large, n = 175, and p small, p = 0.005. np = (175)(0.005) = 0.875 < 10. The Poisson distribution would be a good approximation to the binomial. n = 175, p = 0.005, λ = np = 0.9.
(b) From Table 4 in Appendix II, ( )0 0.4066.P =
(c) ( ) ( )1 1 01 0.40660.5934
P r P≥ = −
= −=
(d) ( ) ( ) ( )2 1 10.5934 0.36590.2275
P r P r P≥ = ≥ −
= −=
22. (a) The problem satisfies the conditions for a binomial experiment with n large, n = 137, and p small,
p = 0.02. np = (137)(0.02) = 2.74 < 10. The Poisson distribution would be a good approximation to the binomial. n = 175, p = 0.02, λ = np = 2.74 ≈ 2.7.
(b) From Table 4 in Appendix II, ( )0 0.0672.P =
(c) ( ) ( ) ( )2 1 0 11 0.0672 0.18150.7513
P r P P≥ = − −
= − −=
(d) ( ) ( ) ( ) ( )4 2 2 30.7513 0.2450 0.22050.2858
P r P r P P≥ = ≥ − −
= − −=
23. (a) n = 100, p = 0.02, r = 2
( ) ( )
( ) ( ) ( )( ) ( )
,2 100 2
100,2
1
2 0.02 0.98
4950 0.0004 0.13810.2734
n rrn rP r C p p
P C
−
−
= −
=
=
=
(b) ( )100 0.02 2npλ = = =
From Table 4 in Appendix II, ( )2 0.2707.P = (c) Yes, the approximation is correct to two decimal places. (d) n = 100; p = 0.02; r = 3
By the formula for the binomial distribution,
( ) ( ) ( )( ) ( )
3 100 3100,33 0.02 0.98
161,700 0.000008 0.14090.1823
P C −=
=
=
By the Poisson approximation, λ = 3, P(3) = 0.1804. This is correct to two decimal places. 24. (a) The Poisson distribution would be a good choice because frequency of fish caught is a relatively rare
occurrence. It is reasonable to assume that the events are independent and that the variable is the number of fish caught in the 8-hour period.
0.667 8 5.3
1 8 85.3 fish caught per 8 hours
λ
λ
= ⋅ ≈
≈
Part IV: Complete Solutions, Chapter 5 363
Copyright © Houghton Mifflin Company. All rights reserved.
(b)
( 7 and 3)( 7, given 3)( 3)
( 7)( 3)
( 7)1 ( 2)0.1163 0.0771 0.0454 0.0241 0.0116 0.0051 0.0021
0.0008 0.0003 0.00011 (0.0050 0.0265 0.0701)
0.28290.89840.3149
P r rP r rP r
P rP r
P rP r
≥ ≥≥ ≥ =
≥≥
=≥≥
=− ≤
+ + + + + ++ + +
=− + +
=
≈
(c) ( 9 and 4)( 9, given 4)( 4)
( 4) ( 5) ( 6) ( 7) ( 8)1 ( 3)
0.1641 0.1740 0.1537 0.1163 0.07711 (0.0050 0.0265 0.0701 0.1239)
0.68520.77450.8847
P r rP r rP r
P r P r P r P r P rP r
< ≥< ≥ =
≥= + = + = + = + =
=− ≤
+ + + +=
− + + +
=
≈
(d) Possibilities include the fields agriculture, the military, and business. 25. (a) The Poisson distribution would be a good choice because hail storms in western Kansas are relatively
rare occurrences. It is reasonable to assume that the events are independent and that the variable is the number of hailstorms in a fixed-square-mile area.
( )8855
85
2.12.1 3.45 8 8
λ = ⋅ = ≈
3.4 storms per 8 square milesλ =
(b)
( 4 and 2)( 4, given 2)( 2)
( 4)( 2)
1 ( 3)1 ( 1)1 (0.0334 0.1135 0.1929 0.2186)
1 (0.0334 0.1135)0.44160.85310.5176
P r rP r rP r
P rP r
P rP r
≥ ≥≥ ≥ =
≥≥
=≥
− ≤=
− ≤− + + +
=− +
=
=
364 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
(c) ( 6 and 3)( 6, given 3)( 3)
( 3) ( 4) ( 5)1 ( 2)
0.2186 0.1858 0.12641 (0.0334 0.1135 0.1929)0.53080.66020.8040
P r rP r rP r
P r P r P rP r
< ≥< ≥ =
≥= + = + =
=− ≤+ +
=− + +
=
=
26. (a) 1, 1
4 41, 3
( )
( ) (0.65 )(0.35 )
k n kn k
nn
P n C p q
P n C
−− −
−−
=
=
(b) 4 03, 3(4) (0.65 )(0.35 ) 0.1785P C= ≈
4 14, 3(5) (0.65 )(0.35 ) 0.2499P C= ≈
4 25, 3(6) (0.65 )(0.35 ) 0.2187P C= ≈
4 26, 3(7) (0.65 )(0.35 ) 0.1531P C= ≈
(c) (4 7) (4) (5) (6) (7)
0.1785 0.2499 0.2187 0.15310.8002
P n P P P P≤ ≤ = + + += + + +=
(d) ( 8) 1 ( 7)1 (4 7)1 0.80020.1998
P n P nP n
≥ = − ≤= − ≤ ≤= −=
(e) 4 6.150.65
kp
µ = = ≈
4(0.35)1.82
0.65kqp
σ = = ≈
The expected year in which the fourth successful crop occurs is 6.15, with a standard deviation of 1.82.
27. (a) We have binomial trials for which the probability of success is p = 0.80 and failure is q = 0.20; k = 12
is a fixed whole number ≥ 1; n is a random variable representing the number of contacts needed to get the twelfth sale.
1, 112 12
1, 11
( )
( ) (0.80 )(0.20 )
k n kn k
nn
P n C p q
P n C
−− −
−−
=
=
(b) 12 011, 11(12) (0.80 )(0.20 ) 0.0687P C= ≈
12 112, 11(13) (0.80 )(0.20 ) 0.1649P C= ≈
12 213, 11(14) (0.80 )(0.20 ) 0.2144P C= ≈
Part IV: Complete Solutions, Chapter 5 365
Copyright © Houghton Mifflin Company. All rights reserved.
(c) (12 14) (12) (13) (14)0.0687 0.1649 0.21440.4480
P n P P P≤ ≤ = + += + +=
(d) ( 14) 1 ( 14)1 (12 14)1 0.44800.5520
P n P nP n
> = − ≤= − ≤ ≤= −=
(e) 12 150.80
kp
µ = = =
12(0.20)1.94
0.80kqp
= = ≈σ
The expected contact in which the twelfth sale will occur is the fifteenth contact, with a standard deviation of 1.94.
28. (a) We have binomial trials for which the probability of success is p = 0.41 and failure is q = 0.59; k = 3 is
a fixed whole number ≥ 1; and n is a random variable representing the number of donors needed to provide 3 pints of type A blood.
1, 13 3
1, 2
( )
( ) (0.41 )(0.59 )
k n kn k
nn
P n C p q
P n C
−− −
−−
=
=
(b) 3 02, 2(3) (0.41 )(0.59 ) 0.0689P C= ≈
3 13, 2(4) (0.41 )(0.59 ) 0.1220P C= ≈
3 24, 2(5) (0.41 )(0.59 ) 0.1439P C= ≈
3 35, 2(6) (0.41 )(0.59 ) 0.1415P C= ≈
(c) (3 6) (3) (4) (5) (6)0.0689 0.1220 0.1439 0.14150.4763
P n P P P P≤ ≤ = + + += + + +=
(d) ( 6) 1 ( 6)1 (3 6)1 0.47630.5237
P n P nP n
> = − ≤= − ≤ ≤= −=
(e) 3(0.59)3 7.32; 3.24
0.41 0.41kqk
p pµ σ= = ≈ = = ≈
The expected number of donors in which the third pint of blood type A is acquired is 7.32, with a standard deviation of 3.24.
29. (a) This is binomial with n – 1 trials and probability of success p. Thus we use the binomial probability
distribution: 1 1 ( 1)
1, 1( ) * k n kn kP A C p q− − − −− −=
(b) P(B) = success on one trial, namely, the nth trial = p (by definition). (c) By the definition of independent trials, P(A and B) = P(A) × P(B).
366 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
(d) P(A and B) =
1 1 ( 1)
1, 1
1, 1
k n kn k
k n kn k
C p q p
C p q
− − − −− −
−− −
× × =
×
(e) The results are the same. Chapter 5 Review 1. A description of all the values of a random variable x, the associated probabilities for each value of x, the
summation of the probabilities equal 1, and each probability takes on values between 0 and 1 inclusive. 2. The criteria: a fixed number of trials that are repeated under identical conditions; the trials are independent
and have exactly two possible outcomes; and the probability of success on each trial is constant. The random variable counts the number of successes r in n trials.
3. (a) Yes, we expect np = 10 × 0.2 = 2 successes. The standard deviation is σ = 1.26, so the boundary is 2 + (2.5) × (1.26) = 5.15. Thus six successes is unusual. (b) No, P(x > 5) = 0.0064 (using a TI-83 calculator). 4. As the number of trials n increases, μ = np will increase, and npqσ = also will increase. 5. (a) ( )
( ) ( ) ( ) ( ) ( )18.5 0.127 30.5 0.371 42.5 0.285 54.5 0.215 66.5 0.00237.62837.63
xP xµ = ∑
= + + + +
=≈
The expected lease term is about 38 months.
( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
2
2 2 2 2 219.13 0.127 7.13 0.371 4.87 0.285 16.87 0.215 28.87 0.002
134.95
11.6 (using 37.63 in the calculations)
x P xσ µ
µ
= ∑ −
= − + − + + +
≈
≈ =
(b)
Part IV: Complete Solutions, Chapter 5 367
Copyright © Houghton Mifflin Company. All rights reserved.
Perc
ent
66.554.542.530.518.5
40
30
20
10
0
Histogram of Length of Lease
6. (a) Number Killed
by Wolves
Relative
P(x)
112 112/296 0.378 53 53/296 0.179 73 73/296 0.247 56 56/296 0.189 2 2/296 0.007
(b) ( )( ) ( ) ( ) ( ) ( )0.5 0.378 3 0.179 8 0.247 13 0.189 18 0.007
5.28 years
xP xµ = ∑
= + + + +
≈
( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
2
2 2 2 2 24.78 0.378 2.28 0.179 2.72 0.247 7.72 0.189 12.72 0.007
23.8
4.88 years
x P xσ µ= ∑ −
= − + − + + +
=
≈
7. This is a binomial experiment with 10 trials. A trial consists of a claim. Success submitted by a male under 25 years of age.
Failure not submitted by a male under 25 years of age.==
(a) The probabilities can be taken directly from Table 3 in Appendix II: n = 10, p = 0.55.
368 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
Prob
abili
ty
109876543210
0.25
0.20
0.15
0.10
0.05
0.000.0
0250
125
0.020
7104
0.076
3382
0.166
483
0.238
419
0.234
117
0.159
58
0.074
6373
0.022
8114
0.004
1020
5
0.000
3001
5
Histogram of Claims (Males under 25)
(b) P(x ≥ 6) = P(6) + P(7) + P(8) + P(9) + P(10) = 0.504 (c) ( )10 0.55 5.5npµ = = =
The expected number of claims made by males under age 25 is 5.5.
( ) ( )10 0.55 0.45 1.57npqσ = = ≈ 8. (a) n = 20, p = 0.05
( ) ( ) ( ) ( )2 0 1 20.358 0.377 0.1890.924
P r P P P≤ = + +
= + +=
(b) n = 20, p = 0.15
Probability accepted:
( ) ( ) ( ) ( )2 0 1 20.039 0.137 0.2290.405
P r P P P≤ = + +
= + +=
Probability not accepted:
1 0.405 0.595− =
9. n = 16, p = 0.50 (a) ( ) ( ) ( ) ( ) ( ) ( )12 12 13 14 15 16
0.028 0.009 0.002 0.000 0.0000.039
P r P P P P P≥ = + + + +
= + + + +=
(b) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )7 0 1 2 3 4 5 6 70.000 0.000 0.002 0.009 0.028 0.067 0.122 0.1750.403
P r P P P P P P P P≤ = + + + + + + +
= + + + + + + +=
(c) ( )16 0.50 8npµ = = = The expected number of inmates serving time for dealing drugs is eight.
Part IV: Complete Solutions, Chapter 5 369
Copyright © Houghton Mifflin Company. All rights reserved.
10. n = 200, p = 0.80 ( )200 0.80 160npµ = = = We expect 160 flights to arrive on time.
( ) ( )200 0.80 0.20 5.66npqσ = = ≈ The standard deviation is 5.66 flights. 11. n = 10, p = 0.75 (a) The probabilities can be obtained directly from Table 3 in Appendix II.
Prob
abili
ty
109876543210
0.30
0.25
0.20
0.15
0.10
0.05
0.00
Histogram of Good Grapefruit
(b) No more than one is bad is the same event as at least nine are good.
( ) ( ) ( )9 9 100.188 0.0560.244
P r P P≥ = +
= +=
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )1 1 2 3 4 5 6 7 8 9 100.000 0.000 0.003 0.016 0.058 0.146 0.250 0.282 0.188 0.0560.999
P r P P P P P P P P P P≥ = + + + + + + + + +
= + + + + + + + + +=
(c) ( )10 0.75 7.5npµ = = = We expect 7.5 good grapefruits.
(d) ( ) ( )10 0.75 0.25 1.37npqσ = = ≈ 12. Let success = show up, then p = 0.95, n = 82.
( )82 0.95 77.9npµ = = = If 82 party reservations have been made, 77.9, or about 78, can be expected to show up.
( ) ( )82 0.95 0.05 1.97npqσ = = ≈ 13. p = 0.85, n = 12
( ) ( ) ( ) ( )2 0 1 20.000 0.000 0.0000.000 (to 3 digits)
P r P P P≤ = + +
= + +=
The data seem to indicate that the percent favoring the increase in fees is less than 85%.
370 Part IV: Complete Solutions, Chapter 5
Copyright © Houghton Mifflin Company. All rights reserved.
14. Let success = do not default, then p = 0.50.
Find n such that ( )5 0.941.P r ≥ = Try n = 15.
( ) ( ) ( ) ( ) ( ) ( )( )
5 1 0 1 2 3 4
1 0.000 0.000 0.003 0.014 0.0421 0.0590.941
P r P P P P P≥ = − + + + + = − + + + +
= −=
You should buy 15 bonds if you want to be 94.1% sure that 5 or more will not default. 15. (a) The Poisson distribution would be a good choice because coughs are a relatively rare occurrence. It is
reasonable to assume that they are independent events, and the variable is the number of coughs in a fixed time interval.
(b) 11 per 1 minuteλ = From Table 4 in Appendix II,
( ) ( ) ( ) ( ) ( )3 0 1 2 30.0000 0.0002 0.0010 0.00370.0049
P r P P P P≤ = + + +
= + + +=
(c) 11 0.5 5.5 ; 5.5 per 30 seconds60 seconds 0.5 30 seconds
λ λ= = =⋅
( ) ( ) ( ) ( )3 1 0 1 21 0.0041 0.0225 0.06180.9116
P r P P P≥ = − − −
= − − −=
16. (a) The Poisson distribution would be a good choice because number of accidents is a relatively rare
occurrence. It is reasonable to assume that they are independent events, and the variable is the number of accidents for a given number of operations.
(b) 2.4 per 100,000 flight operationsλ =
From Table 4 in Appendix II, ( )0 0.0907.P =
(c) 2.4 2 4.8100,000 2 200,0004.8 per 200,000 flight operations.
λ
λ
= =
=
⋅
( ) ( ) ( ) ( ) ( )4 1 0 1 2 31 0.0082 0.0395 0.0948 0.15170.7058
P r P P P P≥ = − − − −
= − − − −=
17. The loan-default problem satisfies the conditions for a binomial experiment. Moreover, p is small, n is
large, and np < 10. Using the Poisson approximation to the binomial distribution is appropriate.
( )1300, 0.0029, 300 0.0029 0.86 0.9350
n p npλ= = = = = ≈ ≈
From Table 4 in Appendix II,
( ) ( ) ( )2 1 0 11 0.4066 0.36590.2275
P r P P≥ = − −
= − −=
Part IV: Complete Solutions, Chapter 5 371
Copyright © Houghton Mifflin Company. All rights reserved.
18. This problem satisfies the conditions for a binomial experiment. Moreover, p is small, n is large, and np < 10. Using the Poisson approximation to the binomial distribution is appropriate.
n = 482, 551 0.00551100,000
p = =
λ = np = 482(0.00551) ≈ 2.7 (a) P(0) = 0.0672 (b) ( 1) 1 (0)
1 0.06720.9328
P r P≥ = −= −=
(c) ( 2) 1 ( 1)1 (0.0672 0.1815)0.7513
P r P r≥ = − ≤= − +=
19. (a) Use the geometric distribution with p = 0.5. ( ) ( ) ( ) ( )22 0.5 0.5 0.5 0.25P n = = = =
P(n = 3) = (0.5)(0.5)(0.5) = 0.125 P(n = 4) = (0.5)(0.5)(0.5)(0.5) = 0.0625 This is the geometric probability distribution with p = 0.5.
(b) ( ) ( ) ( ) ( )( ) ( ) ( ) ( ) ( )
3 4
2 3 4
4 0.5 0.5 0.5 0.0625
4 1 1 2 3 4
1 0.5 0.5 0.5 0.50.0625
P
P n P P P P
= = =
> = − − − −
= − − − −=
20. (a) Use the geometric distribution with p = 0.83.
( ) ( ) ( )1 11 0.83 0.17 0.83P −= =
(b) ( ) ( ) ( )( ) ( ) ( )
( )
2 1
3 1
2 0.83 0.17 0.1411
3 0.83 0.17 0.0240
2 or 3 0.1411 0.0240 0.165
P
P
P
−
−
= =
= =
= + ≈
Normal Distributions (Page 1 of 23)
6.1 Graphs of Normal Probability Distributions
Normal Probability Distribution
x
Important Properties of a Normal Curve 1. The curve is bell-shaped with the highest point over the mean
µ . 2. It is symmetric about the vertical line through µ . 3. The curve approaches the horizontal axis but never crosses or
touches it. 4. The transition points (TP) are where the graph changes from
cupping upward to cupping downward (or visa versa). The transition points occur at x = µ +σ and x = µ −σ .
5. The total area under the curve is 1. Guided Exercise 2 a. Which point (A, B or C)
corresponds to µ +σ ? b. Which point (A, B or C)
corresponds to µ − 2σ ? c. What is the mean and standard deviation of the distribution?
6 8 10 12 14A B C
x
Normal Curve aka Probability Density Function
x-axis
TP TP
µ + σ µ − σ µ
Normal Distributions (Page 2 of 23)
Example A The mean affects the location of the curve and the standard deviation affects the shape (spread) of the curve.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
C
D
Same Mean & Different Standard Deviations
Which curve has the larger mean?
Which curve has the larger standard deviation?
A B
Same Standard Deviation & Different Means
Normal Distributions (Page 3 of 23)
Guided Exercise 1 Determine whether each curve is normal or not. If it is not, then state why.
Empirical Rule for a Normal Distribution a. Approximately 68.2% of the data will lie within 1 standard
deviation of the mean. b. Approximately 95.4% of the data will lie within 2 standard
deviations of the mean. c. Approximately 99.7% of the data will lie within 3 standard
deviations of the mean. d. The area under the curve represents the probability and the
total area under the curve is 1.
Empirical Rule
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
68.2%
13.6% 2.15%
13.6% 2.15% 0.15% 0.15%
µ + 3σ µ − 3σ µ − 2σ µ −σ µ µ +σ µ + 2σ x-axis
Normal Distributions (Page 4 of 23)
Example 1 The playing life of a Sunshine radio is normally distributed with a mean of 600 hours and a standard deviation of 100 hours. a. Sketch a normal curve showing the
distribution of the playing life of the Sunshine radio. Scale and label the axis; include the transition points.
Use the empirical rule to find the probability that a randomly selected radio will last b. between 600 and 700 hours? c. between 400 and 500 hours d. greater than 700 hours?
Normal Distributions (Page 5 of 23)
Guided Exercise 4 The annual wheat yield per acre on a farm is normally distributed with a mean of 35 bushels and a standard deviation of 8 bushels. a. Sketch a normal curve and shadein the
area that represents the probability that an acre will yield between 19 and 35 bushels.
b. Is the shaded area the same as the area between µ − 2σ and
µ ? Use the empirical rule to find the probability that the yield will be between 19 and 35 bushels per acre?
Control Charts A control chart for a random variable x that is approximately normally distributed is a plot of observed x values in time sequence order. The construction is as follows: 1. Find the mean µ and standard deviation σ of the x
distribution in one of two ways: (i) Use past data from a period during which the process was
“in control” or (ii) Use a specified “target” value of µ and σ .
2. Create a graph where the vertical axis represents the x values and the horizontal axis represents time.
3. Draw a solid horizontal line at height µ and horizontal dashed control-limit lines at µ ± 2σ and µ ± 3σ .
4. Plot the variable x on the graph in time sequence order.
Normal Distributions (Page 6 of 23)
Example 2 Susan is director of personnel at Antlers Lodge and hires many college students every summer. One of the biggest activities for the lodge staff is to make the rooms ready for the next guest. Although the rooms are supposed to be ready by 3:30 pm, there are always some rooms not finished because of high turn over. Every 15 days Susan has a control chart of the number of rooms not made up by 3:30 pm each day. From past experience, Susan knows that the distribution of rooms not made up by 3:30 pm is approximately normal, with µ = 19.3 rooms and σ = 4.7 rooms. For the past 15 days the staff has reported the number of rooms not made up by 3:30 pm as: Day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
x 11 20 25 23 16 19 8 25 17 20 23 29 18 14 10 a. Make a control chart for these data. Completely annotate. b. Is the housekeeping process out of control? Explain.
Control Chart for the Number of Rooms Not Made Up by 3:30 PM
µ
µ − 2σ
µ + 2σ
µ − 3σ
µ + 3σ
Normal Distributions (Page 7 of 23)
Out-of-Control Warning Signals 1. Out-of-Control Signal I: One
point falls beyond the 3σ level. The probability that this is a false alarm is 0.003.
2. Out-of-Control Signal II: A run
of 9 consecutive points on one side of the center line. The probability that this is a false alarm is 2 ⋅(0.5)9 = 0.004 .
3. Out-of-Control Signal III: At
least 2 of 3 consecutive points lie beyond the 2σ level on the same side of the center line. The probability that this is a false alarm is 0.002
µ + 2σµ + 3σ
µ + 3σµ + 2σ
µ
µ + 2σµ + 3σ
µ + 3σµ + 2σ
µ
µ + 2σµ + 3σ
µ + 3σµ + 2σ
µ
Normal Distributions (Page 8 of 23)
Example 3 Yellowstone Park Medical Services (YPMS) provides emergency medical care for park visitors. History has shown that the during the summer the mean number of visitors treated each day is 21.7 with a standard deviation of 4.2. For a 10-day summer period, the following numbers of people were treated:
Day 1 2 3 4 5 6 7 8 9 10 Number Treated 20 15 12 21 24 28 32 36 35 37 a. Make a
control chart and plot the data on the chart.
b. Do the data indicate
the number of visitors treated by YPMS is in control or out-of-control? Explain your answer in terms of the three out-of-control signal types.
c. If you were the park superintendent, do you think YPMS
might need some extra help? Explain.
µ
µ − 2σ
µ + 2σ
µ − 3σ
µ + 3σ
Normal Distributions (Page 9 of 23)
6.2 Standard Units and The Standard Normal Distribution
Suppose Tina and Jack are in two different sections of the same course and they recently took midterms. Tina’s class average was 64 and she got a 74. Jack’s class average 72 and he got an 82. Who did better? z-Score or Standardized Score To standardize test scores we use the z-value or (z-score). The z-value, or z-score, or standardized score is the number of standard deviations a data value lies away from the mean. The z-score can be positive, negative, or zero depending on whether the data value is above the mean, below the mean, or at the mean, respectively. For any x-value in a normal distribution the standardized score, z-value, or z-score is given by: Note(s): 1. If x = µ , then z = 0 2. If x > µ , then z > 0 3. If x < µ , then z < 0 Fact: Unless otherwise stated, for now on, the average will
mean the arithmetic mean.
QuickTime™ and a decompressor
are needed to see this picture.
z-score positive
x > µ
z-score negative
x < µ
x µ
z − score = x − µ
σ
Normal Distributions (Page 10 of 23)
Example 3 Suppose Tina and Jack are in two different sections of the same course and they recently took midterms. The distribution of scores in both classes is normal. Tina’s class average was 64 with a standard deviation of 3; she earned a 74. Jack’s class average was 72 with a standard deviation of 5; he earned an 82. a. What was Tina’s z-score? Draw
a distribution for Tina’s class showing Tina’s score within the distribution of scores for her class.
b. What was Jack’s z-score? Draw
a distribution for Jack’s class showing Jack’s score within the distribution scores for his class.
c. Who did better? Why?
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
Normal Distributions (Page 11 of 23)
Example 4 A pizza parlor chain claims a large pizza has 8 oz of cheese with a standard deviation of 0.5 oz. An inspector ordered a pizza and found it only had 6.9 oz of cheese. Franchisee’s can lose their store if they make pizzas with 3 standard deviations (or more) of cheese below the mean. Assume the distribution of weights is normally distributed. a. Graph the x-distribution.
Label and scale the axis. b. Find the z-score for x = 6.9
oz of cheese. c. Is the franchise in danger of
losing its store? Why? d. Find the minimum amount of cheese a franchise can put on a
large pizza so it is not in danger of losing its store.
QuickTime™ and a decompressor
are needed to see this picture.
Normal Distributions (Page 12 of 23)
Guided Exercise 6 The times it takes a student to get to class from home is normally distributed with a mean of 17 minutes and a standard deviation of 3 minutes. a. One day it took 21 minutes to get to class. How many standard
deviations from the mean is that? Explain the sign. b. Another day it took 12 minutes to get to class. How many
standard deviations from the mean is that? Explain the sign. c. On a third day it took 17 minutes to get to class. How many
standard deviations from the mean is that? Explain the sign.
z-Score Formulae z = x − µ
σ or x = zσ + µ
Normal Distributions (Page 13 of 23)
Guided Exercise 7 Sam’s z-score on an exam is 1.3. If the distribution of scores is normally distributed with a mean of 480 and a standard deviation of 70, what is Sam’s raw score. Draw the distribution. Standard Normal Distribution / Curve The standard normal distribution is a normal distribution with mean µ = 0and standard deviation σ = 1. The standard normal curve is the graph of the standard normal distribution.
The Standard Normal Curve µ = 0 , σ = 1
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
TP TP
z
QuickTime™ and a decompressor
are needed to see this picture.
Normal Distributions (Page 14 of 23)
The Normal Cumulative Density Function The area under the normal curve with mean µ and standard deviation σ on the interval from a to b represents the probability that a randomly chosen value for x lies between a and b. It is given by the normal cumulative density function (normalcdf): Access the DISTR menu (2nd > VARS) and select 2:normalcdf(lower bound, upper bound, [µ , σ ]). Example 6 Find the probability that z is between -1 and 1. To show all work and receive full credit (a) (20-50% of the credit) Sketch the normal curve and shade in
the area in question. Label and scale the axis. Put dots at the transition points and tick marks on the axis to 3 standard deviations on each side of the mean.
(b) (50-70% of the credit) Compute the area under the curve, hence the probability, showing the probability notation, the TI-83 function accessed along with its inputs and output. Round probabilities to 4 decimal places.
QuickTime™ and a decompressor
are needed to see this picture.
10−1−2−3 2 3z
Area = P(−1< z <1)= normalcdf (___, ___, ___, ___)= __________
Area =P (a < x < b)= normalcdf (a,b,[µ,σ ])
b a x
Normal Distributions (Page 15 of 23)
Example 7 Find the probability that z is 1. between -3 and 3 2. greater than 1 3. between 0 and 2.53 4. greater than 2.53 5. less than -2.34 6. between -2 and 2
QuickTime™ and a decompressor
are needed to see this picture.
10−1−2−3 2 3z
QuickTime™ and a decompressor
are needed to see this picture.
10−1−2−3 2 3z
QuickTime™ and a decompressor
are needed to see this picture.
10−1−2−3 2 3z
QuickTime™ and a decompressor
are needed to see this picture.
10−1−2−3 2 3z
QuickTime™ and a decompressor
are needed to see this picture.
10−1−2−3 2 3z
QuickTime™ and a decompressor
are needed to see this picture.
10−1−2−3 2 3z
Normal Distributions (Page 16 of 23)
6.3 Area Under Any Normal Curve TI-84: The area under any normal curve between the values a
and b is given by
To show all work and receive full credit (a) (20-50% of the credit) Sketch the normal curve and shade in
the area in question. Label and scale the axis. Put dots at the transition points and tick marks on the axis to 3 standard deviations on each side of the mean.
(b) (50-70% of the credit) Compute the area under the curve,
hence the probability, showing the probability notation, the TI-83 function accessed along with its inputs and output. State probabilities to the nearest ten-thousandth (4 decimal places).
Example 7 Given that x has a normal distribution with a mean of 3 and standard deviation of 0.5, find the probability that an x selected at random will be between 2.1 and 3.7. Show your work and include a sketch of the normal curve relevant to this application.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
a b x
Area = P(a < x < b) = normalcdf (a,b,µ,σ )
QuickTime™ and a decompressor
are needed to see this picture.
Normal Distributions (Page 17 of 23)
Example 8 Let x have a normal distribution with a mean of 10 and a standard deviation of 2. Find the probability that an x selected at random from the distribution is between 11 and 14. Show your work and include a sketch of the normal curve representing the probability. Example A A factory has a machine that puts corn flakes in boxes that are advertised as 20 ounces each. If the distribution of weights is normal with µ = 20 and σ = 1.5, what is the probability that the weight of a randomly selected box of corn flakes will be between 19 and 21 oz? Show your work and include a sketch of the normal curve representing the probability. Guided Exercise 10 If the life of a Sunshine Stereo is normally distributed with a mean of 2.3 years and a standard deviation of 0.4 years, what is the probability that a randomly selected stereo will break down during the warranty period of 2 years? Show your work and include a sketch of the normal curve representing the probability.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
Normal Distributions (Page 18 of 23)
The Inverse-Normal Function: DISTR / 3:invNorm(
a = invNorm(Area to the left of a, µ , σ ) Function Access the DISTR menu (2nd > VARS) and select
3:invNorm(area to the left of a, [µ , σ ]). Input “Area to the left of a” = P(x < a) Mean µ (default value is 0) Standard Deviation σ (default value is 1) Output The value of a on the x-axis of the normal curve. Guided Exercise 11 Find the value of a on the z-axis so that 3% of the area under the standard normal curve lies to the left of a. Round to two decimal places.
QuickTime™ and a decompressor
are needed to see this picture.
x
a
QuickTime™ and a decompressor
are needed to see this picture.
10−1−2−3 2 3z
P(x < a)
Normal Distributions (Page 19 of 23)
Example B a. Draw a standard normal curve. Then
find the value of a > 0 so that 32% of curve lies between 0 and a.
b. Draw a standard normal curve. Then
find the value(s) of a on the z-axis so that 94% of the curve lies between -a and a.
c. Draw a normal curve with a mean of 90
and a standard deviation of 7. Then find the values of b so that 41% of the curve lies between the mean and b.
d. Draw a normal curve with a mean of 45
and a standard deviation of 5. Then find the value of b so that 88% of the curve lies between 45 - b and 45 + b.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
10−1−2−3 2 3z
QuickTime™ and a decompressor
are needed to see this picture.
10−1−2−3 2 3z
Normal Distributions (Page 20 of 23)
Example C Suppose a distribution is normal with a mean of 44 and a standard deviation of 6. Find the value of a (to the nearest hundredth) on the x-axis so that a. 66% of the data values lie below a. b. 15% of the data values lie above a. Example 9 Magic Video Games Inc. sells expensive computer games and wants to advertise an impressive, full-refund warranty period. It has found that the mean life for its’ computer games is 30 months with a standard deviation of 4 months. If the life spans of the computer games are normally distributed, how long of a warranty period (to the nearest month) can be offered so that the company will not have to refund the price of more than 7% of the computer games?
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
Normal Distributions (Page 21 of 23)
Exercise 39 Suppose you eat lunch at a restaurant that does not take reservations. Let x represent the mean time waiting to be seated. It is known that the mean waiting time is 18 minutes with a standard deviation of 4 minutes, and the x distribution is normal. What is the probability that the waiting time will exceed 20 minutes given that it has exceeded 15 minutes? Let event A = “x > 20 minutes” event B = “x > 15 minutes” Answer by completing the following: a. In terms of events A and B, what is it
we want to compute? b. Is the event “B and A” = event “x > 20” (i.e. event A)? c. Show that P(B and A) = P(B) ⋅ P(A, given B) is the same as P(x > 20) = P(x > 15) ⋅ P(A, given B) d. Compute P(A), P(B) and P(A, given B)
QuickTime™ and a decompressor
are needed to see this picture.
Normal Distributions (Page 22 of 23)
6.4 Approximate a Binomial Distribution with a Normal Distribution
Fact If np > 5 and nq > 5 in a binomial distribution, then the sample size n is large enough so that the binomial random variable r has a distribution that is approximately normal. The mean and standard deviation of the normal distribution are estimated by
µ = np and σ = npq . Furthermore, as the sample size gets larger the approximation gets better.
Normal Distributions (Page 23 of 23)
Approximating a Binomial Distribution Using a Normal Distribution If np > 5 and nq > 5, then the binomial random variable r has a distribution that is approximately normal. The mean and standard deviation of the normal distribution are estimated by µ = np and
σ = npq . Example 12 The owner of a new apartment building needs to have 25 new water heaters installed. Assume the probability that a water heater will last 10 years is 0.25. (a) What is the probability that 8 or more will last at least 10
years. (b) Can the binomial
probability distribution be approximated by a normal distribution? Why / Why not?
(c) Estimate part (a) with a
normal distribution. Take note of the continuity correction. That is, remember to subtract 0.5 from the left endpoint of the interval and add 0.5 to the right endpoint.
420 Part IV: Complete Solutions, Chapter 7
Copyright © Houghton Mifflin Company. All rights reserved.
Chapter 7: Introduction to Sampling Distributions Section 7.1 1. Answers vary. Students should identify the individuals (subjects) and variable involved. For example, the
population of all ages of all people in Colorado, the population of weights of all students in your school, or the population count of all antelope in Wyoming.
2. Answers vary. A simple random sample of n measurements from a population is a subset of the population
selected in a manner such that (a) Every sample of size n from the population has an equal chance of being selected. (b) Every member of the population has an equal chance of being included in the sample. 3. A population parameter is a numerical descriptive measure of a population, such as µ, the population
mean, σ, the population standard deviation, σ2, the population variance, p, the population proportion, the population maximum and minimum, etc.
4. A sample statistic is a numerical descriptive measure of a sample, such as ,x the sample mean, s, the
sample standard deviation, s2, the sample variance, ˆ ,p the sample proportion, the sample maximum and minimum, etc.
5. A statistical inference refers to conclusions about the value of a population parameter based on
information from the corresponding sample statistic and the associated probability distributions. We will do both estimation and testing.
6. A sampling distribution is a probability distribution for a sample statistic. 7. They help us visualize the sampling distribution by using tables and graphs that approximately represent
the sampling distribution. 8. Relative frequencies can be thought of as a measure or estimate of the likelihood of a certain statistic
falling within the class bounds. 9. We studied the sampling distribution of mean trout lengths based on samples of size 5. Other such
sampling distributions abound. Notice that the sample size remains the same for each sample in a sampling distribution.
Section 7.2 Note: Answers may vary slightly depending on the number of digits carried in the standard deviation. 1. The standard error is simply the standard deviation for a sampling distribution. 2. The standard error 3. x is an unbiased estimator for μ, and p̂ is an unbiased estimator for p. 4. As the sample size increases, the variability decreases. 5. (a) The required sample size is n ≥ 30. (b) No. If the original distribution is normal, then x is distributed normally for any sample size. 6. (a) No, because we require a sample size of n ≥ 30 if the original distribution is not normal.
Part IV: Complete Solutions, Chapter 7 421
Copyright © Houghton Mifflin Company. All rights reserved.
(b) Yes, the x distribution is normal with mean 72xµ = and 8 216xσ = = .
(68 73) ( 2 0.50) 0.6687
68 72 73 722 0.502 2
P x P z
z z
≤ ≤ = − ≤ ≤ =
− −= = − = =
7. The distribution with n = 225 will have a smaller standard error. Since x nσσ = , dividing by the square
root of 225 will result in a small standard error regardless of the value of σ.
8. (a) We require n = 36 because 12 12 2636
= = .
(b) We require n = 144 because 12 12 112144
= = .
9. (a) 15
14 2.049
x
x n
µ µσσ
= =
= = =
Because n = 49 ≥ 30, by the central limit theorem, we can assume that the distribution of x is approximately normal.
152.0x
x xz µσ− −
= =
15 1515 converts to 02.0
17 1517 converts to 12.0
x z
x z
−= = =
−= = =
( ) ( )( ) ( )
15 17 0 1
1 00.8413 0.50000.3413
P x P z
P z P z
≤ ≤ = ≤ ≤
= ≤ − ≤
= −=
(b) 1514 1.7564
x
x n
µ µσσ
= =
= = =
Because n = 64 ≥ 30, by the central limit theorem, we can assume that the distribution of x is approximately normal.
151.75x
x xz µσ− −
= =
15 1515 converts to 01.75
17 1517 converts to 1.141.75
x z
x z
−= = =
−= = =
422 Part IV: Complete Solutions, Chapter 7
Copyright © Houghton Mifflin Company. All rights reserved.
( ) ( )( ) ( )
15 17 0 1.14
1.14 00.8729 0.50000.3729
P x P z
P z P z
≤ ≤ = ≤ ≤
= ≤ − ≤
= −=
(c) The standard deviation of part (b) is smaller because of the larger sample size. Therefore, the
distribution about xµ is narrower. 10. (a) For both distributions, the mean will be 5xµ = . (b) The distribution with n = 81 because the standard deviation will be smaller. This distribution will be
less spread out around its mean. (c) The distribution with n = 81 because the standard deviation will be smaller. This distribution will be
less spread out around its mean. 11. (a) 75, 0.8µ σ= =
( )
( )
74.5 7574.50.8
0.630.2643
P x P z
P z
− < = <
= < −
=
(b) 0.875, 0.17920x x n
σµ σ= = = =
( )
( )
74.5 7574.50.179
2.790.0026
P x P z
P z
− < = <
= < −
=
(c) No. If the weight of only one car were less than 74.5 tons, we could not conclude that the loader is out of adjustment. If the mean weight for a sample of 20 cars were less than 74.5 tons, we would suspect that the loader is malfunctioning because the probability of this event occurring is 0.26% if indeed the distribution is correct.
12. (a) 68, 3µ σ= =
( )
( )( ) ( )
67 68 69 6867 693 3
0.33 0.33
0.33 0.330.6293 0.37070.2586
P x P z
P z
P z P z
− − ≤ ≤ = ≤ ≤
= − ≤ ≤
= ≤ − ≤ −
= −=
Part IV: Complete Solutions, Chapter 7 423
Copyright © Houghton Mifflin Company. All rights reserved.
(b) 368, 19x x n
σµ σ= = = =
( )
( )( ) ( )
67 68 69 6867 691 1
1 1
1 10.8413 0.15870.6826
P x P z
P z
P z P z
− − ≤ ≤ = ≤ ≤
= − ≤ ≤
= ≤ − ≤ −
= −=
(c) The probability in part (b) is much higher because the standard deviation is smaller for the x distribution.
13. (a) 85, 25µ σ= =
( )
( )
40 854025
1.80.0359
P x P z
P z
− < = <
= < −
=
(b) The probability distribution of x is approximately normal with 2585; 17.68.2x x n
σµ σ= = = =
( )
( )
40 854017.682.55
0.0054
P x P z
P z
− < = <
= < −
=
(c) 2585, 14.433x x n
σµ σ= = = =
( )
( )
40 854014.433.12
0.0009
P x P z
P z
− < = <
= < −
=
(d) 2585, 11.25x x n
σµ σ= = = =
( )
( )
40 854011.24.02
0.0002
P x P z
P z
− < = <
= < −
<
(e) Yes. The more tests a patient completes, the stronger is the evidence for excess insulin. If the average value based on five tests were less than 40, the patient is almost certain to have excess insulin.
14. 7500, 1750µ σ= =
(a) ( )
( )
3500 750035001750
2.290.0110
P x P z
P z
− < = <
= < −
=
424 Part IV: Complete Solutions, Chapter 7
Copyright © Houghton Mifflin Company. All rights reserved.
(b) The probability distribution of x is approximately normal with 17507500; 1237.44.
2x x nσµ σ= = = =
( )
( )
3500 750035001237.44
3.230.0006
P x P z
P z
− < = <
= < −
=
(c) 17507500, 1010.363x x n
σµ σ= = = =
( )
( )
3500 750035001010.36
3.960.0002
P x P z
P z
− < = <
= < −
<
(d) The probabilities decreased as n increased. It would be an extremely rare event for a person to have two or three tests below 3,500 purely by chance. The person probably has leukopenia.
15. (a) 63.0, 7.1µ σ= =
( )
( )
54 63.0547.1
1.270.1020
P x P z
P z
− < = <
= < −
=
(b) The expected number undernourished is 2,200 × 0.1020 = 224.4, or about 224.
(c) 7.163.0, 1.00450x x n
σµ σ= = = =
( )
( )
60 63.0601.004
2.990.0014
P x P z
P z
− < = <
= < −
=
(d) 63.0, 1.004x xµ σ= =
( )
( )
64.2 63.064.21.004
1.200.8849
P x P z
P z
− < = <
= <
=
Since the sample average is above the mean, it is quite unlikely that the doe population is undernourished.
16. (a) By the central limit theorem, the sampling distribution of x is approximately normal with
mean $20xµ µ= = and standard error $7 $0.70.100x n
σσ = = = It is not necessary to make any
assumption about the x distribution because n is large.
Part IV: Complete Solutions, Chapter 7 425
Copyright © Houghton Mifflin Company. All rights reserved.
(b) $20, $0.70x xµ σ= =
( )
( )( ) ( )
$18 $20 $22 $20$18 $22$0.70 $0.70
2.86 2.86
2.86 2.860.9979 0.00210.9958
P x P z
P z
P z P z
− − ≤ ≤ = ≤ ≤
= − ≤ ≤
= ≤ − ≤ −
= −=
(c) $20, $7xµ σ= =
( )
( )
$18 $20 $22 $20$18 $22$7 $7
0.29 0.290.6141 0.38590.2282
P x P z
P z
− − ≤ ≤ = ≤ ≤
= − ≤ ≤
= −=
(d) We expect the probability in part (b) to be much higher than the probability in part (c) because the standard deviation is smaller for the x distribution than it is for the x distribution. By the central limit theorem, the sampling distribution of x will be approximately normal as n increases, and its standard deviation nσ will decrease as n increases. For a fixed interval, such as $18 to $22, centered at the mean, $20 in this case, the proportion of the possible x values within the interval will be greater than the proportion of the possible x values within the same interval. A sample of 100 customers contains much more information about purchasing tendencies than a single customer, so averages are much more predictable than a single observation.
17. (a) The random variable x is itself an average based on the number of stocks or bonds in the fund. Since x
itself represents a sample mean return based on a large (random) sample of size n = 250 of stocks or bonds, x has a distribution that is approximately normal (central limit theorem).
(b) 0.9%1.6%, 0.367%6x x n
σµ σ= = = =
( )
( )( ) ( )
1% 1.6% 2% 1.6%1% 2%0.367% 0.367%1.63 1.09
1.09 1.630.8621 0.05160.8105
P x P z
P z
P z P z
− − ≤ ≤ = ≤ ≤
= − ≤ ≤
= ≤ − ≤ −
= −=
(c) Note: 2 years = 24 months; x is monthly percentage return.
0.9%1.6%, 0.1837%24x x n
σµ σ= = = =
( )
( )( ) ( )
1% 1.6% 2% 1.6%1% 2%0.1837% 0.1837%3.27 2.18
2.18 3.270.9854 0.00050.9849
P x P z
P z
P z P z
− − ≤ ≤ = ≤ ≤
= − ≤ ≤
= ≤ − ≤ −
= −=
(d) Yes. The probability increases as the standard deviation decreases. The standard deviation decreases as the sample size increases.
426 Part IV: Complete Solutions, Chapter 7
Copyright © Houghton Mifflin Company. All rights reserved.
(e) 1.6%, 0.1837%x xµ σ= =
( )
( )
1% 1.6%1%0.1837%3.27
0.0005
P x P z
P z
− < = <
= < −
=
This is very unlikely if µ = 1.6%. One would suspect that µ has slipped below 1.6%. 18. (a) The random variable x is itself an average based on the number of stocks in the fund. Since x itself
represents a sample mean return based on a large (random) sample of size n = 100 of stocks, x has a distribution that is approximately normal (central limit theorem).
(b) 0.8%1.4%, 0.2667%9x x n
σµ σ= = = =
( )
( )( ) ( )
1% 1.4% 2% 1.4%1% 2%0.2667% 0.2667%1.50 2.25
2.25 1.500.9878 0.06680.9210
P x P z
P z
P z P z
− − ≤ ≤ = ≤ ≤
= − ≤ ≤
= ≤ − ≤ −
= −=
(c) 0.8%1.4%, 0.1886%18x x n
σµ σ= = = =
( )
( )( ) ( )
1% 1.4% 2% 1.4%1% 2%0.1886% 0.1886%2.12 3.18
3.18 2.120.9993 0.01700.9823
P x P z
P z
P z P z
− − ≤ ≤ = ≤ ≤
= − ≤ ≤
= ≤ − ≤ −
= −=
(d) Yes. The probability increases as the standard deviation decreases. The standard deviation decreases as the sample size increases.
(e) 1.4%, 0.1886%x xµ σ= =
( )
( )( )
2% 1.4%2%0.1886%
3.18
1 3.181 0.99930.0007
P x P z
P z
P z
− > = >
= >
= − ≤
= −=
This is very unlikely if µ = 1.4%. One would suspect that the European stock market may be heating up; i.e., µ is greater than 1.4%.
19. (a) The total checkout time for 30 customers is the sum of the checkout times for each individual
customer. Thus w = x1 + x2 + … + x30, and the probability that the total checkout time for the next 30 customers is less than 90 is P(w < 90).
Part IV: Complete Solutions, Chapter 7 427
Copyright © Houghton Mifflin Company. All rights reserved.
(b) If we divide both sides of w < 90 by 30, we obtain 30w < 3. However, w is the sum of 30 waiting times,
so 30w is .x Therefore, ( ) ( )90 3 .P w P x< = <
(c) The probability distribution of x is approximately normal with mean 2.7xµ µ= = and standard
deviation 0.6 0.1095.30x n
σσ = = =
(d) ( )
( )
3 2.730.10952.74
0.9969
P x P z
P z
− < = <
= <
=
The probability that the total checkout time for the next 30 customers is less than 90 minutes is 0.9969.
20. Let w = x1 + x2 + ↑ + x36.
(a) w < 320 is equivalent to 32036 36w < or 8.889.x <
2.58.5, 0.4167.36x x n
σµ µ σ= = = = =
( ) ( )
( )
320 8.889
8.889 8.50.4167
0.930.8238
P w P x
P z
P z
< = <
− = <
= <
=
(b) w > 275 is equivalent to 27536 36w > or 7.639.x > 8.5, 0.4167.x xµ σ= =
( ) ( )
( )( )
275 7.639
7.639 8.50.4167
2.07
1 2.071 0.01920.9808
P w P x
P z
P z
P z
> = >
− = >
= > −
= − ≤ −
= −≈
(c) ( ) ( )( )( ) ( )
275 320 7.639 8.889
2.07 0.93
0.93 2.070.8238 0.01920.8046
P w P x
P z
P z P z
< < = < <
= − < <
= < − < −
= −=
428 Part IV: Complete Solutions, Chapter 7
Copyright © Houghton Mifflin Company. All rights reserved.
21. (a) Let 1 2 5.w x x x= + + + 3.317, 1.476
5x x nσµ µ σ= = = = =
90( 90)5 5
( 18)18 171.476
( 0.68)1 0.75170.2483
wP w P
P x
P z
P z
> = >
= >
− = >
= >= −=
(b) 80( 80)5 5
( 16)16 171.476
( 0.68)0.2483
wP w P
P x
P z
P z
< = <
= <
− = <
= < −=
(c) (80 90) (16 18)( 0.68 0.68)( 0.68) ( 0.68)
0.7517 0.24830.5034
P w P xP zP z P z
< < = < <= − < <= < − < −= −=
Section 7.3 1. We must check that np > 5 and nq > 5.
2. p̂pqn
σ = and p̂ pµ =
3. Yes, it is unbiased. The mean of the distribution for p̂ is p. 4. (a) 0.5/25 = 0.02 (b) 0.5/100 = 0.005 (c) As n increases, the continuity correction decreases. 5. (a) ( ) ( )33 0.21 6.93, 33 0.79 26.07np nq= = = =
Yes, p̂ can be approximated by a normal random variable because both np and nq exceed 5.
( )ˆ ˆ
0.21 0.790.21, 0.071
33p ppµ σ= = = ≈
Part IV: Complete Solutions, Chapter 7 429
Copyright © Houghton Mifflin Company. All rights reserved.
( ) ( )( )
( )( ) ( )
0.5 0.5Continuity correction 0.01533
ˆ0.15 0.25 0.15 0.015 0.25 0.015
0.135 0.265
0.135 0.21 0.265 0.210.071 0.071
1.06 0.77
0.77 1.060.7794 0.14460.6348
n
P p P x
P x
P z
P z
P z P z
= = ≈
≤ ≤ = − ≤ ≤ +
= ≤ ≤
− − = ≤ ≤
= − ≤ ≤
= ≤ − ≤ −
= −=
(b) No, because np = 25 × 0.15 = 3.75 does not exceed 5. (c) ( ) ( ) = 48 0.15 = 7.2, = 48 0.85 = 40.8 np nq
Yes, p̂ can be approximated by a normal random variable because both np and nq exceed 5.
( )ˆ ˆ
0.15 0.850.15, 0.052
48p ppµ σ= = = ≈
( ) ( )( )
( )( )
0.5 0.5Continuity correction 0.01045
ˆ 0.22 0.22 0.010
0.21
0.21 0.150.052
1.15
1 1.151 0.87490.1251
n
P p P x
P x
P z
P z
P z
= = =
≥ = ≥ −
= ≥
− = ≥
= ≥
= − <
= −=
6. (a)
( ) ( )50, 0.36
50 0.36 18, 50 0.64 32n pnp nq= =
= = = =
Approximate p̂ by a normal random variable because both np and nq exceed 5.
( )ˆ ˆ
0.36 0.640.36, 0.068
50p ppµ σ= = = ≈
430 Part IV: Complete Solutions, Chapter 7
Copyright © Houghton Mifflin Company. All rights reserved.
( ) ( )( )
( )( ) ( )
0.5 0.5Continuity correction 0.0150
ˆ0.30 0.45 0.30 0.01 0.45 0.01
0.29 0.46
0.29 0.36 0.46 0.360.068 0.068
1.03 1.47
1.47 1.030.9292 0.15150.7777
n
P p P x
P x
P z
P z
P z P z
= = =
≤ ≤ ≈ − ≤ ≤ +
= ≤ ≤
− − = ≤ ≤
= − ≤ ≤
= ≤ − ≤ −
= −=
(b) ( ) ( )
38, 0.2538 0.25 9.5, 38 0.75 28.5
n pnp nq= =
= = = =
Approximate p̂ by a normal random variable because both np and nq exceed 5.
( )ˆ ˆ
0.25 0.750.25, 0.070
38p ppµ σ= = = ≈
( ) ( )( )
( )( )
0.5 0.5Continuity correction 0.01338
ˆ 0.35 0.35 0.013
0.337
0.337 0.250.070
1.24
1 1.241 0.89250.1075
n
P p P x
P x
P z
P z
P z
= = =
> = > −
= >
− = >
= >
= − ≤
= −=
(c) ( )
41, 0.0941 0.09 3.69
n pnp= =
= =
We cannot approximate p̂ by a normal random variable because np < 5. 7.
( ) ( )30, 0.60
30 0.60 18, 30 0.40 12n pnp nq= =
= = = =
Approximate p̂ by a normal random variable because both np and nq exceed 5.
( )ˆ ˆ
0.6 0.40.6, 0.089
30
0.5 0.5Continuity correction 0.01730
p pp
n
µ σ= = = ≈
= = =
Part IV: Complete Solutions, Chapter 7 431
Copyright © Houghton Mifflin Company. All rights reserved.
(a) ( ) ( )( )
( )
ˆ 0.5 0.5 0.017
0.483
0.483 0.60.089
1.310.9049
P p P x
P x
P z
P z
≥ ≈ ≥ −
= ≥
− = ≥
= ≥ −
=
(b) ( ) ( )( )
( )
ˆ 0.667 0.667 0.017
0.65
0.65 0.60.089
0.560.2877
P p P x
P x
P z
P z
≥ ≈ ≥ −
= ≥
− = ≥
= ≥
=
(c) ( ) ( )( )
( )
ˆ 0.333 0.333 0.017
0.35
0.35 0.60.089
2.810.0025
P p P x
P x
P z
P z
≤ ≈ ≤ +
= ≤
− = ≤
= ≤ −
=
(d) Yes, both np and nq exceed 5. 8. (a)
( ) ( )38, 0.73
38 0.73 27.74, 38 0.27 10.26n pnp nq= =
= = = =
Approximate p̂ by a normal random variable because both np and nq exceed 5.
( )ˆ ˆ
0.73 0.270.73, 0.072
38
0.5 0.5Continuity correction 0.01338
p pp
n
µ σ= = = ≈
= = =
( ) ( )( )
( )
ˆ 0.667 0.667 0.013
0.654
0.654 0.730.072
1.060.8554
P p P x
P x
P z
P z
≥ ≈ ≥ −
= ≥
− = ≥
= ≥ −
=
(b) ( ) ( )
45, 0.8645 0.86 38.7, 45 0.14 6.3
n pnp nq= =
= = = =
Approximate p̂ by a normal random variable because both np and nq exceed 5.
432 Part IV: Complete Solutions, Chapter 7
Copyright © Houghton Mifflin Company. All rights reserved.
( )ˆ ˆ
0.86 0.140.86, 0.052
45
0.5 0.5Continuity correction 0.01145
p pp
n
µ σ= = = ≈
= = =
( ) ( )( )
( )
ˆ 0.667 0.667 0.011
0.656
0.656 0.860.052
3.921
P p P x
P x
P z
P z
≥ ≈ ≥ −
= ≥
− = ≥
= ≥ −
≈
(c) Yes, both np and nq exceed 5 for men and for women. 9. (a)
( ) ( )100, 0.06
100 0.06 6, 100 0.94 94n pnp nq= =
= = = =
p̂ can be approximated by a normal random variable because both np and nq exceed 5.
( )ˆ ˆ
0.06 0.940.06, 0.024
100
0.5Continuity correction 0.005100
p ppµ σ= = = ≈
= =
(b) ( ) ( )( )
( )
ˆ 0.07 0.07 0.005
0.065
0.065 0.060.024
0.210.4168
P p P x
P x
P z
P z
≥ ≈ ≥ −
= ≥
− = ≥
= ≥
=
(c) ( ) ( )( )
( )
ˆ 0.11 0.11 0.005
0.105
0.105 0.060.024
1.880.0301
P p P x
P x
P z
P z
≥ ≈ ≥ −
= ≥
− = ≥
= ≥
=
Yes; because this probability is so small, it should rarely occur. The machine might need an adjustment.
10. (a)
( ) ( )50, 0.565
50 0.565 28.25, 50 0.435 21.75n pnp nq= =
= = = =
p̂ can be approximated by a normal random variable because both np and nq exceed 5.
( )ˆ ˆ
0.565 0.4350.565, 0.070
50
0.5 0.5Continuity correction 0.0150
p pp
n
µ σ= = = ≈
= = =
Part IV: Complete Solutions, Chapter 7 433
Copyright © Houghton Mifflin Company. All rights reserved.
(b) ( ) ( )( )
( )
ˆ 0.53 0.53 0.01
0.54
0.54 0.5650.070
0.360.3594
P p P x
P x
P z
P z
≤ ≈ ≤ +
= ≤
− = ≤
= ≤ −
=
(c) ( ) ( )( )
( )
ˆ 0.41 0.41 0.01
0.42
0.42 0.5650.070
2.070.0192
P p P x
P x
P z
P z
≤ ≈ ≤ +
= ≤
− = ≤
= ≤ −
=
(d) Meredith has the more serious case because the probability of having such a low reading in a healthy person is less than 2%.
11.
( )
total number of successes from all 12 quarterstotal number of families from all 12 quarters
11 14 1912 92
2061,1040.1866
p =
+ + +=
=
=
( )ˆ
ˆ
1 1 0.1866 0.81340.1866
0.1866 0.81340.0406
92
p
p
q pp p
pq pqn n
µ
σ
= − = − == ≈ =
= ≈ = ≈
Check: ( ) ( )92 0.1866 17.2, 92 0.8134 74.8np nq= = = = Since both np and nq exceed 5, the normal approximation should be reasonably good.
Center line 0.1866p= =
Control limits at 2 pqpn
±
( )0.1866 2 0.0406
0.1866 0.0812 or 0.1054 and 0.2678
= ±
= ±
Control limits at 3 pqpn
±
( )0.1866 3 0.0406
0.1866 0.1218 or 0.0648 and 0.3084
= ±
= ±
434 Part IV: Complete Solutions, Chapter 7
Copyright © Houghton Mifflin Company. All rights reserved.
Quarter
Prop
orti
on
121110987654321
0.30
0.25
0.20
0.15
0.10
0.05
_P=0.1866
+3SL=0.3084
-3SL=0.0647
+2SL=0.2678
-2SL=0.1054
P Chart of Victims
There are no out-of-control signals.
12.
( )
total number of defective canstotal number of cans
8 11 10110 15
13316500.08061
p =
+ + +=
=
=
( ) ( )ˆ
ˆ
1 1 0.08061 0.919390.08061
0.08061 0.919390.02596
110
p
p
q pp p
pq pqn n
µ
σ
= − = − == ≈ =
= ≈ = ≈
Check: ( ) ( )110 0.08061 8.9, 110 0.91939 101.1np nq= = = = Since both np and nq exceed 5, the normal approximation should be reasonably good.
Center line 0.08061p= =
Control limits at 2 pqpn
±
( )0.08061 2 0.02596
0.08061 0.05192, or 0.02869 and 0.1325.= ±
= ±
Control limits at 3 pqpn
±
( )0.08061 3 0.02596
0.08061 0.07788, or 0.00273 and 0.1585.= ±
= ±
Part IV: Complete Solutions, Chapter 7 435
Copyright © Houghton Mifflin Company. All rights reserved.
Test Sheet
Prop
orti
on
151413121110987654321
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
_P=0.0806
+3SL=0.1585
-3SL=0.0027
+2SL=0.1325
-2SL=0.0287
P Chart of Defective Cans
There are no out-of-control signals. It appears that the production process is in reasonable control.
13.
( )
total number who got jobstotal number of people
60 53 5875 15
87211250.7751
p =
+ + +=
=
=
( ) ( )ˆ
ˆ
1 1 0.7751 0.22490.7751
0.7751 0.22490.0482
75
p
p
q pp p
pq pqn n
µ
σ
= − = − == ≈ =
= ≈ = ≈
Check: ( ) ( )75 0.7751 58.1, 75 0.2249 16.9np nq= = = = Since both np and nq exceed 5, the normal approximation should be reasonably good.
Center line 0.7751p= =
Control limits at 2 pqpn
±
( )0.7751 2 0.0482
0.7751 0.0964, or 0.6787 to 0.8715.= ±
= ±
Control limits at 3 pqpn
±
( )0.7751 3 0.0482
0.7751 0.1446, or 0.6305 to 0.9197.= ±
= ±
436 Part IV: Complete Solutions, Chapter 7
Copyright © Houghton Mifflin Company. All rights reserved.
Day
Prop
orti
on
151413121110987654321
0.95
0.90
0.85
0.80
0.75
0.70
0.65
0.60
_P=0.7751
+3SL=0.9197
-3SL=0.6305
+2SL=0.8715
-2SL=0.6787
1
1
P Chart of Jobs
Out-of-control signal III occurs on days 4 and 5; out-of-control signal I occurs on day 11 on the low side and day 14 on the high side. Out-of-control signals on the low side are of most concern for the homeless seeking work. The foundation should look to see what happened on that day. The foundation might take a look at the out-of-control periods on the high side to see if there is a possibility of cultivating more jobs.
Chapter 7 Review 1. (a) The x distribution approaches a normal distribution. (b) The mean xµ of the x distribution equals the mean µ of the x distribution regardless of the sample size.
(c) The standard deviation xσ of the sampling distribution equals , where nσ σ is the standard
deviation of the x distribution and n is the sample size. (d) They will both be approximately normal with the same mean, but the standard deviations will
be50σ and ,
100σ respectively.
2. All the x distributions will be normal with mean 15.xµ µ= = The standard deviations will be
3 34:24
3 316: =416
3 3100: =10100
x
x
x
nn
nn
nn
σσ
σσ
σσ
= = = =
= = =
= = =
3. (a) 35, 7µ σ= =
( )
( )
40 35407
0.710.2389
P x P z
P z
− ≥ = ≥
= ≥
=
(b) 7 735, 39x x n
σµ µ σ= = = = =
Part IV: Complete Solutions, Chapter 7 437
Copyright © Houghton Mifflin Company. All rights reserved.
( )
( )
73
40 3540
2.140.0162
P x P z
P z
− ≥ = ≥
= ≥
=
4. (a) 38, 5µ σ= =
( )
( )
35 38355
0.60.2743
P x P z
P z
− ≤ = ≤
= ≤ −
=
(b) 538, 1.5810x x n
σµ µ σ= = = = =
( )
( )
35 38351.581.90
0.0287
P x P z
P z
− ≤ = ≤
= ≤ −
=
(c) The probability in part (b) is much smaller because the standard deviation is smaller for the x distribution.
5. 15100, 1.5100x x n
σµ µ σ= = = = =
( ) ( )
( )( ) ( )
100 2 100 2 98 102
98 100 102 1001.5 1.5
1.33 1.33
1.33 1.330.9082 0.09180.8164
P x P x
P z
P z
P z P z
− ≤ ≤ + = ≤ ≤
− − = ≤ ≤
= − ≤ ≤
= ≤ − ≤ −
= −=
6. 215, 0.33336x x n
σµ µ σ= = = = =
( ) ( )
( )( ) ( )
15 0.5 15 0.5 14.5 15.5
14.5 15 15.5 150.333 0.3331.5 1.5
1.5 1.50.9332 0.06680.8664
P x P x
P z
P z
P z P z
− ≤ ≤ + = ≤ ≤
− − = ≤ ≤
= − ≤ ≤
= ≤ − ≤ −
= −=
7. (a)
( ) ( )50, 0.22
50 0.22 11, 50 0.78 39n pnp nq= =
= = = =
Approximate p̂ by a normal random variable because both np and nq exceed 5.
438 Part IV: Complete Solutions, Chapter 7
Copyright © Houghton Mifflin Company. All rights reserved.
( )
ˆ ˆ0.22 0.78
0.22, 0.058650p ppµ σ= = = ≈
( ) ( )( )
( )( ) ( )
0.5 0.5Continuity correction 0.0150
ˆ0.20 0.25 0.20 0.01 0.25 0.01
0.19 0.26
0.19 0.22 0.26 0.220.0586 0.0586
0.51 0.68
0.68 0.510.7517 0.30500.4467
n
P p P x
P x
P z
P z
P z P z
= = =
≤ ≤ ≈ − ≤ ≤ +
= ≤ ≤
− − = ≤ ≤
= − ≤ ≤
= ≤ − ≤ −
= −=
(b) ( ) ( )
38, 0.2738 0.27 10.26, 38 0.73 27.74
n pnp nq= =
= = = =
Approximate p̂ by a normal random variable because both np and nq exceed 5.
( ) ( )ˆ ˆ
0.27 0.730.27, 0.0720
38p ppµ σ= = = ≈
( ) ( )( )
( )
0.5 0.5Continuity correction 0.01338
ˆ 0.35 0.35 0.013
0.337
0.337 0.270.0720
0.930.1762
n
P p P x
P x
P z
P z
= = =
≥ ≈ ≥ −
= ≥
− = ≥
= ≥
=
(c) ( )
51, 0.0551 0.05 2.55
n pnp= =
= =
No, we cannot approximate p̂ by a normal random variable because np < 5. 8.
( ) ( )28, 0.31
28 0.31 8.68, 28 0.69 19.32n pnp nq= =
= = = =
Approximate p̂ by a normal random variable because both np and nq exceed 5.
( )ˆ ˆ
0.31 0.690.31, 0.087
28
0.5 0.5Continuity correction 0.01828
p pp
n
µ σ= = = ≈
= = =
Part IV: Complete Solutions, Chapter 7 439
Copyright © Houghton Mifflin Company. All rights reserved.
(a) ( ) ( )( )
( )
ˆ 0.25 0.25 0.018
0.232
0.232 0.310.087
0.900.8159
P p P x
P x
P z
P z
≥ ≈ ≥ −
= ≥
− = ≥
= ≥ −
=
(b) ( ) ( )( )
( )( ) ( )
ˆ0.25 0.50 0.25 0.018 0.50 0.018
0.232 0.518
0.232 0.31 0.518 0.310.087 0.087
0.90 2.39
2.39 0.900.9916 0.18410.8075
P p P x
P x
P z
P z
P z P z
≤ ≤ ≈ − ≤ ≤ +
= ≤ ≤
− − = ≤ ≤
= − ≤ ≤
= ≤ − ≤ −
= −=
(c) Yes, both np and nq exceed 5.
8.1 Estimating µ When σ is Known (Page 1 of 25)
8.1 Estimating µ When σ is Known Assumptions about the random variable x 1. We have a simple random sample of size n drawn from the
population of x values. 2. The value of σ , the population standard deviation is known. 3. If the x distribution is normal, then our methods work for any
sample size n. 4. If x has an unknown distribution, then we require the sample
size n ≥ 30. However, if the x distribution is not mound-shaped, then a sample size of 50 or 100 may be needed.
Point Estimate An estimate of a population given by a single number is called a point estimate of that population parameter. For Example: x is a point estimate for µ .
s is a point estimate for σ . Margin of Error The margin of error in using x as a point estimate for µ is given by E = x − µ . A point estimate is not very useful unless we have some kind of measure of how “good” it is. This “measure of goodness” is expressed as a confidence interval.
8.1 Estimating µ When σ is Known (Page 2 of 25)
Confidence Interval and Level of Confidence Suppose 100 students at Palomar were randomly chosen and their heights were measured yielding a [sample] mean of 5.72 ft with a margin of error of 0.08 ft. Consider the following statements: 1. The population mean is approximately 5.72 feet. 2. There is a 95% probability that the population mean is
between 5.64 ft and 5.80 ft. P(5.64 ft ≤ µ ≤ 5.80 ft) = 0.95 3. At a 95% level of confidence the population mean is between
5.64 ft and 5.80 ft. 4. The population mean µ = 5.72 ± 0.08 feet at a 95% level of
confidence. Confidence levels and confidence intervals provide a measure of how “good” a point estimate estimates a population parameter.
8.1 Estimating µ When σ is Known (Page 3 of 25)
Confidence Interval for µ A c-percent confidence interval for the population mean µ is an interval computed from sample data in such a way that c is the probability of generating an interval containing the actual value of µ . That is,
P(x − E ≤ µ ≤ x + E) = c ,
where E is the maximum margin of error when estimating µ with x . In words, P(x − E ≤ µ ≤ x + E) = c means . . . 1. The probability that the population mean µ is between x − E
and x + E is c. 2. The population mean µ is between x − E and x + E at a
confidence level of c. 3. The population mean is µ ± E at a c-percent level of
confidence. 5. If we repeat the experiment many times with the same sample
size, then c proportion of the intervals calculated will contain the population mean µ . Thus, 1− c proportion of the intervals will not contain µ .
x + E x − E
Shaded Area = c
x -axis
The probability that µ is on this interval is c
8.1 Estimating µ When σ is Known (Page 4 of 25)
Example 1 Jackie has been jogging 2 miles a day for years and she records her times. A sample of 90 of these times has a mean of 15.60 minutes and a known standard deviation of 1.80 minutes. a. Find a 95% confidence interval for
µ . Draw and label the normal distribution illustrating the confidence interval. Solve without using the ZInterval function (see below).
b. Find E, maximum error in estimating µ with x at the
confidence level c. c. Write the conclusion in probability notation.
i.e. P(x − E ≤ µ ≤ x + E) = c
d. Summarize your conclusion in one sentence [relvant to the application].
8.1 Estimating µ When σ is Known (Page 5 of 25)
Using the TI-83/84 ZInterval Function The ZInterval (STAT / TESTS / 7: ZInterval) function computes a confidence interval for and unknown population mean µ when the population standard deviation is known. Input: STATS σ , x , and c-level Output: The interval from x − E to x + E , where
E =
12
(interval length)
Example 2 a. Compute the 95% confidence interval in example 1. Use the
ZInterval function. b. Summarize your results in a complete sentence relevant to this
application.
At a 95% level of confidence, the population mean µ of all 2-mi jogging times for Jackie is between 15.23 and 15.97 minutes.
8.1 Estimating µ When σ is Known (Page 6 of 25)
Section 8.1 Homework Instructions Steps to find a c% confidence interval for µ 1. Sketch a normal curve illustrating
the c% confidence interval for µ . Label x − E , x + E , and x . Where E is the margin of error when estimating µ with x at a confidence level of c.
2. Without using the ZInterval function, compute the c%
confidence interval for the population mean. That is, x − E = invNorm(area to the left of x − E, µx , σ x )
x + E = invNorm(area to the left of x + E, µx , σ x )
Use the estimate x ≈ µx , and σ x = σ / n . 3. Find E, is the maximum error in estimating µ with x at a
confidence level of c. It is computed as follows E = “half the interval length” from step 2 E = 1
2 (x + E) − (x − E)[ ] 4. Write the confidence interval in probability notation.
i.e. P(x − E ≤ µ ≤ x + E) = c 5. Summarize your results in a concise, complete sentence
relevant to the problem. That is,
At the c% confidence level the population mean µ of all ____________________ is between _____ and _____ [units].
x + E x − E x -axis
8.1 Estimating µ When σ is Known (Page 7 of 25)
Guided Exercise 2 Jason jogs 3 miles per day and records his times. A sample of 90 of these times has a mean of 21.50 minutes and a known standard deviation of 2.11 minutes. Find the 99% confidence interval for the population mean by completing steps 1-5 above. a. Sketch a normal curve to illustrate
the 99% confidence interval for the mean in his application. Label the axis.
b. Without using the ZInterval
function, find the 99% confidence interval for the population mean.
c. Find E = “half the interval length” d. Write the confidence interval in probability notation.
i.e. P(x − E ≤ µ ≤ x + E) = c e. Summarize your results in a concise, complete sentence
relevant to the problem. At a ____% confidence level, the mean of _______________ _________________________________________________ is between ____________ and _____________.
8.1 Estimating µ When σ is Known (Page 8 of 25)
Guided Exercise 3 An automobile loan company wants to estimate the amount of the average car loan during the past year. A random sample of 200 loans had a mean of $8225 and a known standard deviation of $762. Find the 95% confidence interval for the population mean by completing steps 1-5 above. a. Sketch a normal curve to illustrate
the 95% confidence interval for the mean in his application. Label the axis.
b. Without using the ZInterval
function, find the 95% confidence interval for the population mean.
c. Find E = “half the interval length” d. Write the confidence interval in probability notation.
i.e. P(x − E ≤ µ ≤ x + E) = c e. Summarize your results in a concise, complete sentence
relevant to the problem. At a ____% confidence level, the mean of _______________ _________________________________________________ is between ____________ and _____________.
8.1 Estimating µ When σ is Known (Page 9 of 25)
8.1 (was 8.4) Estimating the Sample Size n Critical Value
zc is called the critical value for a confidence level c if
P(−zc < z < zc ) = c That is, zc is the z-score such that the area under the standard normal curve between −zc and zc is c. In words we say . . . a. “the probability that a randomly selected z-value is between
−zc and zc is c.” Or b. “at a c-percent level of confidence we can say that a randomly
chosen z will be between −zc and zc .” For Example 1. If c = 0.90 , then P(−z0.90 < z < z0.90 ) = 0.90 . Compute z0.90 . 2. If c = 0.95, then P(−z0.95 < z < z0.95 ) = 0.95. Compute z0.95 . 3. If c = 0.99 , then P(−z0.99 < z < z0.99 ) = 0.99 . Compute z0.99 . Estimating Sample Size n for Estimating µ
zc −zc
Shaded Area = c
z-axis
8.1 Estimating µ When σ is Known (Page 10 of 25)
If, with a confidence level of c, we want our point estimate x to be within E units of µ , then we choose the sample size n to be
n =zc ⋅σ
E
2
, where zc is the critical value for a confidence level of c. Example 6 A sample of 50 salmon is caught and weighed. The sample standard deviation of the 50 weights is 2.15 lb. How large of a sample should be taken to be 97% confident that the sample mean is within 0.20 lb of the mean weight of the population? Find zc (to the nearest thousandth) and n. Then summarize your results in a complete sentence relevant to this application.
8.1 Estimating µ When σ is Known (Page 11 of 25)
Example 7 An efficiency expert wants to determine the mean time it takes an employee to assemble a switch on an assembly line. A preliminary study of 45 observations found a sample standard deviation of 78 seconds. How many more observations are needed to be 92% certain that the mean of the sample will vary from the true mean by no more than 15 seconds? Find zc (to the nearest thousandth) and n. Then summarize your results in a complete sentence relevant to this application. Guided Exercise 6 The dean wants to estimate the average teaching experience (in years) of the faculty members. A preliminary random sample of 60 faculty yields a sample standard deviation of 3.4 years. How many more faculty should be sampled to be 99% confident that the sample mean does not differ from the true mean by more than 0.5 years? Find zc (to the nearest thousandth) and n. Then summarize your results in a complete sentence relevant to this application.
8.2 Estimating µ When σ is Unknown (Page 12 of 25)
8.2 Estimating µ When σ is Unknown When the population standard deviation σ is unknown, it is approximated by the sample standard deviation s. The TInterval function works with what is called the Student’s t-distribution where all statistical “fudge factors” necessary to accommodate approximating σ with s are built into the function. The TInterval function (TI-83: STAT / TESTS / 8: TInterval) Input: STATS x , s, n, and c-level or DATA data list and c-level Output: The interval from x − E to x + E , where
E =
12
(interval length)
Homework Instructions for Section 8.2 1. Omit exercises #1-4 2. When asked to find a confidence interval, do the following:
a. Find the c% confidence interval for the mean µ . Write it in probability notation
b. Summarize your results in a complete sentence relevant to the application.
8.2 Estimating µ When σ is Unknown (Page 13 of 25)
Example 4 An archeologist discovered a new, but extinct, species of miniature horse. The only seven known samples show shoulder heights (in cm) of 45.3, 47.1, 44.2, 46.8, 46.5, 45.5, and 47.6. Find the 99% confidence interval for µ (the mean height of the entire population of ancient horses) and the error E. Then summarize your results in a complete sentence relevant to this application. a. Find the 99% confidence interval for the mean µ . Write it in
probability notation b. Summarize your results in a complete sentence relevant to the
application. Guided Exercise 3 A company produced a trial production run of 37 artificial sapphires. The mean weight is 6.75 carats and the standard deviation is 0.33 carats. Find the 95% confidence interval for the mean weight µ of all artificial sapphires and the error E. Then summarize your results in a complete sentence relevant to this application.
8.3 Estimating p in a Binomial Experiment (Page 14 of 25)
8.3 Estimating p in a Binomial Experiment Large Sample Size Assumption If np > 5 and nq > 5, then the sample size n is large enough so that the binomial distribution can be approximated by a normal distribution, and a c% confidence interval for p is expressed as
P( ˆ p − E ≤ p ≤ ˆ p + E) = c
where öp is the point estimate for p. TI-83 1-PropZInt function: STAT / TESTS / A: 1-PropZInt Input: x = r = number of successes n = number of trials c-level = confidence level Output:
( ˆ p − E , ˆ p + E), ˆ p , n Where E (the maximum error in using öp as a point estimate for p for the given confidence level) is one-half the interval length.
8.3 Estimating p in a Binomial Experiment (Page 15 of 25)
Example 5 Suppose 800 students were given flu shots and 600 did not get the flu. Assuming all 800 were exposed to the flu: a. What is S, n, and r (note: r is
input as variable x on the TI-83)? b. What are the point estimates for p
and q (i.e. öp and öq )? c. Is n large enough to approximate
the binomial distribution with a normal distribution? Why?
d. Find the 99% confidence interval
for p. e. Summarize your results in a
complete sentence relevant to this application.
p̂ =
q̂ =
S = r = n =
np̂ =
nq̂ =
P( p̂ − E ≤ p ≤ p̂ + E) = 0.99
8.3 Estimating p in a Binomial Experiment (Page 16 of 25)
Guided Exercise 4 A random sample of 195 books at a bookstore showed that 68 of the books were nonfiction. a. Find S and öp . b. Is the sample size large enough to approximate a normal
distribution with a binomial distribution? Why? c. Find the 90% confidence interval for p to the nearest
thousandth (3 decimal places). d. Summarize your results in a complete sentence relevant to this
application. Homework Instructions for Section 8.3 Problems When asked to find the c% confidence interval for p, do the following four steps. 1. Find S and öp 2. Determine if the sample size is large enough to approximate a
normal distribution with a binomial distribution? 3. Find the c% confidence interval for p to the nearest thousandth
(3 decimal places). 4. Summarize your results in a complete sentence relevant to this
application.
8.3 Estimating p in a Binomial Experiment (Page 17 of 25)
A Margin of Error, E, is the maximum error when using a point estimate for a population parameter at a given confidence level. General Interpretation of Poll Results 1. When a poll states the results of a survey, the proportion
reported is öp (the sample estimate of the population proportion).
2. The margin of error is the maximal error E of a [95%, usually]
confidence interval for p. 3. If öp is obtained from a poll, Then a 95% confidence interval
for the population proportion p is öp − E < p < öp + E . Guided Exercise 5 A random sample of 315 households were surveyed. Chances are 19 of 20 that if all adults had been surveyed, the findings would differ from the poll results by no more than 2.6% in either direction. One question was asked: “Which party would do a better job handling education?” The possible responses were Democrats, Republicans, neither, or both. The poll reported that 32% responded Democrat. a. What confidence level corresponds to the phrase “chances are
19 of 20 that if . . . .” b. What is S, n, and the sample statistic öp for the proportion
responding Democrat? c. Find E. Find the 95% confidence interval for p those who
would respond Democrat. d. Summarize your results in a complete sentence relevant to this
application.
8.3 Estimating p in a Binomial Experiment (Page 18 of 25)
8.3 Estimating Sample Size n for Estimating p (a) If, with a confidence level of c, we want our point estimate öp
to be within E units of p, then we choose the sample size n to be
n = öp ⋅ öq ⋅
zc
E
2
where zc is the z-score corresponding to a confidence level of c.
(b) If no estimate for p is available, we can say with a confidence
level of at least c that the point estimate öp will be within E units of p by choosing
n = 0.25 zcE
2
Example 8 A buyer for a popcorn company wants to estimate the probability p that a kernel purchased from a particular farm will pop. Suppose a random sample of n kernels is taken and r of these kernels pop. The buyer wants to be 95% certain that the point estimate öp will be within 0.01 units of p. a. Find zc and E. b. If no estimate for p is available, how large a sample should the
buyer use? (i.e. how large should n be)? c. A preliminary study showed that p was approximately 0.86.
Now, how large a sample should be used?
8.3 Estimating p in a Binomial Experiment (Page 19 of 25)
Guided Exercise 7 The health department wants to estimate the proportion of children who require corrective lenses for their vision. They want to be 99% sure that the point estimate for p will have a maximum error of 0.03. a. If no other information is known, find E and zc . Estimate the
sample size required. b. Suppose a preliminary random sample of 100 children
indicates that 23 require corrective lenses. How large should n be?
8.4 Estimating µ1 − µ2 and p1 − p2 (Page 20 of 25)
8.4 Estimating µ1 − µ2 and p1 − p2 Independent and Dependent Samples In order to make a statistical estimate about the difference between two populations, we need to have a sample from each population. Two samples are independent if the sample from one population is unrelated to the sample from the other. However, if each measurement in one sample can be naturally paired with measurements of another sample, the two samples are said to be dependent (such as before and after samples). Guided Exercise 8 Classify the pairs of samples as dependent or independent. a. In a medical experiment, one group is given a treatment and
another group is given a placebo. After a period of time both groups are measured for the same condition.
b. A group of Math students is given a test at the beginning of a
course and the same group is given the same test at the end of the course.
8.4 Estimating µ1 − µ2 and p1 − p2 (Page 21 of 25)
Theorem 8.1 Let x1 and x2 have normal distributions. If we take independent random samples of size n1 from x1 and n2 from x2 , then the variable x1 − x2 has 1. a normal distribution 2. a mean of µ1 − µ2
3. a standard deviation of
σ12
n1
+σ 2
2
n2
Estimating µ1 − µ2 When σ1 and σ1 are Known A c% confidence interval for µ1 − µ2 is expressed as
(x 1 − x 2) − E < µ1 − µ2 < (x 1 − x 2 )+ E This interval is the output of the TI-83 function 2-SampZInt. TI-83 function 2-SampZInt (STAT / TESTS / 9: 2-SampZInt) Input:
σ1,σ2 , x 1, n1, x 2, n2 ,c −level Output: Interval from
(x 1 − x 2) − E to (x 1 − x 2)+ E Where E is one half the interval length output by the 2-SampZInt function.
8.4 Estimating µ1 − µ2 and p1 − p2 (Page 22 of 25)
Example 9 Suppose a biologist is studying data from Yellowstone streams before and after a 1988 fire. A random sample of 167 fishing reports in the years before the fire showed the average catch per day of 5.2 trout with σ = 1.9 trout. After the fire a sample of 125 fishing reports showed the average catch per day of 6.8 trout with σ = 2.3 trout. a. Are the sample independent? b. Compute a 95% C.I. for µ1 − µ2 . At a 95% level of confidence ________ < µ1 − µ2 < _______. c. Explain the meaning of part b. Estimating µ1 − µ2 When σ1 and σ1 are Unknown A c% confidence interval for µ1 − µ2 is expressed as
(x 1 − x 2) − E < µ1 − µ2 < (x 1 − x 2 )+ E This interval is the output of the TI-83 function 2-SampTInt. TI-83 function 2-SampTInt (STAT / TESTS / 0: 2-SampTInt) Input: x1 , s1 , n1 , x2 , s2 , n2 , c-level, pooled: yes Output: Interval from
(x 1 − x 2) − E to (x 1 − x 2)+ E Where E is one half the interval length output by the 2-SampTInt function.
8.4 Estimating µ1 − µ2 and p1 − p2 (Page 23 of 25)
Example 10 Suppose that a random sample of 29 college students was divided into two groups. The first group had 15 people and was given 1/2 liter of red wine before going to sleep. The second group of 14 people was not given alcohol before going to sleep. Both groups went to sleep at 11 p.m. The average brain wave activity (in hertz) between 4 and 6 a.m. was measured for each participant. The results follow: Group 1 16.0 19.6 19.9 20.9 20.3 20.1 16.4 20.6 20.1 22.3 18.8 19.1 17.4 21.1 22.1 x1 = 19.65 hz , s1 = 1.86 hz
Group 2 8.2 5.4 6.8 6.5 4.7 5.9 2.9 7.6 10.2 6.4 8.8 5.4 8.3 5.1 x2 = 6.59 hz , s2 = 1.91 hz a. Are the groups independent? b. Compute the 90% C.I. for µ1 − µ2 and write it in probability
notation. c. Summarize the results of part b in a single sentence relevant to
this application.
8.4 Estimating µ1 − µ2 and p1 − p2 (Page 24 of 25)
Guided Exercise 9 a. A study reported a 90% confidence interval for the difference
of the means to be 10 < µ1 − µ2 < 20 . What can you conclude about the values of µ1 and µ2 .
b. A study reported a 95% confidence interval for the difference
of proportions to be −0.32 < p1 − p2 < 0.16 . What can you conclude about the values of p1 and p2 .
8.4 Estimating µ1 − µ2 and p1 − p2 (Page 25 of 25)
Confidence Interval for p1 − p2 (Large Samples) If n1 öp1 > 5, n1 öq1 > 5, n2 öp2 > 5 and n2 öq2 > 5, then the c% confidence interval for p1 − p2 is expressed as
( öp1 − öp2 ) − E < p1 − p2 < ( öp1 − öp2 ) + E where E is the maximum error in using öp1 − öp2 as an estimate for
p1 − p2 at a c% confidence level. TI-83 function 2-PropZInt (STAT / TESTS / B: 2-PropZInt) Input: r1 = x1 , n1 , r2 = x2 , n2 , c-level Output: Interval from ( öp1 − öp2 ) − E to ( öp1 − öp2 ) + E Where E is one half the interval length. Exercise 14 The burn center at Community hospital is experimenting with a new plasma compress treatment. A random sample of 316 patients with minor burns received the plasma compress treatment. Of these patients, 259 had no visible scars after treatment. Another random sample of 419 patients with minor burns received no plasma compress treatment. Of this group, 94 had no visible scars. Let p1 be the proportion of patients who received the plasma compress treatment and had no visible scars after treatment. Let p2 be the proportion of patients who did not receive the plasma compress treatment but still had no visible scars. a. Find the 95% confidence interval for p1 − p2 . b. Summarize the results in a single sentence relevant to this
application.
Page 1 of 34
9.1 Hypothesis Testing a. A statistical hypothesis, or simply a hypothesis, is an
assumption about a population parameter. b. Hypothesis testing is the procedure whereby we decide to
“reject” or “fail to reject” a hypothesis. c. Null hypothesis H0: This is the hypothesis (assumption) under
investigation or the statement being tested. The null hypothesis is a statement that “there is no effect,” “there is no difference,” or “there is no change.” The possible outcomes in testing a null hypothesis are ‘reject’ or ‘fail to reject.’
d. Alternate hypothesis H1: This is a statement you will adopt if
there is strong evidence (sample data) against the null hypothesis. A statistical test is designed to assess the strength of the evidence (data) against the null hypothesis.
e. Fail to Reject H0: We never say we “accept H0” - we can
only say we “fail to reject” it. Failing to reject H0 means there is NOT enough evidence in the data and in the test to justify rejecting H0. So, we retain the H0 knowing we have not proven it true beyond all doubt.
f. Rejecting H0: This means there IS significant evidence in the
data and in the test to justify rejecting H0. When H0 is rejected the data is said to be statistically significant. We adopt H1 knowing we will occasionally be wrong.
Page 2 of 34
Example 1 A car manufacturer advertises a car that gets 47 mpg. Let µ be the mean mileage for this model. You assume that the dealer will not underrate the mileage, but suspect he may overrate the mileage a. What can be used for H0? b. What can be used for H1? Guided Exercise 1A A company that manufactures ball bearings claims the average diameter is 6 mm. To check that the average diameter is correct, the company decides to formulate a statistical test. a. What can be used for H0? b. What can be used for H1? Guided Exercise 1B A consumer group wants to test the truth in a package delivery company’s claim that it takes an average of 24 hours to deliver a package. Complaints have led the consumer group to suspect the delivery time is longer than 24 hours. a. What can be used for H0? b. What can be used for H1?
Page 3 of 34
Types of Tests: Left-tailed, Right-Tailed, Two-Tailed The null hypothesis generally states the parameter of interest equals a specific value; typically a historical value of a value of no change. For example, H0 : µ = k . There are three types of statistical tests, which are determined by the alternate hypothesis as follows:
Level of Significance The level of significance α is the probability we are willing to risk rejecting H0 when it is true; it is typically between 1% or 5%. In the above pictures, think of α as the predetermined maximum area in the tail(s). Since H0: µ = k is a statement of “no change,” and is assumed true, we reject H0 only if we take a random sample and the sample mean x is so far away from the assumed mean (H0: µ = k ) that it is statistically unlikely that the assumption µ = k can be true. That is, the area in the tail(s) must be less than or equal to the level of significance α , to reject H0.
Left-Tail Test H0: µ = k H1: µ < k
Right-Tail Test H0: µ = k H1: µ > k
Two-Tail Test H0: µ = k H1: µ ≠ k
x
x
x µ = k µ = k µ = k
FTR H0 FTR H0 FTR H0
x x
Page 4 of 34
Example 2 Let x be random variable that represents the heart rate in beats per minute of Rosie, and old sheep dog. From past experience the vet knows that x is normally distributed with a mean of 115 bpm and standard deviation of σ = 12 bpm. Over the past several weeks Rosie’s heart rate (beats / min) was measured at
93 109 110 89 112 117 The sample mean is x = 105.0 . The vet is concerned that Rosie’s heart rate may be slowing. At a 5% level of significance, does the data indicate this is the case? a. Establish the null hypothesis
(i.e. nothing has changed) and the alternate hypothesis.
b. Draw the x -distribution.
Compute the probability of obtaining a sample mean of 105 bpm or less when the population mean is 115 bpm (by assumption). This area in the tail is called the P-value.
c. What can you conclude
about Rosie’s heartbeat?
Page 5 of 34
P-value Assuming H0 is true, the probability that the test statistic will take on values as extreme or more extreme than the observed test statistic is called the P-value of the test. The smaller the P-value computed from the sample data, the stronger the evidence against H0. In the x -distributions below, the P-value is the total area in the tail(s).
Type I and Type II Errors A Type I error occurs when we reject a true null hypothesis H0. A Type II error occurs when we “fail to reject” a false null hypothesis H0. For a given sample size reducing the probability of a type I error increases the probability of a type II error, and visa versa. The probability of a type I error we are willing to accept in an application is called the level of significance, denoted α (alpha). Alpha is specified in advance.
α = P(making a type I error) = P(rejecting a true H0) e.g. If α = 0.05, then we say we are using a 5% level of
significance. This means that in 100 similar situations H0 will be rejected 5 times (on average) when it was true and should not have been.
µ = k
Left-Tail Test H0: µ = k H1: µ < k
Right-Tail Test H0: µ = k H1: µ > k
Two-Tail Test H0: µ = k H1: µ ≠ k
µ = k µ = k x
x
x
Area = P-value Area =
P-value
2
x
Page 6 of 34
Example 3 Reconsider Example 1 where
H0: µ = 47 mpg H1: µ < 47 mpg a. Suppose α = 0.05. Describe a type I error and its probability.
A type I error is rejecting a true null hypothesis; in this case rejecting the dealer’s claim that µ = 47 mpg and concluding that µ < 47 mpg when in fact the average number of miles per gallon is 47 or higher. P(type 1 error) = 0.05.
b. Describe a type II error
A type II error is failing to reject a false null hypothesis. In this case we “fail to reject” the manufacturer’s claim that µ = 47 mpg when in fact µ < 47 mpg.
Guided Exercise 2 Recall the ball-bearing example where H0: µ = 6 mm and H1:
µ ≠ 6 mm. Suppose α = 0.01. a. Describe a type I error and its consequences and probability.
The probability of a type I error is 1%, the level of significance. A type I error would mean that we rejected the manufacturer’s claim the µ = 6 mm when in fact the average diameter was 6 mm. The consequence of a type I error in this application would be needless adjustment and delay in the manufacturing process.
b. Describe a type II error and its consequences
A type II error would mean that we “failed to reject” the manufacturer’s claim the µ = 6 mm when in fact µ ≠ 6mm. The consequence of a type II error in this application would be the production of many bearings that do not meet specifications.
Page 7 of 34
Statistical Test Conclusions and Meanings For a given, preset level of significance α , and a P-value computed from the sample data: 1. If P-value ≤α , then Ho is rejected. That is, there is enough
evidence in the [sample] data to reject H0. This means we chose the alternate hypothesis H1 knowing we have not proven H1 beyond all doubt.
2. If P-value > α , then we fail to reject H0. That is, there is not enough evidence in the [sample] data to reject H0. This means we retain H0 knowing we have not proven it beyond all doubt.
Example 4 A car manufacturer advertises a car that gets 47 mpg. Suppose that we sampled 40 cars and found a mean gas mileage of 46.26 mpg. The standard deviation is σ = 2.22 mpg. Test the manufacturers claim at a 5% level of significance (α = 0.05). a. Establish the null and alternate hypotheses.
H0: µ = 47 mpg H1: µ < 47 mpg b. Draw the normal x -distribution
and show the null hypothesis and sample statistic on the axis. Label the axis; include the units.
c. Compute the p-value. p-value = normalcdf(0, 46.26, 47, 2.22 / 40 ) = 0.0175
Page 8 of 34
d. Conclude the test. Interpret its meaning in this application. The p-value is 0.018. Since the p − value = 0.018 ≤α = 0.05 , we reject H0 , which means at a 5% level of significance the sample data is significant and supports that the mean car mileage is less than 47 mpg. e. Repeat part d, but test the manufacturers claim at a 1% level of
significance (α = 0.01). The p-value is 0.018. Since the p − value = 0.018 >α = 0.01, we fail to reject H0 , which means at a 5% level of significance the sample data is not strong enough to say the mean car mileage is less than 47 mpg.
Page 9 of 34
9.1 Homework 1. Do problems 1-8 all. 2. On problems 9-14 follow these steps: Example (a) Write the null and alternate
hypotheses. Include units. (b) Compute the standard error
σ x = σ / n . Then sketch the normal curve and the area under the curve that represents the p-value. Label the axis to include the assumption in the null hypotheses and 3 standard deviation on both sides. Include units.
(c) Compute the p-value (without
using the ZTest function). (d) Conclude the test. That is, if the
P-value ≤α , then reject H0, otherwise do not reject H0.
(e) Summarize the results.
H0 : µ = 47 mpg H1 : µ < 47 mpg
p − value = area in the tail(s)
= normalcdf (0, 46.26,47,2.22
40)
= 0.0175
p − value = 0.0175 < α = 0.05Reject H0
At a 5% l.o.s. the sample data is significant and supports that the mean car mileage is less than 47 mpg.
x, mpg47
46.65
46.30
45.95
sample meanx = 46.26
fromH0
Page 10 of 34
9.2 Testing the Mean µ Example 3 Testing the Mean µ when σ is Known Some scientists believe sunspot activity is related to drought duration. Let x by a random variable representing the number of sunspots observed in a four-week period. A random sample of 40 such periods in Spanish colonial times gave the following data: 12.5 14.1 37.6 48.3 67.3 70.0 43.8 56.5 59.7 24.0 12.0 27.4 53.5 73.9 104.0 54.6 4.4 177.3 70.1 54.0 28.0 13.0 6.5 134.7 114.0 72.7 81.2 24.1 20.4 13.3 9.4 25.7 47.8 50.0 45.3 61.0 39.0 12.0 7.2 11.3 The sample mean is x ≈ 47.0 . Previous studies indicate that σ = 35. It is thought that for thousands of years, the mean number of sunspots per four-week period was about µ = 41. Do the data indicate, at a 5% level of significance, that the sunspot activity during the Spanish colonial period was higher than 41? a. Establish the hypotheses. b. What does a 5% level of significance mean in this application? We are willing to tolerate at most a 5% probability of rejecting
a true null hypothesis. That is, assuming H0: µ = 41 is true, to reject H0 means the probability that a sample x is as extreme or more extreme than our observed sample statistic ( x ≈ 47.0) must be less than α = 0.05.
c. Explain the meaning of the P-value in this application. Assuming H0: µ = 41 is true, the P-value is the probability that
a sample x is as extreme or more extreme than our observed sample statistic ( x ≈ 47.0).
Page 11 of 34
d. Draw the x -distribution. Place the null hypothesis and the observed x on the axis. Then compute the P-value.
e. Conclude the test. That is, if the P-value ≤α , then reject H0,
otherwise do not reject H0. f. Interpret your results.
Page 12 of 34
9.2 Exercises #1-16: Steps to Test the Mean µ 1. Establish H0 and H1:
Left-Tailed Test
Right-Tailed Test
Two-Tailed Test
H0: µ = k H1: µ < k
H0: µ = k H1: µ > k
H0: µ = k H1: µ ≠ k
2. Indicate which test you are using. The output for either test is
the P-value. a. If σ is known, then the convention is to
compute the P-value with a normal distribution. The Z-Test uses a normal distribution (STAT / TESTS / 1: Z-Test).
b. If σ is NOT known, then the convention is
to compute the P-value with the more conservative Student’s t-Distribution (STAT / TESTS / 2: T-Test).
3. Conclude the Test: If P-value ≤α , then the sample data is
significant and we reject H0, otherwise we do not reject Ho. 4. State your conclusions in the context of the application.
Page 13 of 34
Example 3 A zoo wishes to obtain eggs from a rare river turtle so they can be hatched and raised to preserve the species. Carol, a staff biologist, finds a nest of 36 eggs she suspects to be from the rare turtle species. Research has shown that the size of rare turtle eggs are normally distributed with a population mean of µ = 7.50 cm. Furthermore, the mean length of the eggs of the other (common) turtle species is known to be longer than 7.50 cm, For the sample, the mean length of the 36 eggs is x = 7.74 cm. The standard deviation of all turtle eggs is σ = 1.5 cm. So, Carol is concerned that the eggs may have come from a common turtle species. Do the data indicate that the eggs from the rare river turtle at a 5% level of significance. 1. Establish H0 and H1. H0: µ = 7.50 cm H1: µ > 7.50 cm
2. State the possible conclusions and their interpretations in this
application.
3. Explain a 5% level of significance in this application. Explain
how serious a type I error is in this application?
Test Conclusion Interpretation of the Result Fail to reject H0 At a 5% level of significance the sample
data is not strong enough to reject H0. That is, the sample evidence is not strong enough to say the eggs are from the common turtle.
Reject H0 At a 5% level of significance the sample data is statistically significant and is sufficient to reject H0, which suggests the eggs are from the common turtle. We will be wrong at most α =5% of the time.
Page 14 of 34
A 5% level of significance means we are taking a 5% risk of a type 1 error – a 5% risk of rejecting a true H0. In this application we are only willing to take a 5% chance of rejecting that the eggs are from the rare river turtle.
5. Find the probability that our assumed mean in the null
hypothesis (H0: µ = 7.50 cm) is at or further away than the test statistic ( x ). That is, find the P-value.
6. Conclude the test. 7. Interpret the results.
Page 15 of 34
Example 5 The drug 6-mP (6-mercoptopurine) is used to treat leukemia. The following data represent the remission times (in weeks) for a random sample of 21 patients using 6-mP. 10 7 32 23 22 6 16 34 32 25 11 20 19 6 17 35 6 13 9 6 10 The sample mean is 17.1 weeks with a sample standard deviation of 10.0 weeks. Let x be a random variable representing the remission times (in weeks) for all patients. Assume the x-distribution is mound-shaped and symmetric. A previous drug treatment had a remission time of 12.5 weeks. At a 1% level of significance do the data indicate the mean remission time for 6-mP is different (either way)? 1. Establish the hypotheses. 2. Find the P-value of the test statistic and conclude the test.
Show your work and/or indicate the test used on your calculator to compute the P-value.
3. Interpret the results.
Page 16 of 34
Example 6 Archeologists become excited when they find an anomaly in a newly discovered artifact. The anomaly may or may not indicate a new trading region or a new method of craftsmanship. Suppose the lengths of arrowheads at a certain site have a mean length of µ = 2.6 cm. A random sample of 61 recently discovered arrowheads in an adjacent cliff dwelling had a sample mean length of 2.92 cm. The standard deviation is σ = 0.85 cm. Do these data indicate that the mean length of arrowheads in the adjacent cliff dwelling is longer than 2.6 cm? Use a 1% level of significance. 1. Establish the hypotheses. 2. Find the P-value of the test statistic and conclude the test.
Show your work and/or indicate the test used on your calculator to compute the P-value.
3. Interpret the results.
Page 17 of 34
Example 7 By taking thousands of practice shots at driving ranges, Pam knows her mean distance using a #1 wood is 225 yards with a standard deviation σ = 25yards. Taking 100 shots with a new ball, Pam found her sample mean distance was 230 yards. At a 5% level of significance, determine if Pam improved her driving distance using the new ball? 1. Establish the hypotheses. 2. Find the P-value of the test statistic and conclude the test.
Show your work and/or indicate the test used on your calculator to compute the P-value.
3. Interpret the results.
Page 18 of 34
Example 8 A large company with offices around the world occasionally must move their employees from one city to another. From long experience, the company knows its employees move on average once every 8.50 years with a standard deviation of 3.62 years. Recent trends have led some to believe a change might have occurred. A sample of 48 employees were asked the number of years since the company last moved them. The mean time was 7.91 years. Has the mean time between moves significantly changed? Use α = 0.05. 1. Establish the hypotheses. 2. Without using the ZTest, find the P-value of the test statistic
and conclude the test. Show your work and/or indicate the test used on your calculator to compute the P-value.
3. Interpret the results.
Page 19 of 34
Guided Exercise 5 Production records show that a machine that makes bottle caps makes caps with a mean diameter of 1.85 cm and a standard deviation of 0.05 cm. An inspector measured a random sample of 64 caps and found a mean diameter of 1.87 cm. At a 1% level of significance, determine if the machine slipped out of adjustment? 1. Establish the hypotheses. 2. Without using the ZTest, find the P-value of the test statistic
and conclude the test. Show your work and/or indicate the test used on your calculator to compute the P-value.
3. Interpret the results.
Page 20 of 34
Confidence Interval versus Two-tailed Hypothesis Test Suppose a two-tailed hypothesis test has a level of significance α and null hypothesis H0: µ = µ0 . Let c be the confidence level for the mean µ based on the sample data. Then c = 1−α and 1. H0 is not rejected whenever µ0 falls inside the c confidence
interval for the meanµ . 2. H0 is rejected whenever µ0 falls outside the c confidence
interval for the mean µ . Exercise 19, Section 9.2 Consider a two-tailed hypothesis test with α = 0.01 and
H0: µ = 20 H1: µ ≠ 20 A random sample of size 36 has a sample mean of 22. It is known the standard deviation σ =4. Use α = 0.03. a. Use hypothesis testing to see if there is sufficient evidence to
reject H0. b. Solve using a confidence interval.
i. What is the confidence level corresponding to a level of significance of 0.03? Find the ____% confidence interval for the mean x .
We are ____% confident that the population mean µ is
between ________ and ________. ii. Do we reject or fail to reject H0 based on the 97%
confidence interval.
Page 21 of 34
9.3 Testing a Proportion p Setup and Assumptions 1. Let r be the binomial random variable representing the number
of successes out of n trials. 2. The sample size n is large so that it can be approximated by a
normal distribution. That is, np > 5 and nq > 5. 3. For the probability of success use öp = r / n for the point
estimate of the population parameter p. 4. The possible sets of hypotheses are:
Left-Tailed Test
Right-Tailed Test
Two-Tailed Test
H0: p = k H1: p < k
H0: p = k H1: p > k
H0: p = k H1: p ≠ k
5. TI-84: STAT / TESTS / 1-PropZTest Input: p0: from the H0 x: the number of successes (the r-value) n: number of trials < p0, > p0, ≠ p0 depending on H1 Output: the P-value 6. Conclude the Test: If P-value ≤α , then the sample data is
significant and we reject H0, otherwise we conclude the sample data is not strong enough to reject Ho.
7. Summarize your conclusion in the specific situation.
Page 22 of 34
Example 9 A team of eye surgeons has developed a new technique for a risky eye operation to restore the sight of people blinded from a certain disease. Under the old method, only 30% of the patients recovered their eyesight. Surgeons have performed the new technique 225 times and 88 of those patients have recovered their sight. Can we justify the claim that the new technique is better than the old one at a 1% level of significance? 1. Establish the hypotheses. 2. Find the P-value of the test statistic and conclude the test.
Show the test used on your calculator to compute the P-value. 3. Interpret the results.
Page 23 of 34
Example 10 A botanist has produced a new variety of hybrid wheat that is better able to withstand drought than other varieties. He knows that 80% of the seeds from the parent plants germinate. He claims the hybrid has the same germination rate. To test this claim, 400 seeds from the hybrid plant are tested and 312 germinated. Test the botanist claim at a 5% level of significance. 1. Establish the hypotheses. 2. Find the P-value of the test statistic and conclude the test.
Show the test used on your calculator to compute the P-value. 3. Interpret the results.
Page 24 of 34
9.4 Tests Involving Paired Differences (Dependent Samples)
Dependent Samples Dependant samples have data that are naturally paired. Dependent samples occur naturally in many applications, such as “before and after” situations – where the same object is measured before and after a treatment. In such cases the difference in the two measures is tested. Examples of Dependent Samples a. A shoe manufacturer claims that among adults in the United
States, the left foot is longer than the right foot. b. A weekend refresher math course is administered to new
students. An exam is administered to each student before and after the course.
Page 25 of 34
Testing the Difference, d, of Paired Data a. It is assumed the paired data are such that the difference d
between the first and second members of each pair are approximately normally distributed with a population mean µd .
b. A random sample of n data pairs with sample mean d and
sample standard deviation sd follow a Student’s t distribution and can be tested with STAT / TESTS / 2: T-Test.
c. The possible sets of hypotheses to be tested are:
Left-Tailed Test
Right-Tailed Test
Two-Tailed Test
H0: µd = 0 H1: µd < 0
H0: µd = 0 H1: µd > 0
H0: µd = 0 H1: µd ≠ 0
4. TI-83: STAT / TESTS / 2: T-Test Input: µ0 : from the H0 x : the mean of the differences d sx : standard deviation of d , sd n: number of pairs in the sample µ : < µ0 , > µ0 , ≠ µ0 depending on H1 Output: the P-value 5. Conclude the Test: If P-value ≤α , then the sample data is
significant and we reject H0, otherwise we conclude the sample data is not strong enough to reject Ho.
6. Interpret the results (specific to application).
Page 26 of 34
Example 10 Heart surgeons know that many patients who undergo heart surgery have a dangerous buildup of anxiety before the operation. Psychiatric counseling may relieve some of that anxiety. The data shown are the anxiety scores of patients before and after counseling. Higher scores mean higher levels of anxiety. Can we conclude that counseling reduces anxiety? Use α = 0.01. 1. Establish the
hypotheses. 2. Find the P-value of the test statistic and conclude the test.
Show the test used on your calculator to compute the P-value. 3. Interpret the results [specific to the context of the application].
Patient B
Score before counseling
A Score after counseling
d = A – B Difference
A B C D E F G H I
121 93
105 115 130 98
142 118 125
76 93 64
117 82 80 79 67 89
-45 0
-41 2
-48 -18 -63 -51 -36
Page 27 of 34
Example 11 To test the quality of two brands of tires, one tire of each brand was placed on six test cars. After 6 months the amount of wear on each tire was measured in thousandths of inches. Can we conclude the two tire brands show unequal wear at a 2% level of significance? 1. Establish the hypotheses. 2. Find the P-value of the test statistic and conclude the test.
Show the test used on your calculator to compute the P-value. 3. Interpret the results [specific to the context of the application].
Car Soapstone Bigyear Difference d = S - B
1 2 3 4 5 6
132 71 90 37 93
107
140 74 110 36 105 119
-8 -3
-20 1
-12 -12
Page 28 of 34
9.5 Testing µ1 − µ2 and p1 − p2 (Independent Samples) Samples are independent if there is no relationship whatsoever between specific values of the two distributions. Example 12 A teacher wishes to compare the effectiveness of two teaching methods. Students are randomly divided into two groups: The first group is taught by method 1 and the second group by method 2. At the end of the course, a comprehensive exam is given to all students. The mean scores, x1 and x2 , of the two groups are compared. Are the samples independent or dependent? Example 13 A shoe manufacturer claims that for U.S. adults the average length of the left foot is longer than the average length of the right foot. A random sample of 60 adults is drawn and the length of both their left and right feet are measured and averaged as x1 and x2 , respectively. Are the samples independent or dependent? Theorem 9.2 Let x1 have a normal distribution with mean µ1 and standard deviation σ1 . Let x2 have a normal distribution with mean µ2 and standard deviation σ 2 . Suppose random sample of size n1 and n2 are taken from the respective distributions. Then the variable
x1 − x2 has 1. A normal distribution. 2. Mean µ1 − µ2
3. Standard deviation σ12 / n1 −σ 2
2 / n2
Page 29 of 34
Steps for Section 9.5 Problems 1. Establish H0 and H1.
Left-Tailed Test
Right-Tailed Test
Two-Tailed Test
H0: µ1 = µ2 H1: µ1 < µ2
H0: µ1 = µ2 H1: µ1 > µ2
H0: µ1 = µ2 H1: µ1 ≠ µ2
2. Indicate which test you are using.
a. If σ1 and σ 2 are known, then the convention is to compute the P-value with a normal distribution. The 2-SampZTest uses a normal distribution (STAT / TESTS / 3: 2-SampZTest).
b. If σ1 and σ 2 are not known, then the convention is to compute the P-value with the more conservative Student’s t-Distribution (STAT / TESTS / 4: 2-SampTTest). Input the sample standard deviation s.
3. Conclude the Test: If P-value ≤α , then the sample data is
significant and we reject H0, otherwise we conclude the sample data is not strong enough to reject Ho.
4. Interpret the results [specific to the context of the application].
Page 30 of 34
Example 14 A consumer group measures the heating capacity of camp stoves by measuring the time it takes the stove to boil 2 quarts of water from 500 F. Two competing models were tested: Model 1: x1 = 11.4 min σ1 = 2.5 min n1 = 10 Model 2: x2 = 9.9 min σ 2 = 2.5 min n2 = 12 Is there a difference in the performance of the two models at a 5% level of significance? 1. Establish the hypotheses. 2. Find the P-value of the test statistic and conclude the test.
Show the test used on your calculator to compute the P-value. 3. Interpret the results [specific to the context of the application].
Page 31 of 34
Example 15 Two competing headache remedies claim to give fast-acting relief. An experiment was performed to compare the mean lengths of time required for bodily adsorption of brand A and brand B: Brand A: x1 = 21.8 min s1 = 8.7 min n1 = 12 Brand B: x2 = 18.9 min s2 = 7.5 min n2 = 12 Assuming both distributions are approximately normal, test the claim that there is no difference in the mean time required for bodily absorption. 1. Establish the hypotheses. 2. Find the P-value of the test statistic and conclude the test.
Show the test used on your calculator to compute the P-value. 3. Interpret the results [specific to the context of the application].
Page 32 of 34
Testing Two Proportions p1 & p2
STAT / TESTS / 6: 2-PropZTest
Example 16 The Macek County Clerk wishes to improve voter registration. One method under consideration is to send reminders in the mail to all citizens in the county who are eligible to register. A random sample of 1250 potential register voters was taken.
Group 1: There were 625 people in this group. No reminders to register were sent to them. The number of potential voters from this group who registered was 295.
Group 2: There were 625 people in this group. Reminders to register were sent to them. The number of potential voters from this group who registered was 350.
At a 5% level of significance, did reminders improve voter registration? 1. Establish the hypotheses. 2. Find the P-value of the test statistic and conclude the test.
Show the test used on your calculator to compute the P-value. 3. Interpret the results [specific to the context of the application].
Page 33 of 34
Guided Exercise 11 The Macek County Clerk wishes to improve voter registration. One method under consideration is to send reminders in the mail to all citizens in the county who are eligible to register. A random sample of 1100 potential register voters was taken.
Group 1: There were 500 people in this group. No reminders to register were sent to them. The number of potential voters from this group who registered was 248.
Group 2: There were 600 people in this group. Reminders to register were sent to them. The number of potential voters from this group who registered was 332.
At a 1% level of significance, did reminders improve voter registration? 1. Establish the hypotheses. 2. Find the P-value of the test statistic and conclude the test.
Show the test used on your calculator to compute the P-value. 3. Interpret the results [specific to the context of the application].
Page 34 of 34
TI-83/84
STAT / TESTS menu Section Description 1: Z-Test 9.2 Testing the mean µ when σ is known. Be
able to do these problems without using the Z-Test function. That is, sketch the distribution and compute the p-value using the normalcdf function.
2: T-Test 9.2, 9.4 Testing the mean µ when σ is not known, or testing dependent paired data µd = 0 .
3: 2-SampZTest 9.5 Testing two mean µ1 − µ2 when σ1 and σ 2 are known.
4: 2-SampTTest 9.5 Testing two mean µ1 − µ2 when σ1 and σ 2 are not known.
5: 1-PropZTest 9.3 Testing a proportion p. 6: 2-PropZTest 9.5 Testing two proportions. 7: ZInterval 8.1 Estimating µ when σ is known. Be able
to do these problems without using the ZInterval function. That is, sketch the distribution and compute the interval using the invNorm function.
8: TInterval 8.2 Estimating µ when σ is not known. 9: 2-SampZInt 8.5 Estimating µ1 − µ2 when σ1 and σ 2 are
known. 0: 2-SampTInt 8.5 Estimating µ1 − µ2 when σ1 and σ 2 are
known. A: 1-PropZInt 8.3 Estimating p when the Binomial
Distribution. B: 2-PropZInt 8.5 Estimating p1 − p2
10.1 Scatter Diagrams (Page 1 of 13)
10.1 Paired Data and Scatter Diagrams Linear Equations Linear equations (or linear functions) graph as straight lines and can be written in the form:
y = bx + a where (0, a) is the y-intercept, and
b = slope of the line =
riserun
=y2 − y1
x2 − x1
Example A a. Identify the slope and y-
intercept in each of the equations.
b. Graph each equation using
the y-intercept and the slope.
c. Graph each equation using the TI-83.
Equation y = bx + a
Slope b
y-intercept (0, a)
y = 2x −5
y = −23 x + 3
y = 5 y = −x
4
4
- 4
- 4
y
x
4
4
- 4
- 4
y
x
10.1 Scatter Diagrams (Page 2 of 13)
Scatter Diagram; Explanatory & Response Variables A scatter diagram is a plot of ordered-pair (x, y) data. We call x the explanatory variable and y the response variable. Example 1 Phosphorous, a chemical in many household and industrial cleaning compounds, often finds its way into surface water. A random sample of eight sites in California wetlands gave the following information about phosphorous reduction in drainage water. In this study, x represents phosphorous concentration (in 100 mg/l) at the inlet of a bio-treatment facility and y represents the phosphorous concentration at the outlet of the facility. a. Make a scatter diagram
of the data. Label and scale the axes. Then draw a “best fit” linear model through the data.
b. Do x and y appear to be
linearly related? c. Use the linear model
(line) to predict the outlet concentration of phosphorous if the inlet concentration is 700 mg/l.
d. Use the linear model to predict the inlet concentration of
phosphorous if the outlet concentration is 200 mg/l.
x 5.2 7.3 6.7 5.9 6.1 8.3 5.5 7.0 y 3.3 5.9 4.8 4.5 4.0 7.1 3.6 6.1
10.1 Scatter Diagrams (Page 3 of 13)
Linear Correlation If the scatter plot of ordered pair data, represented by variables x and y, trends roughly into a straight line, then we say that x and y are linearly correlated. Linear correlation is classified in two general ways: 1. Degree: none, low-moderate, high, perfect Perfect linear correlation means that all (x, y) ordered pairs of
data lie on the same straight line. 2. Sign or slope: positive or negative
i. Positive linear correlation means that high values of x correlate with high values of y, and low values of x correlate with low values of y. The graph has a positive slope (and trends upward from left to right). ii. Negative linear correlation means that high values of x correlate with low values of y, and low values of x correlate with high values of y. The graph has a negative slope (and trends downward from left to right).
See figures 10-1, 10-2, 10-4 Guided Exercise 2, Table 2, and Exercises 1 & 2
10.1 Scatter Diagrams (Page 4 of 13)
Guided Exercise 1 An industrial plant has 7 divisions that do the same type of work. A safety inspector tracks x = “the number of work-hours devoted to safety training” and y = “the number of work-hours lost due to accidents.” The results are shown. (a) Make a scatter diagram for the data. (b) Make a scatter diagram on your calculator. Enter the x-values into L1 and the y-values into L2. Turn the STAT PLOT on and adjust the window settings to (c) Draw a “best fit” line through the data and classify the linear correlation as (i) none, low-moderate, high, or perfect, and (ii) positive or negative. (d) Use your linear model (i.e. read from the line) to predict the number of safety training hours needed so that 20 work-hours are lost due to accidents. (e) Use your linear model (i.e. read from the line) to predict the number of work-hours are lost due to accidents when 30 hours are spent on safety training.
Division x y 1 10.0 80 2 19.5 65 3 30.0 68 4 45.0 55 5 50.0 35 6 65.0 10 7 80.0 12
10.1 Scatter Diagrams (Page 5 of 13)
Sample Correlation Coefficient r The correlation coefficient r is a unit-less numerical measure that assesses the strength of linear relationship between two variables x and y. See table 10-2. 1. −1≤ r ≤1 2. If r = 1, there is perfect positive linear correlation. 3. If r = −1, there is perfect negative linear correlation. 4. The closer r is to 1 or –1, the better a line describes the
relationship between x and y. 5. If r is positive, then as x increases, y increases. 6. If r is negative, then as x increases, y decreases. 7. The value of r is the same regardless of which variable is the
explanatory and which is the response variable. Data plotted as (x, y) and (y, x) will have the same value for r.
Computation of r 1. Turn DiagnosticOn in the CATALOG menu. 2. Enter the x-values into L1 and the y-values into L2. 3. STAT / CALC/4: LinReg(ax+b) Lx,Ly,Y1 4. Highlight Calculate and press ENTER. Then scroll down
to find the value of r. Guided Exercise 1 (f) Find the sample correlation coefficient for the data in guided
exercise 1. Sample versus Population Correlation Coefficient r r = sample correlation coefficient computed from a random sample of (x, y) data pairs. ρ = population correlation coefficient computed from all population data pairs (x, y).
10.1 Scatter Diagrams (Page 6 of 13)
Lurking Variables In ordered pairs (x, y), x is called the explanatory variable and y is called the response variable. When r indicates a linear correlation between x and y, a change in the values of y tends to respond to changes in values of x according to a linear model. A lurking variable is a variable that is neither an explanatory nor a response variable. Yet, a lurking variable may be responsible for changes in both x and y. Correlation does not necessarily mean causation. Example 3 It has been observed in a certain community that over the years the correlation between x, the number of people going to church, and y, the number of people in jail, was r = 0.90. Does going to church cause people to go to jail, or visa versa? Explain.
10.2 Linear Regression and the Coefficient of Determination (Page 7 of 13)
10.2 Linear Regression and the Coefficient of Determination
Least-Squares Linear Regression Line The least-squares linear regression line is the line that fits the (x, y) data points in such a manner that the sum of the squares of all the vertical distances from the data points to the line is a small as possible. The point (x , y) is always on the least-squares regression line. Computing the Linear Regression Line on the TI-83/84
STAT / CALC / 4: LinReg(ax+b) Lx, Ly, Y1 Output: a and b in y = bx + a
10.2 Linear Regression and the Coefficient of Determination (Page 8 of 13)
Example 4 In Denali national Park, Alaska, the wolf population is dependent on the caribou population. Let x represent the caribou population (in hundreds) and y represent the wolf population. A random sample in recent gave the following information.
(a) Identify the explanatory and
response variables. (b) Make a scatter diagram of the
data. (c) Find the linear regression line for the data. Graph the LSRL –
write at least 2 points on the line. (d) Interpret the slope of the line in this application. (e) Predict the size of the wolf population when the caribou
population is 2100. Is this interpolation or extrapolation? (f) Predict the size of the wolf population when the caribou
population is 4000. Is this interpolation or extrapolation?
x 30 34 27 25 17 23 20 y 66 79 70 60 48 55 60
10.2 Linear Regression and the Coefficient of Determination (Page 9 of 13)
Coefficient of Determination r 2 If r is the correlation coefficient, then r 2 is called the coefficient of determination and
r 2 =
Explained VariationTotal Variation
.
1. The value of r 2 is the ratio of explained variation over total variation. That is, r 2 is the fractional amount of the total variation in y that can be explained by using the linear model y = bx + a with x as the explanatory variable.
2. 1− r 2 is the fractional amount of the total variation in y that is due to random chance or to the possibility of lurking variables that influence y.
Example 4A (a) Find r and r 2 for example 4. (b) Explain the value of r 2 in this application. (c) Explain the value of 1− r 2 in this application. Change on Directions for 10.2 Exercises Do the following in problems 7-18:
(a) View the scatter diagram of the data on your calculator to verify a linear model is appropriate.
(b) Find a and b for the least-squares regression line
y = bx + a . Then find r, r 2 , x and y . (c) Graph the regression line from part (b). Be sure the point
(x , y) is on the graph. (d) Interpret the values of r 2 in one sentence relevant to the
application. Interpret the values of 1− r 2 in one sentence relevant to the application.
10.2 Linear Regression and the Coefficient of Determination (Page 10 of 13)
Guided Exercise 3 Quick Sell car dealership runs 1-minute TV advertisements and tracks x = “the number of ads that week.” and y = “the number of cars sold that week.”
x 6 20 0 14 25 16 28 18 10 8 y 15 31 10 16 28 20 40 25 12 15
Complete steps (a) – (d). Then find the predicted number of cars sold per week if the budget only allows 12 ads to be run per week. (a) View the scatter diagram of the
data on your calculator to verify a linear model is appropriate.
(b) Find a and b for the least-squares regression line
y = bx + a . Then find r, r 2 , x and y .
(c) Graph the regression line from part (b). Be sure the point
(x , y) is on the graph.
(d) Interpret the values of r 2 in one sentence relevant to the application. Interpret the values of 1− r 2 in one sentence relevant to the application.
10.3 Testing the Correlation Coefficient (Page 11 of 13)
10.3 Testing the Correlation Coefficient The population correlation coefficient ρ (rho, read “row”) is estimated by the statistic r. If we assume the variables x and y are normally distributed and want to test if they are correlated in the population, then we set the null hypothesis to say they are not correlated:
H0 : ρ = 0 x and y are not correlated at the given level of significance.
Theorem Let random variables x and y be normally distributed. If ρ = 0 (as assumed in the null hypothesis), then the distribution of sample correlation coefficients (the r values) is normally distributed about r = 0. r-axis
Distribution of r – values when ρ = 0
10.3 Testing the Correlation Coefficient (Page 12 of 13)
Example 6 Do college graduates have an improved chance of a better income? Let x = percentage of the population 25 or older with at least 4 years of college and y = percentage growth in per capita income over the past seven years. A random sample of six communities in Ohio gave the information in the table. (a) Find the correlation
coefficient r. (b) Test to see if the correlation coefficient is positive at a 1%
level of significance. (c) Summarize your conclusions in one sentence relevant to this
application. 10.3 Homework Do Exercises 7-12 parts (b) and (d) only.
x 9.9 11.4 8.1 14.7 8.5 12.6 y 37.1 43 33.4 47.1 26.5 40.2
10.3 Testing the Correlation Coefficient (Page 13 of 13)
Exercise 10.3 #3 What is the optimal time for a scuba diver to be at the bottom of the ocean? The navy defines optimal time to be the time at each depth for the best balance between length of work period and decompression time after surfacing. Let x = depth of a dive in meters, and y = optimal time in hours. A random sample of divers gave the following data. (b) Use a 1% level of significance to test the claim that ρ < 0 . (d) Find the predicted optimal time for a dive depth of 18 meters.
x 14.1 24.3 30.2 38.3 51.3 20.5 22.7 y 2.58 2.08 1.58 1.03 0.75 2.38 2.20
11.1 Chi Square, χ 2 : Tests for Independence Example 1 A keyboard manufacturer wants to know: “Is the time a new student takes to learn to type independent or the arrangement of the letters on the keyboard?” Three hundred beginning typing students were randomly assigned to learn to type 20 wpm on three different keyboards. The observed data is in the table below. Use a 1% level of significance. 1. The company would test the following hypotheses:
H0: Keyboard letter arrangement and learning times are independent.
H1: Keyboard letter arrangement and learning times are not independent.
2. Enter the contingency matrix into a matrix. i. MATRIX / EDIT / [A] ii. Enter the number of rows (3) and the number
of columns (3). iii. Enter each element in the contingency matrix
row-by-row. Keyboard 21-40 h 41-60 h 61-80 h Total
A 25 30 25 80 B 30 71 19 120
Standard 35 49 16 100 Total 90 150 60 300
3. Compute the p-value using STAT / TESTS / C: χ 2 -Test 4. State your conclusion in a sentence relevant to the problem.
A =
25 30 2530 71 1935 49 16
11.1 Exercise 8 After a large fund drive for the City Library, the following information was obtained from a random sample of contributors to the library. Use a 1% level of significance to test the claim that “the amount contributed to the library is independent of ethnic group.”
Ethnic Group
1-50 ($)
51-100 ($)
101-150 ($)
151-200 ($) $201+ Total
A 310 715 201 105 42 1373 B 619 511 312 97 22 1561 C 402 624 217 88 35 1336 D 544 571 309 79 29 1532
Total 1875 2421 1039 369 128 5832
11.2 Chi Square, χ 2 : Goodness of Fit Example 1 Last year management listed five items and asked each employee to mark the one item most important to him/her. The percentage results are in the third column of the table. This year the managers asked 500 employees the same thing and observed the results in column 2. Test to see if this years distribution “fits” last years at a 1% level of significance.
Item Observed, O Expected % Expected, E (O − E)2
E
Vacation 30 4% Salary 290 65% Safety 70 13%
Retirement 70 12% Overtime 40 6%
Total 500 100% 500 1. Set-up the χ 2 “goodness-of-fit” hypotheses: H0: The population fits the given distribution (i.e. last year’s) H1: The population this year ha a different distribution than last
years.
2. Compute the χ 2 value. χ 2 =(O − E)2
E∑
3. Sketch the χ 2 -distribution and find the critical value χα
2 (the minimum χ 2 required to reject H0) from table 8. The degrees of freedom df = (number of data rows) – 1.
4. State your conclusions.
11.2 Exercise 2 The type of household for the entire United States and a random sample of 411 households in Dove Creek is given in the table. . Test to see if Dove Creek’s distribution of households is the same as the U.S. distribution at a 5% level of significance.
Item U.S. % Number in Dove Creek Expected, E
(O − E)2
E
Married w/children 26% 102
Married w/o children 29% 112
Single Parent 9% 33
One Person 25% 96
Other 11% 68
1. Set-up the χ 2 “goodness-of-fit” hypotheses: H0: The distribution of household type in Dove Creek fits the
distribution in the U.S. H1: The distribution of household type in Dove Creek is different
than the distribution in the U.S.
2. Compute the χ 2 value. χ 2 =(O − E)2
E∑
3. Sketch the χ 2 -distribution and find the critical value χα
2 (the minimum χ 2 required to reject H0) from table 8. The degrees of freedom df = (number of data rows) – 1. 4. State your conclusions.