Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal...
-
Upload
antony-glenn -
Category
Documents
-
view
262 -
download
0
Transcript of Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal...
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 6-1
Chapter 6
The Normal Distribution
正态分布
Business Statistics:A First Course
5th Edition
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-2
Learning Objectives
In this chapter, you learn: To compute probabilities from the normal
distribution 计算正态分布概率 To use the normal probability plot to determine
whether a set of data is approximately normally distributed 利用正态概率图来判断某一数据集是否近似服从正态分布
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-3
Continuous Probability Distributions A continuous random variable is a variable that can
assume any value on a continuum (can assume an uncountable number of values) 连续性随机变量指一个变量可以在连续取值空间上任意取值
thickness of an item 物品的厚度 time required to complete a task 完成某一任务所需的时间 temperature of a solution 溶解温度 height, in inches 高度 , 以英寸计
These can potentially take on any value depending only on the ability to precisely and accurately measure 理论上可以在测量工具能达到的精确度范围内的任何值
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-4
The Normal Distribution ‘Bell Shaped’ 钟型 Symmetrical 对称的 Mean, Median and Modeare Equal 三种集中趋势度量一直
Location is determined by the mean, μ 位置参数为 μ
Spread is determined by the standard deviation, σ 离散程度 ( 尺度参数 ) 由 σ给出
The random variable has an infinite theoretical range: + to 随机变量的取值范围为实线
Mean = Median = Mode
X
f(X)
μ
σ
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-5
The Normal DistributionDensity Function
2μ)(X
2
1
e2π
1f(X)
The formula for the normal probability density function is
正态分布的概率密度函数为
Where e = the mathematical constant approximated by 2.71828
π = the mathematical constant approximated by 3.14159
μ = the population mean 总体均值σ = the population standard deviation 总体标准差X = any value of the continuous variable 随机变量的取值
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-6
By varying the parameters μ and σ, we obtain different normal distributions
改变参数 μ 和 σ 可以得到不同样子的正态分布
Many Normal Distributions
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-7
The Normal Distribution Shape
X
f(X)
μ
σ
Changing μ shifts the distribution left or right.
改变均值可以左右移动分布Changing σ increases or decreases the spread.
改变标准差可以增减离散度
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-8
The Standardized Normal
Any normal distribution (with any mean and standard deviation combination) can be transformed into the standardized normal distribution (Z) 任何正态分布都可以转化为标准正态分布 Z
Need to transform X units into Z units 实现计量单位 X 到 Z 的变换
The standardized normal distribution (Z) has a mean of 0 and a standard deviation of 1 标准正态分布的均值为 0 ,标准差为 1
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-9
Translation to the Standardized Normal Distribution
Translate from X to the standardized normal (the “Z” distribution) by subtracting the mean of X and dividing by its standard deviation: 通过减去均值后除以标准差来实现 Z 变换(标准化变换)
σ
μXZ
The Z distribution always has mean = 0 and standard deviation = 1 Z 分布的均值和标准差总是为 0 和 1
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-10
The Standardized Normal Probability Density Function
The formula for the standardized normal probability density function is 标准正态分布的概率密度函数
Where e = the mathematical constant approximated by 2.71828
π = the mathematical constant approximated by 3.14159
Z = any value of the standardized normal distribution 标准正态分布 随机变量的取值
2(1/2)Ze2π
1f(Z)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-11
The Standardized Normal Distribution
Also known as the “Z” distribution Z 分布 Mean is 0 Standard Deviation is 1
Z
f(Z)
0
1
Values above the mean have positive Z-values, 如果变化后 Z值是正的,则变量 X 的值大于其均值
values below the mean have negative Z-values 如果变化后 Z值是负的,则变量 X 的值小于其均值
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-12
Example
If X is distributed normally with mean of 100 and standard deviation of 50, the Z value for X = 200 is如果随机变量 X 服从均值为 100 。标准差为 50 的正态分布,则 X=200 时的 Z 值为
This says that X = 200 is two standard deviations (2 increments of 50 units) above the mean of 100. 表明X=200 时比均值 100 大了 2 倍的标准差
2.050
100200
σ
μXZ
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-13
Comparing X and Z units
Z100
2.00200 X
Note that the shape of the distribution is the same, only the scale has changed. We can express the problem in original units (X) or in standardized units (Z) 实际上从 X 到 Z 只是改变了变量的度量尺度,但是变量分布的形状没有改变。
(μ = 100, σ = 50)
(μ = 0, σ = 1)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-14
Finding Normal Probabilities 计算正态概率
a b X
f(X) P a X b( )≤
Probability is measured by the area under the curve 概率密度函数曲线下的面积为变量取值概率
≤
P a X b( )<<=(Note that the probability of any individual value is zero) 连续随机变量的单点取值概率为 0
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-15
f(X)
Xμ
Probability as Area Under the Curve
0.50.5
The total area under the curve is 1.0, and the curve is symmetric, so half is above the mean, half is below概率曲线下方总面积为 1 ,因为曲线对称,所以大于、小于均值部分的面积各为一半, 0.5
1.0)XP(
0.5)XP(μ 0.5μ)XP(
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-16
The Standardized Normal Table标准正态分布表
The Cumulative Standardized Normal table in the textbook (Appendix table E.2) gives the probability less than a desired value of Z (i.e., from negative infinity to Z) 书中附表 E.2 给出了标准正态分布的累计概率表,即给定 Z 值下,小于它的概率
Z0 2.00
0.9772Example:
P(Z < 2.00) = 0.9772
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-17
The Standardized Normal Table
The value within the table gives the probability from Z = up to the desired Z value
.9772
2.0P(Z < 2.00) = 0.9772
The row shows the value of Z to the first decimal point
The column gives the value of Z to the second decimal point
2.0
.
.
.
(continued)
Z 0.00 0.01 0.02 …
0.0
0.1
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-18
General Procedure for Finding Normal Probabilities
计算正态概率的一般步骤
Draw the normal curve for the problem in terms of X 画出刻画随机变量 X 的正态密度曲线
Translate X-values to Z-values 将 X 值转换为标准正态的 Z 值
Use the Standardized Normal Table查标准正态分布表
To find P(a < X < b) when X is distributed normally: 如果 X 服从正态分布,则计算 X 大于值 a ,且小于值 b 的概率如下:
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-19
Finding Normal Probabilities
Let X represent the time it takes to download an image file from the internet. 变量 X 表示从英特网下载图片所以的时间
Suppose X is normal with mean 8.0 and standard deviation 5.0. Find P(X < 8.6) 假定 X 服从均值为 8 ,标准差为 5 的正态分布,计算 P(X < 8.6)
X
8.6
8.0
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-20
Z0.12 0X8.6 8
μ = 8 σ = 10
μ = 0σ = 1
(continued)
Finding Normal Probabilities
0.125.0
8.08.6
σ
μXZ
P(X < 8.6) P(Z < 0.12)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-21
Z
0.12
Z .00 .01
0.0 .5000 .5040 .5080
.5398 .5438
0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255
Solution: Finding P(Z < 0.12)
.5478.02
0.1 .5478
Standardized Normal Probability Table (Portion) 标准正态概率表(部分)
0.00
= P(Z < 0.12)P(X < 8.6)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-22
Finding NormalUpper Tail Probabilities
Suppose X is normal with mean 8.0 and standard deviation 5.0. 假定 X 服从均值为8 ,标准差为 5 的正态分布
Now Find P(X > 8.6) 计算 P(X > 8.6)
X
8.6
8.0
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-23
Now Find P(X > 8.6)…(continued)
Z
0.12
0Z
0.12
0.5478
0
1.000 1.0 - 0.5478 = 0.4522
P(X > 8.6) = P(Z > 0.12) = 1.0 - P(Z ≤ 0.12)
= 1.0 - 0.5478 = 0.4522
Finding NormalUpper Tail Probabilities
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-24
Finding a Normal Probability Between Two Values
Suppose X is normal with mean 8.0 and standard deviation 5.0. Find P(8 < X < 8.6)
P(8 < X < 8.6)
= P(0 < Z < 0.12)
Z0.12 0
X8.6 8
05
88
σ
μXZ
0.125
88.6
σ
μXZ
Calculate Z-values:
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-25
Z
0.12
Solution: Finding P(0 < Z < 0.12)
0.0478
0.00
= P(0 < Z < 0.12)P(8 < X < 8.6)
= P(Z < 0.12) – P(Z ≤ 0)= 0.5478 - .5000 = 0.0478
0.5000
Z .00 .01
0.0 .5000 .5040 .5080
.5398 .5438
0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255
.02
0.1 .5478
Standardized Normal Probability Table (Portion)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-26
Suppose X is normal with mean 8.0 and standard deviation 5.0.
Now Find P(7.4 < X < 8)
X
7.48.0
Probabilities in the Lower Tail
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-27
Probabilities in the Lower Tail
Now Find P(7.4 < X < 8)…
X7.4 8.0
P(7.4 < X < 8)
= P(-0.12 < Z < 0)
= P(Z < 0) – P(Z ≤ -0.12)
= 0.5000 - 0.4522 = 0.0478
(continued)
0.0478
0.4522
Z-0.12 0
The Normal distribution is symmetric, so this probability is the same as P(0 < Z < 0.12)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-28
Empirical Rules 经验法则
μ ± 1σ encloses about 68.26% of X’s
均均均均均均均均均均均均均均 X
f(X)
Xμ μ+1σμ-1σ
What can we say about the distribution of values around the mean? For any normal distribution:
对于给定的正态分布 , 如何经验的确定随机变量如何围绕均值分布 ?
σσ
68.26%
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-29
The Empirical Rule
μ ± 2σ covers about 95% of X’s
μ ± 3σ covers about 99.7% of X’s
xμ
2σ 2σ
xμ
3σ 3σ
95.44% 99.73%
(continued)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-30
Steps to find the X value for a known probability: 步骤1. Find the Z value for the known probability
查标准正态累积概率表找出对应概率的 Z值2. Convert to X units using the formula:
用如下转换公式计算 X 值
Given a Normal ProbabilityFind the X Value
计算对应正态累积概率的 X 值
ZσμX
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-31
Finding the X value for a Known Probability
Example: Let X represent the time it takes (in seconds) to download an
image file from the internet. X 表示从英特网下载图片文件的时间( 秒 )
Suppose X is normal with mean 8.0 and standard deviation 5.0
假定 X 服从均值为 8, 标准差为 5 的正态分布 Find X such that 20% of download times are less than X. 计算 X 值
使得下载所需时间少于 X 的概率为 20%
X? 8.0
0.2000
Z? 0
(continued)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-32
Find the Z value for 20% in the Lower Tail
20% area in the lower tail is consistent with a Z value of -0.84Z .03
-0.9 .1762 .1736
.2033
-0.7 .2327 .2296
.04
-0.8 .2005
Standardized Normal Probability Table (Portion)
.05
.1711
.1977
.2266
…
…
…
…X? 8.0
0.2000
Z-0.84 0
1. Find the Z value for the known probability 先查表得 20% 概率对应的 Z 值
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-33
2. Convert to X units using the formula: 利用转换公式计算 X 值
Finding the X value
80.3
0.5)84.0(0.8
ZσμX
So 20% of the values from a distribution with mean 8.0 and standard deviation 5.0 are less than 3.80 在均值为 8 。标准差为 5 的正态分布中, 20% 的值小于3.8
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-34
Evaluating Normality 正态性检验
Not all continuous distributions are normal 不是所有连续分布都是正态分布
It is important to evaluate how well the data set is approximated by a normal distribution. 判断一个数据集是否近似服从正态分布很重要
Normally distributed data should approximate the theoretical normal distribution: 将数据分布的特征与理论正态分布的性质进行比较
The normal distribution is bell shaped (symmetrical) where the mean is equal to the median. 正态分布是对称钟型且均值与中位数相同
The empirical rule applies to the normal distribution.
应该满足正态分布的经验法则 The interquartile range of a normal distribution is 1.33 standard
deviations. 四分位距应该在标准差的 1.33 倍左右
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-35
Evaluating NormalityComparing data characteristics to theoretical properties将数据分布的特征与理论正态分布的性质进行比较Construct charts or graphs 图形判断
For small- or moderate-sized data sets, construct a stem-and-leaf display or a boxplot to check for symmetry 茎叶图、盒子图判断对称性
For large data sets, does the histogram or polygon appear bell-shaped? 柱状图与折线图是否是钟型的
Compute descriptive summary measures 描述性统计量判断 Do the mean, median and mode have similar values? 三种集中趋势度
量是否近似相等 Is the interquartile range approximately 1.33 σ? 四分位距是否近似 1.33
σ Is the range approximately 6 σ? 全距是否近似 6 倍 σ
(continued)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-36
Evaluating Normality
Comparing data characteristics to theoretical properties将数据分布的特征与理论正态分布的性质进行比较 Observe the distribution of the data set 观察数据的分布
Do approximately 2/3 of the observations lie within mean ±1 standard deviation? 是否 2/3 的观测值落在均值加减一倍标准误之间
Do approximately 80% of the observations lie within mean ±1.28 standard deviations? 是否 80% 的观测值落在均值加减 1.28 倍标准误之间
Do approximately 95% of the observations lie within mean ±2 standard deviations? 是否 95% 的观测值落在均值加减 2 倍标准误之间
Evaluate normal probability plot 计算正态概率图 Is the normal probability plot approximately linear (i.e. a straight line) with
positive slope? 看正态概率图是否近似斜率为正的直线
(continued)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-37
ConstructingA Normal Probability Plot
Normal probability plot 正态概率图 Arrange data into ordered array 先将数据排序 Find corresponding standardized normal quantile values (Z) 计
算排序后累积百分比对应的标准正态分布 Z 值 Plot the pairs of points with observed data values (X) on the
vertical axis and the standardized normal quantile values (Z) on
the horizontal axis 以 X 为纵轴、 Z 为横轴画散点图 Evaluate the plot for evidence of linearity 评价散点图的线性程
度
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-38
A normal probability plot for data from a normal distribution will be approximately linear:
正态数据的正态概率图应该近似一条直线
30
60
90
-2 -1 0 1 2 Z
X
The Normal Probability PlotInterpretation
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-39
Normal Probability PlotInterpretation
Left-Skewed Right-Skewed
Rectangular
30
60
90
-2 -1 0 1 2 Z
X
(continued)
30
60
90
-2 -1 0 1 2 Z
X
30
60
90
-2 -1 0 1 2 Z
X Nonlinear plots indicate a deviation from normality
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-40
Evaluating NormalityAn Example: Mutual Funds Returns
403020100-10Return 2006
Boxplot of 2006 Returns
The boxplot appears reasonably symmetric,
with four lower outliers at -9.0, -8.0, -8.0, -6.5 and one upper outlier at 35.0. (The normal distribution is
symmetric.)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-41
Evaluating NormalityAn Example: Mutual Funds Returns
Descriptive Statistics
(continued)
• The mean (12.5142) is slightly less than the median (13.1). (In a normal distribution the mean and median are equal.)
• The interquartile range of 9.2 is approximately 1.46 standard deviations. (In a normal distribution the interquartile range is 1.33 standard deviations.)
• The range of 44 is equal to 6.99 standard deviations. (In a normal distribution the range is 6 standard deviations.)
• 72.2% of the observations are within 1 standard deviation of the mean. (In a normal distribution this percentage is 68.26%.
• 87% of the observations are within 1.28 standard deviations of the mean. (In a normal distribution percentage is 80%.)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-42
Evaluating NormalityAn Example: Mutual Funds Returns
403020100-10
99.99
99
95
80
50
20
5
1
0.01
Return 2006
Perc
ent
Probability Plot of Return 2006Normal
(continued)
Plot is approximately a straight line except for a few outliers at the low end and the high end.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-43
Evaluating NormalityAn Example: Mutual Funds Returns
Conclusions The returns are slightly left-skewed The returns have more values concentrated around
the mean than expected The range is larger than expected (caused by one
outlier at 35.0) Normal probability plot is reasonably straight line Overall, this data set does not greatly differ from the
theoretical properties of the normal distribution
(continued)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Chap 6-44
Chapter Summary
Presented normal distribution
Found probabilities for the normal distribution
Applied normal distribution to problems