Lecture 19 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture19.pdf19 The Critical Region!Let...

11

Lecture 19

! Hypothesis Testing

2

假设检验

! 获得估计量、标准误差、以及置信区间有时还不够……

! 实际工作往往需要在两个对立的决策中选择一个。例如：新产品上市？还是不上市？

3

案例：教学评估

! 数据：某学院进行了教学改革之后，收集了下一个学期各课程的教学评估。

! 问题：为什么关心这个数据？能分析什么？

4

No title gender student score

1 Associate Professor Female MBA 3.175





6 Associate Professor Female Graduate Student 4.103


8 Associate Professor Male Undergraduate 4.481

9 Associate Professor Male Undergraduate 4.195

┆ ┆ ┆ ┆ ┆

118 Associate Professor Male MBA 4.667

5

具体案例：

• 教学改革以前，MBA教学评估的长期平均水平为4.2。

• 当前收集的教学评估涉及118门课程，其中有52门MBA课程，它们教学评估的平均水平为4.34，标准差为0.54。

• 问题：MBA教学质量比以往提高了吗？这两个均数不等原因？

6

这两个均数不等原因？

n 教学改革的影响。（是真提高）

n 由于抽样误差所致，如抽到的都是好老师上的课。（是假提高）

7

两类假设

! 原假设 – H0 ：MBA教学平均分等于4.2。! 备择假设 – H1 ：MBA教学平均分大于4.2（MBA教学质量比以往提高了）。

• 根据调查得到的样本，对假设做出决策：拒绝还是接受

• 什么时候拒绝原假设（接受备择假设）？

• 4.34是否已经足够好？• 要多好才算真的好？• 要找出拒绝域--即样本均值落在哪个区域时拒绝原假设。

8

假设检验的第一种思路

我们希望能把样本(均值)空间分解成两个不相交部分，其中一个称为接受域，另一个称为拒绝域。当样本落在拒绝域时就拒绝原假设，当样本落在接受域时就接受原假设。

99

拒绝域

! 拒绝域：

• 样本MBA教学平均分比4.2大多大，才会拒绝原假设？

问题：如何选择？

4.2z x w= - ³拒绝域：

1010

你会犯什么错误？

实际上 H0为真

实际上 H1为真

你的决策

接受 H0 你是正确的你犯的是第二类错误

拒绝 H0 你犯的是第一类错误

你是正确的

11

两类错误

! 第一类错误的概率

! 第二类错误的概率

! 找一个两类错误都不犯的检验！？

( )0 0 0( ) 4.2 |P H H P X Ha w= = - ³在为真时，拒绝

( )1 0 1( ) 4.2 |P H H P X Hb w= = - <在为真时，接受

【问】：要使第一类错误的概率变小，ω应变__?(大or小)。要是第二类错误的概率变小，ω应变__?(大or小)。

1212

Neymann-Pearson原则

! N-P原则：控制你犯第一类错误的概率,然后使犯第二类错误的概率尽

量的小。

! 显著性水平：控制犯第一类错误的概率不大于。a

a

13

控制显著性水平

! 通常显著性水平取为 0.1, 0.05 或 0.01

! 如果犯第一类错误的成本很高，则选择(较大)(较小) 值?

! 如果犯第一类错误的成本不高，则选择(较大)(较小) 值?

13

a

a

a

14

Basic concepts in hypothesis testing

! H0: null hypothesis--- tentative assumption about a population parameter

! H1: alternative hypothesis

!

! Hypothesis testing: using data from a sample to decide whether to accept H0 or to accept H1

! Begin with the assumption that the null hypothesis is TRUE. (Similar to the notion of innocent until proven guilty)

0WÎq

1WÎq

f=WÇWW=WÈW 1010 ,

15

Formulating Hypotheses:Testing Research Hypothesis

! A particular automobile model currently attains anaverage fuel efficiency of 24 miles per gallon. Aproduct-research group has developed a newcarburetor (汽化器) specifically designed to increasethe miles-per-gallon rating. To evaluate this newcarburetor, several will be installed in automobiles,and subjected to research-controlled driving tests.

! Research hypothesis:Generally formulated as alternative hypothesis.

24:24:

1

0

>£

µµ

HH

24>µ

16

Formulating Hypotheses: Testing in Decision-Making Situations

! On the basis of a sample of parts from a shipmentthat has just arrived, a quality-control inspector mustdecide whether to accept the entire shipment or toreturn the shipment to the supplier because it doesnot meet specifications. The specifications for aparticular part requires a mean length of 2 inches perpart.

2:2:

1

0

¹=

µµ

HH

17

Simple and Composite Hypothesis

! If Wi contains just a single value of q, then it is said that the hypothesis Hi is a simple hypothesis. --- The distribution of the observations is completely specified.

! E.g. 00 : qq =H

18

! If the set Wi contains more than one value of q, then it is said that the hypothesis Hi is a composite hypothesis. --- It is only specified that the distribution of the observations belongs to a certain class.

! E.g.

010101

0000

:or:or::or:

qqqqqqqqqq

¹><£³

HHHHH

19

The Critical Region! Let S denote the sample space of the random vector X=(X1,…,Xn). A test procedure is specified by partitioning the sample space S into two subsets.• One subset contains the values of X for which H0

will be accepted.• The other subset contains the values of X for

which H0 will be rejected and H1 will be accepted (critical region).

! Determining a test procedure = specifying the critical region of the test.

20

Test Statistics! In most hypothesis testing problems, the critical

region is defined in terms of a test statistic, T=r(X).

Example 8.1.2. Suppose that X=(X1,…,Xn) is a random sample from a normal distribution with mean µ and known variance . We wish to test the hypotheses

It might seem reasonable to reject H0 if is far from µ0. We define , and for each c>0, we create a test procedure dc that rejects H0 if .

2s

01

00

::

µµµµ

¹=

HH

nX|| 0µ-= nXT

cT ³

21

The Power Function

! Let C denote the critical region of the test, then the power function is defined by

! is the probability that the test procedure d will lead to the rejection of H0 .

! is the probability that the test procedure d will lead to the acceptance of H0 .

WÎÎ= qqdqp )|Pr()|( CX

)|( dqp

)|( dqp

)|(1 dqp-

22

! Suppose that a random sample X1,…,Xn is taken from a uniform distribution on the interval [0,q].

! The M.L.E. of q is . Suppose that the critical region of the test d contains all the values of X1,…,Xn for which either or .

Example

9.2£nY

4³nY

1max( , , )n nY X X= L

43:43:

1

0

><££

qqqorH

H

23

! The power function of the test is

ïî

ïí

ì

>-+£<

£=

³+£=

4)/4(1)/9.2(49.2)/9.2(

9.21)|4Pr()|9.2Pr()|(

qqqqq

qqqdqp

nn

n

nn YY

24

! The size a(d) of a given test d is defined by:

a(d) is the maximum probability of type I error .

! d is a level a0 test if and only if

)|(sup)(0

dqpdaq WÎ

=

0)( ada £

Size of the Test

25

! Suppose that a random sample X1,…,Xn is taken from a uniform distribution on the interval [0,q].

! The M.L.E. of q is . Suppose that the critical region of the test d contains all the values of X1,…,Xn for which either or .

Example Revisit

9.2£nY

4³nY

1max( , , )n nY X X= L

43:43:

1

0

><££

qqqorH

H

26

! The power function of the test is

ïî

ïí

ì

>-+£<

£=

³+£=

4)/4(1)/9.2(49.2)/9.2(

9.21)|4Pr()|9.2Pr()|(

qqqqq

qqqdqp

nn

n

nn YY

27

! The size of the test is

! When n=68, the size of d is .

n)()|3()|(sup)( 3029

43===

££dpdqpda

q

0997.0)30/29( 68 =

Lecture 19 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture19.pdf19 The Critical Region!Let...

Documents

Transcript of Lecture 19 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture19.pdf19 The Critical Region!Let...