The smokers’ proportion in H.K. is 40%. How to testify this claim ?

21
The smokers’ proportion in H.K. is 40%. How to testify this claim ?

Transcript of The smokers’ proportion in H.K. is 40%. How to testify this claim ?

Page 1: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

The smokers’ proportion in H.K. is 40%.

How to testify this claim ?

Page 2: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

The smokers’ proportion in H.K. is 40%.

What kind of hypothesis ?

Hypothesis Testing

Statistical hypothesisassumption about parameter of population, e.g.

It’s impractical to investigate the whole population, we just selecta sample and base on the information yielded to perform testing.

This kind of testing is called

Page 3: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

Two kinds of Hypothesis

John: The coin I am holding is fair.May: I don’t believe. I want to test.

Let p = probability that obtaining a head.

We have two kinds of hypothesis.

(1) The coin is fair, p = 0.5. (2) The coin is not fair, p 0.5.

is called the null hypothesis. is called the alternative hypothesis.Denoted as H0 Denoted as H1

“null” means nothing, for this e.g., the coin is fair, nothing special.

H0 : p = 0.5H1 : p 0.5

We write

Page 4: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

Two kinds of Testing

H0 : = 0

H1 : 0

Two-tailed test

H0 : = 0

H1 : > 0

One-tailed test

H0 : = 0

H1 : < 0

One-tailed test

OR

Page 5: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

Decision making

Before testing, choose a level of significance,

For this level of significance,we can find a corresponding critical value zc.

In AL, we need to test involving normal distribution only.

Test statistic z =estimate theof variance

valueparameterestimate 0H

Estimate: a sample parameter to estimate that of population.

e.g. sample mean is an estimate for population mean.

Hence estimate is normally distributed in A.L. in general.

z ~ N(0,1)

What’s the relation between and zc ?

Page 6: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

N(0,1)

zc

N(0,1)

/2

zc

/2

zc

N(0,1)

zc

For One-tailed Test

OR

For Two-tailed Test

If the test statistic lies in the shaded region, we reject H0, otherwise, accept.

H0: = 0

H1: > 0

H0: = 0

H1: < 0

H0: = 0

H1: 0

Page 7: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

Test statistic z =n

X/

The test is OK if

population data has normal distribution and any sample size,

population data has any distribution and sample size is large, (n > 30)

OR

Page 8: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

E.g. 39

Given : I.Q. of children is H.K. ~ N(100,20). A group of 62 children has mean I.Q. = 102.6. Is the sample group more intelligent than population?

Let x be the sample mean, thenGiven (significance level) = 0.05

)6220

,100(~2

Nx

Estimate = 102.6

Test statistic z =62/201006.102

= 1.02

We want to test whether the group is more intelligent or not.The null hypothesis is: the group is nothing special!In symbol,H0: x = 100H1: x > 100The alternative hypothesis is: the group is more intelligent!In symbol,

This is a one-tailed test

N(0,1)

zc

= 0.05

By table, zc = 1.645

Since 1.02 < 1.645,

i.e. The test statistic z doesn’t lie in shaded region!z

Conclusion: we accept H0 at a significance level of 0.05

Caution: we are NOT saying that H0 is true!!

Page 9: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

E.g. 41

A lathe is adjusted so that dimension mean = 20 cm. A sample of 40 is selected and sample mean = 20.1 cm and s.d. = 0.2 cm. Do the results, testat 0.05 significance level, indicate that the machine is out of adjustment ?

H0 : the machine is nothing special, not out of adjustment.

Let x = sample mean, then )402.0

,20(~2

Nx

H0 : x = 20

H1 : the machine is special, it is out of adjustment.H1 : x 20

Estimate = 20.1

Test statistic z =

This is a two-tailed test

N(0,1)

/2

zc

/2

zc

40/2.0201.20

= 3.16

= 0.05

By table, zc = 1.96

Since 3.16 > 1.96,i.e. The test statistic z does lie in shaded region!

z

Conclusion: we reject H0 at a significance level of 0.05

Caution: we are NOT saying that H0 is untrue!!

Page 10: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

x is a random variable, so is the test statistic z. The test statistic z is normally distributed.

Reject H0 H0 is false

Hence z can lie on any region in (,) z lies in the “rejecting region”, either

N(0,1)

/2

zc

/2

zc z

N(,1)

z

You’re lucky! z has small chance to lie inthe region and you gotit. However, the mean= 0 may be true, i.e.H0 may be true!

A

OR

The sample selected is ordinary. z doesn’t lie in extreme regions, indicating that the true mean is (>0) i.e. the mean = 0 is untrue, i.e.H0 is false!

B

In hypothesis testing,we adopt the concept B.

Page 11: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

E.g. 46

In a chemical plant the acid content of the effluent from the factory ismeasured frequently. From 400 measurements the acid content in gramper 100 liters of effluent is recorded in the following frequency distribution.

Acid content 12 13 14 15 16Frequency 5 52 235 74 34

(a) Find the mean acid content and the standard error of the mean.

x34...5

3416...512

= 14.2

s 22

xn

xi 222

2.1434...5

3416...512

= 0.815

(b) Assuming a normal distribution of acid content, give 95% confidence limits for the mean acid content of the effluent.

95% confidence limits for mean aren

sx 96.1

s.e. = n

s0408.0

400815.0

= 14.12 and 14.28

Page 12: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

(c) Is the result consistent with a mean acid content of 14.13 g per 100 litresobtained from tests over several years ?

H0 : = 14.13H1 : 14.13

Estimate = 14.2

Test statistic z =0408.0

13.142.14 = 1.72

= 0.05

This is a two-tailed test

By table, zc = 1.96

Since 1.72 < 1.96,

Conclusion: we accept H0 at a significance level of 0.05

N(0,1)

/2

zc

/2

zc z

Hence at 5% significant level, the result is consistent.Caution: The level of significance must be stated! It’s meaninglessin saying accept or reject H0 alone.

Page 13: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

Test statistic z =npq

pPs

The test is OK if

sample size is large (n 30), and np 10 and nq 10

for small sample size, use binomial.

in calculation of Ps, continuity correction must be made in passing from discrete to continuous variable.

Page 14: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

Continuity Correction

Sample size = nSuccess no. = m

Ps = m/n

H1 : p > p0 adjusted Ps = (m0.5)/n

H1 : p < p0 adjusted Ps = (m+0.5)/n

Believe me, I’ll tell you why later.

Page 15: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

E.g. 48

Last month unemployed rate = 7.1 %. This month, someone discovers 350are unemployed in a random sample of 5000. Has the unemployment decreased this month ? = 0.05

H0 : p = 7.1%H1 : p < 7.1%

Ps = 5000

5.0350 (Continuity correction)

Test statistic z =

npq

pPs

5000)071.01(071.0

071.00701.0

= 0.248

= 0.0701

= 0.05 By table, zc = 1.645 Since 0.248 > 1.645,

we accept H0 at 5% level of significance.

i.e. No change in unemployment rate this month.

N(0,1)

zc z

Page 16: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

E.g. 49

A standard medication reduces pain in 80% of patients treated. A newmedication for the same purpose produces 90 patients relieved among the first 100 tested. Does this new medication relieve more patients thanbefore ? ( = 1%)

H0 : p = 80%H1 : p > 80%

N(0,1)

zcPs =

1005.090

= 0.895

Test statistic z =

npq

pPs

1002.08.08.0895.0

= 2.375

= 0.01 zc = 2.33 z

> zc

Thus, reject H0 at 1% level of significance.

i.e. New medication relieves more patients than before.

Page 17: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

E.g. 53

A manufacturer claims that less than 2% of the women use his birth controlpill suffer from side effects. We have a feeling that this estimate is too low.We decide to test his claim at the 0.01 significance level using a sample of900 randomly selected women. Find the decision rule.

Let p = probability that a randomly selected user has side effect.H0 : p = 2%H1 : p > 2% N(0,1)

zc

= 0.01 zc = 2.33

z =

npq

pPs

90098.002.0

02.0

sP

For rejecting H0, set z > zc,

90098.002.0

02.0

sP> 2.33 z

yielding Ps > 0.0309

0309.0900

5.0 mLet m = no. of “side effect” users in the sample, then

m > 28.3

Hence the rule is: if there are more than 28 “side effect” users, we say that the claimed rate is too low, otherwise, fail to reject the claim.

Page 18: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

E.g. 55

John tells you that he can control the tosses of fair coin. To see if he is rightyou take two ordinary coins and give him one. You toss one and ask him to toss the same thing. You repeat this experiment 18 times and John succeedsin tossing the same as you 15 times. Is John usual ? = 0.05.

Note: the no. of trials = 18 is too small to use normal distribution.Note: we use binomial instead.Let p = probability that tossing the same as you.

H0 : p = 0.5H1 : p > 0.5

P(15 or above success) = (0.5)18[18C15 + 18C16 + 18C17 + 18C18]

= 0.0038 < 0.05

Conclusion: John is usual at 0.05 significance level.

(The chance of having 15 or more success is very small!)

Page 19: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

E.g. 57

A coin which is tossed n times comes up heads n times. What’s the min. n for which we can conclude at 0.01 significance level that the coin is not fair ?

Consider two cases: one-tailed and two-tailed.

H0 : p = 0.5H1 : p > 0.5

One-tailed test

Let p = probability of getting head in each toss.

X = no. of heads in n tosses, then X ~ B(n,0.5)

n heads turn up. And we want to reject H0 at = 0.01

X = n should lie in the rejecting region.

Thus P(X = n) < 0.01

01.021

n

min. n =7.

= 0.01

. . .

nn-1n-2X

P(X)

Similarly, for Two-tailed test, we set201.0

21

n

min. n =8.

Page 20: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

Remarks On Continuity CorrectionFor one-tailed testing about proportion.

H0 : p = p0

H1 : p > p0

We consider sample size n with no. of success X.

To decide whether we reject H0 or not,

Suppose that in a sample, we find m success out of n.

we look into the chance that X is m or above, see whether this chanceis too small. (i.e. less than or not.)

Hence we concern P(X m)X is discrete and if we use continuous variable to approximate it,continuity correction must be carried out.

)5.0

(00

0

qnp

npmzP

)

5.0

(00

0

nqp

pn

m

zP

That’s why we use the adjusted sample proportion (m0.5)/n as the test statistic instead of m/n.

P(X m) =

Try to “prove” the case for H1 : p < p0 on your own.

Page 21: The smokers’ proportion in H.K. is 40%. How to testify this claim ?

Sorry for the hurry lessons and I haven’t got much time on preparing these slides, please read the following on you own:

E.g. 58, 61, 63, 65

Please do the following as your class work/ homework:

3(c) 4, 9, 11, 15, 22, 24, 26, 28, 31, 33, 36, 393(d) 1, 2, 3, 4, 17, 19, 25, 26, 27, 28, 34, 39