Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020...

37
Statistics III July 15, 2020 来嶋 秀治 (Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics interval estimation (区間推定) hypothesis testing (仮説検定) t-test 2 -test 確率統計特論 (Probability & Statistics) Lesson 10

Transcript of Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020...

Page 1: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Statistics III

July 15, 2020

来嶋 秀治 (Shuji Kijima)

Dept. Informatics,

Graduate School of ISEE

Todays topics

• interval estimation (区間推定)

• hypothesis testing (仮説検定)

• t-test

• 2-test

確率統計特論 (Probability & Statistics)

Lesson 10

Page 2: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

1. Interval estimation

Statistical Inference (統計的推定)

point estimation (点推定)

consistent estimation (一致推定)

unbiased estimation (不偏推定)

maximum likelihood (最尤推定)

interval estimation (区間推定)

Page 3: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Statistical inference3

Example 1

A clerk says “our eggs are big. 70[g] in average.”

You bought 6 eggs in a shop.

How large are eggs sold in this shop?

ത𝑋 = 66.3[g], s2 = 17.584[g2]

Is the clerk honest?

1 2 3 4 5 6

weight[g] 64.3 70.4 63.2 67.8 71.3 60.8

Page 4: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Central Limit Theorem (中心極限定理)4

Def.

A series 𝑌𝑛 w/ distribution functions 𝐹𝑛

converges 𝑌 in distribution (𝑌に分布収束する), if

lim𝑛→∞

𝐹𝑛 = 𝐹 where 𝐹 is the distr. func. of 𝑌.

Thm. Central limit theorem

Suppose 𝑋1, … , 𝑋𝑛 are i.i.d., w/ expectation 𝜇, and variance 𝜎2,

then 𝑍𝑛 ≔1

𝑛σ𝑖=1𝑛 𝑋

𝑖−𝜇

𝜎converges to N(0,1) in distribution.

i.e., lim𝑛→∞

Pr 𝑍𝑛 < 𝑧 = −∞

𝑧 1

2𝜋e−

𝑥2

2 d𝑥

𝑍𝑛 ≔1

𝑛

𝑖=1

𝑛𝑋𝑖 − 𝜇

𝜎=

𝑛

𝜎 𝑛

𝑖=1

𝑛𝑋𝑖 − 𝜇

𝑛=

1

𝜎𝑛

𝑋 − 𝜇

Page 5: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Statistical inference5

Example 1

A clerk says “our eggs are big. 70[g] in average.”

ത𝑋 = 66.3[g], s2 = 17.584[g2] for 6 eggs.

Suppose 2=18.0 for simplicity.

Let z* (>0) satisfy

Pr −𝑧∗ ≤𝑋 − 𝜇𝜎𝑛

≤ 𝑧∗ ≥ 0.95

Since central limit theorem,

Pr −𝑧∗ ≤𝑋 − 𝜇𝜎𝑛

≤ 𝑧∗ = න−𝑧∗

𝑧∗ 1

2𝜋𝜎exp −

1

2𝑥2 d𝑥

… and we see that z* = 1.960 (see normal distribution table).

“two-sided 95%

confidence interval”

両側95%信頼区間

Page 6: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Normal distribution6

Wikipedia: Standard normal table

http://en.wikipedia.org/wiki/Normal_distribution

Page 7: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Standard normal table (標準正規分布表)7

Wikipedia: Standard normal table

http://en.wikipedia.org/wiki/Standard_normal_table

Page 8: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Statistical inference8

Example 1

A clerk says “our eggs are big. 70[g] in average.”

ത𝑋 = 66.3[g], s2 = 17.584[g2] for 6 eggs.

Suppose 2=18.0 for simplicity.

ത𝑋 = 66.3[g]

𝑧∗ = 1.960

𝜎2 = 18.0

𝑛 = 6

Pr −𝑧∗ ≤𝑋 − 𝜇𝜎𝑛

≤ 𝑧∗ =

===

= Pr ?≤ 𝜇 ≤?

Page 9: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Statistical inference9

Example 1

A clerk says “our eggs are big. 70[g] in average.”

ത𝑋 = 66.3[g], s2 = 17.584[g2] for 6 eggs.

Suppose 2=18.0 for simplicity.

ത𝑋 = 66.3[g]

𝑧∗ = 1.960

𝜎2 = 18.0

𝑛 = 6

Pr −𝑧∗ ≤𝑋 − 𝜇𝜎𝑛

≤ 𝑧∗ = Pr −𝑧∗𝜎

𝑛≤ 𝑋 − 𝜇 ≤ 𝑧∗

𝜎

𝑛

= Pr −𝑋 − 𝑧∗𝜎

𝑛≤ −𝜇 ≤ −𝑋 + 𝑧∗

𝜎

𝑛

= Pr 𝑋 + 𝑧∗𝜎

𝑛≥ 𝜇 ≥ 𝑋 − 𝑧∗

𝜎

𝑛

= Pr 66.3 + 1.96018

6≥ 𝜇 ≥ 66.3 − 1.960

18

6

= Pr 69.69 ≥ 𝜇 ≥ 62.91

Page 10: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

2. hypothesis testing (仮説検定)

Todays topics

• interval estimation (区間推定)

• hypothesis testing (仮説検定)

• t-test

• 2-test

Page 11: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Hypothesis testing (仮説検定)11

Terminology

• null hypothesis (帰無仮説)

• alternative hypothesis (対立仮説)

Idea

Pr[null hypo is true]

reject the null hypothesis with significant level

(有意水準で帰無仮説を棄却する)

Pr[null hypo is true]

fail to reject the null hypothesis with significant level

(有意水準で帰無仮説を棄却しない)

Page 12: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Statistical inference12

Example 1

A clerk says “our eggs are big. 70[g] in average.”

You bought 6 eggs in a shop.

How large are eggs sold in this shop?

ത𝑋 = 66.3[g], s2 = 17.584[g2]

Is the clerk honest?

1 2 3 4 5 6

weight[g] 64.3 70.4 63.2 67.8 71.3 60.8

Page 13: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Pr −𝑧∗ ≤𝑋 − 𝜇𝜎𝑛

≤ 𝑧∗ =

==

= Pr ?≤ 𝑋 ≤?

Statistical inference13

Example 1

A clerk says “our eggs are big. 70[g] in average.”

ത𝑋 = 66.3[g], s2 = 17.584[g2] for 6 eggs.

Let assume = 70.0 Suppose 2=18.0 for simplicity.

𝜇 = 70

𝑧∗ = 1.960

𝜎2 = 18.0

𝑛 = 6

ത𝑋 = 66.3[g]

Page 14: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Pr −𝑧∗ ≤𝑋 − 𝜇𝜎𝑛

≤ 𝑧∗ = Pr −𝑧∗𝜎

𝑛≤ 𝑋 − 𝜇 ≤ 𝑧∗

𝜎

𝑛

= Pr 𝜇 − 𝑧∗𝜎

𝑛≤ 𝑋 ≤ 𝜇 + 𝑧∗

𝜎

𝑛

= Pr 70 − 1.96018

6≤ 𝑋 ≤ 70 + 1.960

18

6

= Pr 66.6 ≤ 𝑋 ≤ 73.4

Statistical inference14

Example 1

A clerk says “our eggs are big. 70[g] in average.”

ത𝑋 = 66.3[g], s2 = 17.584[g2] for 6 eggs.

Let assume = 70.0 Suppose 2=18.0 for simplicity.

It rejects the null hypothesis = 70.0 with significant level 5%

(帰無仮説 = 70.0 は有意水準5%で棄却される.)

𝜇 = 70

𝑧∗ = 1.960

𝜎2 = 18.0

𝑛 = 6

ത𝑋 = 66.3[g]

Page 15: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Exercise15

Example 2

The scores of an examination.

How much ratio do they understand?

student 1 2 3 4 5 6 7 8 9 10

score 72 89 64 52 96 64 70 83 56 70

Q1. Compute the two-sided 95% confidence interval

Q2. Discuss the null hypothesis “the expectation is 80”

with significance level 5%?

𝑋 = 71.6, 𝜎2 ≃ 200 (unbiased variance)

Page 16: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

2. t distribution, 2 distribution

Todays topics

• interval estimation (区間推定)

• hypothesis testing (仮説検定)

• t-test

• 2-test

Page 17: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Statistical inference17

Example 1

A clerk says “our eggs are big. 70[g] in average.”

You bought 6 eggs in a shop.

How large are eggs sold in this shop?

ത𝑋 = 66.3[g], s2 = 17.584[g2]

Is the clerk honest?

1 2 3 4 5 6

weight[g] 64.3 70.4 63.2 67.8 71.3 60.8

Page 18: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Student’s t-statistics (スチューデントのt統計量)18

Let assume = 70.0

Let 𝑡: =ത𝑋−𝜇𝑠

𝑛

,

where 𝑠2 ≔σ𝑖=1𝑛 𝑋𝑖− ത𝑋 2

𝑛−1(unbiased estimator of 2).

𝑍𝑛 ≔ത𝑋−𝜇𝜎

𝑛

in Cent. limit. Thm.

Question

Does t follow N(0,1), in a similar way as Z?

Example 1

A clerk says “our eggs are big. 70[g] in average.”

ത𝑋 = 66.3[g], s2 = 17.584[g2] for 6 eggs.

Suppose 2=18.0 for simplicity.

Page 19: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Student’s t-statistics (スチューデントのt統計量)19

Question

Does 𝑡 follow N(0,1), in a similar way as 𝑧?

𝑡 =𝜎

𝑠𝑍 =

1

𝑠2

𝜎2

𝑍 =1

1𝜎2

⋅σ𝑖=1𝑛 𝑋𝑖 − 𝑋

2

𝑛 − 1

𝑍 =1

1𝑛 − 1

σ𝑖=1𝑛 𝑋𝑖 − 𝑋

𝜎

2

𝑍

Let 𝑡 =ത𝑋−𝜇𝑠

𝑛

and 𝑍 =ത𝑋−𝜇𝜎

𝑛

where 𝑠2 ≔σ𝑖=1𝑛 𝑋𝑖− ത𝑋 2

𝑛−1(unbiased estimator of 2).

Page 20: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

t-distribution and 2distribution20

Prop. 1.

σ𝑖=1𝑛 𝑋𝑖−𝑋

𝜎

2

follows the 𝜒2-distribution with 𝑛 − 1 degrees.

Prop. 2.

𝑋1, … , 𝑋𝑛 ∼ N 0,1 , independently.

Let 𝑌:= 𝑋12 +⋯+ 𝑋𝑛

2, then 𝑌 follows Ga1

2,𝑛

2.

𝜒2-distribution

with 𝑛 degrees

of freedom

𝑡 =𝜎

𝑠𝑍 =

1

𝑠2

𝜎2

𝑍 =1

1𝜎2

⋅σ𝑖=1𝑛 𝑋𝑖 − 𝑋

2

𝑛 − 1

𝑍 =1

1𝑛 − 1

σ𝑖=1𝑛 𝑋𝑖 − 𝑋

𝜎

2

𝑍

Page 21: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Idea of Prop. 1 (Not a sketch of proof)21

𝑖=1

𝑛𝑋𝑖 − 𝑋

𝜎

2

=

𝑖=1

𝑛 𝑋𝑖 − 𝜇 − 𝑋 − 𝜇

𝜎

2

=1

𝜎2

𝑖=1

𝑛

𝑋𝑖 − 𝜇 2 − 2 𝑋𝑖 − 𝜇 𝑋 − 𝜇 + 𝑋 − 𝜇2

=

𝑖=1

𝑛𝑋𝑖 − 𝜇

𝜎

2

− 2 𝑋 − 𝜇σ𝑖=1𝑛 𝑋𝑖 − 𝜇

𝜎2+ 𝑛

𝑋 − 𝜇

𝜎

2

=

𝑖=1

𝑛𝑋𝑖 − 𝜇

𝜎

2

− 2𝑛𝑋 − 𝜇

𝜎

2

+ 𝑛𝑋 − 𝜇

𝜎

2

=

𝑖=1

𝑛𝑋𝑖 − 𝜇

𝜎

2

− 𝑛𝑋 − 𝜇

𝜎

2

=

𝑖=1

𝑛𝑋𝑖 − 𝜇

𝜎

2

−𝑋 − 𝜇𝜎𝑛

2

Rem. if 𝑋 ∼ N 𝜇, 𝜎2 then 𝑋 − 𝜇

𝜎∼ N(0,1)

Rem. if 𝑋 ∼ N 𝜇, 𝜎2 then

𝑋 ∼ N 𝜇,𝜎2

𝑛

Page 22: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

t-distribution and 2distribution [William Gosset]22

Prop. 3.

𝑋 ∼ N(0,1), 𝑌 ∼ Ga1

2,𝑛

2, independently.

Then, 𝑋

𝑌

𝑛

follows

𝑓 𝑥 =Γ

𝑛 + 12

𝑛𝜋 Γ𝑛2

1 +𝑥2

𝑛

−𝑛+12

−∞ < 𝑥 < ∞ .

𝑡-distribution

with 𝑛 degrees

of freedom

𝑡 = 𝑍𝜎

𝑠=

𝑍

𝑠2

𝜎2

=𝑍

1𝜎2

⋅σ𝑖=1𝑛 𝑋𝑖 − 𝑋

2

𝑛 − 1

=𝑍

1𝑛 − 1

σ𝑖=1𝑛 𝑋𝑖 − 𝑋

𝜎

2

Page 23: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

t-distribution and 2distribution [William Gosset]23

𝑡 = 𝑍𝜎

𝑠=

𝑍

𝑠2

𝜎2

=𝑍

1𝜎2

⋅σ𝑖=1𝑛 𝑋𝑖 − 𝑋

2

𝑛 − 1

=𝑍

1𝑛 − 1

σ𝑖=1𝑛 𝑋𝑖 − 𝑋

𝜎

2

Thm.

𝑡 follows the 𝑡-distribution with 𝑛 − 1 degrees, i.e.,

𝑓𝑡 𝑥 =Γ

𝑛2

(𝑛 − 1)𝜋 Γ𝑛 − 12

1 +𝑥2

𝑛 − 1

−𝑛2

−∞ < 𝑥 < ∞

Page 24: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Student’s t distribution24

Wikipedia: Student’s t distribution

http://en.wikipedia.org/wiki/Student%27s_t-distribution

Page 25: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

2 分布25

Wikipedia: Chi-squared distribution

http://en.wikipedia.org/wiki/Chi-squared_distribution

Page 26: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

t-test (t検定)

Todays topics

• interval estimation (区間推定)

• hypothesis testing (仮説検定)

• t-test

• 2-test

estimation of (expect.)

Page 27: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Statistical inference27

Example 1

A clerk says “our eggs are big. 70[g] in average.”

You bought 6 eggs in a shop.

How large are eggs sold in this shop?

ത𝑋 = 66.3[g], s2 = 17.584[g2]

Is the clerk honest?

1 2 3 4 5 6

weight[g] 64.3 70.4 63.2 67.8 71.3 60.8

Page 28: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

𝑡-test (𝑡検定)28

𝑡-test

Given samples 𝑋1 = 𝑎1, … , 𝑋𝑛 = 𝑎𝑛.

Q: Does a value 𝑏 estimate E[𝑋]?

Claim

If 1 < 𝛼 it rejects E 𝑋 = 𝑏

If 1 ≥ 𝛼 it fails to reject E 𝑋 = 𝑏

Since 𝑡: =ത𝑋−𝜇𝑠

𝑛

follows t distribution with degree n-1,

Pr null hypo. : E 𝑋 = 𝑏 = Pr 𝑋 − 𝑏 ≥ 𝑎 − 𝑏 ∣ 𝐸 𝑋 = 𝑏

= න−∞

−𝑎−𝑏

𝑠2/𝑛𝑓𝑡 𝑥 𝑑𝑥 + න

𝑎−𝑏

𝑠2/𝑛

𝑓𝑡 𝑥 d𝑥 (1)

Page 29: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Student’s t-statistics (スチューデントのt統計量)29

Example 1

A clerk says “our eggs are big. 70[g] in average.”

ത𝑋 = 66.3[g], s2 = 17.584[g2] for 6 eggs.

Let assume = 70.0 Suppose 2=18.0 for simplicity.

Let 𝑡: =ത𝑋−𝜇𝑠

𝑛

,

where 𝑠2 ≔σ𝑖=1𝑛 𝑋𝑖− ത𝑋 2

𝑛−1(unbiased estimator of 𝜎2).

Then 𝑡, follows t distribution with degree 𝑛 − 1

𝑓𝑡 𝑥 =Γ

𝑛 + 12

𝑛𝜋 Γ𝑛2

1 +𝑥2

𝑛

−𝑛+12

−∞ < 𝑥 < ∞ .

𝑍𝑛 ≔ത𝑋−𝜇𝜎

𝑛

in Cent. limit. Thm.

Page 30: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Statistical inference30

Example 1

A clerk says “our eggs are big. 70[g] in average.”

ത𝑋 = 66.3[g], s2 = 17.584[g2] for 6 eggs.

Let assume = 70.0 Suppose 2=18.0 for simplicity.

Let 𝑡∗ (>0) satisfy

Pr −𝑡∗ ≤𝑋 − 𝜇𝑠𝑛

≤ 𝑡∗ = න−𝑡∗

𝑡∗

𝑓𝑡(𝑥)d𝑥 ≥ 0.95

… and we see that 𝑡∗ = 2.571 (see 𝑡-distribution table).

Page 31: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Statistical inference31

Example 1

A clerk says “our eggs are big. 70[g] in average.”

ത𝑋 = 66.3[g], s2 = 17.584[g2] for 6 eggs.

Let assume = 70.0 Suppose 2=18.0 for simplicity.

ത𝑋 = 66.3[g]

2=17.584

n = 6

z*=2.571

It fails to reject the null hypothesis = 70.0

with significant level 5%

(帰無仮説 = 70.0 は有意水準5%で棄却されない.)

𝑋 − 𝜇

𝑠2

𝑛

=66.3 − 70

17.5846

= 2.161 < 𝑡∗ = 2.571

Page 32: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

2-test (2検定)

Todays topics

• interval estimation (区間推定)

• hypothesis testing (仮説検定)

• t-test

• 2-test

estimation of 2 (variance.)

Page 33: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

2-test (2検定)33

2-test

Given samples 𝑋1 = 𝑎1, …𝑋𝑛 = 𝑎𝑛.

Q: Does a value 𝑐2 estimate Var[𝑋]?

Claim

If 2 < 𝛼 it rejects Var 𝑋 = c2

If 2 ≥ 𝛼 it fails to reject Var 𝑋 = c2

Since 𝑆:= σ𝑖=1𝑛 (𝑋𝑖− ത𝑋)2

𝜎2follows

2 distribution with n-1 degrees of freedom,

Pr null hypothesis: Var 𝑋 = 𝑐2 = Pr 𝑆 ≥ 𝑐2 ∣ Var 𝑋 = 𝑐2

= න𝑐2

𝑓𝜒2 𝑥 d𝑥 (2)

Page 34: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

2 分布34

Wikipedia: Chi-squared distribution

http://en.wikipedia.org/wiki/Chi-squared_distributionreject

Page 35: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

2-test (2検定) Example35

2-test

Suppose the sample variance of weights of 10 balls is 0.35.

Is this smaller than the prescribed value 0.2?

Discuss with significant level 5%

Claim

It fails to reject the null hypothesis with significant level 5%.

(有意水準5%で帰無仮説は棄却されない)

𝑆:=

𝑖=1

𝑛(𝑋𝑖− ത𝑋)2

𝜎2=

𝑛 − 1 𝑠2

𝜎2=

10 − 1 × 0.35

0.2= 15.75 < 16.919

right 5%null hypothesis (帰無仮説)

Var[X] 0.2

Page 36: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

2-test (2検定) Example36

2-test

Suppose the sample variance of weights of 100 balls is 0.26.

Is this smaller than the prescribed value 0.2?

Discuss with significant level 5%

𝑆:=

𝑖=1

𝑛(𝑋𝑖− ത𝑋)2

𝜎2=

𝑛 − 1 𝑠2

𝜎2=

100 − 1 × 0.26

0.2= 128.7 > 124.34

null hypothesis (帰無仮説)

Var[X] 0.2

Claim

It rejects the null hypothesis with significant level 5%.

(有意水準5%で帰無仮説は棄却される)

right 5%

Page 37: Statistics IIItcs.inf.kyushu-u.ac.jp/~kijima/GPS20/GPS20-10.pdfStatistics III July 15, 2020 来嶋秀治(Shuji Kijima) Dept. Informatics, Graduate School of ISEE Todays topics •interval

Statistical Hypothesis Testing37

z-test: normal distribution

t-test: 𝑡 distribution, such as expectation

2-test: 2 distribution, such as variance

F-test: 𝐹 distribution, such as ratio of variance