Section 10.2

19
SECTION 10.2 Independence Larson/Farber 4th ed 1

description

Section 10.2. Independence. Section 10.2 Objectives. Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent. Contingency Tables. r  c contingency table Shows the observed frequencies for two variables. - PowerPoint PPT Presentation

Transcript of Section 10.2

Page 1: Section 10.2

SECTION 10.2

Independence

Larson/Farber 4th ed 1

Page 2: Section 10.2

Section 10.2 Objectives

Larson/Farber 4th ed

2

Use a contingency table to find expected frequencies

Use a chi-square distribution to test whether two variables are independent

Page 3: Section 10.2

Contingency Tables

Larson/Farber 4th ed

3

r c contingency table Shows the observed frequencies for two

variables. The observed frequencies are arranged

in r rows and c columns. The intersection of a row and a column is

called a cell.

Page 4: Section 10.2

Contingency Tables

Larson/Farber 4th ed

4 Example: The contingency table shows the results of

a random sample of 550 company CEOs classified by age and size of company.(Adapted from Grant Thornton LLP, The Segal Company)

Age

Company size

39 and

under

40 - 49

50 - 59

60 - 69

70 and over

Small / Midsize

42 69 108 60 21

Large 5 18 85 120 22

Page 5: Section 10.2

Finding the Expected Frequency

Larson/Farber 4th ed

5

Assuming the two variables are independent, you can use the contingency table to find the expected frequency for each cell.

The expected frequency for a cell Er,c in a contingency table is

,(Sum of row ) (Sum of column )Expected frequency

Sample sizer cr cE

Page 6: Section 10.2

Example: Finding Expected Frequencies

Larson/Farber 4th ed

6

Find the expected frequency for each cell in the contingency table. Assume that the variables, age and company size, are independent.

Age

Company size

39 and

under

40 - 49

50 - 59

60 - 69

70 and over

Total

Small / Midsize

42 69 108 60 21 300

Large 5 18 85 120 22 250

Total 47 87 193 180 43 550marginal totals

Page 7: Section 10.2

Solution: Finding Expected Frequencies

Larson/Farber 4th ed

7

Age

Company size

39 and

under

40 - 49

50 - 59

60 - 69

70 and over

Total

Small / Midsize

42 69 108 60 21 300

Large 5 18 85 120 22 250

Total 47 87 193 180 43 550

,(Sum of row ) (Sum of column )

Sample sizer cr cE

1,1

300 4725.64

550E

Page 8: Section 10.2

Solution: Finding Expected Frequencies

Larson/Farber 4th ed

8

Age

Company size

39 and

under

40 - 49

50 - 59

60 - 69

70 and over

Total

Small / Midsize

42 69 108 60 21 300

Large 5 18 85 120 22 250

Total 47 87 193 180 43 550

1,2

300 8747.45

550E

1,3

300 193105.27

550E

1,4

300 18098.18

550E

1,5

300 4323.45

550E

1,2

300 8747.45

550E

1,3

300 193105.27

550E

1,4

300 18098.18

550E

Page 9: Section 10.2

Solution: Finding Expected Frequencies

Larson/Farber 4th ed

9

Age

Company size

39 and

under

40 - 49

50 - 59

60 - 69

70 and over

Total

Small / Midsize

42 69 108 60 21 300

Large 5 18 85 120 22 250

Total 47 87 193 180 43 550

2,2

250 8739.55

550E

2,4

250 18081.82

550E

2,5

250 4319.55

550E

2,1

250 4721.36

550E

2,3

250 19387.73

550E

Page 10: Section 10.2

Chi-Square Independence Test

Larson/Farber 4th ed

10

Chi-square independence test Used to test the independence of two

variables. Can determine whether the occurrence

of one variable affects the probability of the occurrence of the other variable.

Page 11: Section 10.2

Chi-Square Independence Test

Larson/Farber 4th ed

11

For the chi-square independence test to be used, the following must be true.1.The observed frequencies must be obtained by using a random sample.2.Each expected frequency must be greater than or equal to 5.

Page 12: Section 10.2

Chi-Square Independence Test

Larson/Farber 4th ed

12 If these conditions are satisfied, then the sampling

distribution for the chi-square independence test is approximated by a chi-square distribution with (r – 1)(c – 1) degrees of freedom, where r and c are the number of rows and columns, respectively, of a contingency table.

The test statistic for the chi-square independence test is

where O represents the observed frequencies and E represents the expected frequencies.

22 ( )O E

E The test is always a

right-tailed test.

Page 13: Section 10.2

Chi-Square Independence Test

Larson/Farber 4th ed

13

1. Identify the claim. State the null and alternative hypotheses.

2. Specify the level of significance.

3. Identify the degrees of freedom.

4. Determine the critical value.

State H0 and Ha.

Identify .

Use Table 6 in Appendix B.

d.f. = (r – 1)(c – 1)

In Words In Symbols

Page 14: Section 10.2

Chi-Square Independence Test

Larson/Farber 4th ed

14

22 ( )O E

E

If χ2 is in the rejection region, reject H0. Otherwise, fail to reject H0.

5. Determine the rejection region.

6. Calculate the test statistic.

7. Make a decision to reject or fail to reject the null hypothesis.

8. Interpret the decision in the context of the original claim.

In Words In Symbols

Page 15: Section 10.2

Example: Performing a χ2 Independence Test

Larson/Farber 4th ed

15

Using the age/company size contingency table, can you conclude that the CEOs ages are related to company size? Use α = 0.01. Expected frequencies are shown in parentheses.

Age

Company size

39 and

under

40 - 49

50 - 59

60 - 69

70 and over

Total

Small / Midsize

42(25.64

)

69(47.45

)

108(105.2

7)

60(98.18

)

21(23.45

)

300

Large5

(21.36)

18(39.55

)

85(87.73

)

120(81.82

)

22(19.55

)

250

Total 47 87 193 180 43 550

Page 16: Section 10.2

Solution: Performing a Goodness of Fit Test

Larson/Farber 4th ed

16

• H0:

• Ha:

• α =

• d.f. =

• Rejection Region

• Test Statistic:

• Decision:

0.01

(2 – 1)(5 – 1) = 4

0.01

χ2

0 13.277

CEOs’ ages are independent of company size

CEOs’ ages are dependent on company size

Page 17: Section 10.2

Solution: Performing a Goodness of Fit Test

Larson/Farber 4th ed

17

2 2 2 2 2

2 2 2 2 2

(42 25.64) (69 47.45) (108 105.27) (60 98.18) (21 23.45)

25.64 47.45 105.27 98.18 23.45

(5 21.36) (18 39.55) (85 87.73) (120 81.82) (22 19.55)

21.36 39.55 87.73 81.82 19.5577.9

22 ( )O E

E

Page 18: Section 10.2

Solution: Performing a Goodness of Fit Test

Larson/Farber 4th ed

18

• H0:

• Ha:

• α =

• d.f. =

• Rejection Region

• Test Statistic:

• Decision:

0.01

(2 – 1)(5 – 1) = 4

0.01

χ2

0 13.277

CEOs’ ages are independent of company size

CEOs’ ages are dependent on company size

χ2 = 77.9

There is enough evidence to conclude CEOs’ ages are dependent on company size.

77.9

Reject H0

Page 19: Section 10.2

Section 10.2 Summary

Larson/Farber 4th ed

19

Used a contingency table to find expected frequencies

Used a chi-square distribution to test whether two variables are independent