Post on 17-Dec-2015
Mar. 3 Statistic for the day: Average number of pieces of
mail that end up in the dead letter box each year: 57,100,000
Assignment:Assignment:
Read Chapter 17Read Chapter 17
Exercises p. 309-311: 1, 4, 7, 10, 15Exercises p. 309-311: 1, 4, 7, 10, 15
These slides were created by Tom Hettmansperger and in some cases modified by David Hunter
Shuffle two decks of cards.
Stack the two decks side-by-side, face down Stack the two decks side-by-side, face down next to each other.next to each other.
One by one, flip over one card from each One by one, flip over one card from each deck.deck.
I bet I see at least one match. Do you want I bet I see at least one match. Do you want to bet against me?to bet against me?
Probability of no match: Probability of no match on 1Probability of no match on 1stst flip: 51/52 flip: 51/52 Probability of no match on 2Probability of no match on 2ndnd flip: 51/52 flip: 51/52 …… Probability of no match on 52Probability of no match on 52ndnd flip: 51/52 flip: 51/52
These events are NOT independent; however, they are APPROXIMATELY independent
because, say, whether a match occurs on the 36th flip doesn’t influence whether a match occurs on the 47th flip very strongly.
52Thus, Pr(no match) (51/52) .364
Efron Dice
333333D
662222C
555111B
444400A
Side value
4
4
4
4
0
0
555111
Die B
Die A
4
4
4
4
0
0
555111
Die A
Die B
Pr(B beats A) = (12 + 12)/36 = 24/36 = 2/3
5
5
5
1
1
1
662222
Die B
Die C
Pr(C beats B) = (18 + 6)/36 = 24/36 = 2/3
6
6
2
2
2
2
333333
Die D
Die C
Pr(D beats C) = 24/36 = 2/3
3
3
3
3
3
3
444400
Die A
Die D
Pr( A beats D ) = 24/36 = 2/3
333333D
662222C
555111B
444400A
Side value
Pr( B beats A ) = 2/3
Pr( C beats B ) = 2/3
Pr( D beats C ) = 2/3
Pr (A beats D ) = 2/3
Hence, there is NO best die! You can always pick a winner if you pick second.
Percent tables and count tables
A stratified population is one that is divided intomutually exclusive subgroups and the subgroupsexhaust all members of the population.
Cancer testing: confusion of the inverseSuppose we have a cancer test for a certain type of cancer.
Sensitivity of the test:If you have cancer then the probability of a positive testis .98. Pr(+ given you have C) = .98
Specificity of the test:If you do not have cancer then the probability of a negativetest is .95. Pr(- given you do not have C) = .95
Base rate:The percent of the population who has the cancer. This is the probability that someone has C.Suppose for our example it is 1%. Hence, Pr(C) = .01.
++
PositivePositive
--
NegativeNegative
CC
(Cancer)(Cancer)
.98.98 .02.02 .01.01
no Cno C
(no Cancer)(no Cancer)
.05.05 .95.95 .99.99
Sensitivity
Specificity
BaseRate
Percent table
Suppose you go in for a test and it comes back positive.What is the probability that you have cancer?
false positive false negative
Count table from a percent table
++ --
CC .98.98 .02.02 .01.01
no Cno C .05.05 .95.95 .99.99
++ --
CC 9898 22 100100
no Cno C 495495 94059405 9,9009,900
593593 94079407 10,00010,000
Pr(C given a + test) = 98/593 = .165
Do you have a tattoo?
What is the probability that a randomly chosenperson from the class will say yes?
Rows: Sex Columns: Tattoo No Yes All Female 105 31 136 Male 85 15 100 All 190 46 236
Need a count table to estimate the probabilities:
Rows: Sex Columns: Tattoo No Yes All Female 77.21 22.79 100.00 Male 85.00 15.00 100.00 All 80.51 19.49 100.00
Percent table:
Pr(yes) = 46/236 = .1949
Pr(yes given the person is a female) = .2279Pr(yes given the person is a male) = .1500
Are the events ‘yes’ and ‘female’ independent?
Pr(no given the person is female) = .7721
Pr(no given the person is a male) = .8500
Suppose I tell you that a stat100 student came into office hours and they said that they did not have a tattoo.
Which is more likely:
•The student was female.•The student was male.
Rows: Sex Columns: Tattoo No Yes All Female 105 31 136 Male 85 15 100 All 190 46 236
Pr(female given the student said no) = 105/190 = .553
Pr(male given the student said no) = 85/190 = .447
More likely that the student is a female!
Rows: Sex Columns: Tattoo No Yes All Female 105 31 136 Male 85 15 100 All 190 46 236
Pr(yes) = 46/236 = .195 Pr(no) = 190/236 = .805Pr(female) = 136/236 = .576Pr(male) = 100/236 = .424Pr(yes given the student is a female) = 31/136 = .228Pr(yes given the student is a male) = 15/100 = .150Pr(no given the student is a female) = 105/136 = .772Pr(no given the student is a male) = 85/100 = .850Pr(female given the student said yes) = 31/46 = .674Pr(male given the student said yes) = 15/46 = .326Pr(female given the student said no) = 105/190 = .553Pr(male given the student said no) = 85/190 = .447
The count table gives the ability to calculate everything. If you have a percent table, you should create a count table.
Rows: Sex Columns: Tattoo No Yes All Female 77.21 22.79 100.00 Male 85.00 15.00 100.00 All 80.51 19.49 100.00
Note: It’s not always possible to reconstruct a representative count table. In the above, you can’t do it unless you also know the percentage of females.
The count table gives the ability to calculate everything. If you have a percent table, you should create a count table.
Rows: Sex Columns: Tattoo No Yes All Female 77.21 22.79 100.00 Male 85.00 15.00 100.00 All 80.51 19.49 100.00
Also, 57.63% are females
NN YY
FF 44504450 13131313 57635763
MM 636636 36013601 42374237
10000 (arbitrary)10000 (arbitrary)
Leads to: