Lecture 4: The binomial distribution - Oxford Statistics · Lecture 4: The binomial distribution...
Transcript of Lecture 4: The binomial distribution - Oxford Statistics · Lecture 4: The binomial distribution...
Lecture 4: The binomial distribution
4th of November 2015
Lecture 4: The binomial distribution 4th of November 2015 1 / 26
Combination and permutation (Recapitulatif)
Consider 7 students applying to a college for 3 places:
Abi Ben Claire Dave Emma Frank Gail
How many ways are there of choosing 3 students from 7 when
1 order is important (permutation)i.e. (Abi, Dave)6= (Dave, Abi)
2 in not important (combination)i.e. (Abi, Dave)= (Dave, Abi)
Lecture 4: The binomial distribution 4th of November 2015 2 / 26
Permutations
There are 3 places to fill at the college and 7 students applied.
There will be 7× 6× 5 permutations:
7× 6× 5 =7× 6× 5× 4× 3× 2× 1
4× 3× 2× 1=
7!
4!= 7P3 .
In general, the number of permutations of r objects from n is
nPr =n!
(n− r)!.
Lecture 4: The binomial distribution 4th of November 2015 3 / 26
Combinations
For every combination of 3 students, there will be 3! = 6 permutations.For example, the combination Dave, Claire and Abi gives rise to thepermutations
ACD ADC CAD CDA DAC DCA
If we are not interested in the order, the number of combinations of 3students among 7 is then
7C3 =7P3
3!.
In general, the number of combinations of r objects from n is
nCr =nPr
r!=
n!
(n− r)!r!
Lecture 4: The binomial distribution 4th of November 2015 4 / 26
An example of the Binomial distribution
An unfair coin: P (Head) = 2/3 and P (Tail) = 1/3
Let X = No. of heads observed in 5 coin tosses
X can take on any of the values 0, 1, 2, 3, 4, 5
X is a discrete random variable
Some values of X will be more likely to occur than others. Each value of Xwill have a probability of occurring. What are these probabilities?
Lecture 4: The binomial distribution 4th of November 2015 5 / 26
What is P (X = 1)?
One possible way of observing Head once is if we observe the pattern
HTTTT.
The probability of obtaining this pattern is
P(HTTTT) = 23 ×
13 ×
13 ×
13 ×
13
Lecture 4: The binomial distribution 4th of November 2015 6 / 26
There are 32 possible patterns of Head and Tails we might observe.
HHHHH THHHH HTHHH HHTHH HHHTH HHHHTTTHHH THTHH THHTH THHHT HTTHH HTHTHHTHHT HHTTH HHTHT HHHTT TTTHH TTHTHTTHHT THTTH THTHT THHTT HTTTH HTTHT
HTHTT HHTTT HTTTT THTTT TTHTT TTTHT
TTTTH TTTTT
Five of the patterns contain just one Head.
The other 5 possible combinations all have the same probability so theprobability of obtaining one head in 5 coin tosses is
P(X = 1) = 5×(23 × (13)
4)≈ 0.0412
Lecture 4: The binomial distribution 4th of November 2015 7 / 26
What about P (X = 2)?
This probability can be written as
P (X = 2) = No. of patterns × Probability of pattern
= 5C2 ×(23
)2×
(13
)3
= 10 × 4
243≈ 0.165
In general, the probability to observe x Head (and 5− x Tail) is
P (X = x) = 5Cx ×(23
)x×(13
)(5−x)
Lecture 4: The binomial distribution 4th of November 2015 8 / 26
We can use this formula to tabulate the probabilities of each possible valueof X.
P(X = 0) = 5C0 ×(23
)0×(13
)5≈ 0.0041
P(X = 1) = 5C1 ×(23
)1×(13
)4≈ 0.0412
P(X = 2) = 5C2 ×(23
)2×(13
)3≈ 0.1646
P(X = 3) = 5C3 ×(23
)3×(13
)2≈ 0.3292
P(X = 4) = 5C4 ×(23
)4×(13
)1≈ 0.3292
P(X = 5) = 5C5 ×(23
)5×(13
)0≈ 0.1317
Lecture 4: The binomial distribution 4th of November 2015 9 / 26
Distribution of probabilities across the possible values of X.
0 1 2 3 4 5
0.0
0.1
0.2
0.3
0.4
0.5
X
P(X)
This situation is a specific example of a Binomial distribution.
Lecture 4: The binomial distribution 4th of November 2015 10 / 26
Key components of the binomial distribution
In general a Binomial distribution arises when we have the following 4conditions:
- n identical trials, e.g. 5 coin tosses
- 2 possible outcomes for each trial “success” and “failure”,e.g. Heads or Tails
- Trials are independent, e.g. each coin toss doesn’taffect the others
- P(“success”) = p is the same for each trial,e.g. P(Head) = 2/3 is the same for each trial
Lecture 4: The binomial distribution 4th of November 2015 11 / 26
The binomial distribution
If we have the above 4 conditions then if we let
X = No. of “successes”
then the probability of observing x successes out of n trials is given by
P(X = x) = nCx px(1− p)(n−x) x = 0, 1, . . . , n
If the probabilities of X are distributed in this way, we write
X∼Bin(n, p)
n and p are called the parameters of the distribution. We say X follows abinomial distribution with parameters n and p.
Lecture 4: The binomial distribution 4th of November 2015 12 / 26
Example 1
Suppose X ∼ Bin(10, 0.4), what is P(X = 7)?
Here we have: n = 10, p = 0.4, x = 7,
P(X = 7) = 10C7(0.4)7(1− 0.4)(10−7)
= (120)(0.4)7(0.6)3
≈ 0.0425
Lecture 4: The binomial distribution 4th of November 2015 13 / 26
Example 2
Suppose Y ∼ Bin(8, 0.15), what is P(Y < 3)?
Here we have: n = 8, p = 0.15,
P(Y < 3) = P(Y = 0) + P(Y = 1) + P(Y = 2)
= 8C0(0.15)0(0.85)8 + 8C1(0.15)
1(0.85)7 + 8C2(0.15)2(0.85)6
≈ 0.2725 + 0.3847 + 0.2376
≈ 0.8948
Note that 1− p = 0.85.
Lecture 4: The binomial distribution 4th of November 2015 14 / 26
Example 3
Suppose W ∼ Bin(50, 0.12), what is P(W > 2)?
Here we have: n = 50, p = 0.12,
P(W > 2) = P(W = 3) + P(W = 4) + . . .+ P(W = 50)
= 1− P(W ≤ 2)
= 1−(P(W = 0) + P(W = 1) + P(W = 2)
)= 1−
(50C0(0.12)
0(0.88)50 + 50C1(0.12)1(0.88)49
+50C2(0.12)2(0.88)48
)≈ 1−
(0.00168 + 0.01142 + 0.03817
)≈ 0.94874
Note that 1− p = 0.88.
Lecture 4: The binomial distribution 4th of November 2015 15 / 26
Different values of n and p lead to different distributions with differentshapes:
0 2 4 6 8 10
0.0
0.1
0.2
0.3
0.4
0.5
n=10 p=0.5
X
P(X)
0 2 4 6 8 10
0.0
0.1
0.2
0.3
0.4
0.5
n=10 p=0.1
X
P(X)
0 2 4 6 8 10
0.0
0.1
0.2
0.3
0.4
0.5
n=10 p=0.7
XP(X)
Lecture 4: The binomial distribution 4th of November 2015 16 / 26
Expected mean and expected standarddeviation
We have seen in the first lecture that the sample mean andstandard deviation can be used to summarize the shape of a dataset.
In the case of a probability distribution we have no data as such so wemust use the probabilities to calculate the expected mean andstandard deviation.
Lecture 4: The binomial distribution 4th of November 2015 17 / 26
Example: X ∼ Bin(5, 2/3)
Consider the example of the Binomial distribution we saw above
x 0 1 2 3 4 5
P(X = x) 0.004 0.041 0.165 0.329 0.329 0.132
The expected mean value of the distribution, denoted µ can be calculatedas
µ = 0× (0.004) + 1× (0.041) + 2× (0.165) + 3× (0.329)
+4× (0.329) + 5× (0.132)
= 3.333
Lecture 4: The binomial distribution 4th of November 2015 18 / 26
Expected mean and expected standarddeviation
In general, there is a formula for the mean of a Binomial distribution.There is also a formula for the standard deviation, σ.
If X ∼ Bin(n, p) then
µ = np
σ =√npq where q = 1− p
In the example above, X ∼ Bin(5, 2/3) and so the mean and standarddeviation are given by
µ = np = 5× (2/3) = 3.333
andσ =√npq = 5× (2/3)× (1/3) = 1.111
Lecture 4: The binomial distribution 4th of November 2015 19 / 26
Testing a hypotheses using the Binomialdistribution – An example
Consider the following simple situation:
You have a six-sided die, and you have the impression that it’s somehowbeen weighted so that the number 1 comes up more frequently than itshould.
How would you decide whether this impression is correct?
Lecture 4: The binomial distribution 4th of November 2015 20 / 26
You could do a careful experiment, where you roll the die 60 times, andcount how often the 1 comes up.
Suppose you do the experiment, and the 1 comes up 30 times (and othernumbers come up 30 times all together).
If the die is unbiased, you expect the 1 to come up one time in six, i.e. 10times. Therefore 30 times seems high. But is it too high?
There are two possible hypotheses:
1 The die is biased.
2 Just by chance we got more 1’s than expected.
How do we decide between these possibilities?
Lecture 4: The binomial distribution 4th of November 2015 21 / 26
Perform an hypothesis test.
Hypothesis: The die is fair. All 6 outcomes have the same probability.
Experiment: We roll the die 60 times.
Sample: We obtain 60 outcomes and the 1 comes out 30 times.
Assuming our hypothesis is true the experiment we carried out satisfies theconditions of the Binomial distribution
n identical trials, i.e. 60 die rolls.
2 possible outcomes for each trial: “1” and “not 1”.
Trials are independent.
P(“success”) = 1/6 is the same for each trial
Lecture 4: The binomial distribution 4th of November 2015 22 / 26
We define X = No. of 1’s that come up.
We observed X = 30.
We can calculate the probability of observing X=30 if our hypothesis istrue, i.e. if X∼Bin(60,1/6):
P (X = 30) = 60C30
(1
6
)30(5
6
)60−30
≈ 2.25× 10−9.
Conclusion:Under the hypothesis that the die is fair, the probability that the numberof 1’s come up 30 times in this experiment is very low. Therefore we mayconclude that the die has been biased.
Lecture 4: The binomial distribution 4th of November 2015 23 / 26
Hypothesis testing
Now we summarise the general approach:
posit a hypothesis
design and carry out an experiment to collect a sample of data
test to see if the sample is consistent with the hypothesis
Testing the hypothesis:Assuming our hypothesis is true what is the probability that we would haveobserved such a sample or a sample more extreme, i.e. is our sample quiteunlikely to have occurred under the assumptions of our hypothesis?
Lecture 4: The binomial distribution 4th of November 2015 24 / 26
Example: Drug efficiency
Until recently an average of 60 out 100 patients have survived aparticular severe infection.
When a new drug was administered to 15 patients with the infection,12 of them survived.
Does this provide evidence that the new drug is effective?
Lecture 4: The binomial distribution 4th of November 2015 25 / 26
Hypothesis: The drug is not effective,i.e. the probability of surviving is still p = 0.6.
Experiment: We test the drug on 15 patients with the infection.
Sample: 12 patients survived.
Let X denote the number of patients who survived.
Under our hypothesis, X∼ Bin(15,0.6)
We compute the probability that we would have observed such a sampleassuming our hypothesis is true:
P (X = 12) = 15C12 (0.6)12 (0.4)15−12 ≈ 0.063.
There is more than 6% chance of observing such a number of survivingpatients if the drug in not effective. Therefore it may be just by chancethat we observe such a number of patients who survived.
Lecture 4: The binomial distribution 4th of November 2015 26 / 26