Chapter 7: Theoretical Probability Distributions
Transcript of Chapter 7: Theoretical Probability Distributions
BSTT523: Pagano & Gavreau, Chapter 7 1
Chapter 7: Theoretical Probability Distributions Variable - Measured/Categorized characteristic Random Variable (R.V.) X Assumes values (x) by chance Discrete R.V. Can assume a finite number of values Continuous R.V. Can assume any value within an interval 1. Discrete Random Variables p.2 2. Some Discrete Distributions
Bernoulli distribution p.5 Binomial distribution p.7 Poisson distribution p.13 3. Continuous Random Variables p.16 4. The Normal/Gaussian Distribution p.18
BSTT523: Pagano & Gavreau, Chapter 7 2
1. Discrete Random Variables: Definition: The probability distribution of a discrete r.v. X: A table, graph, formula, or other device that specifies all possible values of X and their respective probabilities Example 7.1 Table form: Birth order of children in the U.S.
x: Birth Order
π(π = π₯)
1 0.416 2 0.330 3 0.158 4 0.058 5 0.021 6 0.009 7 0.004
8+ 0.004 Total 1.000
BSTT523: Pagano & Gavreau, Chapter 7 3
Discrete Probability Density Function (Discrete PDF): π(π₯) = π(π = π₯) Properties of the discrete PDF: i. 0 β€ π(π₯π) β€ 1 for all π₯π Non-negative ii. β π(π₯π){πππ π₯π} = 1 Exhaustive iii. ποΏ½π = π₯π βͺ π = π₯ποΏ½ = π(π₯π) + π(π₯π) Additive Cumulative Distribution Function (CDF): πΉ(π₯) = π(π β€ π₯) = β π(π = π₯π)π₯πβ€π₯ = β π(π₯π)π₯πβ€π₯ Note: i. PDF π(π₯π) = π(π = π₯π) = πΉ(π₯π) β πΉ(π₯πβ1) ii. π(π < π₯π) = πΉ(π₯πβ1) iii. π(π < π β€ π) = πΉ(π) β πΉ(π)
BSTT523: Pagano & Gavreau, Chapter 7 4
Example 7.1: Birth Order
x: Birth Order
PDF π(π₯) = π(π = π₯)
CDF πΉ(π₯) = π(π β€ π₯)
1 0.416 0.416 2 0.330 0.746 3 0.158 0.904 4 0.058 0.962 5 0.021 0.983 6 0.009 0.992 7 0.004 0.996
8+ 0.004 1.000 Q1. Prob. that a child picked at random was motherβs 1st or 2nd child? Q2. Prob. that a child picked at random was of birth order fewer than 4? Q3. Prob. that a child picked at random was of order 5 or more? Q4. Prob. that a child picked at random was of order between 3 and 5?
BSTT523: Pagano & Gavreau, Chapter 7 5
2. Some Discrete Distributions Bernoulli Distribution
Bernoulli Variable: Binary Variable
π = οΏ½1, π π’ππππ π 0, πππππ’ππ
Bernoulli Trial: One performance of experiment with 0/1 outcome
Denote π = π(π = 1) π = π(π = 0) = 1 β π
The PDF of the Bernoulli distribution is
π(π₯) = οΏ½π ππ π = 1π ππ π = 0
= ππ₯π1βπ₯, π₯ = 0,1
= ππ₯(1 β π)1βπ₯, π₯ = 0,1 The Bernoulli distribution has one parameter = π
BSTT523: Pagano & Gavreau, Chapter 7 6
If X follows a Bernoulli distribution, then Mean: π = πΈ(π) = π Variance: π2 = πππ(π) = ππ = π(1 β π) Examples of Bernoulli variables:
Ex. 1: flip a coin π = οΏ½1,π»ππππ 0, πππππ
Ex. 2: roll a die, interested in 3βs π = οΏ½1,πππ πππππ ππ 30, ππ‘βπππ€ππ π
BSTT523: Pagano & Gavreau, Chapter 7 7
Binomial Distribution
Perform π independent Bernoulli trials.
π = number of successes (1βs) π = probability of success in each trial π = 1 β π π~π΅πΌπ(π,π)
Q: What is the PDF π(π₯), π₯ = 0,1, β¦ ,π of π~π΅πΌπ(π,π) ? i.e., what is the probability of π₯ successes in π Bernoulli trials?
Q1. 5 Bernoulli trials, π~π΅πΌπ(5,π) P(result is 10010)=? Solution: πππππ = π2π3
Q2. Other results with 2 successes out of 5? Number Sequence
1 11000 2 10100 3 10010 4 10001 5 01100 6 01010 7 01001 8 00110 9 00101
10 00011 There are 10 ways to get 2 successes out of 5 The probability of each sequence is π2π3 P(Sequence 1 or 2 or β¦ or 10) = 10π2π3
BSTT523: Pagano & Gavreau, Chapter 7 8
Definition: A Combination of n subjects taken x at a time =
Number of unordered subsets of x (βn choose xβ) = nCx = π!
π₯!(πβπ₯)!
where x! = x(x-1)(x-2) Β· Β· Β· (2)(1) and define 0!=1 Example: β5 choose 2β β how many subsets of 2 out of 5? 5C2 = 5!
2!(5β2)! = 5Β·4
2Β·1 = 10
Back to binomial distribution question: π~π΅πΌπ(5,π); π(2)=? π(5)=? Ans: π(2) = 5C2p2q3 = 10p2q3 π(5) = 5C5p5q0 = 5!
5!0!p5q0 = 1p51= p5
BSTT523: Pagano & Gavreau, Chapter 7 9
Binomial PDF π(π₯) : π~π΅πΌπ(π,π) P(x successes in n Bernoulli trials) π(π₯) = nCx pxqn-x , x = 0, 1, β¦, n = 0, otherwise
Number of Successes π₯
Probability π(π₯)
0 nC0 qn 1 nC1 pqn-1
. . . . . . π₯ nCx pxqn-x
. . . . . . n-1 nCn-1 pn-1q
π nCn pn Total 1
Important Binomial distribution features: Mean: π = πΈ(π) = ππ Variance: π2 = πππ(π) = πππ
BSTT523: Pagano & Gavreau, Chapter 7 10
Example 7.2 Smoking in the U.S.: 29% are smokers, or π = .29 Select a random sample of size 10. Q1. What is P(4 smokers in the sample)? π = number of smokers out of π = 10 π~π΅πΌπ(10, .29)
Solution 1. π(4) = 10C4 (0.29)4(0.71)6
= 10!4!6!
(.00707)(.1281) = .1903
Solution 2. Table A.1 (P.A1): Binomial PDF π = 0.05 to 0.5, π = 2 to 20 π(4) : π = 10, π β .30 β π(4) β .2001
Solution 3. SAS: PROBBNML(p, n, m) β CDF PDF(βBINOMIALβ, x, p, n) β PDF CDF(βBINOMIALβ, x, p, n) - CDF Q2. P(6 or more smokers in the sample)=? π(π β₯ 6) = 1 β πΉ(5) = 1 β (. 9596) = .0404
BSTT523: Pagano & Gavreau, Chapter 7 11
Q3. Among the 10 individuals chosen, what is the expected number of smokers? πΈ(π) = ππ = 10 β 29 = 2.9 Variance and SD: πππ(π) = πππ = 10 β (. 29) β (. 71) = 2.059 ππ· = οΏ½πππ = β2.059 = 1.43 Note: Using Table A.1, what if π>0.5? π(π₯,π,π) = nCx px(1-p)n-x π(π β π₯,π, 1 β π) = nCn-x (1-p)n-x(p)x nCx= π!
π₯!(πβπ₯)!= π!
(πβπ₯)!π₯!= nCn-x
β π(π₯,π,π) = π(π β π₯,π, 1 β π) i.e. if π>0.5 then treat ππΆ as βsuccessβ. π(π β€ π₯), π~π΅πΌπ(π,π) = π(ππΆ β₯ π β π₯), ππΆ~π΅πΌπ(π, 1 β π)
BSTT523: Pagano & Gavreau, Chapter 7 12
Example 7.3 βWhat do you think about the problem of childhood obesity?β Poll in 2003: 55% of residents think it is βseriousβ. Randomly select π=12 residents. Q1. P(8 people think it is βseriousβ)? π~π΅πΌπ(12, .55) β π(8) = .1700
Same as P(4 out of 12 do not think βseriousβ); π~π΅πΌπ(12, .45) β π(4) = .1700 Q2. P(5 or fewer think βseriousβ) = ?
π(π β€ 5|π = 12,π = .55) = π(π β₯ 7|π = 12,π = .45) = 1 β π(π β€ 6|π = 12,π = .45) = 1 β .7393 = .2607 Q3. Among the sample of 12, what is the expected number of people who think childhood obesity is βseriousβ?
πΈ(π) = ππ = 12 β .55 = 6.6 Q4. What is the variance of the number who think childhood obesity is βseriousβ?
πππ(π) = πππ = 12 β (. 55) β (. 45) = 2.97
BSTT523: Pagano & Gavreau, Chapter 7 13
Poisson Distribution π = number of event occurrences in a given interval of time/space/volume etc. i.e. Count Data Probability that π₯ events will occur:
π(π₯) = πβπππ₯
π₯! , π₯=0, 1, 2, . . .
π~πππΌ(π) Important Poisson features: Mean: πΈ(π) = π Variance: πππ(π) = π When Ξ» is small, the distribution is right-skewed; when Ξ» increases (Ξ»β₯10), the distribution becomes symmetric.
BSTT523: Pagano & Gavreau, Chapter 7 14
Example 7.4 Allergic reaction to anesthesia (Laake and Rottingen) Occurrences of reaction βΌ Poisson, about 12 incidents per year expected Q1. In the next year, what is the probability of seeing 3 incidents? Solution: π~πππΌ(12)
π(3) = πβ12123
3! = .00177
Q2. What is the probability that at least 3 will have a reaction
in the next year? Solution 1: π(π β₯ 3) = 1 β π(π β€ 2) = 1 β πΉ(2) = 1 β {π(0) + π(1) + π(2)}
= 1 β οΏ½πβ12120
0!+ πβ12121
1!+ πβ12122
2!οΏ½
= 1 β .00052225 = .9994775
BSTT523: Pagano & Gavreau, Chapter 7 15
Solution 2: Table A.2 (P.A-6): POISSON PDF π(π β₯ 3) = 1 β πΉ(2) = 1 β (. 0000 + .0001 + .0004) = .9995 Solution 3: SAS: POISSON(Ξ», x) β CDF PDF(βPOISSONβ, x, Ξ») β PDF CDF(βPOISSONβ, x, Ξ») β CDF
BSTT523: Pagano & Gavreau, Chapter 7 16
3. Continuous Random Variables Continuous π can assume any value within its range. Within any interval, there are theoretically an infinite number of values. Subareas of histograms represent frequency of occurrence of values within class intervals Total frequency of values between π and π: add all subareas for intervals π through π. If width of class intervals is very small, then connecting midpoints (creating a frequency polygon) creates a smooth curve. If probability is shown on the y-axis and we have a smooth curve: probability density function (PDF) π(π₯) π(π < π β€ π) = total area under π(π₯) between π and π, or β« π(π‘)ππ‘π
π .
BSTT523: Pagano & Gavreau, Chapter 7 17
Cumulative density function (CDF) of X: πΉ(π₯) = β« π(π‘)ππ‘π₯
ββ Note: Total area under π(π₯) = 1, i.e., β« π(π‘)ππ‘+β
ββ = 1 and π(π₯) = π
ππ₯πΉ(π₯) = πΉβ²(π₯)
BSTT523: Pagano & Gavreau, Chapter 7 18
4. A special continuous distribution: the Normal or Gaussian Normal PDF:
π(π₯) = 1β2ππ
πβ(π₯βπ)2
2π2 , ββ < π₯ < +β
π~π(π,π2) Characteristics:
Distribution is symmetric around π
Mean = Median = Mode = π
Total area under the curve = 1, i.e., β« 1β2ππ
πβ(π₯βπ)2
2π2+βββ = 1
Area under the curve between βπ and +π β .68 Area under the curve between β2π and +2π β .95 Area under the curve between β3π and +3π β .997
πΈ(π) = π location parameter
πππ(π) = π2 scale parameter Standard Normal Distribution:
π~π(0,1) has PDF π(π§) = 1β2π
πβπ§2
2 , ββ < π§ < +β
BSTT523: Pagano & Gavreau, Chapter 7 19
Table A.3: Standard Normal Upper Tail Cumulative Probabilities
π(π β₯ π§0) = 1 βΞ¦(π§0) , π§0 β₯ 0
where Ξ¦(π§) = β« π(π‘)ππ‘π§ββ is the CDF for π
for π§0 < 0, Ξ¦(π§0) = π(π β€ π§0) = π(π β₯ (βπ§0)) , π§0 β€ 0 Example 7.5 Given a variable that follows the standard normal distribution, i.e. π~π(0,1) , what is π(π§ β₯ 1) and π(π§ β€ β1) ?
Solution: by Table A.3, π(π§ β₯ 1)=0.159
and π(π§ β€ β1) = π(π§ β₯ 1) = 0.159 Example 7.6 Randomly pick a value π§ from the standard normal distribution. P(π§ has a value between -2 and +2) = ?
Solution: Note that for a continuous distribution π(π = π₯) = 0.
π(β2 β€ z β€ +2) = π(β2 < π§ < +2) = 1 β π(π§ β₯ 2) β π(π§ β€ β2) = 1 β 2 β π(π§ β₯ 2) = 1 β 2 β (. 023) = 0.954
BSTT523: Pagano & Gavreau, Chapter 7 20
How is the π(0,1) distribution related to π(π,π2) ?
If πΏ~π΅(π,ππ) and π = (πΏβπ)π
, then π~π΅(π,π) . Example 7.7 Systolic Blood Pressure (SBP) (p.181 P&G) π = SBP for 18-74 year old males; π~π(π,π2) with π=129 mm Hg and π=19.8 mm Hg.
Find π₯ which is the cutoff for the upper 2.5% of the SBP distribution; i.e. find π₯ such that π(π > π₯) = .025 .
Solution: By Table A.3 we know that π(π β₯ 1.96) = .025.
(π₯βπ)π
= 1.96 β (π₯β129)19.8
= 1.96
β π₯ = (1.96)(19.8) + 129 = 167.8 What proportion of men in this population have SBP>150 mmHg?
Solution: π(π > 150) = π οΏ½(π₯βπ)π
> (150β129)19.8
οΏ½
= π(π > 1.06) = 0.145 β 14.5%
BSTT523: Pagano & Gavreau, Chapter 7 21
Example 7.8 Breath study (Diskin et al.) π = Ammonia concentration in parts per billion (ppb) π=491 ppb, π = 119 ppb; i.e. π~π(491, 1192)
π(292 β€ π β€ 649) =? Solution 1: π(292 β€ π β€ 649) = π οΏ½292β491
119β€ πβπ
πβ€ 649β491
119οΏ½
= π(β1.67 β€ π β€ 1.33) = 1 β π(π β€ β1.67) β π(π β₯ 1.33) = 1 β .047 β .092 = .861 Solution 2: SAS: ProbNorm(x) β N(0,1) CDF PDF(βNORMALβ, x) β N(0,1) PDF PDF(βNORMALβ, x, π, π) β N(π, π) PDF CDF(βNORMALβ, x) β N(0,1) CDF CDF(βNORMALβ, x, π, π) β N(π, π) CDF