Probability Distributions for Discrete RV

DefinitionThe probability distribution or probability mass function (pmf)of a discrete rv is defined for every number x by

p(x) = P(X = x) = P(all s ∈ S : X (s) = x).

In words, for every possible value x of the random variable, thepmf specifies the probability of observing that value when theexperiment is performed. (The conditions p(x) ≥ 0 and∑

all possible x p(x) = 1 are required for any pmf.)

DefinitionThe probability distribution or probability mass function (pmf)of a discrete rv is defined for every number x by

p(x) = P(X = x) = P(all s ∈ S : X (s) = x).

In words, for every possible value x of the random variable, thepmf specifies the probability of observing that value when theexperiment is performed. (The conditions p(x) ≥ 0 and∑

all possible x p(x) = 1 are required for any pmf.)

DefinitionThe cumulative distribution function (cdf) F (x) of a discrete rvX with pmf p(x) is defined for every number x by

F (x) = P(X ≤ x) =∑

y :y≤x

For any number x , F(x) is the probability that the observed valueof X will be at most x .

F (x) = P(X ≤ x) = P(X is less than or equal to x)

p(x) = P(X = x) = P(X is exactly equal to x)

F (x) = P(X ≤ x) =∑

y :y≤x

F (x) = P(X ≤ x) =∑

y :y≤x

pmf =⇒ cdf:

F (x) = P(X ≤ x) =∑

y :y≤x

It is also possible cdf =⇒ pmf:

p(x) = F (x)− F (x−)

where “x−” represents the largest possible X value that is strictlyless than x .

pmf =⇒ cdf:

F (x) = P(X ≤ x) =∑

y :y≤x

p(x) = F (x)− F (x−)

pmf =⇒ cdf:

F (x) = P(X ≤ x) =∑

y :y≤x

p(x) = F (x)− F (x−)

pmf =⇒ cdf:

F (x) = P(X ≤ x) =∑

y :y≤x

p(x) = F (x)− F (x−)

Proposition

For any two numbers a and b with a ≤ b,

P(a ≤ X ≤ b) = F (b)− F (a−)

where “a−” represents the largest possible X value that is strictlyless than a. In particular, if the only possible values are integersand if a and b are integers, then

P(a ≤ X ≤ b) = P(X = a or a + 1 or . . . or b)

= F (b)− F (a− 1)

Taking a = b yields P(X = a) = F (a)− F (a− 1) in this case.

Proposition

For any two numbers a and b with a ≤ b,

P(a ≤ X ≤ b) = F (b)− F (a−)

where “a−” represents the largest possible X value that is strictlyless than a. In particular, if the only possible values are integersand if a and b are integers, then

P(a ≤ X ≤ b) = P(X = a or a + 1 or . . . or b)

= F (b)− F (a− 1)

Taking a = b yields P(X = a) = F (a)− F (a− 1) in this case.

Expectations

DefinitionLet X be a discrete rv with set of possible values D and pmf p(x).The expected value or mean value of X , denoted by E (X ) orµX , is

E (X ) = µX =∑x∈D

x · p(x)

e.g (Problem 30)A group of individuals who have automobile insurance from acertain company is randomly selected. Let Y be the number ofmoving violations for which the individual was cited during the last3 years. The pmf of Y is

y 0 1 2 3

p(y) 0.60 0.25 0.10 0.05Then the expected value of

moving violations for that group is

µY = E (Y ) = 0 · 0.60 + 1 · 0.25 + 2 · 0.10 + 3 · 0.05 = 0.60

Expectations

E (X ) = µX =∑x∈D

x · p(x)

y 0 1 2 3

µY = E (Y ) = 0 · 0.60 + 1 · 0.25 + 2 · 0.10 + 3 · 0.05 = 0.60

Expectations

E (X ) = µX =∑x∈D

x · p(x)

y 0 1 2 3

µY = E (Y ) = 0 · 0.60 + 1 · 0.25 + 2 · 0.10 + 3 · 0.05 = 0.60

Expectations

y 0 1 2 3p(y) 0.60 0.25 0.10 0.05

Assume the total number of individuals in that group is 100, then there

are 60 individuals without moving violation, 25 with 1 moving violation,

10 with 2 moving violations and 5 with 3 moving violations.

The population mean is calculated as

µ =0 · 60 + ·1 · 25 + ·2 · 10 + 3 · 5

100= 0.60

µ = 0 · 60

100+ 1 · 25

100+ 2 · 10

100+ 3 · 5

100= 0 · 0.60 + 1 · 0.25 + 2 · 0.10 + 3 · 0.05

= 0.60

The population size is irrevelant if we know the pmf!

Expectations

y 0 1 2 3p(y) 0.60 0.25 0.10 0.05

µ =0 · 60 + ·1 · 25 + ·2 · 10 + 3 · 5

100= 0.60

µ = 0 · 60

100+ 1 · 25

100+ 2 · 10

100+ 3 · 5

100= 0 · 0.60 + 1 · 0.25 + 2 · 0.10 + 3 · 0.05

= 0.60

Expectations

y 0 1 2 3p(y) 0.60 0.25 0.10 0.05

µ =0 · 60 + ·1 · 25 + ·2 · 10 + 3 · 5

100= 0.60

µ = 0 · 60

100+ 1 · 25

100+ 2 · 10

100+ 3 · 5

100= 0 · 0.60 + 1 · 0.25 + 2 · 0.10 + 3 · 0.05

= 0.60

Expectations

y 0 1 2 3p(y) 0.60 0.25 0.10 0.05

µ =0 · 60 + ·1 · 25 + ·2 · 10 + 3 · 5

100= 0.60

µ = 0 · 60

100+ 1 · 25

100+ 2 · 10

100+ 3 · 5

100= 0 · 0.60 + 1 · 0.25 + 2 · 0.10 + 3 · 0.05

= 0.60

Expectations

y 0 1 2 3p(y) 0.60 0.25 0.10 0.05

µ =0 · 60 + ·1 · 25 + ·2 · 10 + 3 · 5

100= 0.60

µ = 0 · 60

100+ 1 · 25

100+ 2 · 10

100+ 3 · 5

100= 0 · 0.60 + 1 · 0.25 + 2 · 0.10 + 3 · 0.05

= 0.60

Expectations

Examples:Let X be a Bernoulli rv with pmf

p(x) =

1− p x = 0

p x = 1

0 x 6= 0, or 1

Then the expected value for X is

E (X ) = 0 · p(0) + 1 · p(1) = p

We see that the expected value of a Bernoulli rv X is just theprobability that X takes on the value 1.

Expectations

p(x) =

1− p x = 0

p x = 1

0 x 6= 0, or 1

E (X ) = 0 · p(0) + 1 · p(1) = p

Expectations

p(x) =

1− p x = 0

p x = 1

0 x 6= 0, or 1

E (X ) = 0 · p(0) + 1 · p(1) = p

Expectations

p(x) =

1− p x = 0

p x = 1

0 x 6= 0, or 1

E (X ) = 0 · p(0) + 1 · p(1) = p

Expectations

Examples:Consider the cards drawing example again and assume we haveinfinitely many cards this time. Let X = the number of drawingsuntil we get a ♠. If the probability for getting a ♠ is α, then thepmf for X is

p(x) =

{α(1− α)x−1 x = 1, 2, 3, . . .

0 otherwise

The expected value for X is

E (X ) =∑D

x · p(x) =∞∑

xα(1− α)x−1 = α∞∑

[− d

dα(1− α)x ]

E (X ) = α{− d

dα[∞∑

(1− α)x ]} = α{− d

1− αα

Expectations

p(x) =

{α(1− α)x−1 x = 1, 2, 3, . . .

0 otherwise

E (X ) =∑D

x · p(x) =∞∑

xα(1− α)x−1 = α∞∑

[− d

dα(1− α)x ]

E (X ) = α{− d

dα[∞∑

(1− α)x ]} = α{− d

1− αα

Expectations

p(x) =

{α(1− α)x−1 x = 1, 2, 3, . . .

0 otherwise

E (X ) =∑D

x · p(x) =∞∑

xα(1− α)x−1 = α

∞∑x=1

[− d

dα(1− α)x ]

E (X ) = α{− d

dα[∞∑

(1− α)x ]} = α{− d

1− αα

Expectations

p(x) =

{α(1− α)x−1 x = 1, 2, 3, . . .

0 otherwise

E (X ) =∑D

x · p(x) =∞∑

xα(1− α)x−1 = α

∞∑x=1

[− d

dα(1− α)x ]

E (X ) = α{− d

dα[∞∑

(1− α)x ]} = α{− d

1− αα

Expectations

Examples 3.20Let X be the number of interviews a student has prior to getting ajob. The pmf for X is

p(x) =

{kx2 x = 1, 2, 3, . . .

0 otherwise

where k is chosen so that∑∞

x=1(k/x2) = 1. (It can be showedthat

∑∞x=1(1/x2) <∞, which implies that such a k exists.)

The expected value of X is

µ = E (X ) =∞∑

x · k

∞∑x=1

x=∞!

The expected value is NOT finite!Heavy Tail: distribution with a large amount of probability farfrom µ

Expectations

p(x) =

{kx2 x = 1, 2, 3, . . .

0 otherwise

µ = E (X ) =∞∑

x · k

∞∑x=1

x=∞!

The expected value is NOT finite!

Heavy Tail: distribution with a large amount of probability farfrom µ

Expectations

p(x) =

{kx2 x = 1, 2, 3, . . .

0 otherwise

µ = E (X ) =∞∑

x · k

∞∑x=1

x=∞!

The expected value is NOT finite!Heavy Tail:

distribution with a large amount of probability farfrom µ

Expectations

p(x) =

{kx2 x = 1, 2, 3, . . .

0 otherwise

µ = E (X ) =∞∑

x · k

∞∑x=1

x=∞!

The expected value is NOT finite!Heavy Tail: distribution with a large amount of probability farfrom µ

Expectations

Example (Problem 38)Let X = the outcome when a fair die is rolled once. If before thedie is rolled you are offered either 1

3.5 dollars or 1X dollars, would

you accept the guaranteed amount or would you gamble?x 1 2 3 4 5 6

p(x) 16

1x 1 1

Then the expected dollars from gambling is

6∑x=1

x· p(

= 1 · 1

6+ · · ·+ 1

Expectations

you accept the guaranteed amount or would you gamble?

x 1 2 3 4 5 6

p(x) 16

1x 1 1

6∑x=1

x· p(

= 1 · 1

6+ · · ·+ 1

Expectations

p(x) 16

1x 1 1

6∑x=1

x· p(

= 1 · 1

6+ · · ·+ 1

Expectations

p(x) 16

1x 1 1

6∑x=1

x· p(

= 1 · 1

6+ · · ·+ 1

Expectations

p(x) 16

1x 1 1

6∑x=1

x· p(

= 1 · 1

6+ · · ·+ 1

Expectations

Proposition

If the rv X has a set of possible values D and pmf p(x), then theexpected value of any function h(X ), denoted by E [h(X )] or µhX

,is computed by

E [h(X )] =∑D

h(x) · p(x)

Expectations

Proposition

If the rv X has a set of possible values D and pmf p(x), then theexpected value of any function h(X ), denoted by E [h(X )] or µhX

,is computed by

E [h(X )] =∑D

h(x) · p(x)

Expectations

Example 3.23A computer store has purchased three computers of a certain typeat $500 apiece. It will sell them for $1000 apiece. Themanufacturer has agreed to repurchase any computers still unsoldafter a specified period at $200 apiece.Let X denote the number of computers sold, and suppose thatp(0) = 0.1, p(1) = 0.2, p(2) = 0.3, p(3) = 0.4.Let h(X ) denote the profit associated with selling X units, thenh(X ) = revenue − cost= 1000X + 200(3− X )− 1500 = 800X − 900.The expected profit is

E [h(X )] = h(0) · p(0) + h(1) · p(1) + h(2) · p(2) + h(3) · p(3)

= (−900)(0.1) + (−100)(0.2) + (700)(0.3) + (1500)(0.4)

Expectations

Example 3.23A computer store has purchased three computers of a certain typeat $500 apiece. It will sell them for $1000 apiece. Themanufacturer has agreed to repurchase any computers still unsoldafter a specified period at $200 apiece.

Let X denote the number of computers sold, and suppose thatp(0) = 0.1, p(1) = 0.2, p(2) = 0.3, p(3) = 0.4.Let h(X ) denote the profit associated with selling X units, thenh(X ) = revenue − cost= 1000X + 200(3− X )− 1500 = 800X − 900.The expected profit is

E [h(X )] = h(0) · p(0) + h(1) · p(1) + h(2) · p(2) + h(3) · p(3)

= (−900)(0.1) + (−100)(0.2) + (700)(0.3) + (1500)(0.4)

Expectations

Example 3.23A computer store has purchased three computers of a certain typeat $500 apiece. It will sell them for $1000 apiece. Themanufacturer has agreed to repurchase any computers still unsoldafter a specified period at $200 apiece.Let X denote the number of computers sold, and suppose thatp(0) = 0.1, p(1) = 0.2, p(2) = 0.3, p(3) = 0.4.Let h(X ) denote the profit associated with selling X units,

thenh(X ) = revenue − cost= 1000X + 200(3− X )− 1500 = 800X − 900.The expected profit is

E [h(X )] = h(0) · p(0) + h(1) · p(1) + h(2) · p(2) + h(3) · p(3)

= (−900)(0.1) + (−100)(0.2) + (700)(0.3) + (1500)(0.4)

Expectations

Example 3.23A computer store has purchased three computers of a certain typeat $500 apiece. It will sell them for $1000 apiece. Themanufacturer has agreed to repurchase any computers still unsoldafter a specified period at $200 apiece.Let X denote the number of computers sold, and suppose thatp(0) = 0.1, p(1) = 0.2, p(2) = 0.3, p(3) = 0.4.Let h(X ) denote the profit associated with selling X units, thenh(X ) = revenue − cost= 1000X + 200(3− X )− 1500 = 800X − 900.

The expected profit is

E [h(X )] = h(0) · p(0) + h(1) · p(1) + h(2) · p(2) + h(3) · p(3)

= (−900)(0.1) + (−100)(0.2) + (700)(0.3) + (1500)(0.4)

Expectations

Example 3.23A computer store has purchased three computers of a certain typeat $500 apiece. It will sell them for $1000 apiece. Themanufacturer has agreed to repurchase any computers still unsoldafter a specified period at $200 apiece.Let X denote the number of computers sold, and suppose thatp(0) = 0.1, p(1) = 0.2, p(2) = 0.3, p(3) = 0.4.Let h(X ) denote the profit associated with selling X units, thenh(X ) = revenue − cost= 1000X + 200(3− X )− 1500 = 800X − 900.The expected profit is

E [h(X )] = h(0) · p(0) + h(1) · p(1) + h(2) · p(2) + h(3) · p(3)

= (−900)(0.1) + (−100)(0.2) + (700)(0.3) + (1500)(0.4)

Expectations

Proposition

E (aX + b) = a · E (X ) + b

(Or, using alternative notation, µaX+b = a · µX + b.)

e.g. for the previous example,

E [h(X )] = E (800X − 900) = 800 · E (X )− 900 = 700

Corollary

1. For any constant a, E (aX ) = a · E (X ).2. For any constant b, E (X + b) = E (X ) + b.

Expectations

Proposition

E (aX + b) = a · E (X ) + b

E [h(X )] = E (800X − 900) = 800 · E (X )− 900 = 700

Corollary

Expectations

Proposition

E (aX + b) = a · E (X ) + b

E [h(X )] = E (800X − 900) = 800 · E (X )− 900 = 700

Corollary

Expectations

Proposition

E (aX + b) = a · E (X ) + b

E [h(X )] = E (800X − 900) = 800 · E (X )− 900 = 700

Corollary

Expectations

DefinitionLet X have pmf p(x) and expected value µ. Then the variance ofX, denoted by V (X ) or σ2

X , or just σ2X , is

V (X ) =∑D

(x − µ)2 · p(x) = E [(X − µ)2]

The stand deviation (SD) of X is

σX =√σ2

Expectations

DefinitionLet X have pmf p(x) and expected value µ. Then the variance ofX, denoted by V (X ) or σ2

X , or just σ2X , is

V (X ) =∑D

(x − µ)2 · p(x) = E [(X − µ)2]

The stand deviation (SD) of X is

σX =√σ2

Expectations

Example:For the previous example, the pmf is given as

x 0 1 2 3

p(x) 0.1 0.2 0.3 0.4then the variance of X is

V (X ) = σ2 =3∑

(x − 2)2 · p(x)

= (0− 2)2(0.1) + (1− 2)2(0.2) + (2− 2)2(0.3) + (3− 2)2(0.4)

Expectations

x 0 1 2 3

p(x) 0.1 0.2 0.3 0.4

then the variance of X is

V (X ) = σ2 =3∑

(x − 2)2 · p(x)

= (0− 2)2(0.1) + (1− 2)2(0.2) + (2− 2)2(0.3) + (3− 2)2(0.4)

Expectations

x 0 1 2 3

p(x) 0.1 0.2 0.3 0.4then the variance of X is

V (X ) = σ2 =3∑

(x − 2)2 · p(x)

= (0− 2)2(0.1) + (1− 2)2(0.2) + (2− 2)2(0.3) + (3− 2)2(0.4)

Expectations

Recall that for sample variance s2, we have

s2 =Sxx

n − 1=

∑x2i −

n − 1

Proposition

V (X ) = σ2 = [∑D

x2 · p(x)]− µ2 = E (X 2)− [E (X )]2

e.g. for the previous example, the pmf is given asx 0 1 2 3

p(x) 0.1 0.2 0.3 0.4ThenV (X ) = E (X 2)− [E (X )]2 = 12 · 0.2 + 22 · 0.3 + 32 · 0.4− (2)2 = 1

Expectations

s2 =Sxx

n − 1=

∑x2i −

n − 1

Proposition

V (X ) = σ2 = [∑D

x2 · p(x)]− µ2 = E (X 2)− [E (X )]2

p(x) 0.1 0.2 0.3 0.4ThenV (X ) = E (X 2)− [E (X )]2 = 12 · 0.2 + 22 · 0.3 + 32 · 0.4− (2)2 = 1

Expectations

s2 =Sxx

n − 1=

∑x2i −

n − 1

Proposition

V (X ) = σ2 = [∑D

x2 · p(x)]− µ2 = E (X 2)− [E (X )]2

p(x) 0.1 0.2 0.3 0.4ThenV (X ) = E (X 2)− [E (X )]2 = 12 · 0.2 + 22 · 0.3 + 32 · 0.4− (2)2 = 1

Expectations

s2 =Sxx

n − 1=

∑x2i −

n − 1

Proposition

V (X ) = σ2 = [∑D

x2 · p(x)]− µ2 = E (X 2)− [E (X )]2

p(x) 0.1 0.2 0.3 0.4ThenV (X ) = E (X 2)− [E (X )]2 = 12 · 0.2 + 22 · 0.3 + 32 · 0.4− (2)2 = 1

Expectations

Proposition

If h(X ) is a function of a rv X , then

V [h(X )] = σ2h(X ) =

{h(x)−E [h(X )]}2·p(x) = E [h(X )2]−{E [h(X )]}2

If h(X ) is linear, i.e. h(X ) = aX + b for some nonrandom constanta and b, then

V (aX + b) = σ2aX+b = a2 · σ2

X and σaX+b =| a | ·σX

In particular,σaX =| a | ·σX , σX+b = σX

Expectations

Proposition

If h(X ) is a function of a rv X , then

V [h(X )] = σ2h(X ) =

{h(x)−E [h(X )]}2·p(x) = E [h(X )2]−{E [h(X )]}2

If h(X ) is linear, i.e. h(X ) = aX + b for some nonrandom constanta and b, then

V (aX + b) = σ2aX+b = a2 · σ2

X and σaX+b =| a | ·σX

In particular,σaX =| a | ·σX , σX+b = σX

Expectations

Example 3.23 continuedA computer store has purchased three computers of a certain typeat $500 apiece. It will sell them for $1000 apiece. Themanufacturer has agreed to repurchase any computers still unsoldafter a specified period at $200 apiece. Let X denote the numberof computers sold, and suppose that p(0) = 0.1,p(1) = 0.2, p(2) = 0.3, p(3) = 0.4. Let h(X ) denote the profitassociated with selling X units, thenh(X ) = revenue − cost= 1000X + 200(3− X )− 1500 = 800X − 900.The variance of h(X ) is

V [h(X )] = V [800X − 900]

= 8002V [X ]

= 640, 000

And the SD is σh(X ) =√

V [h(X )] = 800.

Expectations

Example 3.23 continuedA computer store has purchased three computers of a certain typeat $500 apiece. It will sell them for $1000 apiece. Themanufacturer has agreed to repurchase any computers still unsoldafter a specified period at $200 apiece. Let X denote the numberof computers sold, and suppose that p(0) = 0.1,p(1) = 0.2, p(2) = 0.3, p(3) = 0.4. Let h(X ) denote the profitassociated with selling X units, thenh(X ) = revenue − cost= 1000X + 200(3− X )− 1500 = 800X − 900.

The variance of h(X ) is

V [h(X )] = V [800X − 900]

= 8002V [X ]

= 640, 000

V [h(X )] = 800.

Expectations

Example 3.23 continuedA computer store has purchased three computers of a certain typeat $500 apiece. It will sell them for $1000 apiece. Themanufacturer has agreed to repurchase any computers still unsoldafter a specified period at $200 apiece. Let X denote the numberof computers sold, and suppose that p(0) = 0.1,p(1) = 0.2, p(2) = 0.3, p(3) = 0.4. Let h(X ) denote the profitassociated with selling X units, thenh(X ) = revenue − cost= 1000X + 200(3− X )− 1500 = 800X − 900.The variance of h(X ) is

V [h(X )] = V [800X − 900]

= 8002V [X ]

= 640, 000

V [h(X )] = 800.

Binomial Distribution

1. The experiment consists of a sequence of n smaller experimentscalled trials, where n is fixed in advance of the experiment;2. Each trial can result in one of the same two possible outcomes(dichotomous trials), which we denote by success (S) and failure(F );3. The trials are independent, so that the outcome on anyparticular trial dose not influence the outcome on any other trial;4. The probability of success is constant from trial; we denote thisprobability by p.

DefinitionAn experiment for which Conditions 1 — 4 are satisfied is called abinomial experiment.

1. The experiment consists of a sequence of n smaller experimentscalled trials, where n is fixed in advance of the experiment;

2. Each trial can result in one of the same two possible outcomes(dichotomous trials), which we denote by success (S) and failure(F );3. The trials are independent, so that the outcome on anyparticular trial dose not influence the outcome on any other trial;4. The probability of success is constant from trial; we denote thisprobability by p.

1. The experiment consists of a sequence of n smaller experimentscalled trials, where n is fixed in advance of the experiment;2. Each trial can result in one of the same two possible outcomes(dichotomous trials), which we denote by success (S) and failure(F );

3. The trials are independent, so that the outcome on anyparticular trial dose not influence the outcome on any other trial;4. The probability of success is constant from trial; we denote thisprobability by p.

1. The experiment consists of a sequence of n smaller experimentscalled trials, where n is fixed in advance of the experiment;2. Each trial can result in one of the same two possible outcomes(dichotomous trials), which we denote by success (S) and failure(F );3. The trials are independent, so that the outcome on anyparticular trial dose not influence the outcome on any other trial;

4. The probability of success is constant from trial; we denote thisprobability by p.

Examples:1. If we toss a coin 10 times, then this is a binomial experimentwith n = 10, S = Head, and F = Tail.2. If we draw a card from a deck of well-shulffed cards withreplacement, do this 5 times and record whether the outcome is ♠or not, then this is also a binomial experiment. In this case, n = 5,S = ♠ and F = not ♠.3. Again we draw a card from a deck of well-shulffed cards butwithout replacement, do this 5 times and record whether theoutcome is ♠ or not. However this time it is NO LONGER abinomial experiment.

P(♠ on second | ♠ on first) =12

51= 0.235 6= 0.25 = P(♠ on second)

We do not have independence here!

Examples:1. If we toss a coin 10 times, then this is a binomial experimentwith n = 10, S = Head, and F = Tail.

2. If we draw a card from a deck of well-shulffed cards withreplacement, do this 5 times and record whether the outcome is ♠or not, then this is also a binomial experiment. In this case, n = 5,S = ♠ and F = not ♠.3. Again we draw a card from a deck of well-shulffed cards butwithout replacement, do this 5 times and record whether theoutcome is ♠ or not. However this time it is NO LONGER abinomial experiment.

51= 0.235 6= 0.25 = P(♠ on second)

Examples:1. If we toss a coin 10 times, then this is a binomial experimentwith n = 10, S = Head, and F = Tail.2. If we draw a card from a deck of well-shulffed cards withreplacement, do this 5 times and record whether the outcome is ♠or not, then this is also a binomial experiment. In this case, n = 5,S = ♠ and F = not ♠.

3. Again we draw a card from a deck of well-shulffed cards butwithout replacement, do this 5 times and record whether theoutcome is ♠ or not. However this time it is NO LONGER abinomial experiment.

51= 0.235 6= 0.25 = P(♠ on second)

Examples:4. This time we draw a card from 100 decks of well-shulffed cardswithout replacement, do this 5 times and record whether theoutcome is ♠ or not. Is it a binomial experiment?

P(♠ on second draw | ♠ on first draw) =1299

5199= 0.2499 ≈ 0.25

P(♠ on sixth draw | ♠ on first five draw) =1295

5195= 0.2492 ≈ 0.25

P(♠ on tenth draw | not ♠ on first nine draw) =1300

5191= 0.2504 ≈ 0.25

Although we still do not have independence, the conditionalprobabilities differ so slightly that we can regard these trials asindependent with P(♠) = 0.25.

5199= 0.2499 ≈ 0.25

5195= 0.2492 ≈ 0.25

5191= 0.2504 ≈ 0.25

5199= 0.2499 ≈ 0.25

5195= 0.2492 ≈ 0.25

5191= 0.2504 ≈ 0.25

RuleConsider sampling without replacement from a dichotomouspopulation of size N. If the sample size (number of trials) n is atmost 5% of the population size, the experiment can be analyzed asthough it wre exactly a binomial experiment.

e.g. for the previous example, the population size is N = 5200 andthe sample size is n = 5. We have n

N ≈ 0.1%. So we can apply theabove rule.

DefinitionThe binomial random variable X associated with a binomialexperiment consisting of n trials is defined as

X = the number of S’s among the n trials

Possible values for X in an n-trial experiment are x = 0, 1, 2, . . . , n.

NotationWe use X ∼ Bin(n, p) to indicate that X is a binomial rv based onn trials with success probability p.We use b(x ; n, p) to denote the pmf of X , and B(x ; n, p) to denotethe cdf of X , where

B(x ; n, p) = P(X ≤ x) =x∑

b(x ; n, p)

B(x ; n, p) = P(X ≤ x) =x∑

b(x ; n, p)

B(x ; n, p) = P(X ≤ x) =x∑

b(x ; n, p)

B(x ; n, p) = P(X ≤ x) =x∑

b(x ; n, p)

Example:Assume we toss a coin 3 times and the probability for getting ahead for each toss is p. Let X be the binomial random variableassociated with this experiment. We tabulate all the possibleoutcomes, corresponding X values and probabilities in thefollowing table:

Outcome X Probability Outcome X Probability

HHH 3 p3 TTT 0 (1− p)3

HHT 2 p2 · (1− p) TTH 1 (1− p)2 · pHTH 2 p2 · (1− p) THT 1 (1− p)2 · pHTT 1 p · (1− p)2 THH 2 (1− p) · p2

e.g. b(2; 3, p) = P(HHT ) + P(HTH) + P(THH) = 3p2(1− p).

HHH 3 p3 TTT 0 (1− p)3

More generally, for the binomial pmf b(x ; n, p), we have

b(x ; n, p) =

{number of sequences of

length n consisting of x S ’s

probability of anyparticular such sequence

number of sequences oflength n consisting of x S ’s

}= px(1− p)n−x

Theorem

b(x ; n, p) =

)px(1− p)n−x x = 0, 1, 2, . . . , n

0 otherwise

b(x ; n, p) =

}= px(1− p)n−x

Theorem

b(x ; n, p) =

)px(1− p)n−x x = 0, 1, 2, . . . , n

0 otherwise

b(x ; n, p) =

}= px(1− p)n−x

Theorem

b(x ; n, p) =

)px(1− p)n−x x = 0, 1, 2, . . . , n

0 otherwise

b(x ; n, p) =

}= px(1− p)n−x

Theorem

b(x ; n, p) =

)px(1− p)n−x x = 0, 1, 2, . . . , n

0 otherwise

Example: (Problem 55)Twenty percent of all telephones of a certain type are submittedfor service while under warranty. Of these, 75% can be repaired,whereas the other 25% must be replaced with new units. if acompany purchases ten of these telephones, what is the probabilitythat exactly two will end up being replaced under warranty?Let X = number of telephones which need replace. Then

p = P(service and replace) = P(replace | service)·P(service) = 0.25·0.2 = 0.05

P(X = 2) = b(2; 10, 0.05) =

)0.052(1− 0.05)10−2 = 0.0746

Example: (Problem 55)Twenty percent of all telephones of a certain type are submittedfor service while under warranty. Of these, 75% can be repaired,whereas the other 25% must be replaced with new units. if acompany purchases ten of these telephones, what is the probabilitythat exactly two will end up being replaced under warranty?

Let X = number of telephones which need replace. Then

P(X = 2) = b(2; 10, 0.05) =

)0.052(1− 0.05)10−2 = 0.0746

P(X = 2) = b(2; 10, 0.05) =

)0.052(1− 0.05)10−2 = 0.0746

P(X = 2) = b(2; 10, 0.05) =

)0.052(1− 0.05)10−2 = 0.0746

Binomial TablesTable A.1 Cumulative Binomial Probabilities (Page 664)

B(x ; n, p) =∑x

y=0 b(x ; n, p) . . .b. n = 10

0.01 0.05 0.10 . . .

0 .904 .599 .349 . . .1 .996 .914 .736 . . .2 1.000 .988 .930 . . .3 1.000 .999 .987 . . .

. . . . . . . . . . . .

Then for b(2; 10, 0.05), we have

b(2; 10, 0.05) = B(2; 10, 0.05)−B(1; 10, 0.05) = .988−.914 = .074

B(x ; n, p) =∑x

y=0 b(x ; n, p) . . .b. n = 10

0.01 0.05 0.10 . . .

0 .904 .599 .349 . . .1 .996 .914 .736 . . .2 1.000 .988 .930 . . .3 1.000 .999 .987 . . .

. . . . . . . . . . . .

b(2; 10, 0.05) = B(2; 10, 0.05)−B(1; 10, 0.05) = .988−.914 = .074

B(x ; n, p) =∑x

y=0 b(x ; n, p) . . .b. n = 10

0.01 0.05 0.10 . . .

0 .904 .599 .349 . . .1 .996 .914 .736 . . .2 1.000 .988 .930 . . .3 1.000 .999 .987 . . .

. . . . . . . . . . . .

b(2; 10, 0.05) = B(2; 10, 0.05)−B(1; 10, 0.05) = .988−.914 = .074

Mean and Variance

TheoremIf X ∼ Bin(n, p), then E (X ) = np, V (X ) = np(1− p) = npq, andσX =

√npq (where q = 1− p).

The idea is that X =∑n

i=1 Y1 + Y2 + · · ·+ Yn , where Yi ’s areindependent Bernoulli random variable with probability p forone outcome, i.e.

{1, with probabilityp

0, with probability1− p

E (Y ) = p and V (Y ) = (1− p)2p + (−p)2(1− p) = p(1− p).Therefore E (X ) = np and V (X ) = np(1− p) = npq.

Mean and Variance

E (Y ) = p and V (Y ) = (1− p)2p + (−p)2(1− p) = p(1− p).

Therefore E (X ) = np and V (X ) = np(1− p) = npq.

Mean and Variance

Example: (Problem 60)A toll bridge charges $1.00 for passenger cars and $2.50 for othervehicles. Suppose that during daytime hours, 60% of all vehiclesare passenger cars. If 25 vehicles cross the bridge during aparticular daytime period, what is the resulting expected tollrevenue? What is the varianceLet X = the number of passenger cars and Y = revenue. ThenY = 1.00X + 2.50(25− X ) = 62.5− 1.50X .

E (Y ) = E (62.5−1.5X ) = 62.5−1.5E (X ) = 62.5−1.5·(25·0.6) = 40