SanJoséStateUniversity Math161A:AppliedProbability&Statistics...Specialdiscretedistributions(part1)...

26
San José State University Math 161A: Applied Probability & Statistics Special discrete distributions (part 1) Prof. Guangliang Chen

Transcript of SanJoséStateUniversity Math161A:AppliedProbability&Statistics...Specialdiscretedistributions(part1)...

  • San José State University

    Math 161A: Applied Probability & Statistics

    Special discrete distributions (part 1)

    Prof. Guangliang Chen

  • This lecture: Sections 3.4, 3.5

    • Bernoulli

    • Binomial

    • HyperGeometric

    Next lecture: Sections 3.5, 3.6

    • Geometric

    • Negative Binomial

    • Poisson

  • Special discrete distributions (part 1)

    Our treatment plan for each distribution

    • Examples

    • Definition (including pmf)

    • Expected value

    • Variance

    • Other useful properties (if any)

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 3/26

  • Special discrete distributions (part 1)

    Bernoulli distribution

    Consider the following experiments:

    Example 0.1 (Toss a fair coin). X = 1 (heads) and 0 (tails).

    Example 0.2 (Randomly select a ball from an urn that has 10 red and 20green balls). Let Y = 1 (if the selected ball is red) and 0 (otherwise).

    Example 0.3 (Randomly select an individual from a population 40% ofwhich have certain characteristic). Let Z = 1 (if the selected individualhas the characteristic) and 0 (otherwise).

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 4/26

  • Special discrete distributions (part 1)

    These experiments all share the following traits:

    • There is only one trial;

    • It has only two outcomes, “success” or “failure”;

    • The probability of getting a success is some number p;

    • X is a indicator variable: X = 1 (success) or 0 (failure)

    We say that such a random variable has a Bernoulli distribution withparameter p, and denote it as X ∼ Bernoulli(p).

    Such an experiment is called a Bernoulli trial.

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 5/26

  • Special discrete distributions (part 1)

    Example 0.4 (Toss a fair coin). X = 1 (heads) and 0 (tails).

    Answer: X ∼ Bernoulli(12)

    Example 0.5 (Randomly select a ball from an urn that has 10 red and 20green balls). Let Y = 1 (if the selected ball is red) and 0 (otherwise).

    Answer: Y ∼ Bernoulli(13)

    Example 0.6 (Randomly select an individual from a population 40% ofwhich have certain characteristic). Let Z = 1 (if the selected individualhas the characteristic) and 0 (otherwise).

    Answer: Z ∼ Bernoulli(0.4)

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 6/26

  • Special discrete distributions (part 1)

    Clearly, if a discrete random variable X follows a Bernoulli distributionwith parameter p, then its pmf has the following form (and vice versa):

    fX(x) = px(1− p)1−x, x = 0, 1

    (and fX(x) = 0 for all other x)

    x 0 1P (X = x) 1− p p 0 1

    b

    b

    1− p

    p |

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 7/26

  • Special discrete distributions (part 1)

    Theorem 0.1. Let X ∼ Bernoulli(p). Then

    E(X) = pVar(X) = p(1− p)

    Proof. We have already obtained these results previously.

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 8/26

  • Special discrete distributions (part 1)

    Binomial

    The following experiments are identical in nature:

    • (Toss a fair coin 10 times): X = #heads

    • (Answer 10 multiple-choiced questions by random guessing): X=#correctly answered questions

    • (Select with replacement 10 balls at random from an urn containing30 red and 20 blue balls): X= # red balls obtained

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 9/26

  • Special discrete distributions (part 1)

    We make the following abstraction:

    • There are n repeated trials in the experiment

    • Each trial has only two outcomes, “success” and “failure”

    • The probability p of getting successes is fixed

    • The n Bernoulli trials are independent

    • X denotes the total number of successes

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 10/26

  • Special discrete distributions (part 1)

    In short, X counts the total number of successes in n independent Bernoullitrials with fixed probability of success p.

    1 2 n

    In the above scenario, we say that X follows a binomial distributionwith parameters n, p, and write X ∼ B(n, p)

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 11/26

  • Special discrete distributions (part 1)

    Example.

    • (Toss a fair coin 10 times) X = #heads B(10, 12)

    • (Answer 10 multiple-choiced questions by random guessing) X=#correctly answered questions B(10, 14)

    • (Draw with replacement 10 balls from an urn containing 30 red and20 blue balls at random) X= # red balls obtained B(10, 0.6)

    Question. In the last example, if balls are drawn without replacement, isX still a binomial random variable? Why?

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 12/26

  • Special discrete distributions (part 1)

    Theorem 0.2. The pmf of X ∼ B(n, p) is

    fX(x) =(

    n

    x

    )px(1− p)n−x, x = 0, 1, . . . , n.

    1 2 n

    H HH

    How to understand this result:

    • (nx): # ways of having x successes in n trials• px: probability of having exactly x successes

    • (1− p)n−x: probability of having exactly n− x failures

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 13/26

  • Special discrete distributions (part 1)

    Example 0.7. What is the probability of getting 0 heads in 10 independentflips of a fair coin? Exactly 1 head? At least two heads?

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 14/26

  • Special discrete distributions (part 1)

    Example 0.8 (Answer 10 multiple-choiced questions by random guessing).Let X= #correctly answered questions. Find P (X = x) for x = 0, 2, 9.

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 15/26

  • Special discrete distributions (part 1)

    0 2 4 6 8 10

    x

    0

    0.05

    0.1

    0.15

    0.2

    0.25

    0.3

    f(x)

    B(n,p) pmfs with fixed n=10 and different values of p

    p=0.5p=0.25p=0.6

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 16/26

  • Special discrete distributions (part 1)

    Theorem 0.3. Let X ∼ B(n, p). Then

    E(X) = np,Var(X) = np(1− p)

    Proof. We have already proved the two results previously.

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 17/26

  • Special discrete distributions (part 1)

    Two “no meal” jokes

    Q: What is the binomial distribution?

    A: A free lunch program (buy no meal).

    Q: What do you call a mathematician’s parrot that has not been fed?

    A: A polynomial (poly no meal)

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 18/26

  • Special discrete distributions (part 1)

    Hypergeometric

    Recall the example in which we draw 10 balls at random from an urncontaining 30 red and 20 blue balls, and let X= # red balls.

    We obtained that X ∼ B(10, 0.6) if the experiment is performedwith replacement:

    fX(x) =(

    10x

    )0.6x(1− 0.6)10−x, x = 0, 1, . . . , 10

    We also mentioned that X is not binomial if it is without replacement.

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 19/26

  • Special discrete distributions (part 1)

    In fact, in the latter case (without replacement), X has the following pmf:

    fX(x) =(30

    x

    )( 2010−x

    )(5010) , x = 0, 1, . . . , 10

    where

    • (30x ): #ways of choosing x red balls out of 30• ( 20n−x): #ways of choosing 10− x blue balls out of 20• (5010): #ways of choosing 10 balls out of 50 in total (ignoring color)

    This is an example of a hypergeometric distribution with parametersN = 50, r = 30, n = 10.

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 20/26

  • Special discrete distributions (part 1)

    Def 0.1 (X ∼ HyperGeom(N, r, n)).We say that any random variable Xwith a pmf of the following form

    fX(x) =(r

    x

    )(N−rn−x

    )(Nn

    ) , for x in rangehas a Hypergeometric distributionwith parameters N, r, n).

    (r red)

    N balls in total

    (N − r blue)

    select n at random

    No replacement

    X= #red balls

    obtained

    Remark. Typically, Range(X) = {0, 1, . . . , n}. In the cases when there arefew red/blue balls, Range(X) ( {0, 1, . . . , n}. For example, if r = 5, N =25 and n = 10, then Range(X) = {0, 1, . . . , 5} ⊂ {0, 1, . . . , 10}.

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 21/26

  • Special discrete distributions (part 1)

    Example 0.9. Draw, without replacement, n voters at random fromthe whole pool of N that are registered, r of which support certaincandidate. Let X = #selected supporters of the candidate. Then X ∼HyperGeom(N, r, n).

    r individuals

    Population of size NSample n

    X = # individuals

    from subpopulation

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 22/26

  • Special discrete distributions (part 1)

    The hypergeometric pmf has a complicated formula, but it can be wellapproximated by the binomial pmf in the setting of large N and large r.

    Theorem 0.4. When N, r are both large (relative to n), then

    HyperGeom(N, r, n) ≈ B(n, p = rN

    ).

    Remark. In the preceding polling example, both r and N are typicallylarge (e.g., in the order of millions) and n is at most a thousand. Theabove theorem implies that, the number of selected voters that support aparticular candicate is approximately binomial: X approx∼ B(n, p = rN ).

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 23/26

  • Special discrete distributions (part 1)

    500 red

    499 blue

    498 red

    500 blue

    499 red

    500 blue

    500 red

    500 blue

    499 red

    499 blue

    499 red

    499 blue

    500 red

    498 blue

    5001000

    5001000

    499999

    500999

    500999

    499999

    497 red

    500 blue

    498998

    500 red

    497 blue498998

    b

    b

    b

    b

    b

    b

    b

    b

    b

    b

    b

    b

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 24/26

  • Special discrete distributions (part 1)

    Example 0.10. Select 20 balls at random from an urn containing 500red and 500 blue balls, and let X= number of red balls in the selection.Find both the exact value of P (X = 10) and its binomial approximation(Answer: .1780, .1762).

    0 2 4 6 8 10 12 14 16 18 200

    0.05

    0.1

    0.15

    0.2

    HG(1000,500,20)B(20,0.5)

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 25/26

  • Special discrete distributions (part 1)

    Theorem 0.5. Let X ∼ HyperGeom(N, r, n) and p = rN . Then

    E(X) = nrN

    = np

    Var(X) = np(1− p)(

    N − nN − 1

    )

    Remark. Compare with B(n, p = rN ) corresponding to with replacement:

    • The expected values are identical, but

    • the variance of the hypergeometric distribution is smaller.

    Prof. Guangliang Chen | Mathematics & Statistics, San José State University 26/26