Chapter 9: Introducing ProbabilityProbability Models 3 Thus we’ve described: 1.A list of possible...

ACMS 20340Statistics for Life Sciences

Chapter 9:Introducing Probability

Why Consider Probability?

We’re doing statistics here. Why should we bother withprobability?

As we will see, probability plays an important role in statistics.

An Example I

In a (very) recent Gallup study on the role of religion on one’sviews on violence, we find the following statement:

Results are based on face-to-face interviews withapproximately 1,000 adults in each country, aged 15 andolder, from 2008 through 2010. For results based on thetotal sample of adults, one can say with 95% confidencethat the maximum margin of sampling error ranges from±1.66 to ±5.8 percentage points.

Source: gallup.com

An Example II

What is meant by the claim that “one can say with 95%confidence that the maximum margin of sampling error rangesfrom ±1.66 to ±5.8 percentage points”?

This means that the probability that the estimate from the samplescomes within the given margin of error is 0.95.

Another Example

Recall: A simple random sample of size 10 taken from this classmeans that every possible group of size 10 has an equal chance ofbeing selected.

What do we mean when we say a group “has an equal chance ofbeing selected”?

A class of size 50 has 10,272,278,170 possible samples of size 10.

What is Probability?

This is a difficult philosophical question.

Following the textbook, we will define probability in terms of thelong run behavior of random phenomena.

Why “the long run behavior of random phenomena”?

“Chance behavior is unpredictable in the short run but has aregular and predictable pattern in the long run.”

Short Run vs. Long Run

Suppose I toss a fair (unbiased) coin.

Short run vs. long runSuppose I toss a fair (or unbiased) coin.

What will the outcome be?

You can’t know for certain: the outcome isunpredictable in the short run.

or ?





or ?

or




or ?

?

One can’t know for certain: the outcome is unpredictable in theshort run.

Short Run vs. Long Run

However, if I toss the coin a sufficiently large number of times, theoutcomes start to settle down.

Two trials of 5000 tosses each:

Short run vs. long run

However, if I toss the coin a sufficiently large number of times, the outcomes start to settle down.

Two trials of 5000 tosses each:

Further Confirmation

Buffon Kerrich Pearson

Total Tosses 4,040 10,000 24,000Heads 2,048 5,067 12,012Proportion 0.5069 0.5067 0.5005

(These guys had too much time on their hands.)

Randomness and Probability

A phenomenon is random if individual outcomes are uncertain butthere is nonetheless a regular distribution of outcomes in a largenumber of repetitions.

Always keep the example of the tosses of a coin in mind!

The probability of any outcome of a random phenomenon is theproportion of times the outcome would occur in a very long seriesof repetitions.

As we toss the fair coin more and more, the proportion of the occurrence

of heads gets closer and closer to 1/2.

Examples of randomness?

I The outcome of a coin toss.

I The time between emissions of particles by a radioactivesource.

I The sexes of the next litter of lab rats.

I The outcome of a random sample of randomized experiment.

Probability Models 1

Let us study a certain random phenomenon, the birth of a child.Probability Models 1

Suppose we are studying a certain random phenomenon.

Consider, for example,the birth of a child.



That is, will the child be male or female?

We can’t know (too far) in advance.

Here’s what we do know:

1. The outcome will be either male or female.

2. The probability of each outcome is (roughly) 1/2.


Thus we’ve described:

1. A list of possible outcomes

2. A probability for each outcome.

These correspond to the two components of a probability model.

Before defining a probability model, we need a bit moreterminology.


The sample space S of a random phenomenon is the set of allpossible outcomes.

An event is an outcome or set of outcomes of a randomphenomenon.

Thus, an event is a subset of the sample space.

For example, if S = {1, 2, 3, 4, 5, 6, 7, 8, 9} is a set of outcomes,then E = {2, 4, 6, 8} is an event.

Careful: An event need not be an individual outcome!

Finally. . .

A probability model is the description of a random phenomenonconsisting of

1. a sample space S , and

2. a way of assigning probabilities to events in S .

Examples of Sample Spaces

I S = {M,F}

I S = {Republican, Democrat, Independent }

I S = {weights of 1,000 individuals in a sample }

A Baby-friendly Example

Suppose a couple plans to have three children.

Let S be the number of girls they can possibly have.

That is, S = {0, 1, 2, 3}.

What is the probability of each outcome in S (assuming that theprobability of a girl is 1/2)?

Incorrect answer: Each outcome is equally likely, so each hasprobability 1/4.

The Possible Outcomes

Possible outcomes with one child:

{B,G}

Possible outcomes with two children:

{BB,BG ,GB,GG}

Possible outcomes with three children:

{BBB,BBG ,BGB,BGG ,GBB,GBG ,GGB,GGG}

Now, these outcomes are equally likely.

Calculating the Probabilities

Probability of no girls? 1/8

BBB,��BBG ,��BGB,��BGG ,��GBB,��GBG ,��GGB,��GGG

Probability of exactly one girl? 3/8

��BBB,BBG ,BGB,��BGG ,GBB,��GBG ,��GGB,��GGG

Probability of exactly two girls? 3/8

��BBB,��BBG ,��BGB,BGG ,��GBB,GBG ,GGB,��GGG

Probability of three girls? 1/8

��BBB,��BBG ,��BGB,��BGG ,��GBB,��GBG ,��GGB,GGG

Another Example: Blood Types

Let S = {O+,O−,A+,A−,B+,B−,AB+,AB−}.

If we choose an American at random, what is the probability thatthis person has, say, blood type O+?

A more serious exampleLet S = {O+, O-, A+, A-, B+, B-, AB+, AB-}

If we choose an American at random, what is the probability that the citizen has, say, blood type O+?

Where do we get these probabilities?

These are the frequencies of occurrence of each blood type—weget them by taking lots and lots of samples.

A Donation Problem

Suppose that we need blood for someone with blood type AB−.

What is the probability that a randomly selected American has theright blood to donate?

Individuals with blood type O−,A−,B− and AB− can donate tothis person.

So what we’re looking for is the probability of the event

E = {O−,A−,B−,AB−}.

How do we find this probability?

General Rules of Probability: Rules 1 and 2

Rule 1: Every probability is a number between 0 and 1.

That is, if A is any event in S , then

0 ≤ P(A) ≤ 1.

What if P(A) = 0?

I When S is finite, then this means that A is impossible.

What if P(A) = 1?

I When S is finite, then this means that A must occur.

Rule 2: The event consisting of all outcomes in the sample spacehas probability 1.

General Rules of Probability: Rule 3

Rule 3: If two events have no outcomes in common, then theprobability that one or the other occurs is the sum of theirindividual probabilities.

When two events have no outcomes in common, we say that theyare disjoint.

Rule 3: If two events have no outcomes in common, then the probability that one or the other occurs is the sum of their individual probabilities.

Probability rules: Rule 3

When two events have no outcomes in common, we say that they are disjoint.

General Rules of Probability: Rule 3 (continued)

Rule 3: If A and B are disjoint, then

P(A or B) = P(A) + P(B).

(This is sometimes call the “addition rule”.)

In general, if A and B are any two events in S , then

P(A or B) = P(A) + P(B)− P(A and B).

Rule 3: If A and B are disjoint, then

Probability rules: Rule 3

P(A or B) = P(A) + P(B).

In general, if A and B are any two events in S, then

P(A or B) = P(A) + P(B) - P(A and B).

(This is sometimes called the addition rule.)

General Rules of Probability: Rule 4

Rule 4: The probability that an event does not occur is 1 minusthe probability that the event does occur.

P(A does not occur) = 1− P(A)

Back to the Donation Problem

The addition rule holds for more than just two disjoint events:

P(O − or A− or B − or AB−) =

P(O−) + P(A−) + P(B−) + P(AB−) =

0.07 + 0.06 + 0.02 + 0.01 = 0.16

We also used the addition rule in the example with the couplehaving three children. Can you see where?

Discrete Probability Models

So far, the probability models we’ve considered are discreteprobability models.

A probability model is discrete if the sample space is made up of alist of individual outcomes (the first outcome, the second outcome,the third outcome, . . . ).

To assign probabilities in a discrete model, we merely list theprobabilities of all the individual outcomes.

Continuous Probability ModelsWhat kind of probability model should we use for continuousquantitative variables? These can take any number in a range ofpossible values.

First try: Histograms!Heights (inches) of women age 40–49 in the U.S.(Ignore the curve...just look at the histogram)

Calculating Probabilities 1

On this graph the bins are intervals of 1 inch.

What if we want to know the probability someone is within half aninch of 60 inches?

P(59.5 ≤ X ≤ 60.5) =?

Calculating Probabilities 2

We could keep asking for probabilities of smaller and smallerintervals.

But then there are an infinite number of possible events!

This is a problem.

Is there an easier way?

Continuous Distributions

Solution: Use a curve to indicate the different outcomes, and letthe probability of any given interval of values be the area under thecurve.

These curves are called density curves.

Density Curves

All density curves have the following properties.

I The curve is always on or above the x-axis.

I The total area under the curve is equal to 1.

Continuous Probability Models: The Official Definition

A continuous probability model gives a density curve and assignsthe probability of every interval as the area under the curve forthat interval.

A Warning about Density Curves

No set of real data is exactly described by a density curve.

The curve is a model.

That is, the curve is an idealized description that is easy to useand accurate enough for practical use.

Think of the density curve like you would the regression line:

Least-squares regression models a linear trend, and is used to makepredictions about similar individuals in the population.

Example: The Uniform Distribution on the Interval [a, b]

Example: Exponential Distributions

Finding Probabilities with a Density Curve

Let’s consider the uniform distribution on [0,1]:

What is

1. P(X ≤ 0.5)?

2. P(X = 0.5)?

3. P(X < 0.5)?

4. P(X ≤ 0.5 or X > 0.8)?

1. The area under the curve for the region x ≤ 0.5 is a 0.5× 1rectangle, and so P(X ≤ 0.5) = 0.5.

2. The area under the curve for this region is a 0× 1 rectangle,and so P(X = 0.5) = 0.

Finding Probabilities with a Density Curve

Let’s consider the uniform distribution on [0,1]:

What is

1. P(X ≤ 0.5)?

2. P(X = 0.5)?

3. P(X < 0.5)?

4. P(X ≤ 0.5 or X > 0.8)?

3. Observe: P(X < 0.5) + P(X = 0.5) = P(X ≤ 0.5). Thus,P(X < 0.5) = P(X ≤ 0.5) = 0.5.

4. P(X ≤ 0.5 or X > 0.8) = P(X ≤ 0.5) + P(X > 0.8)= 0.5 + 0.2 = 0.7.

Another Warning!

All continuous probability models assign probability 0 to anyindividual outcome.

Only intervals of values have positive probability.

Measures of Center: Median and Mean

Q1: What is the median of a density curve?

A1: It is the point a such that half the area is to the left and halfthe area is to the right:

P(X < a) = P(X > a)

Q2: What is the mean of a density curve?

A1: It is the point a such that the curve would balance if the areaunder the curve were cut from a block of wood and held atpoint a.

For density curves, we use

I µ for the mean, and

I σ for the standard deviation.

(Why? We will see in Chapter 13).

Comparing Median and Mean

Random Variables

When we write P(X > 7), what exactly is X ?

X is a variable which represents the outcome of a randomphenomenon. We call it a random variable.

X may be any possible outcome. But the probability that X willbe in any given interval is called a probability distribution.

X may be either discrete or continuous.

I The X for number of daughters is discrete and finite.

I The X for heights of women is continuous.

Chapter 9: Introducing ProbabilityProbability Models 3 Thus we’ve described: 1.A list of possible...

Documents

Transcript of Chapter 9: Introducing ProbabilityProbability Models 3 Thus we’ve described: 1.A list of possible...