4 Continuous Random Variables and Probability...

101
4 Continuous Random Variables and Probability Distributions Stat 4570/5570 Material from Devore’s book (Ed 8) – Chapter 4 - and Cengage

Transcript of 4 Continuous Random Variables and Probability...

Page 1: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

4 Continuous RandomVariables and

Probability Distributions

Stat 4570/5570

Material from Devore’s book (Ed 8) – Chapter 4 - and Cengage

Page 2: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

2

Continuous r.v.A random variable X is continuous if possible valuescomprise either a single interval on the number line or aunion of disjoint intervals.

Example: If in the study of the ecology of a lake, X, the r.v.may be depth measurements at randomly chosen locations.Then X is a continuous r.v. The range for X is the minimumdepth possible to the maximum depth possible.

Page 3: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

3

Continuous r.v.In principle variables such as height, weight, andtemperature are continuous, in practice the limitations ofour measuring instruments restrict us to a discrete (thoughsometimes very finely subdivided) world.

However, continuous models often approximate real-worldsituations very well, and continuous mathematics (calculus)is frequently easier to work with than mathematics ofdiscrete variables and distributions.

Page 4: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

4

Probability Distributions for Continuous Variables

Suppose the variable X of interest is the depth of a lake ata randomly chosen point on the surface.

Let M = the maximum depth (in meters), so that anynumber in the interval [0, M ] is a possible value of X.

If we “discretize” X by measuring depth to the nearestmeter, then possible values are nonnegative integers lessthan or equal to M.

The resulting discrete distribution of depth can be picturedusing a probability histogram.

Page 5: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

5

Probability Distributions for Continuous Variables

If we draw the histogram so that the area of the rectangleabove any possible integer k is the proportion of the lakewhose depth is (to the nearest meter) k, then the total areaof all rectangles is 1:

Probability histogram of depth measured to the nearest meter

Page 6: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

6

Probability Distributions for Continuous Variables

If depth is measured much more accurately, each rectanglein the resulting probability histogram is much narrower,though the total area of all rectangles is still 1.

Probability histogram of depth measured to the nearest centimeter

Page 7: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

7

Probability Distributions for Continuous Variables

If we continue in this way to measure depth more and morefinely, the resulting sequence of histograms approaches asmooth curve.

Because for each histogram the total area of all rectanglesequals 1, the total area under the smooth curve is also 1.

A limit of a sequence of discrete histograms

Page 8: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

8

Probability Distributions for Continuous Variables

DefinitionLet X be a continuous rv. Then a probability distributionor probability density function (pdf) of X is a function f (x)such that for any two numbers a and b with a ≤ b,

P (a ≤ X ≤ b) =

Page 9: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

9

Probability Distributions for Continuous Variables

The probability that X takes on a value in the interval [a, b]is the area above this interval and under the graph of thedensity function:

P (a ≤ X ≤ b) = the area under the density curve between a and b

Page 10: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

10

Probability Distributions for Continuous Variables

For f (x) to be a legitimate pdf, it must satisfy the followingtwo conditions:

1. f (x) ≥ 0 for all x

2. = area under the entire graph of f (x) = 1

Page 11: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

11

ExampleConsider the reference line connecting the valve stem on atire to the center point.

Let X be the angle measured clockwise to the location ofan imperfection. One possible pdf for X is

Page 12: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

12

Example, contThe pdf is graphed in Figure 4.3.

The pdf and probability from Example 4

Figure 4.3

cont’d

Page 13: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

13

Example , contClearly f(x) ≥ 0. How can we show that the area of this pdfis equal to 1?

How do we calculate P(90 <= X <= 180)?

The probability that the angle of occurrence is within 90° ofthe reference line (reference line is at 0 degrees)?

cont’d

Page 14: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

14

Probability Distributions for Continuous Variables

Because whenever 0 ≤ a ≤ b ≤ 360 in Example 4.4 andP (a ≤ X ≤ b) depends only on the width b – a of the interval,X is said to have a uniform distribution.

DefinitionA continuous rv X is said to have a uniform distributionon the interval [A, B] if the pdf of X is

Page 15: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

15

Example“Time headway” in traffic flow is the elapsed time betweenthe time that one car finishes passing a fixed point and theinstant that the next car begins to pass that point.

Let X = the time headway for two randomly chosenconsecutive cars on a freeway during a period of heavyflow.

This pdf of X is essentially the one suggested in “The Statistical Properties ofFreeway Traffic” (Transp. Res., vol. 11: 221–228):

Page 16: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

16

Example , contThe graph of f (x) is given in Figure 4.4; there is no densityassociated with headway times less than .5, and headwaydensity decreases rapidly (exponentially fast) as xincreases from .5.

The density curve for time headway in Example 5

Figure 4.4

cont’d

Page 17: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

17

Example , contClearly, f (x) ≥ 0; to show that f (x)dx = 1, we use thecalculus result

e–kx dx = (1/k)e–k a.

What is the probability that headway time is at most 5seconds?

cont’d

Page 18: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

18

The Cumulative DistributionFunction

Page 19: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

19

The Cumulative Distribution Function

The cumulative distribution function F(x) for acontinuous rv X is defined for every number x by

F(x) = P(X ≤ x) =

For each x, F(x) is the area under the density curve to theleft of x. This is illustrated in Figure 4.5, where F(x)increases smoothly as x increases.

Figure 4.5A pdf and associated cdf

Page 20: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

20

ExampleLet X, the thickness of a certain metal sheet, have auniform distribution on [A, B].

The density function is shown in Figure 4.6.

Figure 4.6

The pdf for a uniform distribution

Page 21: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

21

Example , contFor x < A, F(x) = 0, since there is no area under the graphof the density function to the left of such an x.

For x ≥ B, F(x) = 1, since all the area is accumulated to theleft of such an x. Finally for A ≤ x ≤ B,

cont’d

Page 22: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

22

Example , contThe entire cdf is

The graph of this cdf appears in Figure 4.7.

Figure 4.7

The cdf for a uniform distribution

cont’d

Page 23: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

23

Using F (x) to Compute Probabilities

Page 24: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

24

Percentiles of a Continuous Distribution

When we say that an individual’s test score was at the 85thpercentile of the population, we mean that 85% of allpopulation scores were below that score and 15% wereabove.

Similarly, the 40th percentile is the score that exceeds 40%of all scores and is exceeded by 60% of all scores.

Page 25: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

25

Percentiles of a Continuous Distribution

PropositionLet p be a number between 0 and 1. The (100p)thpercentile of the distribution of a continuous rv X, denotedby η(p), is defined by

p = F(η(p)) = f(y) dy

η(p) is the specific value such that 100p% of the areaunder the graph of f(x) lies to the left of η(p) and 100(1 – p)%lies to the right.

Page 26: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

26

Percentiles of a Continuous Distribution

Thus η(.75), the 75th percentile, is such that the area underthe graph of f(x) to the left of η(.75) is .75.

Figure 4.10 illustrates the definition.

Figure 4.10

The (100p)th percentile of a continuous distribution

Page 27: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

27

Example 9The distribution of the amount of gravel (in tons) sold by aparticular construction supply company in a given week is acontinuous rv X with pdf

What is the cdf of sales for any x? How do you use this tofind the probability that X is less than .25? What about theprobability that X is greater than .75? What aboutP(.25 < X < .75)?

Page 28: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

28

Example 9The graphs of both f (x) and F(x) appear in Figure 4.11.

Figure 4.11

The pdf and cdf for Example 4.9

cont’d

Page 29: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

29

Example 9

How do we find the 40th percentile of this distribution?

How would you find the median of this distribution?

cont’d

Page 30: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

30

Percentiles of a Continuous Distribution

DefinitionThe median of a continuous distribution, denoted by , isthe 50th percentile, so satisfies .5 = F( ) That is, half thearea under the density curve is to the left of and half is tothe right of .

A continuous distribution whose pdf is symmetric—thegraph of the pdf to the left of some point is a mirror imageof the graph to the right of that point—has median equalto the point of symmetry, since half the area under thecurve lies to either side of this point.

Page 31: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

31

Expected Values

DefinitionThe expected or mean value of a continuous rv X withthe pdf f (x) is:

µ x = E(X) = x f(x) dx

Page 32: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

32

Example, contBack to the gravel example, the pdf of the amount ofweekly gravel sales X is:

Page 33: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

33

Expected Values of functions of r.v.

If h(X) is a function of X, then

E[h(X)] = µh(X) = h(x) f (x) dx

For h(X), a linear function,

E[h(X)] = E(aX + b) = a E(X) + b

Page 34: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

34

VarianceThe variance of a continuous random variable X with pdff(x) and mean value µ is

= V(X) = (x – µ)2 f (x) dx = E[(X – µ)2] = E(X2) – [E(X)]2

The standard deviation (SD) of X is σX =

When h(X) = aX + b, the expected value and variance ofh(X) satisfy the same properties as in the discrete case:

E[h(X)] = a µ + b and V[h(X)] = a2 σ2.

Page 35: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

35

Example, cont.

How do we compute the variance for the weeklygravel sales example?

Page 36: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

36Copyright © Cengage Learning. All rights reserved.

4.3 The Normaldistribution

Page 37: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

37

The Normal DistributionThe normal distribution is probably the most importantdistribution in all of probability and statistics.

Many populations have distributions that can be fit veryclosely by an appropriate normal (Gaussian, bell) curve.

Examples: height, weight, and other physicalcharacteristics, scores on various tests

Page 38: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

38

The Normal DistributionDefinitionA continuous rv X is said to have a normal distributionwith parameters µ and σ (or µ and σ2), where < µ <and 0 < σ, if the pdf of X is

f(x; µ, σ) = < x < (4.3)

e denotes the base of the natural logarithm systemand equals approximately 2.71828π is another mathematical constant with approximate value3.14159.

Page 39: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

39

The Normal DistributionThe statement that X is normally distributed withparameters µ and σ2 is often abbreviated X ~ N(µ, σ2).

Clearly f(x; µ, σ) ≥ 0, but a somewhat complicated calculusargument must be used to verify that f(x; µ, σ) dx = 1.

Similarly, it can be shown that E(X) = µ and V(X) = σ2, sothe parameters are the mean and the standard deviation ofX.

Page 40: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

40

The Normal DistributionFigure below presents graphs of f(x; µ, σ) for severaldifferent (µ, σ) pairs.

Two different normal density curves Visualizing µ and σ for a normal distribution

Page 41: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

41

The Standard NormalDistribution

Page 42: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

42

The Standard Normal DistributionThe normal distribution with parameter values µ = 0 andσ = 1 is called the standard normal distribution.

A random variable with this distribution is called a standardnormal random variable and is denoted by Z. Its pdf is:

< z <

The graph of f(z; 0, 1) is called the standard normal curve.Its inflection points are at 1 and –1.

The cdf of Z is which we denote by

Page 43: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

43

The Standard Normal DistributionThe standard normal distribution rarely occurs naturally.

Instead, it is a reference distribution from whichinformation about other normal distributions can beobtained via a simple formula.

These probabilities can then be found “normal tables.”

This can also be computed with a single command inR, Matlab, Mathematica, Stata, etc.

Page 44: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

44

The Standard Normal DistributionFigure below illustrates the probabilities tabulated in TableA.3:

Page 45: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

45

ExampleP(Z ≤ 1.25) = (1.25), a probability that is tabulated in anormal table.

What is this probability?

Figure below illustrates this probability:

cont’d

Page 46: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

46

Example , cont.b) P(Z ≥ 1.25) = ?c) Why does P(Z ≤ –1.25) = P(Z >= 1.25)? What is

(–1.25)?d) How do we calculate P(–.38 ≤ Z ≤ 1.25)?

cont’d

Page 47: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

47

Percentiles of the Standard NormalDistribution

Page 48: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

48

ExampleThe 99th percentile of the standard normal distribution isthat value of z such that the area under the z curve to theleft of the value is .99

Tables give for fixed z the area under the standard normalcurve to the left of z, whereas now –we have the area and want the value of z.

This is the “inverse” problem to P(Z ≤ z) = ?

How can the table be used for this?

Page 49: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

49

ExampleBy symmetry, the first percentile is as far below 0 as the99th is above 0, so equals –2.33 (1% lies below the firstand also above the 99th).

cont’d

Page 50: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

50

Percentiles of the Standard Normal Distribution

If p does not appear in a table, what can we do?

What is the 95th percentile of the normal distribution?

Page 51: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

51

zα NotationIn statistical inference, we need the z values that givecertain tail areas under the standard normal curve.

There, this notation will be standard:zα will denote the z value for which α of the area underthe z curve lies to the right of zα.

Page 52: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

52

zα Notation for z Critical ValuesFor example, z.10 captures upper-tail area .10, and z.01captures upper-tail area .01.

Since α of the area under the z curve lies to the right of zα,1 – α of the area lies to its left.

Thus zα is the 100(1 – α)th percentile of the standardnormal distribution.

Similarly, what does –zα mean?

Page 53: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

53

zα Notation for z Critical ValuesTable below lists the most useful z percentiles and zαvalues.

Page 54: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

54

Example – critical valuesz.05 is the 100(1 – .05)th = 95th percentile of the standardnormal distribution, so z.05 = 1.645.

The area under the standard normal curve to the left of–z.05 is also .05

Page 55: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

55

Nonstandard Normal DistributionsWhen X ~ N(µ, σ 2), probabilities involving X are computedby “standardizing.” The standardized variable is (X – µ)/σ.

Subtracting µ shifts the mean from µ to zero, and thendividing by σ scales the variable so that the standarddeviation is 1 rather than σ.

PropositionIf X has a normal distribution with mean µ and standarddeviation σ, then

is distributed standard normal.

Page 56: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

56

Nonstandard Normal DistributionsWhy do we standardize normal random variables?

Equality of nonstandard and standard normal curve areas

Page 57: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

57

Nonstandard Normal Distributionshas a standard normal distribution. Thus

Page 58: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

58

Using Normal to approximate theBinomial Distribution

Page 59: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

59

Approximating the Binomial Distribution

Figure below displays a binomial probability histogram forthe binomial distribution with n = 20, p = .6,for which µ = 20(.6) = 12 and σ =

Binomial probability histogram for n = 20, p = .6 with normal approximation curve superimposed

Page 60: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

60

Approximating the Binomial Distribution

Let X be a binomial rv based on n trials with successprobability p. Then if np is large (the binomial probabilityhistogram is not too skewed), X has approximately anormal distribution with µ = np and σ =

In particular, for x = a possible value of X?

Page 61: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

61

Exponential Distribution

Page 62: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

62

The Exponential DistributionsThe family of exponential distributions provides probabilitymodels that are very widely used in engineering andscience disciplines. Examples?

DefinitionX is said to have an exponential distribution with the rateparameter λ (λ > 0) if the pdf of X is

Page 63: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

63

The Exponential DistributionsIntegration by parts give the following results:

How would we set up the calculation for EX?Both the mean and standard deviation of the exponentialdistribution equal 1/λ.

CDF:

Page 64: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

64

Another important application of the exponential distributionis to model the distribution of lifetimes.

A partial reason for the popularity of such applications isthe “memoryless” property of the Exponential distribution.

The Exponential Distributions

Page 65: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

65

The Exponential DistributionsSuppose a light bulb’s lifetime is exponentially distributedwith parameter λ.

Say you turn the light on, and then we leave and comeback after t0 hours to find it still on. What is the probabilitythat the light bulb will last for at least additional t hours?

In symbols, we are looking for P(X ≥ t + t0 | X ≥ t0).

How would we calculate this?

Page 66: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

66

The Gamma Distribution

Page 67: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

67

The Gamma FunctionTo define the family of gamma distributions, we first needto introduce a function that plays an important role in manybranches of mathematics.

DefinitionFor α > 0, the gamma function is defined by

Page 68: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

68

The Gamma FunctionThe most important properties of the gamma function arethe following:

1. For any α > 1, Γ(α) = (α – 1) Γ(α – 1) [via integration by parts]

2. For any positive integer, n, Γ(n) = (n – 1)!

3.

Page 69: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

69

The Gamma FunctionSo if we let

then f(x; α) ≥ 0 and ,

so f(x; a) satisfies the two basic properties of a pdf.

Page 70: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

70

The Gamma DistributionDefinitionA continuous random variable X is said to have a gammadistribution if the pdf of X is

where the parameters α and β satisfy α > 0, β > 0. Thestandard gamma distribution has β = 1.

Page 71: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

71

The Gamma DistributionThe Exp dist results from taking α = 1 and β = 1/λ.

Figure on left illustrates the gamma pdf f(x; α, β) for several(α, β) pairs, and right the standard gamma pdf.

Gamma density curves standard gamma density curves

Page 72: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

72

The Gamma DistributionThe mean and variance of a random variable X having the gammadistribution f (x; α, β) are

E(X) = µ = αβ V(X) = σ2 = αβ2

When X is a standard gamma rv, the cdf of X,

is often called the incomplete gamma function.

Tables with probabilities are available. Even better, R (pgamma),Matlab, Mathematica, etc can all calculate gamma probabilities.

Page 73: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

73

ExampleSuppose the survival time X (weeks) of a random mousehas a gamma distribution with α = 8 and β = 15.

What are EX, Var(X), and sd(X)?

What is the probability probability that a mouse survivesbetween 60 and 120 weeks? What is the probability that amouse survives at least 30 weeks? (Use tables and R)

Page 74: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

74

The Chi-Squared Distribution

Page 75: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

75

The Chi-Squared DistributionDefinitionLet v be a positive integer. Then a random variable X issaid to have a chi-squared distribution with parameter v ifthe pdf of X is the gamma density with α = v/2 and β = 2.The pdf of a chi-squared rv is thus

The parameter is called the number of degrees offreedom (df) of X. The symbol χ2 is often used in place of“chi-squared.”

(4.10)

Page 76: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

76

The Weibull Distribution

Page 77: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

77

The Weibull DistributionThe family of Weibull distributions was introduced by theSwedish physicist Waloddi Weibull in 1939; his 1951 article“A Statistical Distribution Function of Wide Applicability”(J. of Applied Mechanics, vol. 18: 293–297) discusses anumber of applications.

DefinitionA random variable X is said to have a Weibull distributionwith parameters α and β (α > 0, β > 0) if the pdf of X is

(4.11)

Page 78: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

78

The Weibull DistributionIn some situations, there are theoretical justifications for theappropriateness of the Weibull distribution, but in manyapplications f (x; α, β) simply provides a good fit toobserved data for particular values of α and β.

When α = 1, the pdf reduces to the exponential distribution(with λ = 1/β), so the exponential distribution is a specialcase of both the gamma and Weibull distributions.

Page 79: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

79

The Weibull DistributionBoth α and β can be varied to obtain a number of different-looking density curves, as illustrated in

Weibull density curves

Page 80: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

80

The Weibull Distributionβ is called a scale parameter, since different values stretchor compress the graph in the x direction, and α is referredto as a shape parameter.

Integrating to obtain E(X) and E(X2) yields

The computation of µ and σ

2 requires use of the gammafunction.

Page 81: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

81

The Weibull DistributionThe integration is easily carried out to obtainthe cdf of X.

The cdf of a Weibull rv having parameters α and β is

(4.12)

Page 82: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

82

The Lognormal Distribution

Page 83: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

83

The Lognormal DistributionDefinition

A nonnegative rv X is said to have a lognormaldistribution if the rv Y = ln(X) has a normal distribution.

The resulting pdf of a lognormal rv when ln(X) is normallydistributed with parameters µ and σ is

Page 84: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

84

The Lognormal DistributionThe parameters µ and σ are not the mean and standarddeviation of X but of ln(X).

It is common to refer to µ and σ as the location and thescale parameters, respectively. The mean and variance ofX can be shown to be

Page 85: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

85

The Lognormal DistributionFigure below illustrates graphs of the lognormal pdf;although a normal curve is symmetric, a lognormal curve isskewed. Is the skew positive or negative?

Page 86: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

86

The Lognormal DistributionBecause ln(X) has a normal distribution, the cdf of X can beexpressed in terms of the cdf φ(z) of a standard normalrv Z.

F(x; µ, σ) = P(X ≤ x) = P [ln(X) ≤ ln(x)]

(4.13)

Page 87: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

87

The Beta Distribution

Page 88: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

88

The Beta DistributionSo far, all families of continuous distributions (except forthe uniform distribution) had positive density over an infiniteinterval.

The beta distribution provides positive density only for X inan interval of finite length [A,B].

The standard beta distribution is commonly used to modelvariation in the proportion or percentage of a quantityoccurring in different samples. Examples?

Page 89: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

89

The Beta DistributionDefinitionA random variable X is said to have a beta distributionwith parameters α, β (both positive), A, and B if the pdf ofX is

The case A = 0, B = 1 gives the standard betadistribution.

Page 90: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

90

The Beta DistributionFigure below illustrates several standard beta pdf’s.

Standard beta density curves

Page 91: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

91

The Beta DistributionGraphs of the general pdf are similar, except they areshifted and then stretched or compressed to fit over [A, B].

Unless α and β are integers, integration of the pdf tocalculate probabilities is difficult. Either a table of theincomplete beta function or appropriate software should beused.

The mean and variance of X are

Page 92: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

92

Examples…

Page 93: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

93

Example 1Suppose the pdf of the magnitude X of a dynamic load on abridge (in newtons) is

What is F(x)?

Page 94: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

94

Example 1The graphs of f(x) and F(x) are shown in Figure 4.9.

The pdf and cdf for Example 4.7

cont’d

Page 95: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

95

Example 1What is the probability that the load is between 1 and 1.5?The probability that the load exceeds 1?

The average load? The median load?

cont’d

Page 96: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

96

Example 2Two species are competing in a region for control of alimited amount of a certain resource.

Let X = the proportion of the resource controlled by species1 and suppose X has pdf 0 ≤ x ≤ 1 otherwise

which is a uniform distribution on [0, 1]. (In her bookEcological Diversity, E. C. Pielou calls this the “broken- tick”model for resource allocation, since it is analogous tobreaking a stick at a randomly chosen point.)

f(x) =

Page 97: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

97

Example 2Then the species that controls the majority of this resourcecontrols the amount

h(X) = max (X, 1 – X) =

What is the expected amount controlled by the specieshaving majority control?

cont’d

Page 98: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

98

Example 3The time that it takes a driver to react to the brake lights ona decelerating vehicle is critical in helping to avoid rear-endcollisions.

The article “Fast-Rise Brake Lamp as a Collision-Prevention Device” (Ergonomics, 1993: 391–395) suggeststhat reaction time for an in-traffic response to a brake signalfrom standard brake lights can be modeled with a normaldistribution having mean value 1.25 sec and standarddeviation of .46 sec.

What is the probability that reaction time is between 1.00sec and 1.75 sec?

Page 99: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

99

Example 3Similarly, if we view 2 seconds as a critically long reactiontime, the probability that actual reaction time will exceedthis value is

cont’d

Page 100: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

100

Example 4According to the article “Predictive Model for PittingCorrosion in Buried Oil and Gas Pipelines” (Corrosion,2009: 332–342), the lognormal distribution has beenreported as the best option for describing the distribution ofmaximum pit depth data from cast iron pipes in soil.

The authors suggest that a lognormal distribution withµ = .353 and σ = .754 is appropriate for maximum pit depth(mm) of buried pipelines.

Page 101: 4 Continuous Random Variables and Probability Distributionsamath.colorado.edu/sites/default/files/2014/02/1358505463/Week4.pdf · 4 Continuous Random Variables and Probability Distributions

101

Example 4What are the mean and variance of pit depth?

What is the probability that maximum pit depth is between1 and 2 mm?

What value c is such that only 1% of all specimens have amaximum pit depth exceeding c?

cont’d