Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

43
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Transcript of Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Page 1: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Analysis of Means

Farrokh Alemi, Ph.D.

Kashif Haqqi M.D.

Page 2: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Table of Content

• Review

• Objectives

• Definitions

• Expected Value

• Normal Distribution

• Distribution of Mean

• Central Limit Theorem

• Standard Normal Distribution

• Use of Z Values• Confidence Interval • Hypothesis• Two Types of Error• One-tailed Tests• Steps in Testing a

Hypothesis• When to Assume

Normal Distribution for Means

• Use t-distribution

Page 3: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Review

• Frequency distribution

• Mean, median, and mode

• Standard deviation and range

Statistics is the art of making

sense of distributions.

Page 4: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Objectives

• Describe different distributions, including normal, and t-distributions.

• Calculate and interpret confidence intervals using normal distributions.

• Understand types of errors that occurs with hypothesis testing.

• Hypothesis testing using t-distribution.

Page 5: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Example You Should Be Able to Answer at the End

Is it important to ask these types of questions?

• The cost of rehabilitation in the industry is $25,000, with a standard deviation of 3000.

• Assume that the average cost in our hospital is $30,000.

• With 95% confidence, would you say that our cost is different than the industry?

Page 6: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Definitions

• A random variable is a variable whose values are determined by chance.

• A probability distribution is the probability with which values of a random variable can or are observed.

• Probability of a value is the frequency of occurrence of that value divided by the frequency of occurrences of all values.

Page 7: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Example of Probability Estimates

• We examined the waiting time of 50 people at our emergency room and found that 10 people waited up to 5 minutes, 20 people waited 5.001 to 10 minutes, 13 people waited 10.001 to 15 minutes and 7 people waited 15.001to 20 minutes.

• What is the probability of waiting 5 minutes?• What is the probability of waiting up to 10

minutes? Distributions help us make probability estimates about observed values.

Page 8: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Example of Probability Estimates (Continued)

• The probability of waiting up to 5 minutes is the number of times people waited up to 5 minutes divided by the total number of people: 10/50=.20.

• The probability of waiting up to 10 minutes is the number of people who waited up to 10 minutes divided by the total number of people: (10+20)/50=0.6.

Page 9: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Expected Value

• Expected value of a distribution is the mean of the distribution.

• It represents our long run expectations about the distribution.

• The expected value of X is given by summing the product of each value of X, referred to as “i”, times its probability of occurring, referred to as p(X=i).

• Expected value = mean = p(X=i) * i.

Page 10: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Example Calculation of Expected Value or Mean

• We examined the waiting time of 50 people at our emergency room and found that 10 people waited up to 5 minutes, 20 people waited 6 to 10 minutes, 13 people waited 11 to 15 minutes and 7 people waited 16-20 minutes.

• What is the mean waiting time at our emergency room?

Page 11: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Example Calculation of Expected Value or Mean (Continued)

Do this in Excel

Observed waiting time Frequency Probability

Probability times waiting time

2.5 10 0.2 0.57.5 20 0.4 3

12.5 13 0.26 3.2517.5 7 0.14 2.45

Total 50 1 9.2

The expected value or mean is 9.2

Page 12: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Normal Distribution

• A symmetric distribution, meaning that data are evenly distributed about the mean.

• Mean, median and mode are the same value.

• It has one mode and looks like a bell shaped curve.

Page 13: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Normal Distribution Continued

• The curve is continuous, there are no gaps or holes.

• The curve never touches the X-axis as any value is possible but with infinitely small probabilities.

• 99.7% of values are within 3 standard deviations of mean.

Page 14: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Distribution of Mean

• If you take a repeated sample of some observations and average them, then you have a distribution for the mean.

• The distribution of the mean has the same mean as the distribution of the observations.

• Standard deviation of the mean = Standard error = Standard deviation of the observations / Square root of the sample size.

Page 15: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Example

• What is the mean, standard deviation and standard error for the following data: 4, 5, 6?

• Mean = 5

• Standard deviation = 1

• Standard error = 1 / 1.7 = 0.58

Page 16: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Central Limit Theorem

• For any distribution of n observations with mean of and standard deviation .

• As n increases, the sample means will have a Normal distribution of mean and standard deviation / square root (n).

The theorem is important because it helps us ignore questions about the shape of distribution and focus on the mean and standard deviation of it.Do this in Excel

Page 17: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Standard Normal Distribution

• A Normal distribution.• Mean of zero.• Standard deviation of 1.• Z = (Observed value – mean) / standard

deviation of average.• Where standard deviation of mean = standard

error = standard deviation of observations divided by square root of sample size.

Page 18: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Example Calculation of Z

• What is the Z value for the observed mean of 16, if the average mean is 10 and the standard error is 2?

• Z = (16-10) / 2 = 3.

Page 19: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Another Example

• What is the Z value for the mean 16 of 4 observations, if the average of repeated sample of means is 10 and the standard deviation of the observations is 2?

• Standard deviation of mean = 2 / 4^0.5 = 2/2 =1

• Z value for 16 = (16-10)/1 = 6

Page 20: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Use of Z Values

• 99.7% of data are between z=3 and z=-3.

• Z is the number of standard deviations that X is away from the mean.

• 0.15% of data are below z=-3.

• 0.15 % of data are above z=3.

Page 21: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Use of Z Value (Continued)

• 95% of data are within z=1.96 and z=-1.96

• 5% are outside z=1.96 and z=-1.96

• 2.5% of data are below z=-1.96

• 2.5% of data are above z=1.96

Page 22: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Confidence Interval

• For Normal distributions, the 95% two tailed confidence interval corresponds to observations where z=1.96 and z=-1.96.

Page 23: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Example

• What is the 95% confidence interval for mean of 10 and standard deviation of 2?

• Lower limit = 10-1.96*2 = 6.08.

• Upper limit = 10+1.96*2 =13.92.

• At 13.92, Z value is (13.92-10)/2=1.96.

• At 6.08 , Z value is (6.08-10) / 2=-1.96.

• 95% of data fall within these limits.

Page 24: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Two Tailed Confidence Interval

• What percentage of data are between z=1.96 and Z=-1.96. Answer: 95%. Often referred to as two-tailed confidence interval.

• What percentage of data are below z=1.96?• Answer = 97.5. Often referred to as one tailed-

confidence interval.• What percentage of data are above Z=-1.96.

Answer =97.5. Often referred to as one tailed confidence interval.

Page 25: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Hypothesis

• A statistical hypothesis is a conjecture about population parameter.

• The null hypothesis is that there is no difference between the parameter and a value.

• The alternative hypothesis states there is a specific difference.

Experimental data can only reject a hypothesis not accept it.

Page 26: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Possible Outcomes of Hypothesis Test

There are four possible outcomes:1. We reject a hypothesis that is true.

2. We reject a hypothesis that is false.

3. We do not reject a hypothesis that is true.

4. We do not reject a hypothesis that is false.

Page 27: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Two Types of Error

Hypothesis is true

Hypothesis is false

We reject hypothesis

Type one error Correct

We do not reject hypothesis

Correct Type two error

Page 28: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Type 1 Error

• The level of significance is the maximum probability of type 1 error, symbolized by alpha, .

• When we base our decision on 95% confidence intervals, 5% of the data are ignored at the two tails of the distribution. Therefore, there is 5% chance that we will reject a hypothesis that is true.

• Type one error= 5%, = 0.05.

Page 29: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

One-tailed Tests

• In a two-tailed test, the hypothesis is rejected when the value is above higher limit and below the lower limit.

• In a one-tailed test that a parameter is larger than a particular value, the hypothesis is rejected when the value is above higher limit.

Page 30: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

One-tailed Tests (Continued)

• When we base our decision on 95% confidence intervals, 2.5% of the data are ignored at one tail of the distribution. Therefore, there is 2.5% chance that we will reject a hypothesis that is true.

=0.025.

Page 31: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Steps in Testing a Hypothesis

1. State the null hypothesis.

2. Identify the alternative hypothesis.

3. Is this a one tailed or two tailed test?

4. Decide the critical Z value above or below which the hypothesis is rejected, usually 1.96.

5. Calculate the Z value corresponding to the observation.

6. Reject or do not reject the hypothesis by comparing the calculated Z to the critical values.

Page 32: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Example

• The cost of rehabilitation in the industry is $25,000, with a standard deviation of 3000.

• In our hospital, the average cost is $30,000.

• With 95% confidence, would you say that our cost is different than the industry?

Do this in Excel

Page 33: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Steps in Testing Example Hypothesis

1. The null hypothesis: Our cost is higher or lower than average.

2. Alternative hypothesis: Our costs are the same as the industry.

3. This is a two tailed test.

4. The critical Z is +1.96 or –1.96.

5. Observed Z = (30000-25000)/3000 = 1.66.

6. Do not reject the hypothesis.

Page 34: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

When to Assume Normal Distribution for Means

• When the population variance is known and observations have a Normal distribution.

• When the population variance is unknown and there are more than 30 observations.

• Otherwise use t-distribution an approximation for Normal distribution.

Page 35: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Use t-distribution

• If the values in the population is Normal.

• If we have less than 30 observations.

• If we have to estimate the standard deviation from the sample and variance of the population is not known.

• The t-distribution is used as an approximation for near Normal data.

Page 36: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Calculating t Statistic

• t= (observed average – mean) / standard deviation of the average.

• Critical value of t depends on sample size.

• For one tail test of alpha = 0.025 and two tailed test of alpha =0.05.

• The critical t value for sample size of 10 is 2.22 and for sample size of 20 is 2.08.

Page 37: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Calculating t Statistic (Continued)

• If we are examining sample size of 10, 95% of data are within t=2.22 and t=-2.22.

• If we are examining sample size of 10, 97.5% of data are below t=2.22.

Page 38: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Testing With t-distribution

1. State the null hypothesis.

2. Identify the alternative hypothesis.

3. Is this a one tailed or two tailed test?

4. Decide the critical t value above or below which the hypothesis is rejected, the value depends on sample size.

5. Calculate the t value corresponding to the observation.

6. Reject or do not reject the hypothesis by comparing the calculated t to the critical values.

Page 39: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Example Data

Page 40: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Selecting Data Analysis

Page 41: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Select Descriptive Statistics

Page 42: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Enter Data Range

Page 43: Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.

Go to Index

Result

Confidence interval is the mean plus or minus the confidence level. If it does not include $30,000, then our hospital has a different cost structure than other hospitals in our database