Health Economics- Lecture Ch03

download Health Economics- Lecture Ch03

of 28

Transcript of Health Economics- Lecture Ch03

  • 8/3/2019 Health Economics- Lecture Ch03

    1/28

    Statistical Tools

    Dr. Katherine Sauer

    Metropolitan State College of Denver

    Health Economics

  • 8/3/2019 Health Economics- Lecture Ch03

    2/28

    Outline:

    I. Hypothesis Testing

    II. Difference of MeansIII. Regression Analysis

  • 8/3/2019 Health Economics- Lecture Ch03

    3/28

    I. Hypothesis Testing

    A. Simple Hypothesis

    Men and women smoke different numbers of cigarettes

    State the hypothesis:

    Null hypothesis

    (hypothesis we wish todisprove):

    H0: cm = cw

    ex: men and women

    smoke the same number

    of cigarettes

    Alternative hypothesis

    (hypothesis that theory

    suggests to be the case)

    H1: cm cw

    ex: men and women donot smoke the same

    number of cigarettes

  • 8/3/2019 Health Economics- Lecture Ch03

    4/28

    B. Composite Hypothesis

    Rich people spend more on health care than do poor

    people

    State the hypothesis

    Null hypothesis

    (hypothesis we wish todisprove):

    H0: Er= Ep

    ex: the rich and poor

    spend the same amount

    Alternative hypothesis

    (hypothesis that theory

    suggests to be the case)

    H1: Er> Ep

    ex: the rich spend morethan the poor

  • 8/3/2019 Health Economics- Lecture Ch03

    5/28

    II. Difference in Means

    Consider the example of mens and womens smoking.

    To compare mens and womens smoking rates wecould ask people from the population at-large how many

    cigarettes they smoke per day.

    Since we cant ask everyone, how do we decide uponthe sample to use?

  • 8/3/2019 Health Economics- Lecture Ch03

    6/28

    Since many things other than gender may affect the

    number of cigarettes a person smokes, we can account

    for this by selecting a sample of people randomly

    from the universe of all people.

    We could also select a sample of people from a

    relatively homogeneous group, like, college

    sophomores from a given college.

  • 8/3/2019 Health Economics- Lecture Ch03

    7/28

    Types of Data

    Continuous - natural measures that in principle could take

    on different values for each observation

    ex: height, weight, income, price

    Categorical - refer to arbitrary categories

    ex: gender (male or female)

    race (black, white, or other)

    location (urban or rural)

    Is the number of cigarettes smoked continuous or

    categorical?

  • 8/3/2019 Health Economics- Lecture Ch03

    8/28

    Using NIH data for smokers from 2001 and 2002 it wasfound that:

    For 4,714 men, cm = 15.60 cigarettes per day

    For 4,841 women, cw = 13.47 cigarettes per day

    the difference is = cm cw = 2.13 cigarettes per day

  • 8/3/2019 Health Economics- Lecture Ch03

    9/28

    The data shows a difference in the average number of

    cigarettes smoked per day by men and women.

    Does the difference represent a true difference

    between men and women smoking?

    or

    Did the sample randomly draw a higher average level

    for men (15.60) than for women (13.47)?

    Lets look at the sample distribution.

  • 8/3/2019 Health Economics- Lecture Ch03

    10/28

    Based on the distribution,

    some men and somewomen smoked far fewer

    and some smoked far

    more than the average.

    Variance is a measure of

    the dispersion of

    cigarettes smoked around

    the average.

    mean: men (15.60) , women (13.47)

  • 8/3/2019 Health Economics- Lecture Ch03

    11/28

    The larger the variance, the dispersion around the mean

    is large.

    - another observation may be far from the

    sample mean

    The smaller the variance, the dispersion around the

    mean is small.

    - another observation is likely close to the

    sample mean

    In testing a hypothesis, would you rather see a large or

    small variance in your sample data?

  • 8/3/2019 Health Economics- Lecture Ch03

    12/28

    The square root of the variance is called the standarddeviation,s.

    A larger standard deviation indicates more dispersion

    around the mean.

    A smaller standard deviation indicates less dispersion

    around the mean.

  • 8/3/2019 Health Economics- Lecture Ch03

    13/28

    Thestandard errorof the mean is the standard deviation

    divided by the square root of the number ofobservations.

  • 8/3/2019 Health Economics- Lecture Ch03

    14/28

    To test our smoking hypothesis formally, we can

    construct a difference of means test.

    - good for continuous data that can be broken

    up by categories

    We wish to compare the value,

    difference = cm cwto zero, which was the original hypothesis.

    Recall: difference = 2.13

    The standard error of the difference is calculated to be

    equal to 0.216.

  • 8/3/2019 Health Economics- Lecture Ch03

    15/28

    About 68 percent of a distribution lies within 1 standard

    error 2.13 0.216 =1.91

    2.13 + 0.216 =2.35

    About 95 percent of a distribution lies within 2 standarderrors

    2.13 (2)(0.216) =1.69

    2.13 +(2)(0.216) =2.56

    How does this compare to our null hypothesis that the

    value difference is zero?

  • 8/3/2019 Health Economics- Lecture Ch03

    16/28

    The t test:

    The t statistic is calculated as the value divided by the

    standard error.

    In our example: 2.13 / 0.216 = 9.86

    As a rule of thumb, if the t-statistic is greater than 2,

    you have statistical significance.

  • 8/3/2019 Health Economics- Lecture Ch03

    17/28

    This experiment would find very good evidence that

    among smokers, women smoke fewer cigarettes than

    men.

    The males have higher levels than the females, and the

    probability is well over 95 percent that this difference is

    statistically significant.

  • 8/3/2019 Health Economics- Lecture Ch03

    18/28

    III. Regression Analysis

    - good for data that is continuous

    Suppose we wish to explore the relationship between thecigarette tax and the amount of cigarettes smoked per

    day.

    null hypothesis: no effect (b = 0)alternative hypothesis: tax is inversely related to

    the quantity smoked

    (b < 0)

  • 8/3/2019 Health Economics- Lecture Ch03

    19/28

    We want to know if the coefficient of -3.24 is

    significantly different from zero.

  • 8/3/2019 Health Economics- Lecture Ch03

    20/28

    A coefficient of -3.24 means:

    A $1 increase in the tax is correlated with a change in

    quantity demanded of 3.24 fewer cigarettes.

  • 8/3/2019 Health Economics- Lecture Ch03

    21/28

    The elasticity is -0.09. This means a 1% increase in the

    tax will lead to a 0.09% reduction in quantity

    demanded.

  • 8/3/2019 Health Economics- Lecture Ch03

    22/28

    A multiple regression includes more than one

    explanatory variable.

    ex: gender, race, age, education, income

    Some of the variables may be continuous, some may be

    categories.

    - interpretation is different

  • 8/3/2019 Health Economics- Lecture Ch03

    23/28

    Continuous variables

    Notice how adding more variables changes the

    coefficient on excise tax.

    Is it still significant?

    CC

    C

    C

  • 8/3/2019 Health Economics- Lecture Ch03

    24/28

    Income:

    Age:

    Education:

    CC

    C

    C

  • 8/3/2019 Health Economics- Lecture Ch03

    25/28

    When using categorical variables in a regression, we need

    to assign them a numerical value.- dummy variables

    Dummy variables are used in regression analysis to

    determine whether groups of people differ from others.

    For example, maybe we would want to know if African

    Americans smoke more than other groups.

    We can create a dummy variable that assigns the value 1

    if the person is African American or 0 otherwise.

  • 8/3/2019 Health Economics- Lecture Ch03

    26/28

    Because male appears as a variable, we know it was

    assigned a value of 1. (female =0)

    Is the male coefficient significant?

    D

    D

    D

  • 8/3/2019 Health Economics- Lecture Ch03

    27/28

    The interpretation of a dummy variable is different than

    that of a continuous variable.

    0 -5.05

    2.23

    African AmericanNo=0 Yes =1

    No=0

    Yes =1

    Male

    An African American

    female smokes 5.05 fewer

    cigarettes than white

    females.

    A white male smokes 2.23

    more cigarettes than a

    white female.

    An African American male smokes 2.82 fewer

    cigarettes than a white female.

    2.23 -5.05

    = - 2.82

  • 8/3/2019 Health Economics- Lecture Ch03

    28/28

    Summary of Statistical Competencies:

    Formulate questions in terms of hypotheses.

    Read statistical test results to determine if the result is

    significant.

    Understand statistical significance.

    Interpret reported regression results.