The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate...

29
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD

Transcript of The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate...

Page 1: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Interpreting multivariate OLS and logit coefficients

Jane E. Miller, PhD

Page 2: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Overview

• What elements to report for coefficients• Coefficients on – Continuous independent variables (IVs; predictors)– Categorical independent variables

• Ordinary least squares (OLS) and logit coefficients

• Topic sentences for paragraphs reporting multivariate coefficients

Page 3: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Report and interpret results

• Report detailed multivariate results in tables.– Coefficients.– Inferential statistical results:• standard error or test statistic, • p-value or symbol.

– Model goodness of fit statistics.

• Interpret coefficients in the text.– Refer to associated table.

Page 4: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

What to report for coefficients• Topic – Independent variable (IV)– Dependent variable (DV)

• Direction (AKA “sign”)• Magnitude (AKA “size”)• Units or categories• Statistical significance• Most authors remember to report statistical

significance, so I have listed that element last!

Page 5: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Interpreting coefficients• Poor: “The effect of public insurance was –7.2 (p <

0.05).”– Reports the coefficient without interpreting it. Without units

or reference group, the meaning of “– 7.2” cannot be interpreted.

• Better: “Children with private insurance stayed on average 7.2 days longer than those with public insurance (p < 0.05).”– Interprets the β in intuitive terms, mentioning the topic, units,

categories, direction, magnitude and statistical significance.

Page 6: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

More examples of interpreting βs• Poor: “Insurance and length of stay were associated (p

< 0.05).”– Topic and statistical significance, but not direction or size.

• Better: “Privately-insured children stayed longer than publicly insured children (p < 0.05).”– Statistical significance and direction, but not size.

• Best: “Children with private insurance stayed on average 7.2 days longer than those with public insurance (p < 0.05).”– Topic, direction, magnitude, units, categories, and statistical

significance.

Page 7: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Interpretation of βs depends on types of variables in your models

• The type of dependent variable:– Continuous dependent variable• Ordinary least squares (OLS)

– Categorical dependent variable• Logistic (logit) regression model

• Type of independent variable:– Continuous– Categorical

Page 8: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Interpreting coefficients from ordinary least squares (OLS) models

Page 9: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Coefficients for OLS models• For ordinary least squares (OLS) models, the

coefficient (β) is a measure of difference in the DV for a 1-unit increase in the IV.– For unstandardized coefficients, difference in the same

units as the dependent variable.

• Can be explained using wording for results of subtraction.

• For standardized coefficients, β measures difference in standardized units (multiples of standard deviations).– See podcast about resolving the Goldilocks problem using model specification.

Page 10: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Interpreting βs: continuous predictors• The unstandardized coefficient on a continuous

predictor in an OLS model measures– The difference in the dependent variable for a one-

unit increase in the independent variable.– Effect size is in original units of the DV.

• Example topic: Mother’s age as a predictor of birth weight:– Dependent variable = birth weight in grams.– Independent variable = mother’s age in years.– Both are continuous variables.

Page 11: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Example: Mother’s age as a predictor of birth weight

• Poor: “Mother’s age and child’s birth weight are correlated (p<0.01).”– Names the dependent and independent variables and

conveys statistical significance, but not direction or magnitude of the association.

• Better: “As mother’s age increases, her child’s birth weight also increases (p<0.01).”– Concepts, direction, and statistical significance, but not size.

• Best: “For each additional year of mother’s age at the time of her child’s birth, the child’s birth weight increases by 10.7 grams (p<0.01).” – Concepts, units, direction, size, and statistical significance.

Page 12: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Interpreting βs: categorical predictors

• The β on a categorical IV in an OLS model measures the difference in the DV for the category of interest compared to the reference category.– A “1-unit increase” does NOT make sense.

• Example: gender– Dummy variable (AKA “binary variable”) coded• 1 = boy• 0 = girl = omitted (reference) category

Page 13: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Example: Gender as a predictor of birth weight

• Poor: “The β for ‘BBBOY’ is 116.1 with an s.e. of 12.3 (table 15.3).”– Uses a cryptic acronym rather than naming the

independent variable or conveying that it is categorical.

– Doesn’t convey the dependent variable.– Reports the same information as the table (size of

coefficient and standard error), but does not interpret them.

– The direction of the effect cannot be determined because categories and units are not specified.

– To assess statistical significance, readers must calculate test statistic and compare it against critical value.

Page 14: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Gender as a predictor of birth weight, cont.

• Slightly better: “Gender is associated with a difference of 116.1 grams in birth weight (p < 0.01).”– Concepts, magnitude, units, and statistical

significance but not direction: Was birth weight higher for boys or for girls?

• Best: “At birth, boys weigh on average 116 grams more than girls (p < 0.01).”– Concepts, reference category and units, direction,

magnitude, and statistical significance.

Page 15: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Identifying the reference category• For categorical variables, mention identity of

reference category.– E.g., effect size is relative to whom?

• Example for 2-category comparison: – “Boys weighed 116 grams more than girls.”

• Example for multicategory comparison: – “Compared to white infants, black and Hispanic infants

weighed 62 and 16 grams less on average.”– OR “Mean birth weight was 62 and 16 grams less, for

black and Hispanic infants, respectively, when each is compared to white infants.”

Page 16: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Interpreting coefficients from logistic regression models

Page 17: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Logit models for categorical dependent variables

• Logit = log[p/(1 – p)] = log(odds of the category you are modeling)– p is the proportion of the sample in the modeled category

• β measures the log relative-odds of the outcome for different values of the independent variable

• Exponentiate the logit coefficient eβ = relative odds, or “odds ratio”

• Compares the odds of the outcome for different values of the independent variable

Page 18: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Example: Logit model of LBW• Low birth weight (LBW)

= birth weight <2,500 grams

• Log-odds = log[pLBW/(1 – pLBW )]– Where pLBW is the proportion of the sample that is LBW.

• Log relative odds of LBW = comparison of log-odds of LBW for different values of the independent variable.

• eβ = relative odds of LBW for different values of the independent variable.

Page 19: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Wording for odds ratios

• βs for logit models are in the form of ratios.• For suggestions on how to phrase descriptions of

ratios with minimal jargon, see– Table 5.3 in The Chicago Guide to Writing about

NumbersOR – Table 8.3 in The Chicago Guide to Writing about

Multivariate Analysis

Page 20: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Phrases for ratiosType of ratio Ratio

exampleRule of thumb Writing suggestion

< 1.0 (e.g., 0.x) % difference = ratio 100

0.80 [Group] is only x% as ___b as the reference value.

“Males were only 80% as likely as females to graduate from the program.”

Close to 1.0 1.02 Use phrasing to express similarity between the two groups.

“Average test scores were similar for males and females (ratio = 1.02 for males vs. females).”

>1.0 (e.g., 1.y) % difference = (ratio – 1) 100.

1.20 [Group] is 1.y times as ___ as the reference value.

“On average, males were 1.20 times as tall as females.”

OR [Group] is y% ___er than the reference value.

OR “Males were on average 20% taller than females.”

2.34 [Group] is (2.34 – 1) 100, or 134% more ___ than the reference value.

“Males’ incomes were 134% higher than those of females.”

Close to a multiple of 1.0 (e.g., z.00)

2.96 [Group] is (about) z times as ___.

“Males were nearly three times as likely to commit a crime as their female peers.”

See tables in WA#s or

WAMA

Page 21: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Odds ratios for categorical independent variables

• Odds ratio of the outcome for the category of interest compared to the reference category.

• “Infants born to smokers had 1.4 times the odds of low birth weight (LBW) as those born to nonsmokers (p < 0.01).”– Concepts, reference category, direction,

magnitude, and statistical significance.

Page 22: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Odds ratios for continuous independent variables

• Odds ratio of the outcome for a one-unit increase in the independent variable.

• “Odds of LBW decreased by about 0.8% for each 1 year increase in mother’s age (NS).”– Concepts, units, direction, magnitude, and statistical

significance.

Page 23: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Topic sentences for paragraphs reporting multivariate results

• Start each paragraph of the results section with a restatement of topic addressed by analysis to be reported in that paragraph.– Can paraphrase title of table or chart that reports the detailed

statistical results.

• Topic sentence should mention:– Dependent variable.– Independent variable(s).

• Use summary phrase rather than long list of variables.

– Type of analysis.

Page 24: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Example topic sentences

• “Multivariate logistic regression results show that insurance is a powerful predictor of length of stay (table X).” [Next sentence goes into detail about direction, size, and statistical significance.]– Mentions type of analysis, dependent variable, and

independent variable.

• “As shown in figure Y, race and income level interact in their effect on risk of asthma.”

Page 25: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Summary

• Report detailed multivariate results in tables.• Interpret coefficients in prose.• Specify direction, magnitude, and statistical

significance of associations.– Units for continuous variables– Categories for nominal or ordinal variables

• Write about concepts, not acronyms.– Introduce concepts under study in topic sentences.

Page 26: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Suggested resources• Miller, J. E. 2013. The Chicago Guide to Writing

about Multivariate Analysis, 2nd Edition. – Chapter 5, on creating effective multivariate tables– Chapter 8, on wording for results of • subtraction (OLS βs)• ratios (logit βs)

– Chapter 9, on writing about βs from OLS and logit models

– Chapter 10, on the Goldilocks problem for choosing a fitting contrast size for interpreting coefficients

Page 27: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Suggested online resources

• Podcasts on– Comparing two numbers or series– Choosing a reference category– Defining the Goldilocks problem– Resolving the Goldilocks problem: Presenting results– Differentiating between statistical significance and

substantive importance

Page 28: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Suggested practice exercises

• Study guide to The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.– Problem sets for chapters 9 and 15– Suggested course extensions for chapters 9 and 15• “Reviewing,” “writing” and “revising” exercises.

Page 29: The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.

The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.

Contact information

Jane E. Miller, [email protected]

Online materials available athttp://press.uchicago.edu/books/miller/multivariate/index.html