Can I Believe It? Understanding Statistics in Published Literature
description
Transcript of Can I Believe It? Understanding Statistics in Published Literature
Can I Believe It?Understanding Statistics in Published
Literature
Keira Robinson – MOH Biostatistics Trainee
David Schmidt – HETI Rural and Remote Portfolio
HEALTH EDUCATION &TRAINING INSTITUTE
Agenda
WelcomeUnderstanding the contextData typesPresenting dataCommon testsTricks and hintsPracticeWrap up
HEALTH EDUCATION &TRAINING INSTITUTE
Understanding statistics
Never consider statistics in isolationConsider the rest of the article
Who was studied What was measured Why was that measure used Where was the study completed When was it done
It is the author’s role to convince you that their results can be believed!
Types of Data
HEALTH EDUCATION &TRAINING INSTITUTE
Examples of data – Table 1Diamond et al. 2006
HEALTH EDUCATION &TRAINING INSTITUTE
Types of data
Numeric Continuous (height, cholesterol) Discrete (number of floors in a building)
Categorical Binary (yes/no, ie born in Australia?) Categorical (cancer type) Ordinal categorical (cancer stage)
HEALTH EDUCATION &TRAINING INSTITUTE
Histograms
Represents continuous variables Areas of the bars represent the frequency (count) or percent
Indicates the distribution of the data
HEALTH EDUCATION &TRAINING INSTITUTE
Measures of association0
1020
3040
Freq
uenc
y of
peo
ple
160 170 180 190 200 210Height in cm
HEALTH EDUCATION &TRAINING INSTITUTE
Stem and leaf plot- heights
6* 116* 26* 33333336* 444444444446* 5555555555556* 666666666666666666666666* 7777777777777777777777777777776* 88888888888888886* 999999999999999999999999999999997* 00000000000000000000000007* 11111111111111111117* 2222222222227* 3333337* 447* 55
HEALTH EDUCATION &TRAINING INSTITUTE
Skewed Data0
2040
6080
Freq
uenc
y
40 60 80 100 120 140diastolic
HEALTH EDUCATION &TRAINING INSTITUTE
Salient features- the mean
The average value:
mean - 1978
mean - 2010
Birthweight3000 grams
3500 grams
HEALTH EDUCATION &TRAINING INSTITUTE
Salient features- the median
The observation in the middle Example- newborn birth weights 3100, 3100,3200,3300,3400,3500,3600,3650 g
- (3300+3400)/2 = 3350
Not affected by extreme values Wastes information
HEALTH EDUCATION &TRAINING INSTITUTE
Salient features- the mean and median
mean -2010
median- 2010
Birthweight3356grams
3350 grams
HEALTH EDUCATION &TRAINING INSTITUTE
Mean and Median
Mean is preferableSymmetric distributions mean ~
median Present the Mean
Skewed distributionsMean is pulled toward the ‘tail’
Present the Median
HEALTH EDUCATION &TRAINING INSTITUTE
Mean and MedianBody fat Mean Median
13.6 11.70
1020
3040
Num
ber o
f peo
ple
0 10 20 30 40Body fat (%)
HEALTH EDUCATION &TRAINING INSTITUTE
Variability – Standard deviation and varianceThe average distance between the observations
and the meanStandard deviation :
with original units , ie. 0.3 % Variance =
With the original units squared
HEALTH EDUCATION &TRAINING INSTITUTE
Range
Example, infant birth weight3100, 3100,3200,3300,3400,3500,3600,3650,
3800 Range = (3100 to 3800) grams or 700 grams
Interquartile range: the range between the first and 3rd quartiles (Q1 and Q3)
3100, 3100,3200,3300,3400,3500,3600,3650 , 3800 IQR = (3200 to 3600) grams or 400 grams
HEALTH EDUCATION &TRAINING INSTITUTE
Presenting variabilityPresent standard deviation if the
mean is usedPresent Interquartile range if the
median is used
HEALTH EDUCATION &TRAINING INSTITUTE
Graphics for Continuous Variables
Boxplot :40
6080
100
120
Wt
Median
25th percentile (Q1)
75th percentile (Q3)
outlier
IQR
Minimum in Q1
Maximum in Q3
HEALTH EDUCATION &TRAINING INSTITUTE
Categorical Variables- table summariesSport Frequency Percent Cumulative
PercentSoccer 25 12.6 12.6
Football 37 18.7 31.3
Basketball 23 11.6 42.9
Swimming 22 11.1 54.0
Golf 19 9.6 63.6
Rugby 44 22.2 85.9
Cycling 11 5.6 91.4
Tennis 17 8.6 100.0
TOTAL 198 100.0
HEALTH EDUCATION &TRAINING INSTITUTE
Bar charts
Relative frequency for a categorical or discrete variable
010
2030
mea
n of
mea
nbm
i
HEALTH EDUCATION &TRAINING INSTITUTE
Bar chart vs Histogram
Histogram For continuous variables The area represents the frequency Bars join together
Bar chart For categorical variables The height represents the frequency The bars don’t join together
HEALTH EDUCATION &TRAINING INSTITUTE
Pie chart
Areas of “slices” represent the frequency
soccer
football
basketball
swimminggolf
rugby
cycling
tennis
soccer footballbasketball swimminggolf rugbycycling tennis
24
Precision
26
HEALTH EDUCATION &TRAINING INSTITUTE
Presenting statistics
Tables should need no further explanationMeans
No more than one decimal place more than the original data
Standard deviations may need an extra decimal place
Percentages Not more than one decimal place (sometimes no
decimal place) Sample size <100, decimal places are not
necessary If sample size <20, may need to report actual
numbers
Statistical Inference
HEALTH EDUCATION &TRAINING INSTITUTE
Sampling
Population
Sample
InferenceSampling
HEALTH EDUCATION &TRAINING INSTITUTE
Sampling, cont’d
• A statistic that is used as an estimate of the population parameter.
• Example: average parity
PopulationMean
Sample Mean
HEALTH EDUCATION &TRAINING INSTITUTE
Confidence intervals
We are confident the true mean lies within a range of values
95% Confidence Interval: We are 95% confident that the true mean lies within the range of values
If a study is repeated numerous times, we are confident the mean would contain the true mean 95% of the time
How does confidence interval change as the sample size increases?
HEALTH EDUCATION &TRAINING INSTITUTE
Confidence intervals cont’d
HEALTH EDUCATION &TRAINING INSTITUTE
Hypothesis testing
Is our sample of babies consistent with the Australian population with a known mean birth weight of 3500 grams?
Sample mean = 3800 grams, 95% CI of 3650 to 3950 grams
3800 lies outside of this confidence interval range, indicating our sample mean is higher than the true Australian population
HEALTH EDUCATION &TRAINING INSTITUTE
Hypothesis testing
State a null hypothesis: There is no difference between the sample mean
and the true mean: Ho = 3500 Calculate a test statistic from the data t = 2.65 Report the p-value = 0.012
HEALTH EDUCATION &TRAINING INSTITUTE
What is a p-value?
The probability of obtaining the data, ie a mean weight of 3800 grams or greater if the null hypothesis is true
The smaller the p-value, the more evidence against the null hypothesis
< 0.0001 to 0.05 – evidence to reject the null hypothesis (statistically significant difference)
> 0.05 – evidence to accept the null hypothesis (not statistically significant)
HEALTH EDUCATION &TRAINING INSTITUTE
Summary – Confidence intervals and p valuesP –value: Indicates statistical significance
Confidence interval: range of values for which we are 95% certain our true value lies
Recommended to present confidence intervals where possible
Analysing Continuous Outcomes
37
HEALTH EDUCATION &TRAINING INSTITUTE
T tests
What are they used for? Analyse means Provide estimate of the difference in means
between the two groups and the 95% confidence interval of this difference
P-value – a measure of the evidence against the null hypothesis of no difference between the two groups
HEALTH EDUCATION &TRAINING INSTITUTE
T tests- paired vs independent
Paired: Outcome is measured on the same individual
Eg: before and after, cross-over trial Pairs may be two different individuals who are
matched on factors like age, sex etc.Patient Baseline weight
(kg)3 months weight (kg)
1 85 82
2 76 73
3 102 98
4 110 108
HEALTH EDUCATION &TRAINING INSTITUTE
Paired T-tests
Calculate the difference for each of the pairs
The mean weight at baseline was 93 kg and the mean weight at 3 months was 88 kg. The weight at 3 months was 5 kg less compared to the baseline weight 95% CI (-3, 12)
Patient Baseline weight (kg)
3 months weight (kg)
Difference (kg)
1 85 82 -3
2 76 73 -3
3 102 98 -4
4 110 100 -10
Mean 93 88 -5
HEALTH EDUCATION &TRAINING INSTITUTE
Paired T-tests
There was no evidence that there was a significant change in weight after 3 months (p value = 0.19)
Assumptions Bell shaped curve with no outliers Assess shape by graphing the difference
Use a histogram or stem and leaf plot
HEALTH EDUCATION &TRAINING INSTITUTE
Independent T tests
Two groups that are unrelated Eg: weights of different groups of people
Weight (kg)
NW Public School SW Public School52 4551 5471 8214 15
HEALTH EDUCATION &TRAINING INSTITUTE
Independent samples t-tests
Same assumption as for paired t tests plus the assumption of independence and equal variance
NW Public School
(weight kg)
SW Public School
(weight, kg)
Difference(weight, kg)
52 45 7
51 54 3
71 82 11
72 61 11
Mean 62 61 1 (-22,24)
HEALTH EDUCATION &TRAINING INSTITUTE
Interpretation –independent t tests
The mean weight in NW Public was 62 kg and the mean weight in SW Public was 61 kg
The mean difference in weight between the two schools was 1 kg (-22, 24)
There was no evidence of a significant difference in weight between the two schools (p=0.92)
HEALTH EDUCATION &TRAINING INSTITUTE
One-way Analysis of Variance (ANOVA)
What happens when there are more than two groups to compare?
Null hypothesis: means for all groups are approximately equal
No way to measure the difference in means between more than two groups, so the variance between the groups is analysed
Can measure variance within a group as well as variance between groups
HEALTH EDUCATION &TRAINING INSTITUTE
One-way ANOVA
Comparing multiple groups
NW Public School
NE Public School
SW Public School
SE Public School
42 39 46 56
53 52 51 45
46 58 56 41
75 41 44 32
56 65 63 56
HEALTH EDUCATION &TRAINING INSTITUTE
Interpretations – One-way ANOVA
There was evidence of a difference between the average student weight between the four schools p<0.05
There was evidence of no difference between the average student weight between the four schools p>0.05
Not advised to compare all means against each other because there is an increased chance of finding at least 1 result that is significant the more tests that are done
HEALTH EDUCATION &TRAINING INSTITUTE
Assumptions ANOVA
Normality, - observations for all groups are normally distributed,
Variance in all groups are equal Independence – all groups are independent of
each other
HEALTH EDUCATION &TRAINING INSTITUTE
Extensions of one-way ANOVA
Two way-ANOVA: Multiple factors to be considered. Eg school and
type of school (public/private)ANCOVA – Analysis of Covariance
Tests group differences while adjusting for a continuous variables (eg. age) and categorical variables
HEALTH EDUCATION &TRAINING INSTITUTE
Linear Regression
Measures the association between two continuous variables (weight and height)
Or one continuous variable and several continuous variables (mutliple linear regression)
What is the relationship between height and weight?
HEALTH EDUCATION &TRAINING INSTITUTE
Scatter plot of weight and height
Correlation between height and weight = 0.7540
6080
100
120
Wt
160 170 180 190 200 210Ht
HEALTH EDUCATION &TRAINING INSTITUTE
Scatter plot of body fat and height
Correlation between body fat and height = -0.230
1020
3040
Bfa
t
160 170 180 190 200 210Ht
HEALTH EDUCATION &TRAINING INSTITUTE
Linear regression
Fits a straight line to describe the relationshipAssumes
1. Independence for each measure (each person)2. Linearity (check with scatter plots)3. Normality (check residuals with a graph)
Residuals are the difference between the data point and the regression line
4. Homscedasticity Variability in weight does not change as height
changes, ie
HEALTH EDUCATION &TRAINING INSTITUTE
Multiple Linear Regression
Extends the simple linear regression Adjusts for confounding variables
Example: Does smoking while pregnant affect infant birth weight? Outcome variable: infant birth weight Exposure variable: maternal smoking Covariates (other variables of interest):
Sex of the baby, gestational age
HEALTH EDUCATION &TRAINING INSTITUTE
Confounding variables
A variable (factor) associated with both the outcome and exposure variables
Gestational age is associated with both smoking (exposure) and the outcome (birth weight)
Confounders can be assessed by checking the correlation between the variable of interest and the outcome variable
Correlation coefficient : -1.0 <r<1.0Rule of thumb: >0.5 or <-0.5 should be
considered a confounder
HEALTH EDUCATION &TRAINING INSTITUTE
Example of weight vs height adjusting for sex
HEALTH EDUCATION &TRAINING INSTITUTE
Summary for continuous outcomes Comparing means from two group
Use t- tests (paired for same person comparison, independent for independent groups comparison)
Comparing means for more than two groups One-way ANOVA
Comparing means for two or more groups and adjusting for other variables (ANCOVA)
HEALTH EDUCATION &TRAINING INSTITUTE
Summary for continuous outcomes
Assessing the relationship between two continuous variables Simple linear regression
Assessing the relationship between two or more variables Multiple linear regression
Analysing Categorical Outcomes
59
Chi-square tests
What can a chi-square test answer?
HEALTH EDUCATION &TRAINING INSTITUTE
Chi-Square tests
2x2 tables: Low birth weight (<2500
grams)Total
<2500 grams
>2500 grams
Smoking No 5 100 105
Yes 25 75 100
Total 30 175 205
HEALTH EDUCATION &TRAINING INSTITUTE
Chi-square tests
Can be used for paired (same person under two different conditions) or independent samples (unrelated people in different groups)
Used often in case-control studies where the outcome is categorical (or dichotomous)
Tests no association between row and column factors Smoking and low birth weight association
The study design defines the appropriate measure of effect
HEALTH EDUCATION &TRAINING INSTITUTE
Cohort studies
Exposure is determined by Randomisation to different groups followed over time
Outcome is determined at the end of follow upRate of outcome can be estimated
HEALTH EDUCATION &TRAINING INSTITUTE
Cohort studies continued
Eg. Rate of low birth weight in: Smokers: rate = 25/100 = 0.25 = 25% Non-smokers: = 5/105 = 5%
Relative risk (RR) = 25/5=5 times higher risk of low birth rate in smokers relative to non-smokers
Risk Difference (RD) = 25-5 = 20No relative difference between the low birth rate
in smokers and non-smokers RR =1.0No absolute difference in the low birth rate in
smokers and non-smokers = RD
HEALTH EDUCATION &TRAINING INSTITUTE
Cross-Sectional Studies
People observed at one point in time (questionnaire)
Exposure and outcome are measured at the same time
Causal associations cannot be deducedRate ratio (RR) = 25/5=5 times higher risk of low
birth rate in smokers relative to non-smokers Rate Difference (RD) = 25-5 = 20No relative difference between the low birth rate
in smokers and non-smokers RR =1.0No absolute difference in the low birth rate in
smokers and non-smokers = RD
HEALTH EDUCATION &TRAINING INSTITUTE
Case-control studies
Use for rare outcomes (example: child prodigies)Children are selected based on being a prodigy
Eg. 100 child prodigies and 100 children with normal intelligence
Determine exposure retrospectivelyCannot obtain a rate Must obtain the odds of the outcome and
compare using an odds ratio
HEALTH EDUCATION &TRAINING INSTITUTE
Case-Control studies Child prodigy Total
Yes No
Fish oil supplements during pregnancy
No 30 50 105
Yes 70 50 100
Total 100 100 205
HEALTH EDUCATION &TRAINING INSTITUTE
Case-control studies
Odds of being a prodigy: In exposed: 70/50 = 1.4 In unexposed: 0.6 Odds ratio:
1.4/0.6 = 2.3 2.3 times more likely to have a child prodigy if
maternal fish oil supplements were taken during pregnancy
Null hypothesis No association between the exposure and the outcome Odds Ratio = 1
HEALTH EDUCATION &TRAINING INSTITUTE
Summary of RR and ORBoth compare the relative likelihood of an
outcome between 2 groups
RR=1 or OR = 1 Outcome is as likely in the exposed and unexposed
groups
RR>1 or OR >1 The outcome is more likely in the exposed group
compared to the unexposed group The exposure is a risk factor
HEALTH EDUCATION &TRAINING INSTITUTE
Summary of RR and OR
RR<1 or OR<1 The outcome is less likely in the exposed group
compared to the unexposed group The exposure is protective
RR cannot be calculated for a case-control studyOR ~ RR when the outcome is rare
HEALTH EDUCATION &TRAINING INSTITUTE
Extensions of Chi-square
Small sample sizes Fisher’s exact test
Recommended when n<20 or 20 <n<40 and the smallest expected cell count is <5
Paired data Exact binomial test for small sample sizes McNemar’s test
Multiple regression: Logistic regression
HEALTH EDUCATION &TRAINING INSTITUTE
Non-parametric tests
Parametric test Non parametric tests
Independent samples t-test Wilcoxon-Mann-Whitney test
Paired t-test Wilcoxon signed rank sum test
One-way ANOVA Kruskal Wallis
Chi-square test ?
Spurious statistics
HEALTH EDUCATION &TRAINING INSTITUTE
Fact or Fiction
Vaccines and autism?Cell phones and brain tumours?
75
HEALTH EDUCATION &TRAINING INSTITUTE
Common errors
60.182 kg or 61kg? Reporting measurements with unnecessary
precision Age divided into 20-44 years, 45-59 years, 60-74
years, 75+ years Dividing continuous data without explaining why or
how Certain boundaries may be chosen to favour certain
resultsPresenting Means and SD for non-normal data
What should be presented instead?
HEALTH EDUCATION &TRAINING INSTITUTE
Common Errors
“The effect of more exercise was significant”“The effect of 40 minutes of exercise per day was
statistically significant for decreasing weight (p<0.05)”
“40 minutes of exercise per day lowered the mean weight of the group from 95 kg to 89 kg, (95% CI = 75-105 kg, p= 0.03)
Checking the distribution of the data to determine the appropriate statistical test Using parametric tests when data is not normal Using tests for independent data when the data is
paired
HEALTH EDUCATION &TRAINING INSTITUTE
Common Errors
Using linear regression without confirming linearity
Not reporting what happened to all patients Leads to bias of the results
Data dredging Multiple statistical comparisons until a significant
result is foundNot accounting for the denominator or adjusting
for baseline
HEALTH EDUCATION &TRAINING INSTITUTE
Example
HEALTH EDUCATION &TRAINING INSTITUTE
Common Errors
Selection Bias Sampling from a bag of candy where the larger
candies are more likely to be chosen On November 13, 2000, Newsweek published the
following poll results:
HEALTH EDUCATION &TRAINING INSTITUTE
Selection Bias
HEALTH EDUCATION &TRAINING INSTITUTE
Selection Bias
HEALTH EDUCATION &TRAINING INSTITUTE
Common Errors
Other biases (measurement bias, intervention bias)
Using cross sectional studies to infer causality More likely to have a c-section if attending a private
hospital instead of a public hospital
HEALTH EDUCATION &TRAINING INSTITUTE
Practical example
Working in groups quickly read the article provided
Summarise What data they used What test Do you believe their findings? Can you explain why?
HEALTH EDUCATION &TRAINING INSTITUTE
Summary
Statistics must be understood in the context of the whole article
Statistical tests must fit the data typeFindings should be presented appropriatelyBeware flashy stats! It’s the author’s job to justify their choices If you don’t believe it- can you base your practice
on it?
HEALTH EDUCATION &TRAINING INSTITUTE
Questions?