Can I Believe It? Understanding Statistics in Published Literature

Can I Believe It?Understanding Statistics in Published

Literature

Keira Robinson – MOH Biostatistics Trainee

David Schmidt – HETI Rural and Remote Portfolio

HEALTH EDUCATION &TRAINING INSTITUTE

Agenda

WelcomeUnderstanding the contextData typesPresenting dataCommon testsTricks and hintsPracticeWrap up


Understanding statistics

Never consider statistics in isolationConsider the rest of the article

Who was studied What was measured Why was that measure used Where was the study completed When was it done

It is the author’s role to convince you that their results can be believed!

Types of Data


Examples of data – Table 1Diamond et al. 2006


Types of data

Numeric Continuous (height, cholesterol) Discrete (number of floors in a building)

Categorical Binary (yes/no, ie born in Australia?) Categorical (cancer type) Ordinal categorical (cancer stage)


Histograms

Represents continuous variables Areas of the bars represent the frequency (count) or percent

Indicates the distribution of the data


Measures of association0

1020

3040

Freq

uenc

y of

peo

ple

160 170 180 190 200 210Height in cm


Stem and leaf plot- heights

6* 116* 26* 33333336* 444444444446* 5555555555556* 666666666666666666666666* 7777777777777777777777777777776* 88888888888888886* 999999999999999999999999999999997* 00000000000000000000000007* 11111111111111111117* 2222222222227* 3333337* 447* 55


Skewed Data0

2040

6080

Freq

uenc

y

40 60 80 100 120 140diastolic


Salient features- the mean

The average value:

mean - 1978

mean - 2010

Birthweight3000 grams

3500 grams


Salient features- the median

The observation in the middle Example- newborn birth weights 3100, 3100,3200,3300,3400,3500,3600,3650 g

- (3300+3400)/2 = 3350

Not affected by extreme values Wastes information


Salient features- the mean and median

mean -2010

median- 2010

Birthweight3356grams

3350 grams


Mean and Median

Mean is preferableSymmetric distributions mean ~

median Present the Mean

Skewed distributionsMean is pulled toward the ‘tail’

Present the Median


Mean and MedianBody fat Mean Median

13.6 11.70

1020

3040

Num

ber o

f peo

ple

0 10 20 30 40Body fat (%)


Variability – Standard deviation and varianceThe average distance between the observations

and the meanStandard deviation :

with original units , ie. 0.3 % Variance =

With the original units squared


Range

Example, infant birth weight3100, 3100,3200,3300,3400,3500,3600,3650,

3800 Range = (3100 to 3800) grams or 700 grams

Interquartile range: the range between the first and 3rd quartiles (Q1 and Q3)

3100, 3100,3200,3300,3400,3500,3600,3650 , 3800 IQR = (3200 to 3600) grams or 400 grams


Presenting variabilityPresent standard deviation if the

mean is usedPresent Interquartile range if the

median is used


Graphics for Continuous Variables

Boxplot :40

6080

100

120

Wt

Median

25th percentile (Q1)

75th percentile (Q3)

outlier

IQR

Minimum in Q1

Maximum in Q3


Categorical Variables- table summariesSport Frequency Percent Cumulative

PercentSoccer 25 12.6 12.6

Football 37 18.7 31.3

Basketball 23 11.6 42.9

Swimming 22 11.1 54.0

Golf 19 9.6 63.6

Rugby 44 22.2 85.9

Cycling 11 5.6 91.4

Tennis 17 8.6 100.0

TOTAL 198 100.0


Bar charts

Relative frequency for a categorical or discrete variable

010

2030

mea

n of

mea

nbm

i


Bar chart vs Histogram

Histogram For continuous variables The area represents the frequency Bars join together

Bar chart For categorical variables The height represents the frequency The bars don’t join together


Pie chart

Areas of “slices” represent the frequency

soccer

football

basketball

swimminggolf

rugby

cycling

tennis

soccer footballbasketball swimminggolf rugbycycling tennis

Precision

26


Presenting statistics

Tables should need no further explanationMeans

No more than one decimal place more than the original data

Standard deviations may need an extra decimal place

Percentages Not more than one decimal place (sometimes no

decimal place) Sample size <100, decimal places are not

necessary If sample size <20, may need to report actual

numbers

Statistical Inference


Sampling

Population

Sample

InferenceSampling


Sampling, cont’d

• A statistic that is used as an estimate of the population parameter.

• Example: average parity

PopulationMean

Sample Mean


Confidence intervals

We are confident the true mean lies within a range of values

95% Confidence Interval: We are 95% confident that the true mean lies within the range of values

If a study is repeated numerous times, we are confident the mean would contain the true mean 95% of the time

How does confidence interval change as the sample size increases?


Confidence intervals cont’d


Hypothesis testing

Is our sample of babies consistent with the Australian population with a known mean birth weight of 3500 grams?

Sample mean = 3800 grams, 95% CI of 3650 to 3950 grams

3800 lies outside of this confidence interval range, indicating our sample mean is higher than the true Australian population


Hypothesis testing

State a null hypothesis: There is no difference between the sample mean

and the true mean: Ho = 3500 Calculate a test statistic from the data t = 2.65 Report the p-value = 0.012


What is a p-value?

The probability of obtaining the data, ie a mean weight of 3800 grams or greater if the null hypothesis is true

The smaller the p-value, the more evidence against the null hypothesis

< 0.0001 to 0.05 – evidence to reject the null hypothesis (statistically significant difference)

> 0.05 – evidence to accept the null hypothesis (not statistically significant)


Summary – Confidence intervals and p valuesP –value: Indicates statistical significance

Confidence interval: range of values for which we are 95% certain our true value lies

Recommended to present confidence intervals where possible

Analysing Continuous Outcomes

37


T tests

What are they used for? Analyse means Provide estimate of the difference in means

between the two groups and the 95% confidence interval of this difference

P-value – a measure of the evidence against the null hypothesis of no difference between the two groups


T tests- paired vs independent

Paired: Outcome is measured on the same individual

Eg: before and after, cross-over trial Pairs may be two different individuals who are

matched on factors like age, sex etc.Patient Baseline weight

(kg)3 months weight (kg)

1 85 82

2 76 73

3 102 98

4 110 108


Paired T-tests

Calculate the difference for each of the pairs

The mean weight at baseline was 93 kg and the mean weight at 3 months was 88 kg. The weight at 3 months was 5 kg less compared to the baseline weight 95% CI (-3, 12)

Patient Baseline weight (kg)

3 months weight (kg)

Difference (kg)

1 85 82 -3

2 76 73 -3

3 102 98 -4

4 110 100 -10

Mean 93 88 -5


Paired T-tests

There was no evidence that there was a significant change in weight after 3 months (p value = 0.19)

Assumptions Bell shaped curve with no outliers Assess shape by graphing the difference

Use a histogram or stem and leaf plot


Independent T tests

Two groups that are unrelated Eg: weights of different groups of people

Weight (kg)

NW Public School SW Public School52 4551 5471 8214 15


Independent samples t-tests

Same assumption as for paired t tests plus the assumption of independence and equal variance

NW Public School

(weight kg)

SW Public School

(weight, kg)

Difference(weight, kg)

52 45 7

51 54 3

71 82 11

72 61 11

Mean 62 61 1 (-22,24)


Interpretation –independent t tests

The mean weight in NW Public was 62 kg and the mean weight in SW Public was 61 kg

The mean difference in weight between the two schools was 1 kg (-22, 24)

There was no evidence of a significant difference in weight between the two schools (p=0.92)


One-way Analysis of Variance (ANOVA)

What happens when there are more than two groups to compare?

Null hypothesis: means for all groups are approximately equal

No way to measure the difference in means between more than two groups, so the variance between the groups is analysed

Can measure variance within a group as well as variance between groups


One-way ANOVA

Comparing multiple groups

NW Public School

NE Public School

SW Public School

SE Public School

42 39 46 56

53 52 51 45

46 58 56 41

75 41 44 32

56 65 63 56


Interpretations – One-way ANOVA

There was evidence of a difference between the average student weight between the four schools p<0.05

There was evidence of no difference between the average student weight between the four schools p>0.05

Not advised to compare all means against each other because there is an increased chance of finding at least 1 result that is significant the more tests that are done


Assumptions ANOVA

Normality, - observations for all groups are normally distributed,

Variance in all groups are equal Independence – all groups are independent of

each other


Extensions of one-way ANOVA

Two way-ANOVA: Multiple factors to be considered. Eg school and

type of school (public/private)ANCOVA – Analysis of Covariance

Tests group differences while adjusting for a continuous variables (eg. age) and categorical variables


Linear Regression

Measures the association between two continuous variables (weight and height)

Or one continuous variable and several continuous variables (mutliple linear regression)

What is the relationship between height and weight?


Scatter plot of weight and height

Correlation between height and weight = 0.7540

6080

100

120

Wt

160 170 180 190 200 210Ht


Scatter plot of body fat and height

Correlation between body fat and height = -0.230

1020

3040

Bfa

t

160 170 180 190 200 210Ht


Linear regression

Fits a straight line to describe the relationshipAssumes

1. Independence for each measure (each person)2. Linearity (check with scatter plots)3. Normality (check residuals with a graph)

Residuals are the difference between the data point and the regression line

4. Homscedasticity Variability in weight does not change as height

changes, ie


Multiple Linear Regression

Extends the simple linear regression Adjusts for confounding variables

Example: Does smoking while pregnant affect infant birth weight? Outcome variable: infant birth weight Exposure variable: maternal smoking Covariates (other variables of interest):

Sex of the baby, gestational age


Confounding variables

A variable (factor) associated with both the outcome and exposure variables

Gestational age is associated with both smoking (exposure) and the outcome (birth weight)

Confounders can be assessed by checking the correlation between the variable of interest and the outcome variable

Correlation coefficient : -1.0 <r<1.0Rule of thumb: >0.5 or <-0.5 should be

considered a confounder


Example of weight vs height adjusting for sex


Summary for continuous outcomes Comparing means from two group

Use t- tests (paired for same person comparison, independent for independent groups comparison)

Comparing means for more than two groups One-way ANOVA

Comparing means for two or more groups and adjusting for other variables (ANCOVA)


Summary for continuous outcomes

Assessing the relationship between two continuous variables Simple linear regression

Assessing the relationship between two or more variables Multiple linear regression

Analysing Categorical Outcomes

59

Chi-square tests

What can a chi-square test answer?


Chi-Square tests

2x2 tables: Low birth weight (<2500

grams)Total

<2500 grams

>2500 grams

Smoking No 5 100 105

Yes 25 75 100

Total 30 175 205


Chi-square tests

Can be used for paired (same person under two different conditions) or independent samples (unrelated people in different groups)

Used often in case-control studies where the outcome is categorical (or dichotomous)

Tests no association between row and column factors Smoking and low birth weight association

The study design defines the appropriate measure of effect


Cohort studies

Exposure is determined by Randomisation to different groups followed over time

Outcome is determined at the end of follow upRate of outcome can be estimated


Cohort studies continued

Eg. Rate of low birth weight in: Smokers: rate = 25/100 = 0.25 = 25% Non-smokers: = 5/105 = 5%

Relative risk (RR) = 25/5=5 times higher risk of low birth rate in smokers relative to non-smokers

Risk Difference (RD) = 25-5 = 20No relative difference between the low birth rate

in smokers and non-smokers RR =1.0No absolute difference in the low birth rate in

smokers and non-smokers = RD


Cross-Sectional Studies

People observed at one point in time (questionnaire)

Exposure and outcome are measured at the same time

Causal associations cannot be deducedRate ratio (RR) = 25/5=5 times higher risk of low

birth rate in smokers relative to non-smokers Rate Difference (RD) = 25-5 = 20No relative difference between the low birth rate

in smokers and non-smokers RR =1.0No absolute difference in the low birth rate in

smokers and non-smokers = RD


Case-control studies

Use for rare outcomes (example: child prodigies)Children are selected based on being a prodigy

Eg. 100 child prodigies and 100 children with normal intelligence

Determine exposure retrospectivelyCannot obtain a rate Must obtain the odds of the outcome and

compare using an odds ratio


Case-Control studies Child prodigy Total

Yes No

Fish oil supplements during pregnancy

No 30 50 105

Yes 70 50 100

Total 100 100 205


Case-control studies

Odds of being a prodigy: In exposed: 70/50 = 1.4 In unexposed: 0.6 Odds ratio:

1.4/0.6 = 2.3 2.3 times more likely to have a child prodigy if

maternal fish oil supplements were taken during pregnancy

Null hypothesis No association between the exposure and the outcome Odds Ratio = 1


Summary of RR and ORBoth compare the relative likelihood of an

outcome between 2 groups

RR=1 or OR = 1 Outcome is as likely in the exposed and unexposed

groups

RR>1 or OR >1 The outcome is more likely in the exposed group

compared to the unexposed group The exposure is a risk factor


Summary of RR and OR

RR<1 or OR<1 The outcome is less likely in the exposed group

compared to the unexposed group The exposure is protective

RR cannot be calculated for a case-control studyOR ~ RR when the outcome is rare


Extensions of Chi-square

Small sample sizes Fisher’s exact test

Recommended when n<20 or 20 <n<40 and the smallest expected cell count is <5

Paired data Exact binomial test for small sample sizes McNemar’s test

Multiple regression: Logistic regression


Non-parametric tests

Parametric test Non parametric tests

Independent samples t-test Wilcoxon-Mann-Whitney test

Paired t-test Wilcoxon signed rank sum test

One-way ANOVA Kruskal Wallis

Chi-square test ?

Spurious statistics


Fact or Fiction

Vaccines and autism?Cell phones and brain tumours?


Common errors

60.182 kg or 61kg? Reporting measurements with unnecessary

precision Age divided into 20-44 years, 45-59 years, 60-74

years, 75+ years Dividing continuous data without explaining why or

how Certain boundaries may be chosen to favour certain

resultsPresenting Means and SD for non-normal data

What should be presented instead?


Common Errors

“The effect of more exercise was significant”“The effect of 40 minutes of exercise per day was

statistically significant for decreasing weight (p<0.05)”

“40 minutes of exercise per day lowered the mean weight of the group from 95 kg to 89 kg, (95% CI = 75-105 kg, p= 0.03)

Checking the distribution of the data to determine the appropriate statistical test Using parametric tests when data is not normal Using tests for independent data when the data is

paired


Common Errors

Using linear regression without confirming linearity

Not reporting what happened to all patients Leads to bias of the results

Data dredging Multiple statistical comparisons until a significant

result is foundNot accounting for the denominator or adjusting

for baseline


Example


Common Errors

Selection Bias Sampling from a bag of candy where the larger

candies are more likely to be chosen On November 13, 2000, Newsweek published the

following poll results:


Selection Bias


Common Errors

Other biases (measurement bias, intervention bias)

Using cross sectional studies to infer causality More likely to have a c-section if attending a private

hospital instead of a public hospital


Practical example

Working in groups quickly read the article provided

Summarise What data they used What test Do you believe their findings? Can you explain why?


Summary

Statistics must be understood in the context of the whole article

Statistical tests must fit the data typeFindings should be presented appropriatelyBeware flashy stats! It’s the author’s job to justify their choices If you don’t believe it- can you base your practice

on it?


Questions?

Can I Believe It? Understanding Statistics in Published Literature

Documents

Transcript of Can I Believe It? Understanding Statistics in Published Literature