Post on 24-Jun-2020
! 1!
NTDT5602(Methods(in(Nutrition(Research(!
Module(1(Nutritional(Epidemiology(!
Nutritional(Epidemiology(Lecture(1(–(Introduction(&(Study(Design(!Introduction(
(
What(is(epidemiology?((
• “…!the!study!of!the!distribution!and!determination!of!health5related!states!or!events!in!specified!populations!and!the!application!of!this!study!in!the!prevention!and!control!of!health!problems”!!
!What(does(this(mean?((
• The!magnitude!of!a!health!problem!!o Frequency!–"how"often"does"a"health"problem"occur?"!
• The!cause!of!a!health!problem!!o Determinants!–!the"associations"between"lifestyle"factors"and"disease."E.g.,"what"is"the"cause"of"
bowel"cancer?"!• The!evaluation!of!a!treatment!or!prevention!campaign!!
o Clinical!or!community!application!–"e.g.,"with"drugs"used"in"medicine,"we"see"if"it"works"by"running"a"RCT,"or"for"a"prevention"campaign,"we"run"a"clustered"RCT"where"some"centres"get"the"treatment"and"others"do"not,"and"then"measuring"the"difference"between"the"centres"(e.g.,"schools)!
!Overall(aim(of(these(lectures(
• To!be!able!to!critically!appraise!the!nutrition!and!dietetic!scientific!literature!!Course(overview((
• Study!types!and!levels!of!evidence!–!looking"at"study"designs!• Measures!of!frequency!–!using"the"right"measures"of"frequencies!• Measures!of!association!!• Confounding!!• Selection!bias!–!what"can"happen"if"you"do"not"select"the"study"population"using"the"best"methods"!• Measurement!error!–!i.e.,"bias"in"measuring"the"outcome"factors"!• Causality!in!nutrition!related!disease!–!how"we"reach"a"final"conclusion"after"looking"at"all"the"
paths/papers,"how"we"conclude"that"there"is"causality"between"diet"and"disease,"e.g.,"diet"with"less"fibre"!"bowel"cancer"in"30yrs!
!
! 3!
Most(research(questions(fall(into(3(types:((
1) How!common!is!the!nutritional!problem?!Or!the!magnitude/frequency!of!the!nutritional!problem?!
o =(Observational(descriptive(study,(usually(crossVsectional((" How(common(is(the(nutritional(problem?(FREQUENCY(
• Select!study!population!!!Representative!sample!of!the!population!!!Measure!factor!of!interest(
o E.g.,(National(Health(Surveys!o E.g.,"How"many"people"in"Australia"are"eating"less"than"2"serves"of"fruit?"–"You"can’t"
go"out"and"ask"20"million"people"how"much"fruit"they’re"eating;"therefore"you"have"to"select"a"representative"sample."!
o How(many(adults(in(Australia(are(obese?(!Australians!18!years!and!over!!!Representative!sample!(taking"into"account"the"demographic"profile"of"Australia)!!!Get!weight!and!height!and!then!BMI;!HOW!MANY!≥30?!!
!
2) What!nutritional!factor!caused!or!prevented!the!disease?!!o =!Observational(cohort(study,(caseVcontrol(study,(crossVsectional(analytical(!o Observational(studies(for(causal(relationships:((Is"the"study"factor"causing"the"outcome"factor"or"not?)!
" Looking!at!causal!relationships!from!observational!analytical!studies!" Study!factor!(exposure)!––???––>!Outcome!factor!!" High!saturated!fat!diet!––???––>!Ischaemic!Heart!Disease!!
!o Cohort(studies((analytical)(
" Exposure!to!the!study!factor!is!determined!by!the!subjects!–!(i.e.,"cohort"studies"are"observational,"unlike"RCTs"in"which"you"assign"people."In"cohort"studies"the"subjects"determine"their"own"exposure"and"the"investigator"has"nothing"do"with"their"choices,"they"just"monitor)!
" Researchers!measure!the!extent!of!the!exposure!–"(usually"use"FFQs)!• Researchers"rank"the"people"based"on"exposure,"e.g.,"tertiles/quintiles"of"lowest"intake"!"
highest"intake"!" The!outcome!factor!is!measure!LATER!–!e.g.,"See"how"many"new"cases"of"disease"5G20yrs"later!" Determine!the!study!factor!!!(1)!Group!exposed!to!study!factor!(2)!Group!not!exposed!!!
Measure!outcome!factor!in!exposed!versus!unexposed!group"(continually"monitored"every"5"years"until"point"of"analysis;"then"classify"the"people)!
• E.g.,"(1)"Group"exposed"(smokers)"(2)"Group"not"exposed"(nonGsmokers)"!"measure"outcome"factor"(number"of"people"who"got"lung"cancer)"in"each"group!
• E.g.,(the(EPIC(study((n=435000)((
Measure!dietary!fibre!intake,!with!657y!follow5up((! ((1)(Very!low!intake((2)(Low!intake((3)(Medium!intake((4)(High!intake(((! (Measure!the!occurrence!of!gastric!cancer!in!different!intake!levels!(!
! 10!
Nutritional'Epidemiology'Lecture'2.2'–'Measures'of'Association'!
Key'measures'of'association!–!(Measures%to%decide%whether%a%particular%dietary%pattern%is%associated%with%decreased/increased%incidence%of%disease)!
• Relative!Risk!(RR)!
• Odds!Ratio!(OR)!
• Attributable!risk!percent!(AR%)!
• Population!attributable!risk!(PAR)!
• These%are%measures%of%association%in%cohort%studies%and%case%control%studies!
• !%allow%us%to%determine%dietKdisease%relationships%!
!
What'Associations?''
• How!big,!or!how!strong!is!the!association!between!the!study!factor!(exposure)!and!the!outcome!factor?!
–!So%we%can%apply%a%numerical%value%to%the%risk%that%a%particular%diet%will%lead%to%a%disease!
!
Incidence''
• Cumulative!Incidence!=!number!of!people!experiencing!a!NEW!event!during!a!time!period!/!number!of!
susceptible!people!at!the!beginning!of!the!time!period!!
• Incidence!is!a!measure!of!events!(event!rate)!
• Incidence!is!a!measure!of!risk!!
!
Relative'Risk'(Risk'Ratio)'–'for'cohort'studies'and'RCTs''
• RR'='incidence'of'study'factor'in'the'exposed'group'/'incidence'in'the'control'group''
o The%RR%describes%the%likelihood%that%people%exposed%to%the%nutritional%factor%would%get%the%disease%or%are%
protected%from%the%disease%%
• Researchers!measure!their!lifestyle!behaviours,!rank!them!into!groups!according!to!their!level!of!
exposure,!and!then!measure!how!many!of!them!in!each!group!get!a!particular!disease.!!
• The!lowest!quintile!=!the!control!group!
• E.g.,!if!the!exposed!=!20%,!and!the!control!is!10%,!hence!the!RR!=!2.0!!
o This!means!that!the!exposed!are!twice!as!likely!to!have!the!event!!
!
Cohort'study'of'effect'of'food'x'consumption'on'metabolic'syndrome''
' Present'(have&the&disease)&
Absent'(don’t&have&the&disease)&
Total'
Exposed' 2000! 28,000! 30,000!
Unexposed' 500! 14,500! 15,000!
Total' 2500! 43,500! 45,000!
• Incidence!in!exposed!group:!2000/30000!=!0.067!
• Incidence!in!unexposed!group:!500/15000!=!0.033!
• Relative!risk!=!0.067/0.033!=!2.03!
!
Risk'difference'
• How!many!outcomes!are!due!to!the!exposure?!!
• Risk!in!exposed!group!minus!risk!in!the!unexposed!group!
• Risk!Difference!=!Iexposed!–!Iunexposed!
• Risk!Difference!=!0.067!–!0.033!=!0.024!=!24!per!1000!
• This%gives%you%the%DIFFERENCE%in%risk%between%groups%!
• Normally%we%see%“Relative%Risk”%in%papers%!
!
! 14!
Nutritional'Epidemiology'Lecture'3'–'Confounding'!
Key'Points''
• Understand!the!definition!of!confounding!variable!
• Understand!the!difference!between!a!crude!and!an!adjusted!estimate!of!association!!
• Understand!in!general!terms!the!main!ways!of!controlling!confounding!(restriction,!randomization,!
matching,!stratification!and!multivariate!analysis)!!
• Understand!the!difference!between!confounding!and!effect!modification!!
!
Definition'of'confounding''
• Confounding!occurs!when!a!measure!of!association!is!biased,!because!of!the!association!of!the!study!
factor!with!other!factors!that!influence!the!outcome!factor.!!
o E.g.,%you%may%get%a%positive%association%between%the%amount%of%vegetable%consumption%and%
diabetes.%This%association%may%be%real,%or%other%factors%that%influence%the%development%of%
diabetes,%e.g.,%BMI,%is%confounding%the%observed%RR.%!
o I.e.,%confounding%occurs%when%there%is%bias%in%the%measurement%of%association,%because%another%
variable%is%changing%the%observed%risk%of%the%study/exposure%factor.%!
!
An'Example'–'“Is'high'alcohol'intake'associated'with'lung'cancer?”'
!Presence'of'Lung'Cancer!
YES' NO' TOTAL'
Exposed'to'High'
Alcohol'
YES' 615! 24385! 25000!
NO' 210! 24790! 25000!
TOTAL' 825! 49175! 50000!
!
• Relative!Risk!=!(615/25000)!/!(210/25000)!=!0.0246/!0.0084!=!2.9!!o Incidence%of%lung%cancer%in%those%exposed%to%high%alcohol%=%615/25000!
o Incidence%of%lung%cancer%in%those%who%aren’t%exposed%to%high%alcohol%(control)%=%210/25000!
o I.e.,%study%factor:%high%alcohol%KKKK>%outcome%factor:%lung%cancer!
" However,%this%is%the%crude%RR,%and%it%may%be%biased%due%to%the%influence%of%confounders!
!
• Confounding'
o Study!factor:!high!alcohol!!!Confounding!variable:!smoking!!!Outcome!factor:!lung!cancer!
" Somewhere%along%the%way,%there%may%be%a%factor%that’s%confounding%the%relationship%
between%high%alcohol%and%lung%cancer,%i.e.,%smoking.%!
" The%observed%high%association%between%alcohol%and%lung%cancer%could%be%due%to%smoking%
being%a%confounder.%!
!
! 29!
Nutritional*Epidemiology*Revision*!
Study*Design**
• What!is!the*research*question?!–$There$are$3$types$of$questions!• What!is/are!the!study!factors!and!outcome!factors?!!• What!is!the!study!type?!
o RCT!o Cohort!o Case$control!
o CrossSsectional$descriptive!o CrossSsectional$analytical$!
• What!level!of!evidence!is!offered!by!the!study!type?!!o Is$the$level$of$evidence$offered$high$or$low$when$studying$causation?$!
• What!study!design!is!feasible!to!answer!the!research!question?!!o What!study!type!would!give!best!evidence!vs!what!is!most!feasible?!!
! Sometimes$there's$a$tradeSoff$between$feasibility$and$highest$level$of$evidence$(because$it's$
improbable)$
! E.g.,$the$best$way$to$study$the$relationship$between$dietary$fibre$and$bowel$cancer$is$an$RCT$to$avoid$
subject$bias.$But$it$would$take$20$years$for$the$relationship$between$dietary$fibre$and$bowel$cancer$to$
become$apparent,$and$you$wouldn't$be$able$to$ethically$do$an$RCT$for$20$years$
!Study*Type!–$the$study$type$you$choose$depends$on$your$question$(of$which$there$are$3$types)$
• How*common*is*the*nutritional*problem?*
o Use$a$crossSsectional$study,$e.g.,$survey!o !$Prevalence$(how$common)$!o New$cases$=$“incidence”!
• What*nutritional*factor*caused*or*prevented*the*disease?*
o Cohort$study!o Case$control$study!
! Case$control$=$opposite$of$a$cohort$study$
! You$already$have$your$cases,$and$then$you$recruit$a$representative$group$of$controls,$then$use$recall$
to$examine$previous$diet$and$get$the$OR$of$the$disease$being$caused$by$a$particular$dietary$exposure$
o CrossSsectional$analytical$(but$you$cannot$infer$causation$from$a$crossSsectional$analytical$study)!• Does*dietary*intervention*prevent*or*cure*the*disease?**
o RCT!o E.g.,$does$a$particular$drug$lower$blood$pressure?$Does$nS3$fatty$acids$cure$rheumatoid$arthritis?!o Systematic$literature$review$with$metaSanalysis$of$RCTs$=$highest$level$of$evidence$$
!!!
! 30!
*Epidemiologic*Study*Designs**
!
! 31!
*Module*4*Statistics*for*Nutrition*Practice*!Outline*
• Descriptive!Statistics!(Lecture!1)!o Graphical!Displays!of!Data!o Measuring!Centre!and!Spread!o Tabular!Data!!
• Inferential!Statistics!(Lectures!2!and!3)!o How!to!set!up!a!Hypothesis!Test!o Types!of!Hypothesis!Tests!
!Descriptive*and*Inferential*Statistics**
• Descriptive!Statistics!–!describe!the!sample!• Inferential!Statistics!–!use!the!sample!to!test!theories!about!the!population!!
!
Statistics*Lecture*1*–*Descriptive*Statistics**
Learning*Objectives**
• Summarise!statistical!data!using:!o Graphs!and!tables!o Appropriate!measures!of!location!including!means!and!medians!o Appropriate!measures!of!variability!including!standard!deviations!and!interquartile!ranges!
• Interpret!significance!tests!and!confidence!intervals!• Use!IBM!SPSS!statistical!software!to!present!data!!
!Descriptive*Statistics**
• The!purpose!of!descriptive!statistics!is!to!become!familiar!with!the!data!that!you!have!collected!!o What!information!can!you!get!from!the!data?!!
• What!are!we!looking!for?!!o What!is!a!“typical”!response?!!o How!different!are!responses!from!different!individuals?!o Do!responses!differ!between!groups!of!individuals?!!o Is!a!measurement!of!one!aspect!of!an!individual!dependent!on!another!aspect?!!
!Types*of*Data*
• The!way!that!we!look!at!data!will!depend!on!the!type!of!data!that!we!have!!
Categorical* Numeric*
Nominal!–!no!order!E.g.,!Nationality,!Gender!
Ordinal!–!order!E.g.,!Likert!Scale,!level!of!
education!
Discrete!–!takes!whole!number!values!
E.g.,!number!of!birds!in!a!tree!
Continuous*
E.g.,!height!
Mode! Mode!or!Median!
Mean,!Median!or!Mode!$
If$your$data$is$symmetrical,$
you$use$the$mean.$If$it$is$
skewed,$you$use$the$median.$
Mode$is$used$as$a$last$resort.$
Mean!or!Median!
!
! 33!
Measures*of*Central*Tendency**
• Where!is!the!‘middle’!of!the!data?!–!when$you$need$to$summarise$the$data$!o We!could!also!think!of!it!as!what!we!would!expect!to!measure!for!a!typical!respondent.!!
• Three!different!measures!o Mean!–!add!the!observations!and!then!divide!by!the!number!of!observations!
! The!mean!can!be!quite!sensitive!to!large!values!and!skewness!–!avoid!using!the!mean!to!describe!skewed!data!(asymmetry$in$data)!
! Skewness**
• When$the$mean$is$dragged$out$to$the$right$or$left$(may$be$due$to$extreme$values)$
• With$symmetric$data,$the$mean$and$the$median$are$very$similar,$and$often$the$
same.$However,$the$mean$has$many$statistical$properties$that$we$can$use,$which$the$
median$doesn’t$have.$$
! Skewness*from*boxplots**
• If!most!of!the!observations!are!concentrated!on!the!low!end!of!the!scale,!the!distribution!is!skewed!right!(positively!skewed);!and!vice!versa!!
• If!a!distribution!is!symmetric,!the!observations!will!be!evenly!split!at!the!median,!as!shown!in!the!middle!figure.!!
Positively*Skewed* Symmetric* Negatively*Skewed*
! ! !!
!
!
!
!
!!
o Median!–!is!the!middle!observation!(n!=!odd)!or!the!average!of!the!two!middle!observations!(n!=!even)!
! The!median!is!useful!when!describing!skewed!data!!! Boxplot*revisited**
!!
o Mode!–!is!the!most!frequently!occurring!observation!(merely$a$descriptive$tool)!! 38!
Statistics*Lectures*2*&*3*–*Inferential*Statistics**
Where*does*inference*fit*in?**
• So!far!we!have!looked!at!how!to!describe!data,!but!we!haven’t!been!able!to!test!our!observations!• Inference!provides!the!tools!to!test!whether!our!observations!can!be!applied!to!the!population,!as!
opposed!to!just!seeing!the!results!by!chance!!
!• From$the$sample,$we$can$make$an$inference$about$the$population,$including$the$mean,$and$
variation.$!!The*Structure*of*a*Hypothesis*Test*
• Step*1:!Set!up!Hypotheses!!o Null!Hypothesis:!H0!o Alternative!Hypothesis:!H1!
• Step*2:!Choose!an!appropriate!test!!o Depending$on$the$characteristics$of$your$dataset,$there$is$an$appropriate$test$to$apply$to$test$
your$hypothesis!• Step*3:!Execute!the!test!in!SPSS!and!obtain!a!pIvalue!!
o PIvalue!=!probability!that!you!would!observe!inequality!between!the!means!of!each!group!when!the!null!hypothesis!is!actually!true!(probability$that$we$are$detecting$an$effect$that$is$there$by$chance,$and$not$actually$there)!
o I.e.,$the$chance$that$we$only$observed$the$sample$results$by$chance$when$the$null$hypothesis$is$true!• Step*4:!Make!a!conclusion!!
o Either!reject!H0!if!the!pIvalue!is!small!enough,!or!do!not!reject!H0!o You!cannot!accept!a!hypothesis!–!it$doesn’t$automatically$mean$that$the$alternative$hypothesis$is$true.$!
• Step*5:!State!the!conclusion!in!plain!language!!o So!people!who!don’t!understand!statistics!can!still!understand!your!conclusions.!!
!
! 39!
Some*Concepts*
• The!null!hypothesis!can!usually!be!expressed!as!an!equality!between!means!from!each!group!!o If!there!is!only!one!group,!then!the!null!hypothesis!will!be!that!the!mean!equals!a!particular!value!(a!
hypothesized!value!–!just$a$nominated$value)!• The!alternative!hypothesis!can!either!be!one!sided!or!two!sided!!
o One!sided!–!we!reject!the!null!hypothesis!if!the!sample!mean!for!one!group!is!sufficiently!larger!in!one!group!than!the!other!(or!sufficiently!larger!than!the!hypothesized!value)!
o Two!sided!–!we!reject!the!null!hypothesis!if!the!sample!mean!for!one!group!is!sufficiently!different!to!the!other!group!(larger!or!smaller!–!no$nominated$direction)!
o Which!to!choose!depends!on!what!you!are!wanting!to!know!• We!make!a!decision!based!on!a!pIvalue!!
o The!pIvalue!is!the!probability!that!we!would!observe!the!unequal!sample!means!or!something!more!extreme!by!chance!when!the!null!hypothesis!is!true.!!
• We!compare!the!pIvalue!to!a!level!of!significance!!o The!level!of!significance!is!the!level!of!risk!that!we!are!willing!to!accept,!that!we!will!incorrectly!
reject!a!true!null!hypothesis!!o By!default,!we!will!use!5%!=!0.05!significance!–!that!is,!we!are!willing!to!accept!a!5%!chance!that!we!
will!incorrectly!reject!the!null!hypothesis!!! E.g.,$we$are$willing$to$accept$5%$chance$that$we$incorrectly$rejected$the$null$hypothesis/made$an$
incorrect$conclusion$that$there$is$inequality$between$the$means,$when$in$fact$there$isn’t$!o In!some!situations!we!may!change!the!level!of!significance!!o If!pIvalue!<!0.05!we!reject!H0!o If!pIvalue!≥!0.05!we!do!not!reject!H0!
!Types*of*Hypothesis*Tests**
• The!choice!of!test!depends!on!what!you’re!trying!to!do:!What*You’re*Trying*to*Do* Hypothesis*Test*
Comparing!the!mean!of!a!sample!to!a!value! 1Isample!test!Comparing!the!means!of!two!samples!to!each!other! 2Isample!test!Comparing!the!means!of!more!than!two!samples!to!each!other! ANOVA!Comparing!proportions! 1I!and!2Isample!proportion!tests!Finding!relationships!between!categorical!variables! Chi!Square!test!!
• It!also!depends!on!the!type!of!data!that!you!have!!o If!you!can!assume!that!your!data!are!normally!distributed,!use!a!parametric!test!(t,!ANOVA)!o If!you!cannot!assume!that!your!data!are!normally!distributed,!use!a!nonIparametric!test!(Wilcoxon,!
MannIWhitney,!KruskalIWallis)!!Which*Test?**
!• Paired$data$=$when$the$result$on$the$2
nd$measurement$is$dependent$on$the$1
st$measurement$$
o E.g.,$individuals$measured$before$and$after$an$intervention$$
! 40!
Types*of*Hypothesis*Tests**
• Recall!that!the!appropriate!graphical!display!of!data!depends!on!the!type!of!data!that!you!have:!
Type*of*data* Categorical* Numeric*
Categorical* Multiple!Bar!Charts! Multiple!Boxplot!Numeric* Multiple!Boxplot! Scatterplot!
!• It!is!the!same!for!hypothesis!testing!
Type*of*data* Categorical* Numeric*
Categorical* ChiISquared!Test! 1Isample!t,!paired!t!2Isample!t,!ANOVA!
Numeric* Regression! Regression!!19Sample*Tests*
• I!have!one!column!of!data,!and!I!want!to!compare!the!population!mean!to!a!particular!measurement,!e.g.,$a$known$value$from$a$wellSdefined$population!!
• 1Isample!tItests!assume!that!the!data!is!normally!distributed!!o I.e.,!if!you!have!clearly!skewed!data,!it!is!not!appropriate!to!use!a!1Isample!tItest!
!H0:! µ!=!value!
H1:!µ!≠!value!(two!sided!alternative)!!µ!<!value!(one!sided!alternative)!µ!>!value!(one!sided!alternative)!
!• 19sample*t9test*example*–*Pulse*Data*
o In!the!lab!sessions,!we!considered!a!data!set!based!on!the!pulses!of!people!who!either!ran!for!one!minute,!or!rested!for!one!minute.!!
o We!can!use!inferential!techniques!to!test!some!of!the!theories!that!we!may!have!made!from!our!exploration.!!
o Suppose!that!we!would!like!to!test!whether!our!group!has!a!starting!pulse!rate!that!is!different!from!the!typical!resting!pulse!rate!of!75bpm!
o Step*1:!Set!up!the!hypotheses!!! H0:!µ!=!75!! H1:*µ!≠!75!(i.e.,$the$mean$is$different$to$75)!
o Step*2:*Choose!an!appropriate!test!! 1Isample!tItest!!
o Step*3:!Execute!the!test!in!SPSS!and!obtain!a!pIvalue!!
!o Step*4:!Make!a!conclusion!!
! PIvalue!=!0.067,!which!is!greater!than!0.05!level!of!significance!!! Therefore!we!will!not!reject!H0!
o Step*5:!State!the!conclusion!in!plain!language!!! Therefore!the!initial!pulse!rate!measurements!do!not!differ!significantly!from!the!
typical!resting!pulse!rate!of!75bpm!(i.e.,$73$is$not$statistically$different$to$75)!
! 42!
29sample*tests***
! I!have!two!samples!and!I!want!to!compare!them!to!each!other!o If!each!individual!has!one!measurement!in!each!sample,!then!my!data!is!paired!by!the!
individual,!use!a!paired*t9test!–!also$used$for$individuals$who$were$matched$according$to$certain$
characteristics$thought$to$influence$outcomes$!o If!my!data!is!not!paired,!use!a*29sample*t9test!
! Both!tests!assume!that!the!data!are!normally!distributed!–!you$will$get$misleading$data$if$you$have$
skewed$data$!!
! Paired*t9test*example*–*Pulse*Data*
o Suppose!that!I!want!see!whether!the!first!and!second!pulse!rates!for!those!participants!who!ran!were!different!or!not!!
!o Step*1:!Set!up!the!hypotheses!
! H0:!µ1!=!µ2!–!the$mean$obtained$from$first$pulse$rate$measurements$is$the$same$as$the$
mean$obtained$from$the$second$pulse$rate$measurements$!! H1:*µ1!≠!µ2!–!the$means$are$not$the$same$!
o Step*2:*Choose!an!appropriate!test!*! Paired!tItest!*
o Step*3:*Execute!the!test!in!SPSS!and!obtain!a!pIvalue!*
*o Step*4:!Make!a!conclusion!*
! PIvalue!<!0.001!(0.05!level!of!significance)!–!the$probability$of$observing$a$difference$of$this$magnitude$when$in$truth$there$isn’t$a$difference$is$very$small*
! Therefore!we!will!reject!H0!–!we$reject$the$notion$of$there$being$no$difference$*o Step*5:!State!the!conclusion!in!plain!language*
! Therefore!there!is!a!significant!difference!between!the!initial!pulse!rate!and!the!final!pulse!rate!for!those!participants!who!ran!*
*
! 43!
! 29Sample*t9test*example*–*Pulse*Data*
o The!two!sample!tItest!determines!whether!the!means!of!two!unrelated!populations!are!the!same!or!not!*
o In!general,!the!hypotheses!are:!*! H0:!µ1!=!µ2!–!i.e.,$the$mean$of$group$one$is$equal$to$the$mean$of$group$two!! H1:*µ1!≠!µ2!–!i.e.,$the$difference$between$the$means$of$the$groups$is$not$0!
o We!can!also!decide!whether!or!not!we!should!use!a!“pooled!variance”,!i.e.,!assume!that!the!variance!of!the!two!groups!are!equal!and!obtain!a!more!powerful!test!*
! To!decide!this,!we!need!to!look!at!a!2Isample!variances!test!*! SPSS!automatically!conducts!a!test!for!equal!variances!when!it!does!a!2Isample!tItest!*
• This!test!is!called!Levene’s!test*! Levene’s!test!has!hypotheses:!*
• H0:!σ12!=!σ2
2!–!i.e.,$there$is$no$difference$in$the$variances*• H1:!σ1
2!≠!σ22!–$i.e.,$there$is$a$difference$in$the$variances!*
! If!we!reject!the!null!hypothesis,!we!cannot!assume!equal!variances!and!need!to!use!the!pIvalue!associated!with!“equal!variances!not!assumed”*
! If!we!do!not!reject!the!null!hypothesis,!we!can!assume!equal!variances!and!use!the!pIvalue!associated!with!“equal!variances!assumed”*
*
o Now!suppose!that!I!wish!to!test!whether!there!is!a!significant!difference!between!the!final!pulse!rates!between!the!males!and!females!who!ran!*
! In!this!case,!the!participants!in!one!group!will!be!males!and!the!other!females*! Therefore!the!data!won’t!be!paired,!and!we!should!use!a!2Isample!tItest!*
*
o Step*1:!Set!up!the!hypotheses!*! H0:!µM!=!µF!–!i.e.,$the$mean$final$pulse$rate$in$males$and$females$are$equal!! H1:*µM!≠!µF!–!i.e.,$the$mean$final$pulse$rate$in$males$and$females$are$different!
o Step*2:*Choose!an!appropriate!test!*! 2Isample!tItest*
o Step*3:*Execute!the!test!in!SPSS!and!obtain!a!pIvalue!*
**
**
o Step*4:*Make!a!conclusion!*! PIvalue!<!0.001!(0.05!level!of!significance)*! Therefore!we!will!reject!H0*
o Step*5:*State!the!conclusion!in!plain!language*! Therefore!there!is!a!significant!difference!between!the!final!pulse!rates!of!the!male!
participants!and!the!final!pulse!rates!of!the!female!participants!who!ran!*
! 44!
Testing*Proportions**
• We!can!also!perform!tests!on!proportions,!e.g.,!the!percentage!of!smokers!• We!set!up!hypotheses!in!the!same!way!as!we!did!for!means!• A!1Iproportions!test!will!compare!the!sample!proportion!to!a!hypothesized!proportion,!similar!to!a!
1Isample!tItest!!o H0:!p!=!p0!–$i.e.,$there$is$no$difference$between$the$proportions$!o H1:!p!≠!p0!(>!or!<)!–!i.e.,$there$is$a$difference$between$the$proportions!
!• 19Proportions*test*example*–*Pulse*Data**
o It!is!claimed!that!out!of!the!cohort!that!participated!in!the!pulse!experiment,!more!than!15%!of!students!smoke.!*
o Using!the!pulse!sample,!we!can!test!this!claim.!**
o Step*1:*Set!up!the!hypotheses*! H0:!p!=!0.15!–$i.e.,$the$number$of$people$who$smoke$is$equal$to$15%*
! H1:!p!>!0.15!*o Step*2:!Choose!an!appropriate!test*
! 1Iproportions!test*o Step*3:!Execute!the!test!in!SPSS!and!obtain!a!pIvalue!*
*
o Step*4:*Make!a!conclusion!*! PIvalue!<!0.001!(0.05!level!of!significance)*! Therefore!we!will!reject!H0*
o Step*5:!State!the!conclusion!in!plain!language*! Therefore!the!proportion!of!students!who!smoke!is!significantly!greater!than!15%*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
! 46!
• Multiple*Comparisons*Tests*
o Analysis!of!Variance!only!considers!whether!or!not!there!are!differences!between!the!means!across!the!groups.!It!does!not!find!WHERE!those!differences!are!(or$how$much$
difference$there$is)!o Multiple!comparisons!tests!test!difference!between!pairs!of!groups!o Tukey’s*multiple*comparisons*tests!set!a!family!error!rate!of!0.05.!!
! In!this!case,!the!probability!that!we!observe!any!significant!differences!when!none!exist!is!0.05!
o Some*tests!set!the!individual!error!rate!at!0.05.!!! In!this!case,!the!probability!of!incorrectly!finding!differences!somewhere!will!be!a*lot*more*
than*0.05*
o Therefore,!care!needs!to!be!taken!when!choosing!a!postIhoc!procedure.!**
o Multiple*Comparisons*Tests*Example*–*Pulse*Data*
! We!used!ANOVA!to!determine!whether!the!changes!in!pulse!rates!differed!between!participants!with!different!levels!of!activity;!now!we!would!like!to!see!exactly!which!groups!differed.!*
! We!reIdo!a!OneIWay!ANOVA!and!select!“PostIHoc”,!then!select!Tukey!*! We!will!obtain!the!ANOVA!table,!as!well!as!the!following!table.!*
*! We!can!notice!that!there!is!a!significant!difference!between!Moderate!and!High!levels!of!
activity,!but!no!significant!differences!between!the!other!levels!of!activity.!*• The!change!in!pulse!rate!for!those!with!a!Slight!level!of!activity!is!not!significantly!different!to!
the!other!groups.!*! This*is*better*understood*with*a*multiple*boxplot.**
*! Notice*that*SPSS*gives*95%*Confidence*Intervals*between*pairs*of*groups.**
• If!0!lies!in!the!confidence!interval,!then!we!would!conclude!that!the!means!of!the!two!groups!are!not!significantly!different*
• If!both!bounds!are!positive,!or!if!both!bounds!are!negative,!then!we!would!say!that!the!groups!have!different!means*
! 48!
*
• Kruskal9Wallis*Test*Example*–*Pulse*Data**
o We!can!also!repeat!the!activity!level!analysis!using!a!nonIparametric!test*o First!we!need!to!calculate!the!change!in!pulse!rate:!Change*in*pulse*rate*=*Pulse2*–*Pulse1!!o Step*1:!Set!up!the!hypotheses!!
! H0:!median1!=!median2!=!median3*! H1:!not!all!of!the!medians!are!equal!*
o Step*2:*Choose!an!appropriate!test!!! KruskalIWallis!
o Step*3:!Execute!the!test!in!SPSS!and!obtain!a!pIvalue!!
!o Step*4:*Make!a!conclusion*!
! PIvalue!=!0.002!(0.05!level!of!significance)!! Therefore!we!will!reject!H0!
o Step*5:!State!the!conclusion!in!plain!language!! Therefore!there!is!a!significant!difference!in!the!changes!in!pulse!rates!between!
participants!with!different!levels!of!activity.!!o In!this!case,!we!made!the!same!conclusion,!but!the!observed!pIvalue!was!larger.!!
!!!!!!!!!!!!!!!!!!!!!!!!!
! 49!
Chi9Square*Tests*
! We!have!looked!at!a!test!for!a!single!categorical!variable!with!two!levels!(proportions$of$smokers)!o We!performed!hypothesis!tests!based!on!the!proportions!in!each!group!
! If!we!wish!to!compare!two!or!more!groups,!or!have!more!than!two!levels!in!the!categorical!variable,!then!we!need!a!more!sophisticated!test!!
! A!chiIsquared!test!compares!two!or!more!categorical!variables,!each!with!two!or!more!categories.!!o This!test!compares!the!observed!frequencies!in!each!cell!of!the!crosstab!to!what!we!would!expect!to!
see!there!if!the!variables!were!independent.!!! We!assume!that!we!have!an!adequate!sample!for!each!cell!of!the!crosstabulation!!
o An!observed!frequency!of!at!least!5,!and!an!expected!frequency!of!at!least!5!!o If!there!is!not!an!adequate!sample,!then!we!need!to!combine!groups!
! The!hypotheses!for!this!test!are:!!o H0:!The!variables!are!independent!of!each!other!*o H1:!The!variables!are!not!independent!of!each!other!*
! If!we!reject!the!null!hypothesis!then!we!can!say!that!the!variables!are!related,!or!the!proportions!differ!between!groups!(depending!on!what!you!set!out!to!find)!
!! Chi9square*test*example*–*Smoking*
o Suppose!that!we!would!like!to!test!whether!there!are!gender!differences!in!whether!a!person!smokes!or!not!
o Step*1:*Set!up!the!hypotheses!*! H0:!Gender!and!smokes!are!independent!!! H1:!Gender!and!smokes!are!not!independent!!
o Step*2:!Choose!an!appropriate!test!!! ChiIsquare!test!!
o Step*3:*Execute!the!test!in!SPSS!and!obtain!a!pIvalue!!
!! Expected$Count$=$the$count$if$there$was$no$association$between$the$variables$
! E.g.,$62%$of$the$participants$are$male,$so$if$the$H0$was$true,$we$should$find$62%$of$the$participants$to$
be$male$in$each$category,$i.e.,$we’d$find$62%$of$smokers$were$male.$62%$of$62$participants$=$39.7$
participants$
!o Step*4:*Make!a!conclusion!!
! PIvalue!=!0.216!(0.05!level!of!significance)!–!We!are!using!the!Pearson!chiIsquare!pIvalue!! Therefore!we!will!not!reject!H0!
o Step*5:*State!the!conclusion!in!plain!language!! Therefore!the!proportions!of!males!and!females!who!smoke!are!not!significantly!different!
(the$proportion$of$males$and$females$across$the$groups,$smokers$and$nonSsmokers,$are$not$
significantly$different)!
! 50!
Regression*
• Linear!regression!(Ordinary!least!squares/OLS)!o Use$linear$regression$when$the$outcome$of$interest$is$a$continuous$variable!
• Logistic!regression!(Binary!logistic)!o Use$logistic$regression$if$the$outcome$variable$is$categorical;$i.e.,$2$possible$outcomes$(e.g.,$did$the$
patients$survive$or$not$survive$at$the$end$of$the$period$of$time)!!
• Simple*linear*regression*
o The!association!between!two!continuous!variables!can!be!depicted!graphically!using!a!scatter!diagram!!
o The!method!of!simple!linear!regression!allows!us!to!use!an!equation!to!represent!the!relationship!between!x!and!y!
o The!equation!of!a!straight!line!can!be!written!as:!y*=*α *+*βx!o α!is!the!expected!value!of!y!when!x!is!zero!!o β!(the!regression*coefficient!is!the!expected!change!in!y!as!x!
increases!by!1!unit!(the$slope)!o A!positive!β!indicates!y!increases!as!x!increases;!a!negative!β!indicates!y!decreases!as!x!increases!o When!β!=!0,!then!there!is!no!association!between!y!and!x,!because!the!expected!value!of!y!does!not!
change!as!x!changes!–!i.e.,$when$the$gradient$is$0,$there$is$no$meaningful$relationship$between$x$and$y$–$
when$x$changes,$y$doesn’t$change$!!
o Three*assumptions*are*made:**
1. There!is!a!linear!relationship!between!x!and!y!2. The!variability!about!the!regression!line!(the$line$of$best$fit)!is!the!same!for!all!values!with!x,!
with!constant!standard!deviation!–!i.e.,$the$spread$of$data$is$even;$the$ySvalues$are$spread$evenly$around$the$line$of$best$fit!
3. The!distribution!of!y!for!any!given!x!is!normal!!o The!null!hypothesis!is!that!there!is!no!association!between!y!and!x,!which!is!equivalent!to!assuming!
that!the!slope!is!0.!!!
o Simple*linear*regression*example*–*height*and*weight**
! Step*1:!Set!up!the!hypotheses!• H0:!β!=!0!• H1:!β!≠!0!–$i.e.,$the$gradient$is$not$0!
! Step*2:*Choose!an!appropriate!test!!• linear!regression!
! Step*3:!Execute!the!test!in!SPSS!and!obtain!a!pIvalue!
!
!! Step*4:*Make!a!conclusion!!
• PIvalue!<!0.001!(0.05!level!of!significance)!!• Therefore!we!will!reject!H0!(that$there$is$no$association$between$height$and$weight)!
! Step*5:!State!the!conclusion!in!plain!language!• Therefore!there!is!a!relationship!between!weight!and!height:!!
o weight!=!I91.147!+!90.008!x!height!