Sample size in health sciences - Basics and selected examples

Post on 12-Apr-2017

31 views 0 download

Transcript of Sample size in health sciences - Basics and selected examples

Sample size estimation: Basics & selected examples Dr. S. A. Rizwan, M.D.

Public Health SpecialistSBCM, Joint Program – Riyadh

Ministry of Health, Kingdom of Saudi Arabia

Learningobjectives

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Importanceofsamplesizeestimation• Basicconceptsinsamplesizecalculation• Howdoessamplesizerelatetostudyresults• Samplesizecalculationinspecificsituations

2

Booksandsoftware

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Books• Samplesizedeterminationinhealthstudies- apracticalmanual(Lwanga &Lemeshow)

• SampleSizeCalculationsinClinicalResearch(Shein-ChungChow,HanshengWang,JunShao)

• Software• Epitools,onlinecalculators,Statcal inEpi Info,Gpower• PASS,nmaster,Statsdirect,Stata• Manyothers

3

Obligatoryopeningjoke!

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 4

RethinkingAesop’sfables…

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 5

Let’splayagame!

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 6

Samplesize:Basicconcepts

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 7

Prerequisitesforthisclass

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Understandingofthefollowingbasicconcepts• Typesofstudydesigns• Measuresofassociation• Mean/SD• Proportion• Standarderror• Hypothesistestingandtypes• Confidenceintervals

8

Somerelatedterms

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Significancelevel• Power• Effectsize• Variability• Precision

Con.level Zα95% 1.96(2sided)95% 1.64(1sided)99% 2.57(2sided)99% 2.32(1sided)Power Zβ90% 1.28285% 1.03780% 0.84275% 0.67570% 0.524

9

Samplesize&statisticalinference

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Twomethodsofstatisticalinference• Hypothesistesting• Confidenceintervalestimation

10

Twoaspectsofagoodsample

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Thesamplesize• Ifadequate,thengoodinternalvalidity

• Thesamplingmethod• Ifrepresentative,thengoodexternalvalidity

11

Whycalculatesamplesize?

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Statingtheassumptionsandparametersbeforestartofthestudyincreasesthevalidityofstatisticalconclusionsmadeafterthestudy

• Post-hocanalysisandresultsareconsideredmerelyexploratory

12

Thoughtexercise

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• IamapplyingforajobandintheresumeIhavestatedthatmytypingspeedisveryfast.

• Myfriendisapplyingforthesamejobandinhisresumehestatedthathistypingspeedwas60words/min.

Whichcandidateareyoumorelikelytoassessinavalidmanner?

13

Whycalculatesamplesize?(contd.)

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Fundsandtimeconstraints

• Reallynotnecessarytostudytheentirepopulation(ethicalproblem!)

• Smallsamplesunabletodetectclinicallyrelevantdifferences

• Ifastudywithsmallsamplefindsnon-significantresults– whatdoesitmean?

14

Thoughtexercise

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Study 1: A study was conducted for an anti-hypertensive drug on 10,000 people whichshowed a statistically significant fall in BP of1mm Hg over 3 months

• Study 2: It was found that there was 30%reduction in mortality due to propranolol amongMI patients. But that was not significant. 66cases and 64 controls were studied

Stateyourcommentoneachoftheabovescenario.

15

Thoughtexercise

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 16

Then,howlargeshouldSSbe?

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Neither too small nor too large

17

Samplesizeestimatedforprimaryobjective

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Samplesizeiscalculatedfor‘primaryoutcomevariable’

• Ifthereare>1primaryoutcomessamplesizecalculatedforeachoutcomeandlargestchosen

18

Commonscenariosforsamplesize

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• 100sofscenariosforcalculationsamplesize

• Descriptive:• Proportion,mean/SD

• Analytical:• Twoproportions,2means/SD• Also,riskdiff,OR&RR,incidencedensity

19

Uncommonscenariosforsamplesize

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Survival• Regression• Correlation• Qualityassurance• Diagnosticteststudies• Andmanymore

20

Furtherconsiderationsforsamplesize

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Studydesign• Clusterdesign• Crossover• Matched/paired• Typeofhypothesis(inequality,equivalence,non-inferiority&superiority)

• Fixedfollowupduration• Ratioofcontrolstocases

• HypothesistestingorCIestimation?

21

HowtoapproachaSSproblem?

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

1. Converttheresearchquestionintoastatisticalproblemstatement2. Determineformulaorsoftwarecommand&determineinputsneeded3. Selectthesourcesfortheinputs4. Substitutethevaluesintheformulaorenterinthesoftware5. Factorinnon-response/drop-outrate

22

HowtoapproachaSSproblem?

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• First:Converttheresearchquestionintoastatisticalproblemstatement

• Foreg.,• ToestimatethemeanbirthweightofneonatesborntomotherswithanaemiaintheeasternsectorofRiyadh

• Estimationofasinglemeanwithstatedprecision

23

HowtoapproachaSSproblem?

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Second:Findouttheformulaorthesoftwarecommandappropriateforthisproblem

• Foreg.,• Estimationofasinglemeanwithstatedprecision

N=(Zα2 *S2)/L2

24

HowtoapproachaSSproblem?

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Second:anddeterminetheingredientsyourequiretoinputintheformula• Exp.proportion,incidence• Exp.SD• Exp.RRorOR• Power,precision• Confidencelevel• Others(DE,ICC,COV,clustersize)

• Foreg.,• EstimateofSD,alfa &precision

25

HowtoapproachaSSproblem?

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Third:Selectingthesourcesfortheinputs

• Matchthelocationascloseaspossible• Matchthestudypopulationascloseaspossible• Matchthestudysettingascloseaspossible• Matchthestatisticascloseaspossible• Orconductapilotstudy

• Foreg.,• OthersectorinRiyadh->someothercityinKSA->Middleeast->anydevelopingcountry->anywhere

26

HowtoapproachaSSproblem?

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Third(contd.):Whatsourcestouse?

• Fromwhere• PublishedLiterature• Pilotstudy• Expertsinthefield• Educatedguess(gutfeeling)

Itbegsthequestionthatifwealreadyknowtheseinputsthenwhyconductthestudyinthefirstplace!

27

HowtoapproachaSSproblem?

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Third(Contd.):eg., anappropriatesource

28

HowtoapproachaSSproblem?

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Fourth:Substitutethevaluesintheformulaorenterinthesoftware

N=(Zα2 *S2)/L2

N=(1.96*1.96*600*600)/100*100N=138.2N=Roundedto140

29

HowtoapproachaSSproblem?

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Fifth:Factorinnon-response/drop-outrate

• Finalsamplesize= !"#$%&()*&+,$&-.&/0&($12(&0".&

• Foreg.,• Foranon-responserateof20%• Finalsamplesize=140/0.80=175

30

Samplesize:Someselectedscenarios

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 31

Samplesizeinspecificsituations

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

Authors Originalresearchquestion Simplifiedproblemstatement

1.Dr. Nariman Whatistheproportion ofpatientswhoquitsmoking inatobaccocessationprogram?

Estimationofasingleproportionforaspecialgroup

2.Dr. GhadeerWhatistheincidenceofDMinobesehypertensivesandwhatistheincidenceofDMinnon-obesehypertensiveduringafiveyearfollow-upperiod?

Comparisonofincidenceratesintwogroupsinacohortstudy

3.Dr. Rahma

Whatistheproportion ofLBWneonatesborn tosicklecellmothersandwhatistheproportion ofLBWneonatesborn tonormalmothers inacohortofmothers?

Comparisonoftwoproportionsinacohortstudy

4.Dr. Abrar

Whatistheproportion of ILIabsentstudentsinthehandwashingschoolsandwhatistheproportion ofILIabsentstudents inthecontrolschools?Hereschoolsaretheunitsofrandomisation

Comparisonoftwoproportionsina2groupclusterRCT

32

Scenario1:Estimatingasingleproportion

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 33

Scenario1– Step1

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Whatistheproportionofpatientswhoquitsmokinginatobaccocessationprogram?

• Specifically,whatistheproportionofpatientswithDMandHTNwhoquitsmokinginatobaccocessationprogram?

• Itisacross-sectionalstudybasedonsecondarydataanalysis

• Estimatingasingleproportion

34

Scenario1– Step2

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• SSformulaforestimationadifferencebetweentwoproportionsincohortstudy• Inputsrequiredareexpectedproportionofquitting,precision&confidencelevel

35

Scenario1– Step3

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Athoroughliteraturereviewandpreliminarydataanalysisshowedawidevariationintheexpectedproportion– from10%to50%

36

Scenario1– Step4

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Substitutingthevaluesforanumberofscenariosinthesoftware

37

Scenario1– Step5

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Theconceptofdropoutsorlosstofollow-upinnotapplicable inthiscasebecauseitissecondarydataanalysis

• Sothesamplesizeshouldbe>400andbutneednotbe>3500

• Finaldecisionwilldependonfeasibility

38

Scenario2:Comparisonofincidenceratesinatwogroupcohortstudy

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 39

Scenario2– Step1

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Hypothesis:theriskofdeveloping(incidence)DMwillbehigherinobesehypertensivepatientsascomparedtonon-obesehypertensivepatientsduringa5yearfollow-upperiod

• Itisacohortstudywithtwogroups• Exposedisobesehypertensive• Non-exposedisnon-obesehypertensive• OutcomeisincidenceofDM

• Estimatingadifferencebetweentwoincidenceratesinacohortstudy

40

Scenario2– Step1

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Thisproblemcanbevisualisedinanumberofways:

1. Comparingtwoincidenceratesinacohortstudy(RelativeRisk– hypothesistest)

2. Comparingtwoincidence ratesinacohortstudy(RelativeRisk– statedprecision)

3. Comparingtwoincidence ratesinacohortstudywithsmallproportionandfixedstudyduration(Riskdifference– hypothesistest)

4. Comparingtwoproportions(Riskdifference–hypothesistest)

41

Scenario2– Step2

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Method1:SSformulaforestimatingRRwithstatedprecision• Inputsrequiredareexpected proportionofdiseaseamongexposed&unexposed,RR,Precision,

confidence level

42

Scenario2– Step2

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Method2:SSformulaforhypothesistestingofRR• Inputsrequiredareexpected proportionofdiseaseamongexposed&unexposed,RR,power,

confidence level

43

Scenario2 – Step2

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• SSformulafordifferenceintwoproportions(akariskdifference)canalsobeusedforthisscenario

Riskdifferencebetween2proportions Riskdifferencebetween2incidence rateswith fixedstudyduration

44

Scenario2 – Step2

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 45

Scenario2– Step3

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• AcasualliteraturereviewshowedthattheriskofDMwas5timesamongobeseHTNascomparedtonon-obeseHTN,theincidenceamongnon-obesewas5.4andamongobesewas24.2per1000personyears

46

Scenario2– Step4

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Method1&2:Substitutingthevaluesforanumberofscenariosinthesoftware

47

Scenario2– Step5

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Consideringalosstofollow-upof10%

• Finalsamplesize=716/0.90=795pergroup

48

Scenario3:Comparisonofproportionsinatwogroupcohortstudy

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 49

Scenario3– Step1

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Hypothesis:theproportionofLBWneonateswillbehigherinthesicklecellmothersascomparedtothenon-sicklecellmother

• Itisacohortwithtwogroups• Exposedismotherswithsicklecelldisease• Non-exposedisnormalmothers• OutcomeisproportionofLBW

• Estimatingadifferencebetweentwoproportionsinacohortstudy

50

Scenario3– Step1

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Thisproblemcanbevisualisedinanumberofways:

1. Comparingtwoincidence ratesinacohortstudy(RelativeRisk– hypothesistest)

2. Comparingtwoincidence ratesinacohortstudy(RelativeRisk– statedprecision)

3. Comparingtwoincidence ratesinacohortstudywithsmallproportionandfixedstudyduration(Riskdifference– hypothesistest)

4. Comparingtwoproportions(Riskdifference–hypothesistest)

51

Scenario3– Step2

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Method1:SSformulaforestimatingriskdifference(hypothesistest)• Inputsrequiredareexpectedproportionofdiseaseamongexposed&unexposed,power,confidence level

Differencebetween2proportions

52

Scenario3– Step3

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• AliteraturereviewshowedthatproportionofLBWamongSCDmotherswas16.5%andinthenormalmothersitwas8.3%,withanRRof~2

53

Scenario3– Step4

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Substitutingthevaluesforanumberofscenariosinthesoftware

54

Scenario3– Step5

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Consideringalosstofollow-upof10%

• Finalsamplesize=331/0.90=368pergroup

55

Scenario4:ComparisonofproportionsintwogroupclusterRCT

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 56

Scenario4– Step1

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Hypothesis:theproportionofstudentsbeingabsentduetoILIwillbehigherinthecontrolschools ascomparedtotheschoolsimplementingthehandwashing programduringafollowupperiodof6weeks

• ItisaclusterRCTwithtwogroups• Exposedishandwashingprogram• Non-exposedisnohandwashingprogram• OutcomeisproportionofILIabsenteeism• Schoolistheunitofrandomisation

• EstimatingadifferencebetweentwoproportionsinaclusterRCT

57

Scenario4– Step1

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Thisproblemcanbevisualisedinanumberofways:

1. Comparingtwoproportions(Riskdifference–hypothesistestusingICC)

2. Comparingtwoproportions(Riskdifference–hypothesistestusingDesignEffect)

3. Comparingtwoproportions(Riskdifference–hypothesistestusingCoefficientofvariation)

58

Scenario4– Step2

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Method1:SSformulaforcomparisonofproportionsusingdesigneffect• Inputsrequiredareproportionofoutcomeintheexp.group&controlgroup,sizeofcluster,DE,power,

confidence level

59

Scenario4– Step2

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Method2:SSformulaforcomparisonofproportionsusingintraclustercorrelationcoefficient• Inputsrequiredareproportionofoutcomeintheexp.group&controlgroup,sizeofcluster,ICC,

power,confidencelevel

60

Scenario4– Step3

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• AliteraturereviewshowedthatincidenceofILIabsenteeismwas0.043intheexp.groupand0.070inthecontrolgroup

61

Scenario4– Step4

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Substitutingthevaluesforanumberofscenariosinthesoftware

62

Scenario4

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Method3:SSformulaforcomparisonofincidencerates(persontime)• Inputsrequiredareincidence rates(PT)intheexp.group&controlgroup,coeff.ofvariation,power,

confidence level

63

Scenario4 – Step5

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Consideringalosstofollow-upof10%

• Finalsamplesize=1625/0.90=1805pergroup

• No.ofclustersrequired=1805/40=45pergroup

64

Review

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Whyissamplesizecalculationimportant?

• WhatarethefivestepstocalculatetheSS?

• Whatarethesomeofthecommoninputs requiredforsamplesizeformulae?

• HowwillyouselectanappropriatesourcefortheinputsofSSformula?

• HowwillyourelatetheSSofyourstudyaftertheresults?

65

Takehomemessages

Demystifying statistics! SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Apriorisamplesizecalculationisverycrucialformakingvalidconclusions

• Followthestepwiseapproach

• Samplesizeestimationdoesnotneedtobeveryaccurate,onlyadequate

• Incaseofnon-significantfindingsinastudy,calculatepowerfordeeperunderstanding

66

Thankyou!Emailyourqueriestosarizwan1986@outlook.com

67