Session 4: Classroom MBI Team Training Presented by the MBI Consultants.
Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor...
Transcript of Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor...
![Page 1: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/1.jpg)
Basic Statistics: MBI-540
Sally W. Thurston, PhD
Associate ProfessorDepartment of Biostatistics and Computational Biology
University of Rochester
January 2011
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 2: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/2.jpg)
Why statistics?
Allows us to say something (“make inference”) about apopulation based on data from a sample.Example: measure effect of drug on weight loss in 20BALB/c mice
Sample: the 20 micePopulation: all BALB/c mice (that could be subject to thesame condition)Interested in weight loss in BALB/c mice, not just in the 20mice.Statistics allows us to estimate weight loss in the populationusing data from a sample.
Many journal articles ask for statistics
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 3: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/3.jpg)
Some comments
Design of experiment is keyNo statistical method can fix bad data!Essential to think about design before data is collected
Be clear about research question(s) before collecting data.Includes: what data to collect to answer question(s).
Know your dataHow the measurements relate to research questionPlots to visualize your data: helpful!
Most statistical models have assumptionsIf violated, statistical “answer” not correct.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 4: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/4.jpg)
How can a statistician help?
Design of experimentTo help, statistician must understand aims, background
AnalysisStatistician can figure out appropriate model, checkassumptions, fit model to data
Research?Talk with statistician early: help with design to prevent biasIf small project, consultation possibleIf detailed project, statistician should be collaborator, notconsultant
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 5: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/5.jpg)
Outline
1 Design of experiments: examples
2 Mean, standard deviation, standard error
3 M&M data
4 t-test
5 2-sample t-test
6 Importance of plotting the data
7 1-way ANOVA
8 Summary remarks
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 6: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/6.jpg)
Dr. Tuba’s question
Dr. Tuba is interested in the effect of drugs on weight loss inmice after infection.
Drugs are given right after infection6 mice were given the standard drug6 mice were given the new drugDr. Tuba reports p < 0.05 for a t-test of the difference inmean weight loss across the two groups after day 5.
Should Dr. Tuba publish?
Your thoughts?What additional information might you need beforedeciding?
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 7: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/7.jpg)
Dr. Tuba’s question
Dr. Tuba is interested in the effect of drugs on weight loss inmice after infection.
Drugs are given right after infection6 mice were given the standard drug6 mice were given the new drugDr. Tuba reports p < 0.05 for a t-test of the difference inmean weight loss across the two groups after day 5.
Should Dr. Tuba publish?
Your thoughts?What additional information might you need beforedeciding?
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 8: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/8.jpg)
Dr. Tuba’s question
Dr. Tuba is interested in the effect of drugs on weight loss inmice after infection.
Drugs are given right after infection6 mice were given the standard drug6 mice were given the new drugDr. Tuba reports p < 0.05 for a t-test of the difference inmean weight loss across the two groups after day 5.
Should Dr. Tuba publish?
Your thoughts?What additional information might you need beforedeciding?
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 9: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/9.jpg)
Some things to consider
How did Dr. Tuba choose which mice received each drug?Are the mice independent?Plot the data - why?Did he choose to present results only from the time periodthat showed the most significant effect?
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 10: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/10.jpg)
How did Dr. Tuba choose the mice?
Scenario AThe 12 mice were running around in a pen.Dr. Tuba captured the mice one at a time.The first 6 mice captured were given the standard drug.
Problems?Ease of capture might be associated with propensity tolose weight.
Fastest mice might be the most healthyFastest mice might be the lightest
So - mouse selection could be the cause of an apparentdrug effect.
Important to randomize treatment assignment.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 11: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/11.jpg)
How did Dr. Tuba choose the mice?
Scenario AThe 12 mice were running around in a pen.Dr. Tuba captured the mice one at a time.The first 6 mice captured were given the standard drug.
Problems?
Ease of capture might be associated with propensity tolose weight.
Fastest mice might be the most healthyFastest mice might be the lightest
So - mouse selection could be the cause of an apparentdrug effect.
Important to randomize treatment assignment.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 12: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/12.jpg)
How did Dr. Tuba choose the mice?
Scenario AThe 12 mice were running around in a pen.Dr. Tuba captured the mice one at a time.The first 6 mice captured were given the standard drug.
Problems?Ease of capture might be associated with propensity tolose weight.
Fastest mice might be the most healthyFastest mice might be the lightest
So - mouse selection could be the cause of an apparentdrug effect.
Important to randomize treatment assignment.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 13: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/13.jpg)
How did Dr. Tuba choose the mice?
Scenario AThe 12 mice were running around in a pen.Dr. Tuba captured the mice one at a time.The first 6 mice captured were given the standard drug.
Problems?Ease of capture might be associated with propensity tolose weight.
Fastest mice might be the most healthyFastest mice might be the lightest
So - mouse selection could be the cause of an apparentdrug effect.
Important to randomize treatment assignment.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 14: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/14.jpg)
How did Dr. Tuba choose the mice?
Scenario BThe mice came from 2 dams, treatment was randomized.Pups from the first dam were given the standard drug.Pups from the second dam were given the new drug.
Problems?The effect of drug is totally confounded with litterTranslation: If we see a difference in outcome between thetwo groups, we can’t tell if it was
caused by the drug - ORdue to a litter effect
Mice in one litter might have a different propensity to loseweight than mice in another litter.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 15: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/15.jpg)
How did Dr. Tuba choose the mice?
Scenario BThe mice came from 2 dams, treatment was randomized.Pups from the first dam were given the standard drug.Pups from the second dam were given the new drug.
Problems?
The effect of drug is totally confounded with litterTranslation: If we see a difference in outcome between thetwo groups, we can’t tell if it was
caused by the drug - ORdue to a litter effect
Mice in one litter might have a different propensity to loseweight than mice in another litter.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 16: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/16.jpg)
How did Dr. Tuba choose the mice?
Scenario BThe mice came from 2 dams, treatment was randomized.Pups from the first dam were given the standard drug.Pups from the second dam were given the new drug.
Problems?The effect of drug is totally confounded with litterTranslation: If we see a difference in outcome between thetwo groups, we can’t tell if it was
caused by the drug - ORdue to a litter effect
Mice in one litter might have a different propensity to loseweight than mice in another litter.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 17: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/17.jpg)
How did Dr. Tuba choose the mice?
Scenario CThe mice came from 6 dams.The first pup in each dam was given the standard drug.The second pup in each dam was given the new drug.
Problems?Birth order may be associated with the propensity to loseweight.Birth order is confounded with drug.
Better idea: within a litter, randomize first/second born pups todrug.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 18: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/18.jpg)
How did Dr. Tuba choose the mice?
Scenario CThe mice came from 6 dams.The first pup in each dam was given the standard drug.The second pup in each dam was given the new drug.
Problems?
Birth order may be associated with the propensity to loseweight.Birth order is confounded with drug.
Better idea: within a litter, randomize first/second born pups todrug.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 19: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/19.jpg)
How did Dr. Tuba choose the mice?
Scenario CThe mice came from 6 dams.The first pup in each dam was given the standard drug.The second pup in each dam was given the new drug.
Problems?Birth order may be associated with the propensity to loseweight.Birth order is confounded with drug.
Better idea: within a litter, randomize first/second born pups todrug.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 20: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/20.jpg)
How did Dr. Tuba choose the mice?
Scenario CThe mice came from 6 dams.The first pup in each dam was given the standard drug.The second pup in each dam was given the new drug.
Problems?Birth order may be associated with the propensity to loseweight.Birth order is confounded with drug.
Better idea: within a litter, randomize first/second born pups todrug.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 21: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/21.jpg)
How did Dr. Tuba choose the mice?
Scenario DMice were all first-born pups from different dams.Mice were randomized to treatmentAll mice which received the standard drug were housed inone cage.All mice which received the new drug were housed in one(different) cage.
Problems?There might be a cage effect - viral infection, hotterlocation, less water, · · · .
Better idea: divide animals from a drug group into > 1 cage - orif necessary, house all in same cage.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 22: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/22.jpg)
How did Dr. Tuba choose the mice?
Scenario DMice were all first-born pups from different dams.Mice were randomized to treatmentAll mice which received the standard drug were housed inone cage.All mice which received the new drug were housed in one(different) cage.
Problems?
There might be a cage effect - viral infection, hotterlocation, less water, · · · .
Better idea: divide animals from a drug group into > 1 cage - orif necessary, house all in same cage.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 23: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/23.jpg)
How did Dr. Tuba choose the mice?
Scenario DMice were all first-born pups from different dams.Mice were randomized to treatmentAll mice which received the standard drug were housed inone cage.All mice which received the new drug were housed in one(different) cage.
Problems?There might be a cage effect - viral infection, hotterlocation, less water, · · · .
Better idea: divide animals from a drug group into > 1 cage - orif necessary, house all in same cage.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 24: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/24.jpg)
Was day 5 one of many days he considered?
1 2 3 4 5
−4
−3
−2
−1
0
Days
Wei
ght l
oss
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● group 1group 2
Comments?
Simulated data: the mean isthe same for both groups, atall timesLooking for the biggestdifference = multiplecomparisons
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 25: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/25.jpg)
Was day 5 one of many days he considered?
1 2 3 4 5
−4
−3
−2
−1
0
Days
Wei
ght l
oss
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● group 1group 2
Comments?Simulated data: the mean isthe same for both groups, atall timesLooking for the biggestdifference = multiplecomparisons
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 26: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/26.jpg)
Was day 5 one of many days he considered?
1 2 3 4 5
−8
−6
−4
−2
0
Days
Wei
ght l
oss
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
● group 1group 2
Comments?
Simulated data: the meandeclines with time for group2, not group 1If expect this, could chooseday 5 a prioriCould use other statisticalmethods to compare slopeover time in the 2 groups.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 27: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/27.jpg)
Was day 5 one of many days he considered?
1 2 3 4 5
−8
−6
−4
−2
0
Days
Wei
ght l
oss
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
● group 1group 2
Comments?Simulated data: the meandeclines with time for group2, not group 1If expect this, could chooseday 5 a prioriCould use other statisticalmethods to compare slopeover time in the 2 groups.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 28: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/28.jpg)
Are the mice independent?
Mice from the same litter are more like each other.Mice housed in the same cage might respond moresimilarlyWe are not interested in the effect of litter (or cage)
If appropriate experimental design, can control for correlatedobservations statistically.Need to record and save data on the litter (or cage)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 29: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/29.jpg)
Are the mice independent?
Mice from the same litter are more like each other.Mice housed in the same cage might respond moresimilarlyWe are not interested in the effect of litter (or cage)
If appropriate experimental design, can control for correlatedobservations statistically.Need to record and save data on the litter (or cage)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 30: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/30.jpg)
Other scenarios (1)
An investigator is comparing response to different doses of 3drugs on 96-well plates.
Each drug is tested at 5 doses.Plate 1 has multiple wells of drug A at each dose level.Plate 2 has multiple wells of drug B at each dose level.Plate 3 has multiple wells of drug C at each dose level.
Is this a good experimental design? Why / why not?
If one drug appears to be better, it might be due to a differencein the plates.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 31: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/31.jpg)
Other scenarios (1)
An investigator is comparing response to different doses of 3drugs on 96-well plates.
Each drug is tested at 5 doses.Plate 1 has multiple wells of drug A at each dose level.Plate 2 has multiple wells of drug B at each dose level.Plate 3 has multiple wells of drug C at each dose level.
Is this a good experimental design? Why / why not?
If one drug appears to be better, it might be due to a differencein the plates.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 32: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/32.jpg)
Other scenarios (1)
An investigator is comparing response to different doses of 3drugs on 96-well plates.
Each drug is tested at 5 doses.Plate 1 has multiple wells of drug A at each dose level.Plate 2 has multiple wells of drug B at each dose level.Plate 3 has multiple wells of drug C at each dose level.
Is this a good experimental design? Why / why not?
If one drug appears to be better, it might be due to a differencein the plates.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 33: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/33.jpg)
Other scenarios (2)
Dr. Busy scores mouse activity in a 1-minute period on a scaleof 1-10.
All mice are measured 6 hours after treatment10 mice receive a standard treatment, 10 a new treatment.Treatment assignment is randomizedThe new treatment is expected to be better.The investigator knows the treatment assignment of eachmouse.
Is this a good experimental design? Why / why not?
Expectation that new treatment is better could unconsciouslybias the investigator’s measurements
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 34: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/34.jpg)
Other scenarios (2)
Dr. Busy scores mouse activity in a 1-minute period on a scaleof 1-10.
All mice are measured 6 hours after treatment10 mice receive a standard treatment, 10 a new treatment.Treatment assignment is randomizedThe new treatment is expected to be better.The investigator knows the treatment assignment of eachmouse.
Is this a good experimental design? Why / why not?
Expectation that new treatment is better could unconsciouslybias the investigator’s measurements
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 35: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/35.jpg)
Other scenarios (2)
Dr. Busy scores mouse activity in a 1-minute period on a scaleof 1-10.
All mice are measured 6 hours after treatment10 mice receive a standard treatment, 10 a new treatment.Treatment assignment is randomizedThe new treatment is expected to be better.The investigator knows the treatment assignment of eachmouse.
Is this a good experimental design? Why / why not?
Expectation that new treatment is better could unconsciouslybias the investigator’s measurements
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 36: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/36.jpg)
Comments on experimental designs
In the scenarios just described, there is a possibility thatthe intended effect cannot be distinguished fromsomething else that is not of interest.Estimate of intended effect may be biased.
Unless the unintended effect can be estimated(impossible?)
No statistical analysis will fix this problem.Often, the problem can be avoided by good experimentaldesign.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 37: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/37.jpg)
Comments on experimental designs
In the scenarios just described, there is a possibility thatthe intended effect cannot be distinguished fromsomething else that is not of interest.Estimate of intended effect may be biased.Unless the unintended effect can be estimated(impossible?)
No statistical analysis will fix this problem.
Often, the problem can be avoided by good experimentaldesign.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 38: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/38.jpg)
Comments on experimental designs
In the scenarios just described, there is a possibility thatthe intended effect cannot be distinguished fromsomething else that is not of interest.Estimate of intended effect may be biased.Unless the unintended effect can be estimated(impossible?)
No statistical analysis will fix this problem.Often, the problem can be avoided by good experimentaldesign.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 39: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/39.jpg)
Summary: experimental design
Data should be a random sample from populationIf treatment is assigned, assignment should be randomlydetermined.
Can use table of random numbersBlinding
Person measuring endpoints should not know treatmentassignmentPerson receiving treatment should not know his/hertreatmentParticularly an issue if measurements can be subjective
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 40: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/40.jpg)
Outline
1 Design of experiments: examples
2 Mean, standard deviation, standard error
3 M&M data
4 t-test
5 2-sample t-test
6 Importance of plotting the data
7 1-way ANOVA
8 Summary remarks
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 41: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/41.jpg)
Dr. Forte’s question
A well-known journal article reports that after infection,BALB/c mice given ’superdrug’ lost 4 ounces on averageafter one week.Dr. Forte wants to compare this to ’superdrug’ effects oninfected C57BL/6 mice.Dr. Forte infects 4 C57BL/6 mice, treats with ’superdrug’,and measures weight loss after one week.
His measurements of weight loss: 5, 8, 6, 3 ounces.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 42: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/42.jpg)
The mean
Data (weight loss) represented by x1, x2, · · · , xnn is sample size (number observations)Dr. Forte’s data: x1 = 5, x2 = 8, x3 = 6, x4 = 3.
Population mean (µ): the average xi over all observationsin population
Cannot measure data from all observations in population.µ is unknown (a population parameter)
Estimate µ by µ̂. Here µ̂ = x̄ = sample mean.
For Dr. Forte’s experimentWhat is n?What is the sample?What is the population?
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 43: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/43.jpg)
The mean
Data (weight loss) represented by x1, x2, · · · , xnn is sample size (number observations)Dr. Forte’s data: x1 = 5, x2 = 8, x3 = 6, x4 = 3.
Population mean (µ): the average xi over all observationsin population
Cannot measure data from all observations in population.µ is unknown (a population parameter)
Estimate µ by µ̂. Here µ̂ = x̄ = sample mean.
For Dr. Forte’s experimentWhat is n?What is the sample?What is the population?
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 44: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/44.jpg)
The mean (cont.)
Dr. Forte’s data: 3, 5, 6, 8.Sample mean, x̄ = 1
n∑n
i=1 xi
What is Dr. Forte’s sample mean?
x̄ = 14(3 + 5 + 6 + 8) = 5.5
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 45: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/45.jpg)
The mean (cont.)
Dr. Forte’s data: 3, 5, 6, 8.Sample mean, x̄ = 1
n∑n
i=1 xi
What is Dr. Forte’s sample mean?
x̄ = 14(3 + 5 + 6 + 8) = 5.5
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 46: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/46.jpg)
Variability
Consider measurements from 2 researchers:Dr. Forte’s measurements: 3, 5, 6, 8.Dr. Pianissimo’s measurements: 5.1, 5.3, 5.7, 5.9Both have x̄ = 5.5.
Whose measurements give you more confidence about yourknowledge of the population mean?
Dr. Pianissimo: data closer together, plausible values for µ insmaller interval.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 47: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/47.jpg)
Variability
Consider measurements from 2 researchers:Dr. Forte’s measurements: 3, 5, 6, 8.Dr. Pianissimo’s measurements: 5.1, 5.3, 5.7, 5.9Both have x̄ = 5.5.
Whose measurements give you more confidence about yourknowledge of the population mean?
Dr. Pianissimo: data closer together, plausible values for µ insmaller interval.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 48: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/48.jpg)
Variance (σ2 = population variance)
Population variance: σ2 = E[(xi − µ)2]
E means “expectation”: average over all observations inpopulation.Translation: σ2 is expected squared difference betweenindividual values and population meanσ = population SD (of xi )
Sample variance, σ̂2
Estimate σ2 by σ̂2.If µ known: σ̂2 = 1
n
∑ni=1(xi − µ)2
If µ unknown: σ̂2 = s2 = 1n−1
∑ni=1(xi − x̄)2
Dr. Forte’s data: 3, 5, 6, 8, x̄ = 5.5. What is s2?s2 = (3−5.5)2+(5−5.5)2+(6−5.5)2+(8−5.5)2
3 = 4.333
Dr. Pianissimo’s data: 5.1, 5.3, 5.7, 5.9, s2 = 0.133.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 49: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/49.jpg)
Variance (σ2 = population variance)
Population variance: σ2 = E[(xi − µ)2]
E means “expectation”: average over all observations inpopulation.Translation: σ2 is expected squared difference betweenindividual values and population meanσ = population SD (of xi )
Sample variance, σ̂2
Estimate σ2 by σ̂2.If µ known: σ̂2 = 1
n
∑ni=1(xi − µ)2
If µ unknown: σ̂2 = s2 = 1n−1
∑ni=1(xi − x̄)2
Dr. Forte’s data: 3, 5, 6, 8, x̄ = 5.5. What is s2?
s2 = (3−5.5)2+(5−5.5)2+(6−5.5)2+(8−5.5)2
3 = 4.333
Dr. Pianissimo’s data: 5.1, 5.3, 5.7, 5.9, s2 = 0.133.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 50: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/50.jpg)
Variance (σ2 = population variance)
Population variance: σ2 = E[(xi − µ)2]
E means “expectation”: average over all observations inpopulation.Translation: σ2 is expected squared difference betweenindividual values and population meanσ = population SD (of xi )
Sample variance, σ̂2
Estimate σ2 by σ̂2.If µ known: σ̂2 = 1
n
∑ni=1(xi − µ)2
If µ unknown: σ̂2 = s2 = 1n−1
∑ni=1(xi − x̄)2
Dr. Forte’s data: 3, 5, 6, 8, x̄ = 5.5. What is s2?s2 = (3−5.5)2+(5−5.5)2+(6−5.5)2+(8−5.5)2
3 = 4.333
Dr. Pianissimo’s data: 5.1, 5.3, 5.7, 5.9, s2 = 0.133.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 51: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/51.jpg)
Variance (σ2 = population variance)
Population variance: σ2 = E[(xi − µ)2]
E means “expectation”: average over all observations inpopulation.Translation: σ2 is expected squared difference betweenindividual values and population meanσ = population SD (of xi )
Sample variance, σ̂2
Estimate σ2 by σ̂2.If µ known: σ̂2 = 1
n
∑ni=1(xi − µ)2
If µ unknown: σ̂2 = s2 = 1n−1
∑ni=1(xi − x̄)2
Dr. Forte’s data: 3, 5, 6, 8, x̄ = 5.5. What is s2?s2 = (3−5.5)2+(5−5.5)2+(6−5.5)2+(8−5.5)2
3 = 4.333
Dr. Pianissimo’s data: 5.1, 5.3, 5.7, 5.9, s2 = 0.133.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 52: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/52.jpg)
SD, SE
Standard deviation (of x), SD(x) is s =√
s2
Can also write s as sx
Standard error (of x̄), SE(x̄) is√
s2
n = s√n
In other contexts, can speak of standard error of otherquantities, e.g. SE(β̂)
Dr. Forte’s data: 3, 5, 6, 8,x̄ = 5.5, s2 = 4.333.
What are SD(x) and SE(x̄)?SD(x) =
√4.333 = 2.08
SE(x̄) = 2.08√4
= 1.04
Dr. Pianissimo’s data: 5.1, 5.3, 5.7, 5.9x̄ = 5.5, s2 = 0.133.
What are SD(x) and SE(x̄)?SD(x) =
√0.133 = 0.365
SE(x̄) = 0.365√4
= 0.183
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 53: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/53.jpg)
SD, SE
Standard deviation (of x), SD(x) is s =√
s2
Can also write s as sx
Standard error (of x̄), SE(x̄) is√
s2
n = s√n
In other contexts, can speak of standard error of otherquantities, e.g. SE(β̂)
Dr. Forte’s data: 3, 5, 6, 8,x̄ = 5.5, s2 = 4.333.
What are SD(x) and SE(x̄)?
SD(x) =√
4.333 = 2.08SE(x̄) = 2.08√
4= 1.04
Dr. Pianissimo’s data: 5.1, 5.3, 5.7, 5.9x̄ = 5.5, s2 = 0.133.
What are SD(x) and SE(x̄)?SD(x) =
√0.133 = 0.365
SE(x̄) = 0.365√4
= 0.183
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 54: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/54.jpg)
SD, SE
Standard deviation (of x), SD(x) is s =√
s2
Can also write s as sx
Standard error (of x̄), SE(x̄) is√
s2
n = s√n
In other contexts, can speak of standard error of otherquantities, e.g. SE(β̂)
Dr. Forte’s data: 3, 5, 6, 8,x̄ = 5.5, s2 = 4.333.
What are SD(x) and SE(x̄)?SD(x) =
√4.333 = 2.08
SE(x̄) = 2.08√4
= 1.04
Dr. Pianissimo’s data: 5.1, 5.3, 5.7, 5.9x̄ = 5.5, s2 = 0.133.
What are SD(x) and SE(x̄)?
SD(x) =√
0.133 = 0.365SE(x̄) = 0.365√
4= 0.183
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 55: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/55.jpg)
SD, SE
Standard deviation (of x), SD(x) is s =√
s2
Can also write s as sx
Standard error (of x̄), SE(x̄) is√
s2
n = s√n
In other contexts, can speak of standard error of otherquantities, e.g. SE(β̂)
Dr. Forte’s data: 3, 5, 6, 8,x̄ = 5.5, s2 = 4.333.
What are SD(x) and SE(x̄)?SD(x) =
√4.333 = 2.08
SE(x̄) = 2.08√4
= 1.04
Dr. Pianissimo’s data: 5.1, 5.3, 5.7, 5.9x̄ = 5.5, s2 = 0.133.
What are SD(x) and SE(x̄)?SD(x) =
√0.133 = 0.365
SE(x̄) = 0.365√4
= 0.183
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 56: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/56.jpg)
SD, SE (cont.)
Recall SD(x) is sx =√
1n−1
∑ni=1(xi − x̄)2, SE(x̄) = sx√
n
If Dr. Forte had collected data from 20 mice instead of 4, howwould SD(x) and SE(x̄) be affected?
SD(x) would be about the same (estimated more precisely)SE(x̄) would decrease (we have more confidence about x̄).
What do SD(x) and SE(x̄) mean?SD(x) is our estimate of σ, the SD of x in the population.SE(x̄) is a measure of our uncertainty about µ.
Can form confidence interval (CI) for µ: x̄ ± tcritSE(x̄)tcrit discussed later (usually ≈ 2)Approximately 95% of such CIs include µ.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 57: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/57.jpg)
SD, SE (cont.)
Recall SD(x) is sx =√
1n−1
∑ni=1(xi − x̄)2, SE(x̄) = sx√
n
If Dr. Forte had collected data from 20 mice instead of 4, howwould SD(x) and SE(x̄) be affected?
SD(x) would be about the same (estimated more precisely)SE(x̄) would decrease (we have more confidence about x̄).
What do SD(x) and SE(x̄) mean?SD(x) is our estimate of σ, the SD of x in the population.SE(x̄) is a measure of our uncertainty about µ.
Can form confidence interval (CI) for µ: x̄ ± tcritSE(x̄)tcrit discussed later (usually ≈ 2)Approximately 95% of such CIs include µ.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 58: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/58.jpg)
SD, SE (cont.)
Recall SD(x) is sx =√
1n−1
∑ni=1(xi − x̄)2, SE(x̄) = sx√
n
If Dr. Forte had collected data from 20 mice instead of 4, howwould SD(x) and SE(x̄) be affected?
SD(x) would be about the same (estimated more precisely)SE(x̄) would decrease (we have more confidence about x̄).
What do SD(x) and SE(x̄) mean?SD(x) is our estimate of σ, the SD of x in the population.SE(x̄) is a measure of our uncertainty about µ.
Can form confidence interval (CI) for µ: x̄ ± tcritSE(x̄)tcrit discussed later (usually ≈ 2)Approximately 95% of such CIs include µ.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 59: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/59.jpg)
Why divide by n − 1 in s2?
If we knew µ, σ̂2 =Pn
i=1(xi−µ)2
n
We don’t know µ, so σ̂2 = s2 =Pn
i=1(xi−x̄)2
n−1
For a particular sample, x̄ minimizes∑
(xi − µ̂)2 for any µ̂
Using x̄ to estimate µ:Numerate of s2 is (on average) smaller than
∑(xi − µ)2
Need to divide by smaller number to compensate; n − 1 isthe right number
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 60: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/60.jpg)
Why divide by n − 1 in s2?
Examine by simulationTake n = 5 observations fromnormal, µ = 0, σ2 = 1.
Calculate s2 =P
(xi−x̄)2
4 andP(xi−x̄)2
5
Repeat multiple timesNote that true value of σ is 1.
Dividing by n − 1 does wellDividing by n underestimates σ
SD(x): divide by (n−1)
Fre
quen
cy
0.5 1.0 1.5 2.0
020
4060
8010
0
Not SD(x): divide by n
Fre
quen
cy
0.0 0.5 1.0 1.5
020
4060
8012
0
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 61: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/61.jpg)
Why divide by n − 1 in s2?
Examine by simulationTake n = 5 observations fromnormal, µ = 0, σ2 = 1.
Calculate s2 =P
(xi−x̄)2
4 andP(xi−x̄)2
5
Repeat multiple timesNote that true value of σ is 1.Dividing by n − 1 does wellDividing by n underestimates σ
SD(x): divide by (n−1)
Fre
quen
cy
0.5 1.0 1.5 2.0
020
4060
8010
0
Not SD(x): divide by n
Fre
quen
cy
0.0 0.5 1.0 1.5
020
4060
8012
0
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 62: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/62.jpg)
Outline
1 Design of experiments: examples
2 Mean, standard deviation, standard error
3 M&M data
4 t-test
5 2-sample t-test
6 Importance of plotting the data
7 1-way ANOVA
8 Summary remarks
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 63: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/63.jpg)
M&M data
We collected data on the number of M&M of each color inpackages of mini M&M’s
# brown
Fre
quen
cy
5 10 15 20
02
46
8
# orange
Fre
quen
cy
4 6 8 10 12 14 16
02
46
8
# yellow
Fre
quen
cy
2 4 6 8 10 12 14
02
46
810
12
# green
Fre
quen
cy
6 8 10 12
01
23
45
67
# blue
Fre
quen
cy
4 6 8 10 12 14 16 18
05
1015
# red
Fre
quen
cy
4 6 8 10 12 14 16
05
1015
32 people totalHistograms show number ofeach color
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 64: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/64.jpg)
M&M data
Aside: did everyone get same # M&M’s? NO
Number MM: 50 51 55 56 57 58 59 60 61 62 66 73People: 1 1 3 2 9 2 4 5 1 2 1 1
%(blue + red): All
Fre
quen
cy
0.30 0.35 0.40 0.45
01
23
45
67 Focus on % (red + blue)
x1 is % (red + blue) for firstperson, x2 same for secondperson, etc.What is the sample?What is the population?
We could approximate xi with anormal distribution, mean µ,variance σ2
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 65: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/65.jpg)
M&M data
Aside: did everyone get same # M&M’s? NO
Number MM: 50 51 55 56 57 58 59 60 61 62 66 73People: 1 1 3 2 9 2 4 5 1 2 1 1
%(blue + red): All
Fre
quen
cy
0.30 0.35 0.40 0.45
01
23
45
67 Focus on % (red + blue)
x1 is % (red + blue) for firstperson, x2 same for secondperson, etc.What is the sample?What is the population?
We could approximate xi with anormal distribution, mean µ,variance σ2
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 66: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/66.jpg)
M&M data
Aside: did everyone get same # M&M’s? NO
Number MM: 50 51 55 56 57 58 59 60 61 62 66 73People: 1 1 3 2 9 2 4 5 1 2 1 1
%(blue + red): All
Fre
quen
cy
0.30 0.35 0.40 0.45
01
23
45
67 Focus on % (red + blue)
x1 is % (red + blue) for firstperson, x2 same for secondperson, etc.What is the sample?What is the population?
We could approximate xi with anormal distribution, mean µ,variance σ2
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 67: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/67.jpg)
M&M data
sample 1
Fre
quen
cy
0.30 0.35 0.40 0.45
0.0
0.5
1.0
1.5
2.0
2.5
3.0
sample 2
Fre
quen
cy
0.30 0.35 0.40 0.45
0.0
0.5
1.0
1.5
2.0
sample 3
Fre
quen
cy
0.30 0.32 0.34 0.36 0.38 0.40
0.0
0.5
1.0
1.5
2.0
sample 4
Fre
quen
cy
0.30 0.32 0.34 0.36 0.38 0.40
0.0
0.5
1.0
1.5
2.0
%(blue + red): samples
Interest: % (red + blue)Divide data into 4 groups of 8observationsCalculate mean, SD, SE(x̄) ineach group
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 68: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/68.jpg)
M&M data
Each group has n = 8 observations (32 observations total)Group 1: % (red + blue): Data (x1, · · · , x8) is
0.45, 0.38, 0.39, 0.38, 0.35, 0.33, 0.41, 0.29
Calculate x̄
x̄ = (0.45+0.38+0.39+0.38+0.35+0.33+0.41+0.29)8 = 0.374
Calculate sx (standard deviation)
sx =
√(0.45−0.374)2+(0.38−0.374)2+···+(0.29−0.374)2
7 = 0.048
Calculate SE(x̄) = sx√n = 0.048√
8= 0.017
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 69: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/69.jpg)
M&M data
Each group has n = 8 observations (32 observations total)Group 1: % (red + blue): Data (x1, · · · , x8) is
0.45, 0.38, 0.39, 0.38, 0.35, 0.33, 0.41, 0.29
Calculate x̄x̄ = (0.45+0.38+0.39+0.38+0.35+0.33+0.41+0.29)
8 = 0.374Calculate sx (standard deviation)
sx =
√(0.45−0.374)2+(0.38−0.374)2+···+(0.29−0.374)2
7 = 0.048
Calculate SE(x̄) = sx√n = 0.048√
8= 0.017
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 70: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/70.jpg)
M&M data
Each group has n = 8 observations (32 observations total)Group 1: % (red + blue): Data (x1, · · · , x8) is
0.45, 0.38, 0.39, 0.38, 0.35, 0.33, 0.41, 0.29
Calculate x̄x̄ = (0.45+0.38+0.39+0.38+0.35+0.33+0.41+0.29)
8 = 0.374Calculate sx (standard deviation)
sx =
√(0.45−0.374)2+(0.38−0.374)2+···+(0.29−0.374)2
7 = 0.048
Calculate SE(x̄) = sx√n = 0.048√
8= 0.017
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 71: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/71.jpg)
M&M data
Each group has n = 8 observations (32 observations total)Group 1: x̄= 0.374, sx= 0.0479 SE(x̄)= 0.0169Group 2: x̄= 0.3564, sx= 0.0574 SE(x̄)= 0.0203Group 3: x̄= 0.3549, sx= 0.0348 SE(x̄)= 0.0123Group 4: x̄= 0.3465, sx= 0.0342 SE(x̄)= 0.0121
Everyone: x̄= 0.358 sx= 0.0436 SE(x̄)= 0.0077
We might want to know if % red + blue in the population = 13
(later: t-test)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 72: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/72.jpg)
M&M data
Each group has n = 8 observations (32 observations total)Group 1: x̄= 0.374, sx= 0.0479 SE(x̄)= 0.0169Group 2: x̄= 0.3564, sx= 0.0574 SE(x̄)= 0.0203Group 3: x̄= 0.3549, sx= 0.0348 SE(x̄)= 0.0123Group 4: x̄= 0.3465, sx= 0.0342 SE(x̄)= 0.0121
Everyone: x̄= 0.358 sx= 0.0436 SE(x̄)= 0.0077
We might want to know if % red + blue in the population = 13
(later: t-test)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 73: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/73.jpg)
M&M data
Each group has n = 8 observations (32 observations total)Group 1: x̄= 0.374, sx= 0.0479 SE(x̄)= 0.0169Group 2: x̄= 0.3564, sx= 0.0574 SE(x̄)= 0.0203Group 3: x̄= 0.3549, sx= 0.0348 SE(x̄)= 0.0123Group 4: x̄= 0.3465, sx= 0.0342 SE(x̄)= 0.0121
Everyone: x̄= 0.358 sx= 0.0436 SE(x̄)= 0.0077
We might want to know if % red + blue in the population = 13
(later: t-test)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 74: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/74.jpg)
Outline
1 Design of experiments: examples
2 Mean, standard deviation, standard error
3 M&M data
4 t-test
5 2-sample t-test
6 Importance of plotting the data
7 1-way ANOVA
8 Summary remarks
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 75: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/75.jpg)
t-test
Recall Dr. Forte’s question: is weight loss in C57BL/6 mice(his data) different than in BALB/c mice (published data,weight loss = 4 ounces)?
H0 is the null hypothesis (“no effect”, “same as before”)HA is the alternative hypothesis (“some effect”, “not H0”)
Example: H0 : µ = µ0 versus HA : µ 6= µ0.µ is mean weight loss for C57Bl/6 mice in populationµ is unknown, but we have an estimate of it (x̄)µ0 is some specific number (here: 4).
How to evaluate H0, HA?Compare x̄ to µ0If x̄ is far from µ0, suggests H0 not true.Need SE(x̄) to evaluate how different x̄ and µ0 are.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 76: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/76.jpg)
t-test
Recall Dr. Forte’s question: is weight loss in C57BL/6 mice(his data) different than in BALB/c mice (published data,weight loss = 4 ounces)?
H0 is the null hypothesis (“no effect”, “same as before”)HA is the alternative hypothesis (“some effect”, “not H0”)Example: H0 : µ = µ0 versus HA : µ 6= µ0.
µ is mean weight loss for C57Bl/6 mice in populationµ is unknown, but we have an estimate of it (x̄)µ0 is some specific number (here: 4).
How to evaluate H0, HA?Compare x̄ to µ0If x̄ is far from µ0, suggests H0 not true.Need SE(x̄) to evaluate how different x̄ and µ0 are.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 77: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/77.jpg)
t-test
Recall Dr. Forte’s question: is weight loss in C57BL/6 mice(his data) different than in BALB/c mice (published data,weight loss = 4 ounces)?
H0 is the null hypothesis (“no effect”, “same as before”)HA is the alternative hypothesis (“some effect”, “not H0”)Example: H0 : µ = µ0 versus HA : µ 6= µ0.
µ is mean weight loss for C57Bl/6 mice in populationµ is unknown, but we have an estimate of it (x̄)µ0 is some specific number (here: 4).
How to evaluate H0, HA?Compare x̄ to µ0If x̄ is far from µ0, suggests H0 not true.Need SE(x̄) to evaluate how different x̄ and µ0 are.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 78: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/78.jpg)
t-test assumptions
T-test is based on tcalc = x̄−µ0SE(x̄)
Assumptions: xi ∼ independent N(µ, σ2)
Interpretation: ∼ means “distributed as”Says data are independent and have a normal distributionµ = population mean (of the xi ’s)σ2 = population variance (of the xi ’s) (σ = SD)
“How can I tell if my data have a normal distribution?”Assumption is about distribution of data from populationt-test robust to moderate departure from normalityImportant to plot data to check for gross violation fromnormality
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 79: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/79.jpg)
t-test assumptions
T-test is based on tcalc = x̄−µ0SE(x̄)
Assumptions: xi ∼ independent N(µ, σ2)
Interpretation: ∼ means “distributed as”Says data are independent and have a normal distributionµ = population mean (of the xi ’s)σ2 = population variance (of the xi ’s) (σ = SD)
“How can I tell if my data have a normal distribution?”Assumption is about distribution of data from populationt-test robust to moderate departure from normalityImportant to plot data to check for gross violation fromnormality
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 80: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/80.jpg)
Samples of n=20 from a normal distribution
normsamp[ctr]
−3 −1 0 1 2 3
01
23
4
normsamp[ctr]
Fre
quen
cy
−3 −1 0 1 2 3
01
23
45
normsamp[ctr]
Fre
quen
cy
−3 −1 0 1 2 3
0.0
1.0
2.0
3.0
normsamp[ctr]
−3 −1 0 1 2 3
01
23
45
normsamp[ctr]
Fre
quen
cy
−3 −1 0 1 2 3
01
23
45
67
normsamp[ctr]
Fre
quen
cy
−3 −1 0 1 2 3
01
23
4
−3 −1 0 1 2 3
01
23
45
67
Fre
quen
cy
−3 −1 0 1 2 3
01
23
45
67
Fre
quen
cy
−3 −1 0 1 2 3
01
23
45
6
Normal
normsamp[ctr]
Fre
quen
cy
−3 −2 −1 0 1 2 3
01
23
4
One mistake
c(normsamp[1:(nper.plot − 1)], 10.25)
Fre
quen
cy
−4 −2 0 2 4 6 8 10
01
23
4
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 81: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/81.jpg)
Samples of n=20 from a normal distribution
normsamp[ctr]
−3 −1 0 1 2 3
01
23
4
normsamp[ctr]
Fre
quen
cy
−3 −1 0 1 2 3
01
23
45
normsamp[ctr]
Fre
quen
cy
−3 −1 0 1 2 3
0.0
1.0
2.0
3.0
normsamp[ctr]
−3 −1 0 1 2 3
01
23
45
normsamp[ctr]
Fre
quen
cy
−3 −1 0 1 2 3
01
23
45
67
normsamp[ctr]
Fre
quen
cy
−3 −1 0 1 2 3
01
23
4
−3 −1 0 1 2 3
01
23
45
67
Fre
quen
cy
−3 −1 0 1 2 3
01
23
45
67
Fre
quen
cy
−3 −1 0 1 2 3
01
23
45
6
Normal
normsamp[ctr]
Fre
quen
cy
−3 −2 −1 0 1 2 3
01
23
4
One mistake
c(normsamp[1:(nper.plot − 1)], 10.25)
Fre
quen
cy
−4 −2 0 2 4 6 8 10
01
23
4
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 82: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/82.jpg)
Outliers
“My data has one very unusual value. Should I delete it?”
Step 1: check to see if the value is a mistake! If so, fix.Step 2: is there something unusual about thatobservation?
animal was sicka different person took the measurementexperimental conditions different from others
If clear reason why unusual value differs from otherobservations
Value represents different condition: OK to delete (explainin paper)
If cannot determine any reason why value is unusualDeleting observation underestimates population variability:don’t deleteAlternative: report results both with and without the value(explain in paper)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 83: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/83.jpg)
Outliers
“My data has one very unusual value. Should I delete it?”Step 1: check to see if the value is a mistake! If so, fix.Step 2: is there something unusual about thatobservation?
animal was sicka different person took the measurementexperimental conditions different from others
If clear reason why unusual value differs from otherobservations
Value represents different condition: OK to delete (explainin paper)
If cannot determine any reason why value is unusualDeleting observation underestimates population variability:don’t deleteAlternative: report results both with and without the value(explain in paper)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 84: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/84.jpg)
Outliers
“My data has one very unusual value. Should I delete it?”Step 1: check to see if the value is a mistake! If so, fix.Step 2: is there something unusual about thatobservation?
animal was sicka different person took the measurementexperimental conditions different from others
If clear reason why unusual value differs from otherobservations
Value represents different condition: OK to delete (explainin paper)
If cannot determine any reason why value is unusualDeleting observation underestimates population variability:don’t deleteAlternative: report results both with and without the value(explain in paper)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 85: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/85.jpg)
Parametric or non-parametric?
“When should I use a non-parametric version of a t-test?”Some statisticians (nearly) always use non-parametrictests (such as Mann-Whitney)Some statisticians (nearly) always use parametric tests(such as t-test)Knowledge of type of data can help determine if normalityis reasonable
Sometimes data approximately normal after transformation
If no severe departure from normality, either parametric ornonparametric tests probably fine.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 86: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/86.jpg)
t-test example
T-test is based on tcalc = x̄−µ0SE(x̄)
Dr. Forte: H0 : µ = 4, HA : µ 6= 4.We calculated x̄ = 5.5, SE(x̄) = 1.04.What is tcalc?
Here tcalc = (5.5− 4)/1.04 = 1.44
Under H0, tcalc ∼ tn−1
Here: tcalc ∼ t3If H0 true, tcalc centered around 0
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 87: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/87.jpg)
t-test example
T-test is based on tcalc = x̄−µ0SE(x̄)
Dr. Forte: H0 : µ = 4, HA : µ 6= 4.We calculated x̄ = 5.5, SE(x̄) = 1.04.What is tcalc?
Here tcalc = (5.5− 4)/1.04 = 1.44
Under H0, tcalc ∼ tn−1
Here: tcalc ∼ t3If H0 true, tcalc centered around 0
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 88: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/88.jpg)
t-distribution
Recall tcalc = x̄−µ0SE(x̄)
“How can tcalc have a distribution? It’s just a number! ”Your thoughts?
Any single experiment gives a single tcalc .Could take another sample of 4 C57BL/6 mice, measureweight loss, get tcalc for that sample.This could theoretically be repeated multiple times withdifferent miceEach experiment would give a tcalc
A histogram of the tcalc values would look like the t3distribution (assuming enough experiments)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 89: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/89.jpg)
t-distribution
Recall tcalc = x̄−µ0SE(x̄)
“How can tcalc have a distribution? It’s just a number! ”Your thoughts?
Any single experiment gives a single tcalc .Could take another sample of 4 C57BL/6 mice, measureweight loss, get tcalc for that sample.This could theoretically be repeated multiple times withdifferent miceEach experiment would give a tcalc
A histogram of the tcalc values would look like the t3distribution (assuming enough experiments)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 90: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/90.jpg)
Samples from t-distributions with 3 df
n= 10
−1.0 −0.5 0.0 0.5 1.0 1.5 2.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
n= 100
−6 −4 −2 0 2 4
05
1015
2025
30
n= 1000
−4 −2 0 2 4 6
050
100
150
n= 10000
−6 −4 −2 0 2 4 6
020
040
060
0
n=10: First fewobservations: -0.69,1.73, 0.55, 0.96,-0.79, · · ·
n=100: Starting tolook roughly “normal”n=1000: (12 obsoutside plot. Rangewas: -12.8 to 8.3)n=10,000: (85 obsoutside plot. Rangewas: -24.3 to 35.7)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 91: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/91.jpg)
Samples from t-distributions with 3 df
n= 10
−1.0 −0.5 0.0 0.5 1.0 1.5 2.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
n= 100
−6 −4 −2 0 2 4
05
1015
2025
30
n= 1000
−4 −2 0 2 4 6
050
100
150
n= 10000
−6 −4 −2 0 2 4 6
020
040
060
0
n=10: First fewobservations: -0.69,1.73, 0.55, 0.96,-0.79, · · ·n=100: Starting tolook roughly “normal”
n=1000: (12 obsoutside plot. Rangewas: -12.8 to 8.3)n=10,000: (85 obsoutside plot. Rangewas: -24.3 to 35.7)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 92: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/92.jpg)
Samples from t-distributions with 3 df
n= 10
−1.0 −0.5 0.0 0.5 1.0 1.5 2.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
n= 100
−6 −4 −2 0 2 4
05
1015
2025
30
n= 1000
−4 −2 0 2 4 6
050
100
150
n= 10000
−6 −4 −2 0 2 4 6
020
040
060
0
n=10: First fewobservations: -0.69,1.73, 0.55, 0.96,-0.79, · · ·n=100: Starting tolook roughly “normal”n=1000: (12 obsoutside plot. Rangewas: -12.8 to 8.3)
n=10,000: (85 obsoutside plot. Rangewas: -24.3 to 35.7)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 93: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/93.jpg)
Samples from t-distributions with 3 df
n= 10
−1.0 −0.5 0.0 0.5 1.0 1.5 2.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
n= 100
−6 −4 −2 0 2 4
05
1015
2025
30
n= 1000
−4 −2 0 2 4 6
050
100
150
n= 10000
−6 −4 −2 0 2 4 6
020
040
060
0
n=10: First fewobservations: -0.69,1.73, 0.55, 0.96,-0.79, · · ·n=100: Starting tolook roughly “normal”n=1000: (12 obsoutside plot. Rangewas: -12.8 to 8.3)n=10,000: (85 obsoutside plot. Rangewas: -24.3 to 35.7)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 94: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/94.jpg)
Normal, t distributions: differences?
Normal distributionn= 10000
−6 −4 −2 0 2 4 6
020
040
060
080
0
Normal: range was: -3.6 to 3.9
t3 distributionn= 10000
−6 −4 −2 0 2 4 6
020
040
060
0
t3: range was: -24.3 to 35.7
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 95: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/95.jpg)
Different t-distributions
−6 −4 −2 0 2 4 6
0.0
0.1
0.2
0.3
0.4
df=2df=8normal
With more data, larger ndegrees of freedom (df) =n − 1 increasessmaller area in tails of thedistributionapproaches normaldistribution (which is t with∞ df)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 96: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/96.jpg)
Normal, t distributions
“Why doesn’t tcalc = x̄−µ0SE(x̄) have a normal distribution?”
Recall we assumed xi ∼ N(µ, σ2)
x̄ has a normal distributionIf we knew σ2
SE(x̄) would be σ2/n, a fixed number“tcalc” would have a normal distribution
When σ2 not known, use s2 to estimate itOur estimate of σ2 isn’t perfectResult: tcalc more variable
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 97: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/97.jpg)
Normal, t distributions
“Why doesn’t tcalc = x̄−µ0SE(x̄) have a normal distribution?”
Recall we assumed xi ∼ N(µ, σ2)
x̄ has a normal distributionIf we knew σ2
SE(x̄) would be σ2/n, a fixed number“tcalc” would have a normal distribution
When σ2 not known, use s2 to estimate itOur estimate of σ2 isn’t perfectResult: tcalc more variable
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 98: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/98.jpg)
t-distribution (3 df)
−6 −4 −2 0 2 4 6
0.0
0.1
0.2
0.3
t dist, 3 df2.5% cutpt = 3.18
If H0 trueExpect values nearcenterRed: unusual
If H0 true, control Pr(reject H0) to α(α often chosen as 0.05)Reject H0 if tcalc in shaded area
Each shaded area has 2.5% of thedistributionShaded area defined by tcrit
tcrit is number for which 2.5% ofdistribution has larger values(−tcrit : 2.5% are smaller)tcrit depends on df and α
With 3 df, α = 0.05, tcrit = 3.18Reject H0 if |tcalc | > tcrit
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 99: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/99.jpg)
t-distribution (3 df)
−6 −4 −2 0 2 4 6
0.0
0.1
0.2
0.3
t dist, 3 df2.5% cutpt = 3.18
If H0 trueExpect values nearcenterRed: unusual
If H0 true, control Pr(reject H0) to α(α often chosen as 0.05)Reject H0 if tcalc in shaded area
Each shaded area has 2.5% of thedistributionShaded area defined by tcrit
tcrit is number for which 2.5% ofdistribution has larger values(−tcrit : 2.5% are smaller)tcrit depends on df and α
With 3 df, α = 0.05, tcrit = 3.18Reject H0 if |tcalc | > tcrit
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 100: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/100.jpg)
t-distribution (3 df)
−6 −4 −2 0 2 4 6
0.0
0.1
0.2
0.3
t dist, 3 df2.5% cutpt = 3.18
If H0 trueExpect values nearcenterRed: unusual
If H0 true, control Pr(reject H0) to α(α often chosen as 0.05)Reject H0 if tcalc in shaded area
Each shaded area has 2.5% of thedistributionShaded area defined by tcrit
tcrit is number for which 2.5% ofdistribution has larger values(−tcrit : 2.5% are smaller)tcrit depends on df and α
With 3 df, α = 0.05, tcrit = 3.18Reject H0 if |tcalc | > tcrit
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 101: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/101.jpg)
Sample from t-distribution: how good is tcrit?
n= 10000
−6 −4 −2 0 2 4 6
020
040
060
0
How good is tcrit?Do we really get 2.5% in each “tail”?For t3, α = 0.05, tcrit = 3.18.
From sample of 10,000 observation from t3247 observations were < −3.18 (2.47%)258 observations were > 3.18 (2.58%)
If H0 true and α = 0.05, approximately 5% of the time wewould reject H0 (“type I error”)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 102: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/102.jpg)
Sample from t-distribution: how good is tcrit?
n= 10000
−6 −4 −2 0 2 4 6
020
040
060
0
How good is tcrit?Do we really get 2.5% in each “tail”?For t3, α = 0.05, tcrit = 3.18.
From sample of 10,000 observation from t3247 observations were < −3.18 (2.47%)258 observations were > 3.18 (2.58%)
If H0 true and α = 0.05, approximately 5% of the time wewould reject H0 (“type I error”)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 103: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/103.jpg)
t-test for Dr. Forte
Recall Dr. Forte’s question: is weight loss in C57BL/6mice (his data) different than in BALB/c mice(published data, weight loss = 4 ounces)?His hypotheses: H0 : µ = 4, HA : µ 6= 4.µ: weight loss in population of C57BL/6 miceDr. Forte: tcalc = x̄−µ0
SE(x̄) = (5.5− 4)/1.04 = 1.44
t-test rule: Reject H0 if |tcalc | > tcrit
With 3 df, α = 0.05, tcrit = 3.18What should Dr. Forte conclude?
Fail to reject H0Dr. Forte’s data are consistent with H0 (p = 0.25)No evidence that weight loss in C57BL/6 mice isdifferent from 4 ounces.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 104: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/104.jpg)
t-test for Dr. Forte
Recall Dr. Forte’s question: is weight loss in C57BL/6mice (his data) different than in BALB/c mice(published data, weight loss = 4 ounces)?His hypotheses: H0 : µ = 4, HA : µ 6= 4.µ: weight loss in population of C57BL/6 miceDr. Forte: tcalc = x̄−µ0
SE(x̄) = (5.5− 4)/1.04 = 1.44
t-test rule: Reject H0 if |tcalc | > tcrit
With 3 df, α = 0.05, tcrit = 3.18What should Dr. Forte conclude?Fail to reject H0
Dr. Forte’s data are consistent with H0 (p = 0.25)No evidence that weight loss in C57BL/6 mice isdifferent from 4 ounces.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 105: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/105.jpg)
t-distribution (3df)
−6 −4 −2 0 2 4 6
0.0
0.1
0.2
0.3
t dist, 3 df2.5% cutpt = 3.18
Dr. Forte’s value: blue line(1.44)Compare Dr. Forte’s value toa t distribution with 3 dfReject H0 if tcalc in shadedareaFail to reject H0
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 106: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/106.jpg)
Confidence interval
95% confidence interval (CI): x̄ ± tcrit × SE(x̄)
(Assumes tcrit calculated using α = 0.05)
This example: tcrit = 3.18We calculated x̄ = 5.5, SE(x̄) = 1.04What is Dr. Forte’s 95% CI?
95% CI is 5.5± 3.18× 1.04 = (2.19, 8.81)
This is a 95% CI for µFail to reject H0 because interval includes µ0 = 4.Confidence interval more informative than “fail to reject H0”
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 107: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/107.jpg)
Confidence interval
95% confidence interval (CI): x̄ ± tcrit × SE(x̄)
(Assumes tcrit calculated using α = 0.05)
This example: tcrit = 3.18We calculated x̄ = 5.5, SE(x̄) = 1.04What is Dr. Forte’s 95% CI?95% CI is 5.5± 3.18× 1.04 = (2.19, 8.81)
This is a 95% CI for µFail to reject H0 because interval includes µ0 = 4.Confidence interval more informative than “fail to reject H0”
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 108: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/108.jpg)
Classical statistics
µ considered fixed.Observed data (xi ) are considered random
Could collect new data: different xiImplies calculated confidence interval is random.
If H0 is true, on average 95% of the CIs will include µ.Can never answer “Is H0 true?”
Can only say something about how unusual our teststatistic is if H0 is true
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 109: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/109.jpg)
p-value
−6 −4 −2 0 2 4 6
0.0
0.1
0.2
0.3
t dist, 3 df12.5% cutpt = 1.44 p-value = probability of observing
a statistic as extreme or moreextreme as we observed, if H0 istrue
Curve: total area=1Dr. Forte’s tcalc = 1.44.p-value = shaded area = 0.25
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 110: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/110.jpg)
p-value
We calculated p = 0.25 for Dr. Forte’s data.
Is this the same as the probability that H0 is true?
Nop-value is calculated assuming H0 is true.
Recall: p-value is probability of observing a statistic as or moreextreme as we observed, if H0 is true
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 111: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/111.jpg)
p-value
We calculated p = 0.25 for Dr. Forte’s data.
Is this the same as the probability that H0 is true?
Nop-value is calculated assuming H0 is true.
Recall: p-value is probability of observing a statistic as or moreextreme as we observed, if H0 is true
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 112: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/112.jpg)
p-value questions
What do you think of these statements?“The p-value tells you whether or not the observed effect isreal.”
Not true“If the result is not statistically significant, that proves thereis no difference.” Not true
Better:Use knowledge of the magnitude of the estimated effect(and the CI) to help interpret the results.Suppose comparing treatment to control and p = 0.07
If estimated effect large, “Our study suggests an importantbenefit of treatment, but this did not reach statisticalsignificance.”
A p-value is not an end in itself.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 113: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/113.jpg)
p-value questions
What do you think of these statements?“The p-value tells you whether or not the observed effect isreal.” Not true
“If the result is not statistically significant, that proves thereis no difference.” Not true
Better:Use knowledge of the magnitude of the estimated effect(and the CI) to help interpret the results.Suppose comparing treatment to control and p = 0.07
If estimated effect large, “Our study suggests an importantbenefit of treatment, but this did not reach statisticalsignificance.”
A p-value is not an end in itself.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 114: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/114.jpg)
p-value questions
What do you think of these statements?“The p-value tells you whether or not the observed effect isreal.” Not true“If the result is not statistically significant, that proves thereis no difference.”
Not trueBetter:
Use knowledge of the magnitude of the estimated effect(and the CI) to help interpret the results.Suppose comparing treatment to control and p = 0.07
If estimated effect large, “Our study suggests an importantbenefit of treatment, but this did not reach statisticalsignificance.”
A p-value is not an end in itself.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 115: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/115.jpg)
p-value questions
What do you think of these statements?“The p-value tells you whether or not the observed effect isreal.” Not true“If the result is not statistically significant, that proves thereis no difference.” Not true
Better:Use knowledge of the magnitude of the estimated effect(and the CI) to help interpret the results.Suppose comparing treatment to control and p = 0.07
If estimated effect large, “Our study suggests an importantbenefit of treatment, but this did not reach statisticalsignificance.”
A p-value is not an end in itself.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 116: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/116.jpg)
p-value questions
What do you think of these statements?“The p-value tells you whether or not the observed effect isreal.” Not true“If the result is not statistically significant, that proves thereis no difference.” Not true
Better:Use knowledge of the magnitude of the estimated effect(and the CI) to help interpret the results.Suppose comparing treatment to control and p = 0.07
If estimated effect large, “Our study suggests an importantbenefit of treatment, but this did not reach statisticalsignificance.”
A p-value is not an end in itself.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 117: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/117.jpg)
t-test for M&M’s
Consider M&M data from 1st sample of 8 peopleGroup 1: x̄= 0.374, sx= 0.0479 SE(x̄)= 0.0169
Recall xi is % (red + blue) M&M’sTest H0 : µ = 1/3, versus HA : µ 6= 1/3.What is tcalc?
0.374− 0.3330.0169
= 2.41
Compare to tcrit = 2.36 (based on (n-1) df, α = 0.05)Conclusion? Reject H0 (also p = 0.047).
For first 8 people, mean % (red + blue) M&M’s significantlydifferent than 1/3.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 118: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/118.jpg)
t-test for M&M’s
Consider M&M data from 1st sample of 8 peopleGroup 1: x̄= 0.374, sx= 0.0479 SE(x̄)= 0.0169
Recall xi is % (red + blue) M&M’sTest H0 : µ = 1/3, versus HA : µ 6= 1/3.What is tcalc?
0.374− 0.3330.0169
= 2.41
Compare to tcrit = 2.36 (based on (n-1) df, α = 0.05)Conclusion?
Reject H0 (also p = 0.047).For first 8 people, mean % (red + blue) M&M’s significantlydifferent than 1/3.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 119: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/119.jpg)
t-test for M&M’s
Consider M&M data from 1st sample of 8 peopleGroup 1: x̄= 0.374, sx= 0.0479 SE(x̄)= 0.0169
Recall xi is % (red + blue) M&M’sTest H0 : µ = 1/3, versus HA : µ 6= 1/3.What is tcalc?
0.374− 0.3330.0169
= 2.41
Compare to tcrit = 2.36 (based on (n-1) df, α = 0.05)Conclusion? Reject H0 (also p = 0.047).
For first 8 people, mean % (red + blue) M&M’s significantlydifferent than 1/3.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 120: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/120.jpg)
t-test for M&M’s
For groups of 8 people, tcrit = 2.36Group 1: x̄= 0.374, sx= 0.0479 SE(x̄)= 0.0169
tcalc = 2.41, p = 0.047Group 2: x̄= 0.3564, sx= 0.0574 SE(x̄)= 0.0203
tcalc = 1.138, p = 0.293Group 3: x̄= 0.3549, sx= 0.0348 SE(x̄)= 0.0123
tcalc = 1.175, p = 0.123Group 4: x̄= 0.3465, sx= 0.0342 SE(x̄)= 0.0121
tcalc = 1.084, p = 0.314
For entire sample (n=32), tcrit = 2.04Everyone: x̄= 0.358 sx= 0.0436 SE(x̄)= 0.0077
tcalc = 3.19, p = 0.003
Larger n: smaller SE(x̄), smaller tcrit , easier to reject H0 if H0not true.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 121: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/121.jpg)
t-test for M&M’s
For groups of 8 people, tcrit = 2.36Group 1: x̄= 0.374, sx= 0.0479 SE(x̄)= 0.0169
tcalc = 2.41, p = 0.047Group 2: x̄= 0.3564, sx= 0.0574 SE(x̄)= 0.0203
tcalc = 1.138, p = 0.293Group 3: x̄= 0.3549, sx= 0.0348 SE(x̄)= 0.0123
tcalc = 1.175, p = 0.123Group 4: x̄= 0.3465, sx= 0.0342 SE(x̄)= 0.0121
tcalc = 1.084, p = 0.314
For entire sample (n=32), tcrit = 2.04Everyone: x̄= 0.358 sx= 0.0436 SE(x̄)= 0.0077
tcalc = 3.19, p = 0.003
Larger n: smaller SE(x̄), smaller tcrit , easier to reject H0 if H0not true.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 122: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/122.jpg)
t-test for M&M’s
For groups of 8 people, tcrit = 2.36Group 1: x̄= 0.374, sx= 0.0479 SE(x̄)= 0.0169
tcalc = 2.41, p = 0.047Group 2: x̄= 0.3564, sx= 0.0574 SE(x̄)= 0.0203
tcalc = 1.138, p = 0.293Group 3: x̄= 0.3549, sx= 0.0348 SE(x̄)= 0.0123
tcalc = 1.175, p = 0.123Group 4: x̄= 0.3465, sx= 0.0342 SE(x̄)= 0.0121
tcalc = 1.084, p = 0.314
For entire sample (n=32), tcrit = 2.04Everyone: x̄= 0.358 sx= 0.0436 SE(x̄)= 0.0077
tcalc = 3.19, p = 0.003
Larger n: smaller SE(x̄), smaller tcrit , easier to reject H0 if H0not true.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 123: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/123.jpg)
Outline
1 Design of experiments: examples
2 Mean, standard deviation, standard error
3 M&M data
4 t-test
5 2-sample t-test
6 Importance of plotting the data
7 1-way ANOVA
8 Summary remarks
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 124: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/124.jpg)
2-sample t-test
Dr. Forte now wants to compare the effect of ’superdrug’on weight loss in knockout mice (new) versus C57BL/6mice (just measured)
NOTE: Better to collect data on both species in the sametime period.
Let x̄1 and s1 be mean and SD of weight loss in 1st sample(C57BL/6), n1 = 4 observations.Let x̄2 and s2 be mean and SD of weight loss in 2ndsample (knockout), n2 observationsµ1 is population mean weight loss in C57BL/6 mice, µ2 inknockout mice.Null hypothesis of “no effect” or “no difference” isH0 : µ1 = µ2, so HA : µ1 6= µ2.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 125: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/125.jpg)
2-sample t-test (cont.)
H0 : µ1 = µ2, HA : µ1 6= µ2
Same as H0 : µ1 − µ2 = 0 and HA : µ1 − µ2 6= 0.Estimate µ1 − µ2 by x̄1 − x̄2.tcalc = x̄1−x̄2
SE(x̄1−x̄2)
Assumptions of the 2-sample test: data from 2 populationsAre independentAre normally distributed (within population)Have the same variance (or: use modified estimate ofstandard error)
Reject H0 if |tcalc | > tcrit (tcrit depends on n1 + n2 and α)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 126: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/126.jpg)
Scenario 1
Pairs of mice are housed in the same cage with same foodsourceInterest is in effect of restricted food intakeMice fed 2x/day, only 1 mouse can eat at once
Is this a reasonable design?
Mice are not independent: if one mouse in a pair eatsaggressively, the other mouse cannot eat as much
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 127: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/127.jpg)
Scenario 1
Pairs of mice are housed in the same cage with same foodsourceInterest is in effect of restricted food intakeMice fed 2x/day, only 1 mouse can eat at once
Is this a reasonable design?
Mice are not independent: if one mouse in a pair eatsaggressively, the other mouse cannot eat as much
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 128: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/128.jpg)
Scenario 1
Pairs of mice are housed in the same cage with same foodsourceInterest is in effect of restricted food intakeMice fed 2x/day, only 1 mouse can eat at once
Is this a reasonable design?
Mice are not independent: if one mouse in a pair eatsaggressively, the other mouse cannot eat as much
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 129: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/129.jpg)
Scenario 2
12 mice receive standard food, 12 a new food.Interest: gain in head circumference, measured at end of 2weeksBob measures mice receiving standard food, Harrymeasures the othersBob and Harry do not know which mice received whichfood source
Is this a reasonable design? No
Food source is totally confounded with person who takesmeasurementsBob might tend to get smaller measurements on sameanimal (bias)Harry might tend to measure less accurately (largervariance)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 130: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/130.jpg)
Scenario 2
12 mice receive standard food, 12 a new food.Interest: gain in head circumference, measured at end of 2weeksBob measures mice receiving standard food, Harrymeasures the othersBob and Harry do not know which mice received whichfood source
Is this a reasonable design?
No
Food source is totally confounded with person who takesmeasurementsBob might tend to get smaller measurements on sameanimal (bias)Harry might tend to measure less accurately (largervariance)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 131: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/131.jpg)
Scenario 2
12 mice receive standard food, 12 a new food.Interest: gain in head circumference, measured at end of 2weeksBob measures mice receiving standard food, Harrymeasures the othersBob and Harry do not know which mice received whichfood source
Is this a reasonable design? No
Food source is totally confounded with person who takesmeasurementsBob might tend to get smaller measurements on sameanimal (bias)Harry might tend to measure less accurately (largervariance)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 132: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/132.jpg)
Dr. Forte’s 2-sample data
1 2
12
34
56
78
group
y
●
●
●
●
Do you think assumptions arereasonable?
(OK)H0 : µ1 − µ2 = 0, HA : µ1 − µ2 6= 0.tcalc = x̄1−x̄2
SE(x̄1−x̄2)
x̄1 = 5.5, x̄2 = 2.1SE(x̄1 − x̄2) = 1.19 (assumes samevariance in 2 groups)What is tcalc? 5.5−2.1
1.19 = 2.86tcrit = 2.45, from 6 df, α = 0.05What is your conclusion? (Reject H0)
95% CI for µ1 − µ2 is x̄1 − x̄2 ± tcrit × SE(x̄1 − x̄2) = (0.50, 6.30)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 133: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/133.jpg)
Dr. Forte’s 2-sample data
1 2
12
34
56
78
group
y
●
●
●
●
Do you think assumptions arereasonable? (OK)H0 : µ1 − µ2 = 0, HA : µ1 − µ2 6= 0.tcalc = x̄1−x̄2
SE(x̄1−x̄2)
x̄1 = 5.5, x̄2 = 2.1SE(x̄1 − x̄2) = 1.19 (assumes samevariance in 2 groups)What is tcalc?
5.5−2.11.19 = 2.86
tcrit = 2.45, from 6 df, α = 0.05What is your conclusion? (Reject H0)
95% CI for µ1 − µ2 is x̄1 − x̄2 ± tcrit × SE(x̄1 − x̄2) = (0.50, 6.30)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 134: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/134.jpg)
Dr. Forte’s 2-sample data
1 2
12
34
56
78
group
y
●
●
●
●
Do you think assumptions arereasonable? (OK)H0 : µ1 − µ2 = 0, HA : µ1 − µ2 6= 0.tcalc = x̄1−x̄2
SE(x̄1−x̄2)
x̄1 = 5.5, x̄2 = 2.1SE(x̄1 − x̄2) = 1.19 (assumes samevariance in 2 groups)What is tcalc? 5.5−2.1
1.19 = 2.86tcrit = 2.45, from 6 df, α = 0.05What is your conclusion?
(Reject H0)
95% CI for µ1 − µ2 is x̄1 − x̄2 ± tcrit × SE(x̄1 − x̄2) = (0.50, 6.30)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 135: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/135.jpg)
Dr. Forte’s 2-sample data
1 2
12
34
56
78
group
y
●
●
●
●
Do you think assumptions arereasonable? (OK)H0 : µ1 − µ2 = 0, HA : µ1 − µ2 6= 0.tcalc = x̄1−x̄2
SE(x̄1−x̄2)
x̄1 = 5.5, x̄2 = 2.1SE(x̄1 − x̄2) = 1.19 (assumes samevariance in 2 groups)What is tcalc? 5.5−2.1
1.19 = 2.86tcrit = 2.45, from 6 df, α = 0.05What is your conclusion? (Reject H0)
95% CI for µ1 − µ2 is x̄1 − x̄2 ± tcrit × SE(x̄1 − x̄2) = (0.50, 6.30)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 136: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/136.jpg)
Dr. Forte’s conclusions
Recall: Dr. Forte wants to compare the effect of’superdrug’ on weight loss in knockout vs C57BL/6mice.C57BL/6 mice: (x̄1 = 5.5), knockout mice (x̄2 = 2.1)We rejected H0, p = 0.03
What does “reject H0” mean here?
Data suggests that theaverage weight loss in C57BL/6 mice is significantlygreater than in knockout mice.
What does the p-value mean?If the 2 species had the same mean weight loss, we wouldhave observed a difference as great or greater 3% of thetime.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 137: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/137.jpg)
Dr. Forte’s conclusions
Recall: Dr. Forte wants to compare the effect of’superdrug’ on weight loss in knockout vs C57BL/6mice.C57BL/6 mice: (x̄1 = 5.5), knockout mice (x̄2 = 2.1)We rejected H0, p = 0.03
What does “reject H0” mean here? Data suggests that theaverage weight loss in C57BL/6 mice is significantlygreater than in knockout mice.
What does the p-value mean?
If the 2 species had the same mean weight loss, we wouldhave observed a difference as great or greater 3% of thetime.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 138: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/138.jpg)
Dr. Forte’s conclusions
Recall: Dr. Forte wants to compare the effect of’superdrug’ on weight loss in knockout vs C57BL/6mice.C57BL/6 mice: (x̄1 = 5.5), knockout mice (x̄2 = 2.1)We rejected H0, p = 0.03
What does “reject H0” mean here? Data suggests that theaverage weight loss in C57BL/6 mice is significantlygreater than in knockout mice.
What does the p-value mean?If the 2 species had the same mean weight loss, we wouldhave observed a difference as great or greater 3% of thetime.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 139: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/139.jpg)
Outline
1 Design of experiments: examples
2 Mean, standard deviation, standard error
3 M&M data
4 t-test
5 2-sample t-test
6 Importance of plotting the data
7 1-way ANOVA
8 Summary remarks
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 140: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/140.jpg)
Plot the data: why?
1 2
5.6
5.8
6.0
6.2
6.4
6.6
6.8
group
y
●
●
●
●
●
●
Comments?
The data distribution looksreasonable
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 141: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/141.jpg)
Plot the data: why?
1 2
5.6
5.8
6.0
6.2
6.4
6.6
6.8
group
y
●
●
●
●
●
●
Comments?The data distribution looksreasonable
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 142: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/142.jpg)
Plot the data: why?
1 2
−4
−2
02
46
group
y
●●●●●●
Comments?
One large outlying observationA mistake?Non-normal distribution?
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 143: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/143.jpg)
Plot the data: why?
1 2
−4
−2
02
46
group
y
●●●●●●
Comments?One large outlying observation
A mistake?Non-normal distribution?
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 144: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/144.jpg)
Plot the data: why?
1 2
300
400
500
600
700
800
900
group
y
●
●
●
●
●
●
Comments?
Variability within groups notsimilarSame data as first plot -EXCEPT data isexponentiatedSuggest: use logtransformationKnowledge of what theoutcomes are can guidedecision about whether totransform
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 145: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/145.jpg)
Plot the data: why?
1 2
300
400
500
600
700
800
900
group
y
●
●
●
●
●
●
Comments?Variability within groups notsimilarSame data as first plot -EXCEPT data isexponentiatedSuggest: use logtransformationKnowledge of what theoutcomes are can guidedecision about whether totransform
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 146: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/146.jpg)
Outline
1 Design of experiments: examples
2 Mean, standard deviation, standard error
3 M&M data
4 t-test
5 2-sample t-test
6 Importance of plotting the data
7 1-way ANOVA
8 Summary remarks
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 147: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/147.jpg)
Dr. Crescendo
Dr. Crescendo wants to test the effects of drugs A, B, andC on cell counts in 96-well platesSome things to consider
Is each drug tested on multiple plates? (plate effect)Are the wells at the edge of the plate different from others?(edge effect)Are the wells for drug A always read first? (order effect)
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 148: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/148.jpg)
Set up for 1-way ANOVA
Dr. Crescendo took 5 measurements from each of the 3treatments (A,B,C)µ1, µ2, µ3 are population means of treatment effects A, B,and C.Hypotheses:
H0 : µ1 = µ2 = µ3HA : at least 2 means are not equal.
How to test H0, HA?
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 149: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/149.jpg)
Pictures for 1-way ANOVA
1 2 3
45
67
8
group
y
●
●
●
●
●
1 2 3
45
67
8group
y
●
●
●
●
●
How do the 2 pictures differ?
Right: 3 means are more different from each otherRight: Variability between means is also more differentANOVA compares variability between treatments tovariability within treatments
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 150: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/150.jpg)
Pictures for 1-way ANOVA
1 2 3
45
67
8
group
y
●
●
●
●
●
1 2 3
45
67
8group
y
●
●
●
●
●
How do the 2 pictures differ?Right: 3 means are more different from each other
Right: Variability between means is also more differentANOVA compares variability between treatments tovariability within treatments
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 151: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/151.jpg)
Pictures for 1-way ANOVA
1 2 3
45
67
8
group
y
●
●
●
●
●
1 2 3
45
67
8group
y
●
●
●
●
●
How do the 2 pictures differ?Right: 3 means are more different from each otherRight: Variability between means is also more different
ANOVA compares variability between treatments tovariability within treatments
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 152: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/152.jpg)
Pictures for 1-way ANOVA
1 2 3
45
67
8
group
y
●
●
●
●
●
1 2 3
45
67
8group
y
●
●
●
●
●
How do the 2 pictures differ?Right: 3 means are more different from each otherRight: Variability between means is also more differentANOVA compares variability between treatments tovariability within treatments
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 153: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/153.jpg)
Dr Crescendo’s data
1 2 3
5.5
6.0
6.5
7.0
group
y
●
●
●
●●
Dr. Crescendo's data
Sample sizes:n1 = n2 = n3 = 5.Sample means: x̄1 =5.49, x̄2 = 6.07, x̄3 = 6.38.Sample variances: s2
1 =0.29, s2
2 = 0.48, s23 = 0.66
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 154: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/154.jpg)
1-way ANOVA: basics
ANOVA assumes Var(x1) = Var(x2) = Var(x3)
Call this σ2
Variability within treatments is weighted average ofs2
1, s22, s2
3.degrees of freedom (df) = n1 + n2 + n3 − 3 (here: df=12)Dr. Crescendo: within treatment variance is3.016/12 = 0.25
Variability between treatments is weighted average of howfar apart group means are from overall mean
df = number groups - 1 (here: df=2)Dr. Crescendo: between treatment variance is2.063/2 = 1.03
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 155: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/155.jpg)
1-way ANOVA: basics
ANOVA assumes Var(x1) = Var(x2) = Var(x3)
Call this σ2
Variability within treatments is weighted average ofs2
1, s22, s2
3.degrees of freedom (df) = n1 + n2 + n3 − 3 (here: df=12)Dr. Crescendo: within treatment variance is3.016/12 = 0.25
Variability between treatments is weighted average of howfar apart group means are from overall mean
df = number groups - 1 (here: df=2)Dr. Crescendo: between treatment variance is2.063/2 = 1.03
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 156: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/156.jpg)
1-way ANOVA: basics
ANOVA assumes Var(x1) = Var(x2) = Var(x3)
Call this σ2
Variability within treatments is weighted average ofs2
1, s22, s2
3.degrees of freedom (df) = n1 + n2 + n3 − 3 (here: df=12)Dr. Crescendo: within treatment variance is3.016/12 = 0.25
Variability between treatments is weighted average of howfar apart group means are from overall mean
df = number groups - 1 (here: df=2)Dr. Crescendo: between treatment variance is2.063/2 = 1.03
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 157: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/157.jpg)
Testing in 1-way ANOVA (3 groups)
Recall: 1-way ANOVA comparesVariability within treatment: MS(Within) = MS(Error)Variability between treatment: MS(Between) = MS(Group)(Note: MS = mean square)
Dr. Crescendo:MS(Error) = 3.016
12 = 0.25MS(Between) = 2.063
2 = 1.03
ANOVA tableDf Sum Sq Mean Sq
Between 2 2.06337 1.03169Error 12 3.01612 0.25134
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 158: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/158.jpg)
Testing in 1-way ANOVA (3 groups)
Recall: 1-way ANOVA comparesVariability within treatment: MS(Within) = MS(Error)Variability between treatment: MS(Between) = MS(Group)(Note: MS = mean square)
Dr. Crescendo:MS(Error) = 3.016
12 = 0.25MS(Between) = 2.063
2 = 1.03
ANOVA tableDf Sum Sq Mean Sq
Between 2 2.06337 1.03169Error 12 3.01612 0.25134
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 159: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/159.jpg)
1-way ANOVA: basics
If H0 trueHow do MS(Within) and MS(Between) compare?How could we estimate σ2?
MS(Between) ≈ MS(Within), and both estimate σ2
MS(Between)MS(Within) close to 1.
If H0 not true, same questionsOnly MS(Within) estimates σ2.
MS(Between) > MS(Within) so MS(Between)MS(Within) > 1
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 160: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/160.jpg)
1-way ANOVA: basics
If H0 trueHow do MS(Within) and MS(Between) compare?How could we estimate σ2?MS(Between) ≈ MS(Within), and both estimate σ2
MS(Between)MS(Within) close to 1.
If H0 not true, same questionsOnly MS(Within) estimates σ2.
MS(Between) > MS(Within) so MS(Between)MS(Within) > 1
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 161: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/161.jpg)
1-way ANOVA: basics
If H0 trueHow do MS(Within) and MS(Between) compare?How could we estimate σ2?MS(Between) ≈ MS(Within), and both estimate σ2
MS(Between)MS(Within) close to 1.
If H0 not true, same questions
Only MS(Within) estimates σ2.
MS(Between) > MS(Within) so MS(Between)MS(Within) > 1
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 162: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/162.jpg)
1-way ANOVA: basics
If H0 trueHow do MS(Within) and MS(Between) compare?How could we estimate σ2?MS(Between) ≈ MS(Within), and both estimate σ2
MS(Between)MS(Within) close to 1.
If H0 not true, same questionsOnly MS(Within) estimates σ2.
MS(Between) > MS(Within) so MS(Between)MS(Within) > 1
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 163: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/163.jpg)
Testing in 1-way ANOVA (3 groups)
H0 : µ1 = µ2 = µ3, HA : at least 2 means are not equal.
ANOVA tableDf Sum Sq Mean Sq F value Pr(>F)
Between 2 2.06337 1.03169 4.1047 0.04383Error 12 3.01612 0.25134
Test of H0 is Fcalc =MS(Between)
MS(Error) = 4.10.
If H0 true (and assumptions met!), Fcalc ∼ F2,12
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 164: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/164.jpg)
Testing in 1-way ANOVA (3 groups)
H0 : µ1 = µ2 = µ3, HA : at least 2 means are not equal.
ANOVA tableDf Sum Sq Mean Sq F value Pr(>F)
Between 2 2.06337 1.03169 4.1047 0.04383Error 12 3.01612 0.25134
Test of H0 is Fcalc =MS(Between)
MS(Error) = 4.10.
If H0 true (and assumptions met!), Fcalc ∼ F2,12
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 165: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/165.jpg)
Testing in 1-way ANOVA (3 groups)
H0 : µ1 = µ2 = µ3, HA : at least 2 means are not equal.
ANOVA tableDf Sum Sq Mean Sq F value Pr(>F)
Between 2 2.06337 1.03169 4.1047 0.04383Error 12 3.01612 0.25134
Test of H0 is Fcalc =MS(Between)
MS(Error) = 4.10.
If H0 true (and assumptions met!), Fcalc ∼ F2,12
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 166: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/166.jpg)
Some F distributions
0 1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
1.0
df=2,12df=3,12df=8,12df=8,100
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 167: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/167.jpg)
Dr. Crescendo’s Fcalc
0 1 2 3 4 5 6
0.0
0.2
0.4
0.6
0.8
1.0
F dist, 2 and 12 df5% cutpt = 3.89
Generally: only large valuesof Fcalc considered unusual(1-tail)Reject H0 ifFcalc > F2,12,.95 = 3.89.Dr. Crescendo:Fcalc = 1.03/0.25 = 4.10,p = 0.044. Reject H0.
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 168: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/168.jpg)
Outline
1 Design of experiments: examples
2 Mean, standard deviation, standard error
3 M&M data
4 t-test
5 2-sample t-test
6 Importance of plotting the data
7 1-way ANOVA
8 Summary remarks
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 169: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/169.jpg)
Interacting with statistician
“What questions should I ask the statistician?”
Who asks the first questions?I ask: “Tell me about your experiment and your question.”We need some background
If designing an experiment:I ask: “What is your research question (explain in terms ofdata you plan to collect)?”We can help you
Decide what data to collect to answer questionHow to design experimentFigure out reasonable sample size (but need furtherinformation from you: estimated effect)
For data already collectedI ask: “What is your research question (explained in termsof data)?”Plots of data helpful
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 170: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/170.jpg)
Interacting with statistician
“What questions should I ask the statistician?”Who asks the first questions?
I ask: “Tell me about your experiment and your question.”We need some background
If designing an experiment:I ask: “What is your research question (explain in terms ofdata you plan to collect)?”We can help you
Decide what data to collect to answer questionHow to design experimentFigure out reasonable sample size (but need furtherinformation from you: estimated effect)
For data already collectedI ask: “What is your research question (explained in termsof data)?”Plots of data helpful
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 171: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/171.jpg)
How a statistician can help
Design of experimentStatistician must understand study aims and backgroundCan help investigator decide
what data to collecthow to design data collection processhow to use data to answer research questions
AnalysisDetermine appropriate modelUnderstand and check assumptions (use different model ifviolated)Carry out analysis
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 172: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/172.jpg)
What does Biostatistics Dept offer?
CoursesBST463 = Introduction to Biostatistics (fall semester)
Consulting serviceRotation system: faculty and PhD student or postdocFree assistance with grant preparation if appropriate %effort for BiostatisticsFree 1-hour consultation on anythingIf related to CTSI, voucher possible for up to 10 free hoursStudent projects: faculty mentor must attend initial meetingand most other meetings with BiostatisticiansFaculty advisor can contact Susan Messing([email protected]) for appointment,see http://www.urmc.rochester.edu/biostat/consulting/
Sally W. Thurston, PhD Basic Statistics: MBI-540
![Page 173: Basic Statistics: MBI-540 · Basic Statistics: MBI-540 Sally W. Thurston, PhD Associate Professor Department of Biostatistics and Computational Biology University of Rochester January](https://reader033.fdocuments.net/reader033/viewer/2022041903/5e619384fdf9ae3d6719ce4e/html5/thumbnails/173.jpg)
Parting remarks
Statistics is not just “crunching numbers”Requires understanding of model and assumptions.
Good experimental design and data collection is keyPoor design: may be impossible to make valid inferenceabout population.Biased data collection: may never be able to adjust for bias.Good design: enables statistical inference about populationof inference based on data collected.
Sally W. Thurston, PhD Basic Statistics: MBI-540