Basics in Epidemiology & Biostatistics 2 RSS6 2014
-
Upload
rss6 -
Category
Health & Medicine
-
view
270 -
download
6
Transcript of Basics in Epidemiology & Biostatistics 2 RSS6 2014
![Page 1: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/1.jpg)
Basics in Epidemiology & Biostatistics
Hashem Alhashemi MD, MPH, FRCPC Assistant Professor, KSAU-HS
![Page 2: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/2.jpg)
• Large samples > 30.
• Normally distributed.
• Descriptive statistics: Range, Mean, SD.
Non-parametric data
• For small samples & variables that are not normally distributed.
• No basic assumptions (distribution free).
• Descriptive statistics: Range, Rank, Median, & the interquartile range. (the middle 50 = Q3-Q1).
• Median is the middle number in a ranked list of numbers (regardless of its frequency).
Parametric data
![Page 3: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/3.jpg)
Non-parametric data
![Page 4: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/4.jpg)
The Mean
• It sums all the values (great digital summary ).
• But, it will be affected by extreme values. So, it is not a good summary if your data is not normal (symmetrical bell shape).
• The sum of data differences above and below the mean will equal = 0.
"حب التناهي شطط خير األمور الوسط "
![Page 5: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/5.jpg)
Stander Deviation
Average of differences from the mean (Squared-SS)
Sample set:
1 ,2 ,3 ,4 , 5 ,6 ,7
X = 28/7= 4
Number of differences = 6
(n-1)
Stander deviation
Unit of deviation of data from the Mean
![Page 6: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/6.jpg)
Differences?
![Page 7: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/7.jpg)
Similar < +/- 1𝛔
Slightly Different
Very Different
Extremely Different
(0.02) > +/-2𝛔
(0.001) >+/- 3𝛔
<+/-2𝛔
![Page 8: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/8.jpg)
• Z distribution, is a hypothetical population (model) with a 𝛍 of 0, & 𝛔 1.
• Six (𝛔 ) make up 0.997 of the area under the curve
Z distribution
Parametric Data
Population
%
![Page 9: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/9.jpg)
• God knows every thing.
• Dose not need to take samples.
• Commits no mistakes.
![Page 10: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/10.jpg)
Central Limit Theorem
• The mean of all possible sample means will be approximately equal to the mean of the population.
• The distribution of all possible sample means will be normal.
• If you limit your prediction to the center, you will be ok (averages are normally distributed)
(1777 – 1855)
"حب التناهي شطط خير األمور الوسط "
Carl Friedrich Gauss
![Page 11: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/11.jpg)
• t distribution, is a hypothetical population (model) with a 𝛍 of 0, & 𝛔 1 , (Degrees of freedom= n-1).
• Six (𝛔 ) make up 0.997 of the area under the curve
t distribution
Parametric Data
Sample
Sampling distribution
%
![Page 12: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/12.jpg)
Similar <+/-1 SE
Slightly Different
Very Different
Extremely Different
(0.02) > +/-2 SE
(0.001) >+/- 3 SE
<+/-2 SE
![Page 13: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/13.jpg)
Stander Error
SE is the unit for error in estimating the population mean.
SE is the unit for deviation of all possible samples means from the population mean.
SE is the unit for average difference of all possible samples means from population mean.
n because S is a root product of the variance.
![Page 14: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/14.jpg)
The Average Idea
SE Stander Error S Stander Deviation X mean
A unit for Error in estimation of the population mean.
A unit of Deviation of the data from the sample mean.
Average
A unit for Deviation of all possible samples means from the population mean.
A unit for Average of differences of the data from the sample mean.
A unit for Average of differences of all possible samples means from population mean.
![Page 15: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/15.jpg)
A Fancy World made of
%s & Averages
Biostatistics
![Page 16: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/16.jpg)
Sample size
Estimate
Calculate
Calculate (SE) ?
?
Estimate
![Page 17: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/17.jpg)
95% Confidence Interval (C.I)
SE
Stander of Error
+/- 2 SE
μ π Ω λ
Estimate Margin of Error
X P OR Rate
General formula
![Page 18: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/18.jpg)
SD vs SE
• Standard Deviation calculates the variability of the data within a sample in relation to the sample mean .
• Standard Error estimates the variability of all possible samples means in relation to the population mean.
So, it helps identify the % of data above and below a certain measurement.
So, it helps identify the degree of error in your estimation.
![Page 19: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/19.jpg)
A Fancy World made of
Biostatistics
Averages & %s
![Page 20: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/20.jpg)
Population (descriptive) :
• Calculate Mean
μ (measures) • Calculate proportion
𝛑 (counts) • Calculate Stander deviation
σ
• Calculate Parameters: μ & 𝛑
Sample (Inferential) :
• Estimate Sample size
• Calculate Mean X
• Calculate Stander deviation S
• Calculate Stander error SE & 95% C.I (Confidence Interval)
• Calculate Statistics
Difference between studying populations & samples:
Estimate Parameters: μ & 𝛑
![Page 21: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/21.jpg)
END
![Page 22: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/22.jpg)
• Large samples > 30.
• Normally distributed.
• Descriptive statistics: Range, Mean, SD.
Non-parametric data
• For small samples & variables that are not normally distributed.
• No basic assumptions (distribution free).
• Descriptive statistics: Range, Rank, Median, & the interquartile range. (the middle 50 = Q3-Q1).
• Median is the middle number in a ranked list of numbers.
Parametric data
![Page 23: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/23.jpg)
Non-parametric data
• For small samples and variables that are not normally distributed.
• No basic assumptions (distribution free).
• Descriptive statistics: Range, Rank, Median, and the interquartile range (the middle 50 = Q3-Q1).
![Page 24: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/24.jpg)
Count
Quantitative Data
Discrete
Continuous
Binomial (Binary) :
Sex
Ratio (real zero) /
Interval (no zero)
Temperature/BP
Multinomial :
1-Categorical : Race
2-Ordinal: Education 3-Numerical: number pregnancies/residents
Measure
![Page 25: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/25.jpg)
Non-parametric data
• For small samples and variables that are not normally distributed.
• No basic assumptions (distribution free).
• Descriptive statistics: Range, Rank, Median, and the interquartile range (the middle 50 = Q3-Q1).
![Page 26: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/26.jpg)
Differences?
![Page 27: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/27.jpg)
Objectives
• Definitions.
• Types of Data.
• Data summaries.
• Mean Χ , Stander deviation S.
• Stander Error SE, Confidence interval C.I of μ .
![Page 28: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/28.jpg)
Quantitative Data
Discrete
Continuous
Dichotomous:
Binary: Sex
Multichotomous:
1-No order : Race
2-Ordinal: Education
Numerical: number pregnancies/residents
Ratio (real zero) /
Interval (no zero)
Temperature/BP
(Non-Parametric Data)
![Page 29: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/29.jpg)
Quantitative Data
Discrete
Continuous
Categorical :
1- Di-chotomous:
Sex
2- Multi-chotomous:
Race,Education
Numerical: number of
pregnancies/residents
Ratio (real zero) /
Interval (no zero)
Temperature/BP
Types of
Data Count
Non-Parametric Data
Parametric Data
Parametric Data
![Page 30: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/30.jpg)
Summaries
Visual Numerical
X, 𝛍, s, 𝛔 Histogram
P, 𝛑, s, 𝛔 Bar & Pie Chart (Counts) Categories
(Measures) Any value
![Page 32: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/32.jpg)
Normality & Approximation to Normality
Why?
![Page 33: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/33.jpg)
Approximation to Normality
• If choices are equally likely to happen
• If repeated numerous number of times
• It will look normal.
• Whether it was a coin or a dice
(Di-chotomous or Multi-chotomous)
![Page 34: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/34.jpg)
Normality & Approximation to Normality
Clinical Relevance?
![Page 35: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/35.jpg)
Choices equally likely to happen….. i.e. Out come of interest probability is unknown (Research ethics)
Repeated numerous number of times….
i.e. Large sample size
Normality assumption helps us predict the Probability of our outcome
![Page 36: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/36.jpg)
The Bell / Normal curve
Stander deviation(SD)/ sample curve True error (SE)/ population curve
• Was first discovered by Abraham de Moivre in 1733.
• The one who was able to reproduce it and identified it as the normal distribution (error curve) was Gauss in 1809.
![Page 37: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/37.jpg)
De Moivre had hoped for a chair of mathematics, but foreigners were at a disadvantage, so although he was free from religious discrimination, he still suffered discrimination as a Frenchman in England.
Born 1667 in Champagne, France
Died 1754 in London, England
![Page 38: Basics in Epidemiology & Biostatistics 2 RSS6 2014](https://reader034.fdocuments.net/reader034/viewer/2022052311/55895931d8b42acb638b4759/html5/thumbnails/38.jpg)
Largest Value - Smallest Value SD estimate