biostatistics basic

download biostatistics basic

If you can't read please download the document

description

for ppl interested in biostats

Transcript of biostatistics basic

  • 1. AN INTRODUCTION ON BIOSTATISTICS PART 1 Dr. Vaneet Aggarwal Department of Pharmacology 1

2. TOPICS : Definitions Data and types of data Methods of data presentations Types of stats Measure of central tendency Measure of dispersions The normal distribution 2 3. WHAT DOES STATISTICS MEAN ??? 3 4. STATISTICS: Principles and methods for collection, presentation, analysis and interpretation of data. BIOSTATISTICS: Tools of statistics applied to the data that is derived from biological sciences. 4 5. WHY DO WE NEED BIOSTATISTICS ? To define normalcy. To study the correlation or association between two or more attributes. Locate define and measure the extent of the disease. To evaluate the efficacy of drugs. To determine the success or failure of health care program. 5 6. QUALITY RESEARCH How does biostatistics help a post graduate student ? 6 7. DESCRIPTIVE AND INFERENTIAL STATISTICS Descriptive stats are concerned with the presentation, organisation, and summarization of data. Inferential stats allow us to generalise from our sample group of data to a larger group of subjects. 7 8. VARIABLES A variable is simply what is being observed or measured. 8 9. FLAVOURS OF VARIABLES DEPENDENT variable is the outcome of interest, which should change in response to some intervention. INDEPENDENT variable is the intervention, or what is being manipulated. 9 10. More generally if one variable changes in response to another, we say that dependent variable is the one that changes in response to independent variable. 10 11. DATA 11 Collective recording of observations. 12. 12 Depending upon source of data collection PRIMAR Y DATA SECONDAR Y DATA Interviews Examinations Questio Questionare Questionares RECORDS CENSUS DATA 13. 13 DATA QUALITATIVE DATA (Discrete/ frequency data) QUANTITATIVE DATA (Continuous data) Subjects with same characteristic are counted to form specific groups. They have magnitude. 14. SCALES OF DATA MEASUREME NT 14 15. 15 NOMINAL SCALE ORDINAL SCALE INTERVAL SCALE RATIO SCALE Non Categorical Categorical SCALES OF DATA MEASUREMENT 16. NOMINAL SCALE A nominal variable consists of named categories, with no implied order among the categories. "Existential" variables 16 17. ORDINAL SCALE An ordinal scale consists of ordered categories, where the difference between categories cannot be considered to be equal. 17 18. INTERVAL SCALE An interval scale has equal distances between values, but the zero is arbitrary. 18 19. RATIO SCALE A ratio scale has equal interval between values and a meaningful zero point. 19 20. SCALES AT GLANCE 20 Scale type Assumptions Nominal NAMED categories Ordinal Ordered categories Interval Equal intervals Ratio Meaningful zero 21. EXAMPLES OF SCALES Indicate whether the following variables are nominal, ordinal, interval or ratio. a) your income (assuming it's more than $0). b) a list of the different specialities in your profession. c) the ranking of specialities with regard to income. d) Salman Khan was described as a "10". What type of variable was the scale? e) a range of motion in degrees. 21 22. EXAMPLES OF SCALES f) a score of 13 out of 17 on the Anxiety Scale. g) staging of breast cancer as type I, II, III or IV. h) ST depression on the ECG, measured in millimeters. i) ST depression, measured as "1"+/- 1mm, "2"= 1 to 5mm, and "3" = 5mm. j) ICD-9 classifications: 0295=organic psychosis, 0296=depression and so on. k) diastolic blood pressure , in mm Hg. l) pain measurement on a seven-point scale 22 23. PROPORTION AND RATE A proportion is a type of fraction in which the numerator is subset of the denominator. Rate is a fraction that also has a time component. 23 24. DATA REPRESENTATION Graphing for nominal data Graphing the ordinal data Graphing the interval and ratio data 24 25. NOMINAL DATA Bar graphs Histograms Dot plots 25 26. EXAMPLE 26 COURSE NUMBER OF STUDENTS FORENSIC 25 PSM 42 PHARMACOLOGY 8 MICROBIOLOGY 12 PATHOLOGY 13 27. 27 0 12.5 25 37.5 50 forsensic psm pharma micro patho 28. 28 0 12.5 25 37.5 50 psm forensic patho micro pharma 29. 29 0 12.5 25 37.5 50 psm forensic patho micro patho 30. ORDINAL DATA Bar graphs Histograms Dot plot 30 31. INTERVAL AND RATIO DATA Histograms Stem and leaf plot Frequency polygon Cumulative frequency polygon 31 32. 32 33. 33 34. 34 35. 35 36. FOR MAKING HISTOGRAMS Rank order the data Find the range Choose the width New table giving you interval midpoint count etc Turn into histogram Lose some info on the way 36 37. 37 38. 38 39. 39 40. 40 41. PIC OF FREQUENCY POLYGON 41 42. CUMULATIVE FREQUENCY POLYGON 42 43. HISTOGRAM FOR 3 GROUPS 43 44. FREQUENCY POLYGON FOR 3 GROUPS 44 45. SO WHEN TO USE WHAT ? Bar graphs and Histograms can be used for all types of data. But when more than 2 groups than frequency polygon. Use graphs to show relationships not to report numbers. 45 46. 46 0 12.5 25 37.5 50 psm forensic patho micro pharma 47. 47 42% 25% 13% 12% 8% 48. 48 0 100 200 300 400 April May June July 49. TABLE 1 49 50. TABLE VERSION 2.0 50 51. TABLE VERSION 3.0 51 52. NUMBERS A specific data point - the value of variable for one subject is represented by capital letter X. We denote the mean of a variable by putting a bar over the capital letter X: X The number of subjects in the sample is represented by N. n indicates the sample size of a group. Use subscript notation to differentiate between various sample sizes data points etc. 52 53. NUMBERS Nominal data Ordinal data Ratio and interval data 53 54. 54 55. INTERVAL AND RATIO DATA Mean is the measure of central tendency. A measure of central tendency is the typical value for the data. 55 56. ORDINAL DATA Median is the measure of central tendency. The median is that value such that half of the data points fall above it half below it . 56 57. NOMINAL DATA Mode is the measure of central tendency. The mode is the most frequently occurring category. 57 58. MEASURE OF DISPERSION Refers to how closely the data cluster around the measure of central tendency. 58 59. NOMINAL / ORDINAL DATA Index of dispersion 59 60. ORDINAL DATA Range (diff between highest and lowest values) Interquartile range / midspread 60 61. 61 62. INTERVAL OR RATIO DATA Interquartile range The mean deviation The variance and the standard deviation The coefficient of variation 62 63. Mean deviation 63 64. VARIANCE AND STANDARD DEVIATION 64 65. COEFFICIENT OF VARIANCE 65 66. 66 67. BOX PLOTS 67 68. N/C: SETTING THE SCENE A survey of schools found that the most widely used method to get out of going to school is "not today mom I have a headache". Based on the survey of 2000 student it was found to be used an average of 100 times a year with a SD of 15. Can we determine what proportion of students use this reason at least 115 times a year; or fewer than 70 times a year; or between 106 and 112 times annually ? 68 69. STANDARD SCORE 69 70. NORMAL DISTRIBUTION Assumption for statistical tests Mean and variance independent All natural phenomenon Distribution of means 70 71. 71 72. 72 73. NORMAL CURVE Mean, median, mode Symmetrical Approaches the x axis asymptotically 73 74. HOW MANY STUDENTS 115 TIMES ? 74 75. FEWER THAN 70 TIMES IN A YEAR ? 75 76. BETWEEN 106 -112 TIMES ? 76 77. 77 78. Winifred Castle Most researchers use statistics the way a drunkard uses a lamp-post more for support than illumination. 78