Biostatistics
-
Upload
preston-house -
Category
Documents
-
view
24 -
download
0
description
Transcript of Biostatistics
2006
Biostatistics
Primary MMed (Anaesthesia)
Writing a study protocol introduction
research question current knowledge research hypothesis research objective
Writing a study protocol methodology
type of study experiments
prospective (randomized, control) observational
prospective retrospective audit
Writing a study protocol methodology
sample size calculation guided by expected difference in either
proportion or numerical data, standard deviation of known data, α and β values, and intended power of study
plan to recruit more subjects in case of drop outs
size of drop outs will affect the basis of assumptions and the initial sample size calculation, and the research hypothesis may show no significant difference (P>0.05)
Writing a study protocol methodology
patient consent inclusion criteria exclusion criteria, restriction process of randomization procedure (how to go about carrying out the
data collection) control group test group(s)
Writing a study protocol methodology
monitoring of the patients during the period of procedure
routine - heart rate, blood pressure, oxygen saturation
data / observations side effects of test protocol
rescue therapy safety of patient
treatment plan criteria for withdrawal from study
Writing a study protocol statistical analysis
the data will be subjected to a test for normality
statement about treatment of normal or nonparametric distribution of data
normal distribution expressed as mean and standard
deviation, with 95% confidence interval t-test for comparison of means obtained
from 2 groups of data analysis of variance test for comparison of
means obtained from more than 2 groups Χ2 test for discrete data
Writing a study protocol statistical analysis
non-normal distribution expressed as median and range, with
limits of 25th and 75th percentiles Mann-Whitney U test for analysis of data
from 2 groups Kruskal-Wallis test for analysis of data
from more than 2 groups state the P value and decide on the
significance level, conventionally it is P<0.05
submit protocol for ethics committee approval
2006
The null hypothesis
H0
Null hypothesis study hypothesis
investigator conducting a study usually has a theory in mind
however, very difficult to prove the hypothesis
simpler to disprove a hypothesis than proving it
null hypothesis differences observed is not due to exposure
to factor, and is by chance always phrased in the negative and that is
why it is termed null
2006
Types of research
Longitudinal studiesCross-sectional studies
Types of research - longitudinal study investigates a process over time
the effect of external factors on human subjects
3 types clinical trial, a cohort study, case-control
study prospective or retrospective studies
in prospective studies, subjects are grouped according to ‘exposure’ to some factor
in retrospective studies, subjects are grouped according to outcome, the ‘exposure’ effect is then determined retrospectively
Types of research - cross-sectional study describes a phenomenon fixed in time
description of staging system for cancer laboratory studies of biological processes
Comparisonlongitudinal studies prospective studies
according to “exposure”
randomised non-randomised
observational studies
retrospective studies according to
outcome determine
“exposure”
cross sectional studies disease description diagnosis and
staging abnormal ranges disease severity
disease processes
Randomised clinical trial randomisation
is a procedure in which the play of chance enters into the assignment of a subject to the alternatives (control and test groups) under investigation, so that the assignment cannot be predicted in advance
tends to produce study groups comparable in unknown as well as known factors likely to influence outcome apart from the actual treatment being given itself
guarantees that the probabilities obtained from statistical tests will be valid
Randomised clinical trial - designs parallel designs
one group receives the test treatment, and one group the control
cross-over designs the subjects receive both the test and the
control treatments in a randomised order each subject acts as own control, allowing
paired or matched analysis, and provides an estimate of the difference between test and control
useful in chronic disease that remain stable over time, such as diabetes and asthma, where the purpose of treatment is palliative, not cure
Designs of a randomised clinical trial: parallel, cross-over
Ran
dom
isati
on
Ran
dom
isati
on
Test
Control
Test
Control
Ass
ess
men
tControl
Test
Ass
ess
men
t
A parallel design clinical trial
A two period cross-over design clinical trial
Ass
ess
men
t
Problems of cross-over design cross-over effect
possibility that the effect of the particular treatment used in the first period will carry over to the second period, and may interfere with how the treatment scheduled for the second period will act, thus affecting the final comparison between the two treatments
to allow for this possibility, a washout period, in which no treatment is given, should be included between successive treatment periods
Problems of cross-over design disease may not remain stable over the trial
period subject drop-outs
more will occur in this design trials than in parallel design trials, due to extended treatment period
Non randomised studies historical controls
when randomisation is not possible problem of bias selection already
occurring - those who did not receive transplants may be more ill or may not have satisfied the criteria
survival of patients who received heart transplants and patients who did not
Non randomised studies pre-test–post-test studies
a group of individuals are measured, then subjected to treatment or intervention, and then measured again
purpose of the study is to study the size of the effect of treatment or intervention (e.g. campaign)
major problem is ascribing the change in measurement to the treatment since other factors may also have changed in that interval
Cohort a cohort is a component of a population
identified so that its characteristics can be ascertained as it ages through time designated groups of persons either born in
a certain year or traced over a period of time (who ever worked in a factory)
Cohort studies cohort study
one in which subsets of a defined population can be identified who have been exposed (or will be exposed) to a factor which may influence the probability of occurrence of an outcome (given disease)
usually confined to studies determining and investigating aetiological factors and do not allocate the equivalent of treatments
also for post-marketing surveillance, comparing adverse effects of new drug with alternative treatment
may be referred to as follow-up, longitudinal or prospective study
often termed observational studies, since they observe the progress of individuals over time
Progress of a cohort studyPop
ula
tion With
disease
Without disease
Exposed
Not expose
d
With disease
Without disease
With disease
Without disease
Time
Problems with cohort study exposure to factor may be by natural selection
or up to the individuals’ decision bias may influence the measure of interest other associated factors may also influence
measure of interest e.g. cohort study of cardiovascular risk in
men sterilised by vasectomy e.g. incidence of breast cancer with
consumption of alcohol
Problems with cohort study size of the study
the required size of a cohort study depends not only on the size of the risk being investigated but also the incidence of the particular condition under investigation
cohort studies not suitable for investigating aetiological factors in rare diseases
Problems with cohort study problems with interpretation
bias pool of study subjects when cohort is made up of employed
individuals, the risk of dying in the first few years of follow up is less than in general population, this is known as healthy worker effect; people who are sick are less likely to be employed
incomplete representation e.g. people who respond (to
questionnaires) and people who are lost to follow up
Case-control studies case-control study
also known as a case-reference study or retrospective study
starts with identification of persons with the disease (or other outcome variable) of interest, and a suitable control group of persons without the disease
the relationship of a risk factor to the disease is examined by comparing the two groups with regard to how frequently the risk factor is present
Case-control studies
Subjects with disease (case)
Subjects without disease (control)
Exposed
Not exposed
Exposed
Not exposed
Time(Retrospective
)
Case-control studies designs
matched design control subjects can be chosen to match
individual cases for certain important variables such as age, gender and weight
unmatched design controls can be a sample from a suitable
non-diseased population
Case-control studies selection of controls
not required that the control group are alike in every aspect to the cases, usually 2 or 3 variables which presumably will influence outcome are matched, such as age, gender, social class
main purpose is to control for confounding variables that might influence the case-control comparison
confounding arises when the effects of two processes are not separated, e.g. disease related to 2 exposure factors
Case-control studies selection of controls
matching can be wasteful if matching criteria leads to many available controls being discarded because they fail the matching criteria
if controls are too closely matched to their respective cases, the relative risk may be underestimated
Limitations of case-control studies ascertainment of exposure relies on previously
recorded data or on memory, and it is difficult to ensure lack of bias between the cases and controls the cases may be more motivated to recall
possible risk factors difficulty with selection of suitable control
group a major source of criticism
Cross-sectional studies subjects are included without reference to
either their exposure or their disease usually deals with exposures that do not
change, such as blood type, or chronic smoking habit
resembles a case-control study except that the number of cases are not known in advance, but are simply the prevalent cases at the time of survey
Cross-sectional studies sampling methods
quota sample to ensure that the sample is
representative of general population in say, age, gender and social class structure
not recommended in medical research grab or convenience sample
only subjects who are available to the interviewer can be questioned
Cross-sectional studies problems
bias in the type of responders and non-responders
exposures have to be determined by a retrospective history
2006
Calculation of sample size
Calculation of sample size consider
control group response the anticipated benefit (of the treatment) significance level power
Control group response it is first necessary to postulate the response
of the control group patients denoted by π1, to distinguish it from the value
that will be obtained from the trial, denoted p1
experience of other studies may provide π1
The anticipated benefit it is also necessary to postulate the size of the
anticipated response in treatment group patients, denoted by π2, to distinguish it from the value that will be obtained from the trial, denoted p2
anticipated benefit δ = π2 - π1
Type I error the error of incorrectly rejecting the null
hypothesis when it (the hypothesis) is true the error of concluding that the differences
seen in the result is significant when in fact it is not
wrongly accepting that differences in the results as significant when there is no difference
designated α equivalent to the false positive rate (1-
specificity)
One or two-sided null hypothesis (H°) is that there is no
significant difference, and chance has occurred no assumption about the direction of change
or variation alternate hypothesis states that the difference
is real, further that it is due to some specific factor, where no direction of change is specified
both ends of the distribution curve are important, and the test of significance is two-sided, or two-tailed
where the direction is specified, then only one tail of the curve is relevant, and the test of significance is one-sided, or one-tailed
One or two-sided the critical value is the value of a test statistic
at which we decide to accept or reject H° critical value for a one-sided test at
significance p, will be equivalent to that for a two-sided test at 2p
one-sided p = 0.025 / two-sided p = 0.05 thus, it is tempting to use one-sided tests as
the significance is greater, but the decision should be made before the data is collected, not after the direction of change is observed and should be clearly stated when presenting results
One or two-sided
Frequency
μ = 0
P<2.5% (one-sided)
P<5% (two-sided)
1.96σ z
Significance level the significance level, α, is the probability of
making a Type I error and is set before the test is carried out
in most cases, it will be two-sided
Significance level denoted by the letter P, represents the
probability of the observed value being due solely to chance variation
can be interpreted as the probability of obtaining the observed difference, or one more extreme, if the null hypothesis is true
the smaller the value of P, the less likely the variation is to be due to chance and the stronger the evidence for rejecting the null hypothesis
Significance level most scientific work, by accepted convention,
rejects the null hypothesis at P < 0.05 probability of the observed value being due
solely to chance is < 0.05 (or < 1 in 20) this means that we shall reject the null
hypothesis on 5% of occasions, when it is in fact true, i.e. there was simply a chance variation and that the 2 treatment are equally effective
Significance levels α
the probability that a random variable, (Normally distributed with mean = 0 and standard deviation = 1) will be greater than z or less than -z
z the value on the horizontal
axis of a Normal distribution corresponding to the probability α
α2 α2
–zα +zα
Significance level if α is 0.05, then the corresponding z is 1.96
to link z with the corresponding α, we write z0.05 = 1.96
zα is the value along the axis of a Normal distribution
thus 0.05/2 = 0.025 is to the left of z = –1.96 0.025 is to the right of z = +1.96
Type II error error in incorrectly accepting the null
hypothesis of no difference between treatments, when it (the hypothesis) is in fact false (and should be rejected) accepting the null hypothesis when there
should be significant difference in the results accepting that the differences seen in the
result is not statistically significant and making the conclusion P>0.05
the probability of making a type II error is designated β
equivalent to the false negative rate (1-sensitivity)
Type II error 2 main factors produce β error
chance alone unusual data sample which does not
support a difference statistical methods can produce incorrect
conclusions too small a sample size
the smaller n, the greater must be the real difference before statistical difference may be shown
usually set at a value of β = 0.2 or 20%
Power the probability that a study can predict a
difference, when a real difference actually exists, is termed the statistical power of the study
it is the probability of rejecting the null hypothesis when it (the hypothesis) is false
first decide how much false negative or type II error (β) rate is reasonable power equals 1-β
the higher the power of the study, the smaller the difference which may be detected
Statistical Testing Errors
Real difference No Yes
Significance No effect1-α Type II errortests seen β-error
Effect Type I error Power seen α-error 1-β
Calculation of sample size calculations depend on a function (zα + z2β)2,
where zα and z2β are the ordinates for the normal distribution the value of z for the corresponding α and 2β
are read off from the Table of Normal distribution
Table to assist in sample size calculations, α = 0.05 β Power (1-β) z2β zα (zα +z2β)2
0.3 0.7 0.524 1.960 6.1720.2 0.8 0.842 1.960 7.8490.1 0.9 1.282 1.960
10.507
Calculation of sample size comparison of proportions
wish to detect a difference in proportions δ = π2 - π1
e.g. response rate to placebo and treatment drug
for Χ2 test, the number in each group should be at least m = (zα + z2β)2{π1(1-π1) + π2(1-π2)}
δ2
Calculation of sample size comparison of means (unpaired data)
wish to detect a difference in means δ = μ2 - μ1
the number in each group should be at leastm = 2(zα + z2β)2 σ2
δ2 where σ is presumed to be the same with
both drugs
Calculation of sample size comparison of means (paired data)
with cross over trial the number in each group should be at least
m = (zα + z2β)2 σw2
δ2 where σw is the standard deviation of the paired difference between treatment
2006
Randomisation
Methods of randomisation simple randomisation
either manually by tossing coin or throwing a six-sided die or from table of random numbers
a good method in large trials but does not guarantee equal numbers of
patients in each of the two groups in smaller trials there is a high chance of
getting notable imbalance between the groups
groups of different sizes presence of allocation bias
!
Methods of randomisation random permuted block (RPB) randomisation
a method for ensuring that group sizes never get too far out of balance and avoids assigning different numbers to each study group
combinations (e.g. AABB, ABBA) obtained from blocked randomisation can be assigned numbers (say 1-6), the sequence of the combinations can then be dictated by numbers from table of random numbers
a potential problem with the method is that if the block length becomes know, the method is predictable and selection bias can arise
randomly varying block length can help
Methods of randomisation stratified randomisation
if there are important prognostic factors which, if they were distributed unequally between the treatment groups, would give rise to a serious bias, then it may be prudent to intervene in the randomization process to ensure balance between these factors to ensure balanced treatment allocation for patients within each group or centre
in stratification, RPBs are used within each stratum defined by the prognostic factors
stratification can be cumbersome if there are too many prognostic factors and minimization is a method which is can provide balance in a less cumbersome way
Methods of randomisation unequal randomization (1:2, 2:1 for sample
sizes of test : control groups) although maximum power can be obtained
when the allocations to the groups are in the ratio 1:1, the loss in power is slight if the ratio departs only slightly from 1
there can be practical advantages to unequal allocation, which might be worth considering in some applications.
Methods of randomisation carrying out randomisation
randomisation list should be prepared and held by a person not involved in the investigation and not the investigator determining patient eligibility
this person serves as a check for the trial when a patient is confirmed as eligible for
the trial, randomisation is then revealed over the telephone by opening sequentially numbered
envelopes
2006
Data description
Types of data qualitative data (varying by description)
nominal data ordered categorical or ranked data or
ordinal data numerical or quantitative data (varying by
number) numerical discrete data (differ only by fixed
amount) numerical continuous data (differ by any
amount)
Qualitative data nominal data
data that one can name or describe not measured but simply counted expressed in number (or frequency) or
percentage (or relative frequency) 2 or more groups of observation
2 groups - gender of patients, did or did not get exposure to factor
> 2 groups - blood groups, racial groups, anaesthetists, surgeons
Qualitative data ordered categorical or ranked data
if there are more than two categories of classification, it may be possible to order them in some way or assign ranks to categories to facilitate statistical analysis
e.g. ordinal data : mild, moderate, severe
Vascular surgeon
A(1) B(2) C(3)
Post
op
era
tiv
e b
leed
ing
(m
l)
-
-
-
-
xx xxxxxxxxxx
x xxxxxx
xxxxx
xxxxx xxxxx xx
Quantitative data numerical discrete
data consists of counts number of general anaesthetics and
regional anaesthetics performed in this hospital in a year
number of anaesthetic trainees passing the Final MMed in the past 5 years
Quantitative data - numerical continuous such data are measurements that can take
any value in a given range age, estimated blood volume, hourly blood
loss interval scale,
position of zero is arbitrary, a difference between two measurements has meaning, but not their ratio
meaning may change if a ratio or percentage is applied to a different scale, e.g. 10% increase in body temperature in Celsius and Fahrenheit scales
Quantitative data - numerical continuous ratio scale,
value of zero has real meaning, negatives are invalid
meaning does not change when ratio or percentage is applied to different units of measure, e.g. 10% increase in body weight in kilograms or pounds
continuous data is often dichotomised to make nominal data and then ordered or ranked for statistical analysis e.g. diastolic blood pressure which is
continuous can be converted to hypertension (>90 mmHg) or normotension (90 mmHg)
Summary of measurements nominal
name, birthplace ordinal
mild | moderate | severe interval
meaningful distance between values ratio
allows study of absolute magnitude
2006
Summarising data
Summarising data for the study measurement of location or central tendency
mean or arithmetic average interval or ratio of a quantitative variable
median and quartiles interval or ratio, (± ordinal)
mode nominal, ordinal, interval or ratio
measurement of dispersion or variability range or interquartile range standard deviation
measures of symmetry
2006
Measurement of location or central tendency
the mean (x, pronounced xbar) or average of n observations is the sum of the observations, Σx divided by their number, n (arithmetic average)
advantage mean uses all the data values, is statistically
efficient disadvantage
vulnerable to outliers, single observations (not erroneous
measurements) which if excluded from the calculations, have noticeable influences on the results
Mean or average
=x =sum of all sample valuessize of sample
Σxn
Median and quartiles lower, median and upper quartiles divide the
data into 4 equal parts approximately equal numbers of
observations in the 4 sections (equal only when n is divisible by 4)
estimation of quartiles the data is first ordered from smallest to
largest, and then counting upwards the number of observations
Median median or middle quartile
value above and below which half the measurements fall
for odd number of observations, is the observation at the centre of the ordering
for even number of observations, is the average of the ‘middle’ two observations
advantage not affected by outliers
disadvantage not statistically efficient as it does not make
use of all the individual data values
Lower and upper quartiles practical method of calculating lower or upper
quartiles are by the stem-and-leaf plot observations
10,13,20,20,22,22,23,24,25,25,27,28,30,30,30,31,31,32,32,33,34,35,35,36,37,38,38,39,39,41,41,41,42,42,43,43,44,44,46,47,48,50,50,51,52,54
2 1 0310 2 002234557817 3 0001122345567889912 4 111223344678 5 5 0012446
Percentile 25th percentile
the value above which 75 percent of the observed cases fall and below which 25 percent of the observed cases fall
50th percentile the median, the value above and below
which half of the observed values of a variable fall
75th percentile the value above which 25 percent of the
observed values of a variable fall and below which 75 percent of the observed values of a variable fall
Mode this is the value that occurs most frequently,
or if the data is grouped, the grouping with the higher frequency
not much use in statistical analysis as its value depends on the accuracy with which the data are measured
bimodal distribution describes a distribution with two peaks in it
2006
Measure of dispersion or variability
Measure of dispersion or variability range
difference between minimum & maximum values
interquartile range (largest value 3rd quartile) - (largest value
1st quartile) standard deviation degrees of freedom
Range or interquartile range range
is given as the smallest and the largest observations
vulnerable to outliers interquartile range
the distance between the 25th and 75th percentile
not vulnerable to outliers displayed as box-whisker plots
Post
opera
tiv
e b
leedin
g
(ml)
------ A B C
Surgeon
Standard deviation, s a measure of dispersion, i.e. how far variables
are away from their mean, often abbreviated as SD expressed in the same units of measurement
as the observations
the value Σ(x-x)2 is interpreted as from each x value subtract the mean x, square this difference, then add each of the n squared differences
n-1 (or the degree of freedom) compensates for small sample sizes (n < 30)
and higher probability of falling outside the SD
Σ(x-x)2
n-1s =
Standard deviation, s s reflects the variability in the data
if x’s are widely scattered about x, then s would be large
variance a measure of the dispersion of values about
the mean the square of the standard deviation
coefficient of variation expresses the SD as a percentage of the
sample mean c.v. = s 100%
x
Σ(x-x)2
n-1s2 =
Standard deviation, s for data with a normal distribution, there is
a 68.26% chance that the actual value will be within 1 standard deviation above or one standard deviation below the mean value
a 95.45% chance that the actual value will be within 2 standard deviations
a 99.7% chance that the actual value will be within 3 standard deviations
68%x-
34.13% (1SD)
47.73% (2SD)
49.85% (3SD)
34.13% (1SD)
47.73% (2SD)
49.85% (3SD)
Measures of symmetry symmetric distribution
if the distribution is symmetric then the median and mean will be close
data expressed as mean and standard deviation
skewed distribution a distribution is skewed to the right (left) if
the longer tail is to the right (left) data expressed as median and interquartile
range mean and standard deviation are sensitive
to the skewness
Skewness an index of the degree to which a distribution
is not symmetric, or to which the tail of the distribution is skewed or extends to the left or right
calculation of skewness, sk
normality can be confirmed if mean and median are close
skewness is used, along with the kurtosis statistic, to assess if a variable is normally distributed
sk = 3(mean-median) SD
Skewness the normal distribution
is symmetric, and has a skewness value of zero
a distribution with a significant positive skewness has a long right tail
a distribution with a significant negative skewness has a long left tail
+1
-1
Kurtosis a measure of the extent to which observations
are clustered in the tails kurtosis can be used, along with the skewness
statistic, to assess whether a variable is normally distributed
for samples from a normal distribution, the values of kurtosis will fluctuate around 0
Kurtosis for a normal distribution,
the value of the kurtosis statistic is 0
if a variable has a negative kurtosis, its distribution has lighter tails than a normal distribution
if a variable has a positive kurtosis, a larger proportion of cases fall into the tails of the distribution than into those of a normal distribution
+ve+ve
-ve -ve
Normal probability plot normality of
observation can be confirmed from the Normal probability plot
Normal ordinates, z
Ob
serv
ati
on
xxxxxxxxxxx
Normality bell-shaped distribution of a continuous
variable symmetrical about its mean median and mean will be close skewness value is zero kurtosis value is zero normal probability plot
sk = 3(mean-median) SD
Mean or median? mean and median
convey different impressions of the location of the data
both give useful information if the distribution is symmetric, mean is a
better summary statistic if the distribution is skewed, the median is less
influenced by the tails for nominal or ordered categorical data, mean
is the proportion of each group
2006
Generating data from sample to population
Populations and samples a population is a theoretical concept used to
describe an entire group (any collection of people, objects, events, or observations) this is usually too large and cumbersome to
study so investigation is usually restricted to one or more samples drawn from the study population
Populations and samples samples are taken from populations to provide
estimates of the population parameters the purpose of summarising the behaviour
of a particular group is usually to draw some inference about a wider population of which the group is a sample, such as determining the reference normal range
statistics describes the sample parameters describe characteristics of the
population
Populations and samples to allow true inferences about the study
population from a sample there are a number of conditions, the study population must be clearly defined every individual in the population must have
an equal chance of being included in the sample, i.e. a random sample
random does not refer to the sample, but the manner in which it was selected
the opposite of random sampling is purposive sampling, i.e. every 2nd patient
Sampling errors sampling errors
the smaller the sample size, the greater the error
the greater the variability of the observations, the greater the error
non-sampling errors these do not necessarily decrease as the
sample size increases result in bias or systematic distortion of the
results
Central Limit Theorem definition: even when the variable is not
normally distributed the sample mean will tend to be normally distributed
if random samples of n measurements are repeatedly drawn from a population with a finite mean μ, and a standard deviation σ, then when n is large, the relative frequency histogram for the (repeated) sample means will tend to be distributed normally
2006
Normal distribution
Normal distribution bell-shaped distribution of a continuous
variable symmetrical about its mean described by
population mean, μ population standard deviation, σ bell is tall and narrow for small standard
deviation, and short and wide for large standard deviation
a skewed distribution can be transformed into Normal distribution shape by taking the logarithm of the measurements or working with the square root of the observations
σ
μ
Standard Normal Distribution
Frequency
μ = 0
2.5%
1.96σ
σ = 1
0
95%
2.5%
+1
S
D
-1 S
D
-1.5
S
D
-2 S
D
Scoring of fine motor skills
Frequency
above average
average
borderline
impaired
severely impaired
+2
S
D
68.26%
IQ Curve
Normal distribution mathematical property of the Normal
distribution 68.26% of the distribution lies between μ ±
1σ 95.45% of the distribution lies between μ ±
2σ μ – 1.96 σ and μ + 1.96 σ (for exactly
95%) 99% of the distribution lies between μ ± 3σ
μ – 2.58 σ and μ + 2.58 σ
1-α
α/2 α/21.96σ
1.96σ
Normal distribution in practice, the parameters μ and σ must be
estimated from the sample data for this purpose, a random sample from the
population is first taken if the sample is taken from a Normal
distribution, and provided that the sample is not too small, similarly, approximately 95% of the collected data will be within x – 1.96 s to x + 1.96 s
1.96 is the 5% percentage point of the normal distribution
2.58 is the 1% percentage point of the normal distribution
Standard error SD(x)/n the standard deviation of the sampling
distribution for a statistic can apply to: mean, difference between
means, skewness, kurtosis, Pearson correlation, regression coefficient, proportion, difference between proportion
a measure of how much the value of a test statistic may vary from sample to sample
Standard error of the mean standard deviation of the mean, SD(x) or
standard error of mean SE(x) or SE defines the precision with which a mean is
estimated SE(x) or SD(x) = SD(x)/n or s/n
Exercise: Calculation of SE Mean (x) alanine aminopeptidase value for 25
subjects is 1.0 U; SD is 0.3 U SE = SD/√n
= 0.3/√25= 0.3/5= 0.06
Comparing s and SEstandard deviation, s is a measure of the
variability between individuals with respect to the measurement under consideration
describes the sample
standard error of mean is a measure of the
uncertainty in the sample statistic,
always refers to an estimate of a population parameter
the larger the sample size, the smaller the standard error of the mean
Standard error of skewness a measure of the variability of the skewness
statistic examine how far it is from zero by dividing the
measure of skewness by its standard error the larger the absolute value of this
quotient, the less reasonable it is to assume that the variable comes from a distribution with zero skewness, such as the normal distribution
Standard error of kurtosis a measure of the variability of the kurtosis
statistic examine how far it is from zero by dividing the
measure of kurtosis by its standard error (SE Kurt) the larger the absolute value of this
quotient, the less reasonable it is to assume that the variable comes from a distribution with zero kurtosis, such as the normal distribution.
Confidence interval gives an estimated range of values which is
likely to include an unknown population parameter, the estimated range being calculated from a given set of sample data
if independent samples are taken repeatedly from the same population, and a confidence interval calculated for each sample, then a certain percentage (confidence level) of the intervals will include the unknown population parameter
confidence intervals are usually calculated so that this percentage is 95%, but 90%, 99%, 99.9% confidence intervals for the unknown parameter can be produced
Confidence interval the width of the confidence interval gives us
some idea about how uncertain we are about the unknown parameter
a very wide interval may indicate that more data should be collected before anything very definite can be said about the parameter
confidence intervals are more informative than the simple results of hypothesis tests (where H0 is rejected or not) since they provide a range of plausible values for the unknown parameter
Confidence limits (xx, yy) the lower and upper boundaries or values of a
confidence interval the values which define the range of a
confidence interval
Confidence interval for a mean define a range of values within which the
population mean μ is likely to lie that is, a range of values that is likely to
cover the true but unknown population mean value (say, 95% of time)
95% confidence interval for a large sample (>60) 95% CI = x 1.96 s/n where s/n is
SE(x)
95% confidence interval
α/2 α/21.96σ 1.96σ
Confidence interval for a mean the upper and lower values are the 95%
confidence limits a reported CI from a particular study may or
may not include the actual population mean but if the study were to be repeated 100
times, of the 100 resulting 95% CI, we would expect 95 of these to include the population mean
Confidence interval for a mean small samples
less precise statements about population parameters can be made than with large samples
x and s will not always be necessarily close to μ and σ, respectively
sample size is already taken into account in the calculation of the standard deviation of the mean, SD(x), using n, i.e. s/n
of practical importance only when the sample size is very small (less than 15) and when the distribution in the population is extremely non-normal
Exercise: Calculation of CI Mean (x) alanine aminopeptidase value for 25
subjects is 1.0 U; SD is 0.3 U Calculate 95% CI 95% CI = x + (1.96 x SE) where SE
= SD/√n= 1.0 + (1.96 x 0.3/√25)= 1.0 + (1.96 x 0.06)= 1.0 + (0.1176)
95% CI is from 0.8824 to 1.1176 lower and upper boundaries of 95% CI =
(0.8824, 1.1176)
-
-
2006
t distribution
t distribution t distribution with (n–1) degrees of freedom
introduced by WS Gossett, who used the pen-name ‘Student’, and is often called Student’s t distribution
like the normal distribution, the t distribution is a symmetrical bell-
shaped distribution with a mean of zero, but is more spread out, having longer tails
the exact shape of the t distribution depends on the degree of freedom (d.f.), n–
1, of the standard deviation s degrees of freedom refers to the number of
observations completely free to vary, the fewer the degrees of freedom, the more the t distribution is spread out
Normal distribution, t-distribution
α/2 α/2
–zα +zα
α/2 α/2
–tα +tα
Confidence interval using t distribution confidence interval
is calculated using t’, the appropriate percentage point of the t distribution with (n–1) degrees of freedom
small sample CI = x (t’ s/n) for small degrees of freedom,
the percentage points of the t distribution are larger in value than the corresponding percentage points of the normal distribution
because sample standard deviation s may be a poor estimate of the population σ, and when this uncertainty is taken into account, the resultant CI is wider
t-Test reflecting this increased conservatism, the
critical value for the t-test
Significance t-Test z-TestP<0.05 2.26 1.96P<0.01 3.25 2.58
2006
Statistical tests
the type of test used depends upon the sample size
Non-parametric tests UseWilcoxon signed rank Test oftest difference
between paired observations
Wilcoxon rank sum test Comparison of 2 Mann-Whitney U test groupsKruskal-Wallis one-way Comparison of analysis of variance several groupsSpearman rank Measure of correlation association
between 2 variablesΧ2 goodness of fit test Comparison of
an observed frequency distribution with
a theoretical one
Parametric testsPaired t test
2-sample t test
One-way analysis of variance
Pearson correlation
Tests statistics a statistic derived from sample data, used to
measure the difference between the observed data and what would be expected under the null hypothesis: z-statistics
principal of the standard normal deviate t-statistics
small samples, with limited degrees of freedom
Χ2-statistics categorical or qualitative variables
Normal test, z test the Normal test or z test requires that,
the sample size is large (n > 30) the population standard deviation, σ is
known the variable is assumed to be normally
distributed commonly the population σ is unknown,
however it is possible to use the sample standard deviation, s as an estimate of σ
Large sample if the sample size is large, n > 30
then the sample standard deviation, s, is considered to be an adequate estimate of the population σ
thus, the standard error of the sample mean becomes,
SE(x) = (s/√n) under these circumstances the z-test can
again be used to test the significance of the difference between the population mean μ and the sample mean x'
assuming the population fits a normal distribution
Small samples if n < 30
the sample standard deviation, s, is not an adequate estimate of the population σ
Student's t-test is employed (Gosset in 1908)
the t-distribution describes a series of curves,
dependent upon the number of degrees of freedom
as for the normal distribution, these are symmetrical with a mean μ = 0
Paired samples t-test attributes and demographic data, disease
condition are matched to make two groups of subjects as similar as possible
the two groups can be two groups of subjects in a matched
case-control study can be of the same subjects observed before
and after a treatment as in a cross-over trial this test is a statistical test of the null
hypothesis that two population means are equal
any observed differences between the groups, if statistically significant, can be attributed to the variable of interest
Paired t-test also known as the related test, or matched
test paired t = x /(s/n), d.f. = n-1
the corresponding P value or significance level, is obtained from the Table of percentage point for Student’s t distribution
95% CI = x (t0.05 s/n)
One sample t-test tests whether a sample mean is different from
some specified value, μ, which need not be zero
t = (x-μ) / (s/n), d.f. = n–1
Two sample or unpaired t-test also known as independent sample, or
unrelated test, the t test is used for small samples (n<30)
for analysing data in 2 groups of subjects in a parallel group clinical trial or the unmatched case-control study
requires that the population distributions are normal
when comparing 2 means, the validity of the t test also depends on the equality of the 2 population standard deviations
Two sample or unpaired t-test the standard error for the difference between
the means, SE (x1-x2) = s(1/n1 + 1/n2) where s is the common standard deviation
and is derived from s1 and s2
the t value is calculated as t = (x1-x2) / s(1/n1 + 1/n2), d.f. = n1 +
n2 - 2 confidence interval is
CI = (x1-x2) ± (t’ SE (x1-x2))
Small samples, unequal SD first approach is to seek a suitable change of
scale to remedy this, so that the t test can be used taking logarithms of the individual values
alternatives are to use a non-parametric test or to use either the Fisher-Behrens or the Welch tests
2006
Comparison of several means
analysis of variance
Analysis of variance the t-test is generalised to more than 2 groups
by means of a technique termed analysis of variance
for this method, there are both between- and within-groups degrees of freedom the between-groups and within-groups
degrees of freedom are quoted in this order for every case
depending on the number of factor(s) included for analysis
one-way analysis of variance two-way analysis of variance
One-way analysis of variance used to compare the means of several groups one-way analysis of variance is used when the
subgroups to be compared are defined by just one factor e.g. comparison of means between different
socioeconomic classes, or different ethnic groups, or by a disease process
this method assesses how much of the overall variation in the data is attributed to differences between the group means, and comparing this with the amount attributable to differences within group
Two-way analysis of variance used when the data are classified in 2 ways
e.g. by age-group and gender balanced and unbalance study designs
a balanced design if there are equal numbers of observations in each group
an unbalanced design if there are not equal numbers of observations in each group
multiple regression test can also be applied
2006
Non-normal distribution
non-parametric tests
Non-parametric tests non-parametric statistical tests are for
analysing numerical data that make no assumption about the underlying normality of distribution
particularly useful when there is obvious non-normality in a small data set which cannot be corrected with a suitable transformation
Wilcoxon signed rank test a non-parametric statistical test equivalent of
the paired t test, analysing differences between paired observations
it makes no assumptions about the shapes of the distribution of the two variables
the absolute values of the differences between the two variables are calculated as (+/-) for each case and ranked from smallest to largest
the test statistic is based on the sums of ranks for negative and positive differences
Wilcoxon signed (+/-) rank test procedure
the absolute values of the differences between the paired observation are calculated (+/-) for each case and ranked from smallest to largest
exclude any differences which are zero, then rank in order, ignoring signs (i.e. + or –)
2 pairs having the same difference are given the mean of what would have been their successive ranks (i.e. 2nd & 3rd would have been ranked as 2.5 & 2.5)
add up the ranks of positive differences and negative differences separately
each of the (+) & (–) ranks is totaled, and the smaller referred to the Table of critical values for Wilcoxon matched pairs signed rank test for P value
Wilcoxon rank sum test a non-parametric equivalent of the unpaired t
test or two-sample test procedure
rank the observations from both groups together in ascending order of magnitude
if any of the values are equal, average their ranks
add up the ranks in the group with the smaller sample size
compare this sum with the critical ranges in the Table of critical ranges for the Wilcoxon rank sum test
Mann-Whitney U a non-parametric equivalent of the unpaired t
test or two-sample test that two independent samples come from the same population
similar approach as Wilcoxon rank sum test with entirely comparable results
Kruskal Wallis one way analysis of variance a non-parametric equivalent of the one way
analysis of variance for normal distribution
Post
op
era
tive
nause
a a
nd
vom
itin
g s
core 5-4-3-2-1- A B C
Groups
2006
Binomial distribution
Binomial distribution data (discrete variables) which can take only 0
or 1 response, such as treatment failure or treatment success, follow the binomial distribution providing the underlying population response rate does not change
proportion (p) of respondents (to treatment) in sample p = R/n , R is the number of respondents
and n is the total number in the sample potential variation about this expectation is
expressed by SD(R)
Binomial distribution p can be used to estimate the population
response rate, π number of successful response in the
population can be estimated to be nπ standard error of p is SE(p) = (pq/n) where
q = 1 - p, is the proportion of unsuccessful responders
95% CI for π is p + 1.96 SE(p)
2006
The Chi-squared test for contingency tables
Contingency table expected frequencies
A B
C (xC yA)/n (xC yB)/n xC
D (xD yA)/n (xD yB)/n xD
yA yB Total=n
condition or interventio
n
yes or no outcome
Contingency table can be used to represent
qualitative variables discrete quantitative variables continuous quantitative variables whose
values have been grouped
The Χ2 test is used to
examine the association between discrete (categorical or qualitative) variables
test whether an observed frequency distribution differs significantly from a postulated theoretical one
test whether there is an association or dependence between the row variable and the column variable, or
test whether the distribution of individuals among the categories of one variable is independent of their distribution among the categories of the other
The Χ2 test advantage
it allows comparison of many more categories, drawn-up into a contingency table
the null hypothesis is that any number of categories have equal chance of any other factor
when the table has only 2 rows or 2 columns, this is equivalent to the comparison of proportions
applicable for a parallel group clinical trial, an unmatched
case-control study, or a cross-sectional survey
The Χ2 test in 2 2 tables first calculate the values expected in the 4 cells
of the table assuming the null hypothesis is true estimated cell value = row total x column
total / total N the Χ2 test is calculated from
, d.f.= 1 for a 2 2 table
the Χ2 test is valid (Cochran 1954) when n is > 40, regardless of the expected values, and if less than 20% of the expected values are less than 5 and none are less than 1
when n is between 20 and 40, the test is only valid if all the expected values are at least 5
Χ2 = Σ(O-E)2
E
Yates’ correction for continuity Factor A
Present Absent Total
Factor B Present a c m Absent b d n
Total r s N Yates’ correction for continuity
for converting the discrete data, as Χ2 distribution used to calculate the P value is continuous, like the Normal distribution
, where || means take the value as positive
for ease of calculation, this is equivalent to
Χc2 = Σ
(|O-E|-½)2
E
Χc2 = N(|ad-bc|-½N)2
mnrs
Yates’ correction for continuity the use of the continuity correction is always
advisable although it has most effect when the expected numbers are small
when the numbers are very small, the Χ2 test is not a good enough approximation even with a continuity correction and the Fisher’s exact test for a 22 table should be used
Fisher’s exact test for a 2 2 table Factor APresent Absent Total
Factor B Present a c m Absent b d n
Total r s N Fisher’s exact test to determine P value
to be used if any expected counts in a 2 2 table is less than 5, as the P value given by the Χ2 test is not strictly valid
the probability of observing the particular table is
N!a!b!c!d!m!n!r!s!
Larger tables chi-squared test can also be applied to larger
tables, generally called rc tables r denotes rows and c denotes columns
, d.f.= (r-1)(c-1)
there is no continuity correction or exact test for contingency tables larger than 22
chi-squared test should not be applied to tables showing only proportions or percentages
Χ2 = Σ(O-E)2
E
Goodness of fit apart from using contingency tables, the Χ2-
statistic can be used to see if any observed set of data follows a particular distribution e.g. by calculating the expected frequencies
from say a Poisson distribution, and then comparing these with the observed data
the degrees of freedom will be the number of observations, n - 1
Paired comparison in contingency tables McNemar’s test is a test on a 2x2 classification
table when the difference between paired proportions is tested in cross-over trials and matched-pair case-
control studies
2006
Correlation and linear regression
Correlation and linear regression techniques for dealing with the relationship
between 2 or more continuous variables correlation
examines linear association between 2 variables not whether one variable predicts another
variable strength of the association summarised by the
correlation coefficient regression
examines dependence of one variable, the dependent variable, on the other, the independent variable
relationship summarised by a regression equation consisting of a slope and an intercept
Correlation coefficient, r summarises the strength of the association
between 2 variables allows testing of the hypothesis that the
population correlation coefficient r is zero i.e. whether an apparent association between the variables would have arisen by chance
Pearson correlation coefficient when the correlation coefficient is based on
the original observations Spearman rank correlation coefficient
when it is calculated from the ranks of the data
Correlation coefficient, r dimensionless quantity ranging from -1 to +1
a positive correlation is one in which both variables increase together
a negative correlation is one in which one variable increases as the other decreases
when variables are exactly linearly related, then the correlation either equals +1 or -1
unaffected by units of measurement
Different correlations
y y
y y
x x
x x
••••
••••
••
•• •
••
••••
••
••
•••••
• ••
•
•••
•••
••
•
r = 1 r = 0.2
r = 0 r = -0.4
strong positive
weak positive
no correlation weak negative
Correlation coefficient, r should not be used
if the relationship is non-linear in situations where one of the variables is
determined in advance should be used with caution
in the presence of outliers when the variables are measured over more
than one distinct group e.g. disease and healthy groups
y y
x x
r > 0
r = 0 y
x
r < 0
• ••••
• ••
••
••
••
• • ••••
•
Variance explained, r2
squaring of the correlation coefficient and multiplying by 100 gives r2 or variance explained, the proportion of the variance of one variable explained by the other e.g. r of 0.9, r2 = 0.81 x100; approximately
80% of the variance of one variable can be accounted/explained/predicted by the other
Test of significance Pearson correlation (coefficient, r)
this is a parametric measure of the degree of association between 2 numerical variables
to test whether this is significantly different from zero, calculate
SE(r) = {(1-r2)/(n-2)} and t = r/SE(r) and compare this with the Table of t-
distribution with n-2 degrees of freedom for P value
r ={Σ(x-x)2 Σ(y-y)2}
Σ(x-x)(y-y)
Test of significance assumes both variables are random samples
and at least one has a normal distribution outlying points away from the main body of
the data suggest the variable may not have a normal distribution in this case it may be better to replace the
observation by their ranks and use the Spearman rank correlation coefficient
Test of significance Spearman rank correlation (correlation, rs)
this is a non-parametric measure of the degree of association between 2 numerical variables
the values of each variable are independently ranked and the measure is based on the differences between the pairs of rank of the 2 variables
where d is the difference between each pair
of ranks
n3-n6Σd2rs= 1-
Test of significance Spearman rank correlation
this correlation will have a value between -1 and 1, and its interpretation is similar to that of the Pearson correlation coefficient
the significance of the association is tested by comparing rs with the critical values of the Spearman rank correlation coefficient table, significant if rs is > critical value
Regression looking for dependence of one variable, the
dependent variable, on the other, the independent variable
relationship summarised by a regression equation consisting of a slope and an intercept the slope represents the amount the dependent
variable increases with unit increase of the independent variable
the intercept represents the value of the dependent variable when the independent variable is zero
multiple (multivariate) regression examines simultaneous relationship between one dependent variable and a number of independent variable
Linear regression assumption
a change in one variable, x, will lead directly to a change in another variable, y
e.g. haemoglobin increase with age, not the other way around
y variable (resultant change) is termed dependent variable
x variable is termed the independent variable
Regression line regression equation describes the relationship
between y and x y = a + bx
y = α + βx for population parametersa is the intercept and b is the regression
coefficient calculation of b calculation of intercept, a = y – bx unwise to use equation to predict an outcome
based on extrapolation of observed parameters
b =Σ(x-x)(y-y)
Σ(x-x)2
Multiple regression model of multiple regression
y = a + b1x1 + b2x2 + ……bkxk
applications to look for relationships between continuous
variables, allowing for a third (possibly confounding) variablee.g. predicted haemoglobin = 5.24 + 0.11(age) + 0.097(PCV)
corresponding t-value is b/SE(b) with d.f. of n minus the number of estimated parameters
from these, P value is derived from Table of t distribution
Multiple regression 95% confidence intervals
b t0.05SE(b) if the interval includes zero, the conclusion
should be that that relationship between y and x remains the same whether or not x changes
2006
Degrees of freedom
Degrees of freedom the number of degrees of freedom depends on
2 factors the number of groups we wish to compare the number of parameters we need to
estimate to calculate the standard deviation of the contrast of interest
Degrees of freedom for Χ2 test for comparison of 2 proportions, 1 df for t test
2 sets of degrees of freedom one degree of freedom between-groups one for within-groups
for comparing 2 means, 1 df for between groups, there are also dfs for estimating σ
for paired data, df = number of subjects minus 1
for unpaired data, df = n1 + n2 minus 2, that is n –1 for each group
Degrees of freedom for linear regression
given n independent pairs of observations, 2 degrees of freedom are removed for the 2 parameters that have been estimated, thus d.f. = n-2
for multiple regression d.f. are n minus the number of estimated
parameters
2006
Diagnostic tests and implications
Predictions
Sensitivity what is the probability that the test result will
be positive when a disease is present? that is when the test result is positive, how accurate is it in relation to overall presence of the disease? presence of acute myocardial infarction and
positive T-Troponin test restriction of atlanto-occipital joint extension
and difficult laryngoscopy
Specificity what is the probability that the test result will
be negative when a disease is absent? that is when the test result is negative, how accurate is it in relation to overall absent of the disease? absence of acute myocardial infarction and
negative T-Troponin test Mallampati class I and II and no difficulty in
laryngoscopy
Prediction model Occurrence of event/disease
Prediction by test present (D+) absent (D-) Totalpositive test (T+) a b a+bnegative test (T-) c d c+d
Total e f gas fraction or percentage:prevalence of disease = e/g or P(D+)sensitivity of a test = a/e or P(T+ given D+)specificity of a test = d/f or P(T- given D-)false negative rate = 1- sensitivity = c/e (Type II error)false positive rate = 1- specificity = b/f (Type I error)
Sensitivity and specificity both are useful statistics because they will
yield consistent results for the diagnostic test in a variety of patient groups with different disease prevalences
important point sensitivity and specificity are characteristics
of the test, not the population to which the test is applied
Predictive value of a test Occurrence of event/disease
Prediction by test present (D+) absent (D-) Totalpositive test (T+) a b a+bnegative test (T-) c d c+d
Total e f gpredictive value of a positive test attempts to calculate the
probability of the patient having the disease (D+) when the test is positive T+, or P(D+ given T+) = a/(a + b)
predictive value of a negative test attempts to calculate the probability of the patient not having the disease (D-) when the test is negative T-, or P(D- given T-) = d/(c + d)
discrimination - overall correct classification rate, how well the model separates those in D+ and D- = (a+d)/g
false classification rate = (c+b)/g or = 1 - discrimination
Predictive value or sensitivity? predictive value is what the clinician wants
positive test, is the disease present? negative test, is the disease absent? correct classification rate
sensitivity is supplied with the statistical test disease present, was the test positive? no disease, was the test negative?
2006
Risk and Odds and ratios
Measure of Abbrevn Description No effect Total effect success
Absolute risk ARR Absolute change in risk: the risk of an ARR=0% ARR=initial reduction event in the control group minus the risk
risk of an event in the treated group; usually expressed as a percentage
Relative risk RRR Proportion of the risk removed by RRR=0% RRR=100%reduction treatment: the absolute risk reduction
divided by the initial risk in the control group; usually expressed as a percentage
Relative risk RR The risk of an event in the treated group RR=1 or RR=0
divided by the risk of an event in the RR=100% control group; usually expressed as a
decimal proportion, sometimes as a percentage
Odds ratio OR Odds of an event in the treated group OR=1 OR=0
divided by the odds of an event in the control group; usually expressed as a decimal proportion
Number NNT Number of patients who need to be NNT= NNT= 1/needed to treat treated to prevent one event; this is the initial reciprocal of the absolute risk reduction
risk (when expressed as a decimal fraction); it is usually rounded to a whole number
Odds given that a subject has or does not have a
disease, odds always measures the incidence of
event (exposure) to non event (non-exposure)
+ve cases -ve casesexposed a bnot exposed c dtotal e f
with the disease, odds of event are a/c; without the disease, odds of event are b/d
Odds ratio odds ratio compares the incidence of event (of
exposure) to non-event (non-exposure) among 2 groups of subjects with 2 opposing outcomes
odds ratio is the ratio of two odds +ve cases -ve cases
exposed a bnot exposed c dtotal e f
odds ratio, OR is (a/c)/(b/d)= ad/bc
Odds Ratio 95% CI 1 includes 1 > 1 does not
include 1
< 1 does not
include 1
includes 1
Interpretation No association Positive association between exposure and outcome at the 5% significance level (the odds of exposure is greater in cases Than in controls) Negative association between exposure and outcome at the 5%significance level (the odds of exposure is smaller in cases than controls)Association of exposure and outcome is not proven by the study at the 5% significance level
Risk +ve cases –ve cases
totalexposed a b a+bnot exposed c d c+dtotal e f g
risk of an event is the probability that an event will occur within a stated period of time
the risk of developing the disease within the follow-up time is a/(a+b) for the exposed population c/(c+d) for the unexposed population
Relative risk +ve cases –ve cases
totalexposed a b a+bnot exposed c d c+dtotal e f g
the relative risk is a summary of the outcome of a cohort study
RR = (a/(a+b))/(c/(c+d))or a(c+d)/c(a+b)
Relative risk 95% CI 1 includes 1 > 1 does not
include 1
< 1 does not
include 1
includes 1
Interpretation No association Positive association between exposure and outcome at the 5% significance level (outcome is more likely in the exposedcohort)Negative association between exposure and outcome at the 5% significance level (outcome is less likely in the exposed cohort)Association of exposure and outcome is not proven by the study at the 5% significancelevel
How large was the treatment effect?
PrOpofol - LignOcaine Trial
+ lignocaine control
# of patients randomized 100 100# (%) patients with pain 15(15%)
20(20%)absolute risk (pain) reduction 0.2-0.15 = 0.05relative risk of having pain 0.15/0.20 =
0.75relative risk reduction (1-0.75) x 100% = 25%
Depends on the sample size sample size of 100 each arm
95% CI for RRR of 25% is -28.15 to 77.76 lignocaine + propofol probably no benefit
sample size of 1000 each arm 95% CI for RRR of 25% is (8.35 to 41.61) confident that true RRR is close to 25%
http://ptwww.cchs.usyd.edu.au/Pedro/CIcalculator.xls.
-50 -25 0 25 50
Relative risk reduction (%)
-38 9 41 59n=100/gp
n=1000/gp
Confidence interval around relative risk reduction
RRR 0f 25%
2006
Are the likely treatment benefits worth the potential harm and cost?
Number needed to treat (NNT)Number needed to harm (NNH)
Number needed to treat …the number of patients who must receive an intervention of therapy
during a specific period of time to prevent
one adverse outcome or produce one positive outcome
Number needed to treat … Risk of perioperative AMI NNT without β-blocker with β-blocker* (1/ARR) (ARR)
40 year old man 2% 1.8% 1/0.002
(0.2% or 0.002) = 50070 year old man 40% 36%
1/0.04 (4% or 0.04) = 25
*assuming 10% relative risk reduction (RRR) with preoperative administration of β-blocker
ARR = absolute risk reductionNNT = number needed to treat
Number needed to treat … if there is a higher probability that a patient
will experience an adverse outcome if we do not treat, the more likely the patient will benefit from
treatment and the fewer such patients we need to treat to
prevent one adverse outcome
Number to treat ibuprofen
controlnumber of patients 50 50at least 50% pain relief over 6 hours 27 10expressed in percentage 54%
20% absolute risk reduction = 54%-20% (or 0.54-
0.20) = 34% (or 0.34)
number needed to treat = 1/0.34 = 2.94 or 3
Number to treat another way of calculating NNT for the
analgesic trialNNT = 1/(the proportion of patients with at
least 50% pain relief with analgesic minus the proportion of patients with at least 50% pain relief with placebo)
= 1/((27/50) - (10/50)) = 1/(0.54 - 0.20)
= 1/0.34 = 2.9
or 3
Number to treat the best NNT would be 1,
every patient with treatment benefited no patient given control benefited
generally NNTs between 2 and 5 are indicative of effective treatments, but NNTs of 20, 50 or 100 may be useful for prophylactic treatments, like interventions to reduce death after heart attack
relevance of which depends on the intervention and the consequences
Number to harm for adverse effects, the number needed to
harm (NNH) can be calculated in exactly the same way as an NNT
for an NNH, large numbers are obviously better than small numbers, because that means that the adverse effect occurs with less frequency