Non Parametric Methods

Post on 06-Sep-2015

54 views 7 download

Tags:

description

Parametric

Transcript of Non Parametric Methods

  • Non-Parametric Methods Peter T. DonnanProfessor of Epidemiology and BiostatisticsStatistics for Health Research

  • Objectives of PresentationIntroductionRanks & MedianWilcoxon Signed Rank TestPaired Wilcoxon Signed RankMann-Whitney testSpearmans Rank Correlation CoefficientOthers.

  • What are non-parametric tests? Parametric tests involve estimating parameters such as the mean, and assume that distribution of sample means are normally distributedOften data does not follow a Normal distribution eg number of cigarettes smoked, cost to NHS etc.Positively skewed distributions

  • A positively skewed distribution

  • What are non-parametric tests? Non-parametric tests were developed for these situations where fewer assumptions have to be madeNP tests STILL have assumptions but are less stringentNP tests can be applied to Normal data but parametric tests have greater power IF assumptions met

  • Ranks Practical differences between parametric and NP are that NP methods use the ranks of values rather than the actual valuesE.g. 1,2,3,4,5,7,13,22,38,45 - actual1,2,3,4,5,6, 7, 8, 9,10 - rank

  • MedianThe median is the value above and below which 50% of the data lie. If the data is ranked in order, it is the middle valueIn symmetric distributions the mean and median are the sameIn skewed distributions, median more appropriate

  • MedianBPs:135, 138, 140, 140, 141, 142, 143Median=

  • MedianBPs:135, 138, 140, 140, 141, 142, 143Median=140

    No. of cigarettes smoked:0, 1, 2, 2, 2, 3, 5, 5, 8, 10Median=

  • MedianBPs:135, 138, 140, 140, 141, 142, 143Median=140

    No. of cigarettes smoked:0, 1, 2, 2, 2, 3, 5, 5, 8, 10Median=2.5

  • T-testT-test used to test whether the mean of a sample is sig different from a hypothesised sample meanT-test relies on the sample being drawn from a normally distributed populationIf sample not Normal then use the Wilcoxon Signed Rank Test as an alternative

  • Wilcoxon Signed Rank TestNP test relating to the median as measure of central tendencyThe ranks of the absolute differences between the data and the hypothesised median calculatedThe ranks for the negative and the positive differences are then summed separately (W- and W+ resp.)The minimum of these is the test statistic, W

  • Wilcoxon Signed Rank Test:ExampleThe median heart rate for an 18 year old girl is supposed to be 82bpm. A student takes the pulse rates of 8 female students (all aged 18):83, 90, 96, 82, 85, 80, 81, 87Do these results suggest that the median might not be 82?

  • Wilcoxon Signed Rank Test:ExampleH0:

  • Wilcoxon Signed Rank Test:ExampleH0: median=82H1:

  • Wilcoxon Signed Rank Test:ExampleH0: median=82H1: median82

  • Wilcoxon Signed Rank Test:ExampleH0: median=82H1: median82

    Two-tailed test

    Because one result equals 82 this cannot be used in the analysis

  • Wilcoxon Signed Rank Test:ExampleW+= 1.5+6+7+4+5=23.5W-= 3+1.5=4.5So, W=4.5n=7, so the value of W > tabulated value of 2, so p>0.05

    ResultAbove or below medianAbsolute difference from median=82Rank of difference83+11.590+8696+14785+3480-2381-11.587+55

  • Wilcoxon Signed Rank Test:ExampleTherefore, the student should conclude that these results could have come from a population which had a median of 82 as the result is not significantly different to the null hypothesis value.

  • Wilcoxon Signed Rank Test Normal ApproximationAs the number of ranks (n) becomes larger, the distribution of W becomes approximately NormalGenerally, if n>20Mean W=n(n+1)/4Variance W=n(n+1)(2n+1)/24Z=(W-mean W)/SD(W)

  • Wilcoxon Signed Rank Test AssumptionsPopulation should be approximately symmetrical but need not be Normal Results must be classified as either being greater than or less than the median ie exclude results=medianCan be used for small or large samples

  • Paired samples t-test Disadvantage: Assumes data are a random sample from a population which is Normally distributed

    Advantage: Uses all detail of the available data, and if the data are normally distributed it is the most powerful test

  • The Wilcoxon Signed Rank Test for Paired Comparisons Disadvantage: Only the sign (+ or -) of any change is analysed

    Advantage: Easy to carry out and data can be analysed from any distribution or population

  • Paired And Not Paired Comparisons If you have the same sample measured on two separate occasions then this is a paired comparisonTwo independent samples is not a paired comparisonDifferent samples which are matched by age and gender are paired

  • The Wilcoxon Signed Rank Test for Paired Comparisons Similar calculation to the Wilcoxon Signed Rank test, only the differences in the paired results are rankedExample using SPSS:A group of 10 patients with chronic anxiety receive sessions of cognitive therapy. Quality of Life scores are measured before and after therapy.

  • Wilcoxon Signed Rank Test example

    QoL ScoreBeforeAfter695123949231132812691210

  • Wilcoxon Signed Rank Test example

  • p < 0.05SPSS Output

  • Mann-Whitney testUsed when we want to compare two unrelated or INDEPENDENT groupsFor parametric data you would use the unpaired (independent) samples t-testThe assumptions of the t-test were:The distribution of the measure in each group is approx Normally distributedThe variances are similar

  • Example (1)The following data shows the number of alcohol units per week collected in a survey:

    Men (n=13): 0,0,1,5,10,30,45,5,5,1,0,0,0Women (n=14): 0,0,0,0,1,5,4,1,0,0,3,20,0,0

    Is the amount greater in men compared to women?

  • Example (2)How would you test whether the distributions in both groups are approximately Normally distributed?

  • Example (2)How would you test whether the distributions in both groups are approximately Normally distributed?

    Plot histogramsStem and leaf plotBox-plotQ-Q or P-P plot

  • Boxplots of alcohol units per week by gender

  • Example (3)Are those distributions symmetrical?

  • Example (3)Are those distributions symmetrical?

    Definitely not!

    They are both highly skewed so not Normal. If transformation is still not Normal then use non-parametric test Mann Whitney

    Suggests perhaps that males tend to have a higher intake than women.

  • Mann-Whitney on SPSS

  • Normal approx (NS)Mann-Whitney (NS)

  • Spearman Rank CorrelationMethod for investigating the relationship between 2 measured variables Non-parametric equivalent to Pearson correlationVariables are either non-Normal or measured on ordinal scale

  • Spearman Rank Correlation ExampleA researcher wishes to assess whetherthe distance to general practice influences the time of diagnosis of colorectal cancer.

    The null hypothesis would be that distance is not associated with time to diagnosis. Data collected for 7 patients

  • Distance from GP and time to diagnosis

    Distance (km)Time to diagnosis (weeks)56244384205455104

  • Scatterplot

  • Distance from GP and time to diagnosis

    Distance(km)Time(weeks)Rank for distanceRank for timeDifferencein RanksD22413-244321115637-416844311104532420565.50.50.2545575.51.52.25Total = 0d2=28.5

  • Spearman Rank Correlation ExampleThe formula for Spearmans rank correlation is:

    where n is the number of pairs

  • Spearmans on SPSS

  • Spearmans in SPSS

  • Spearmans in SPSS

  • Spearmans in SPSS

  • Spearman Rank Correlation ExampleIn our example, rs=0.468

    In SPSS we can see that this value is not significant, ie.p=0.29

    Therefore there is no significant relationship between the distance to a GP and the time to diagnosis but note that correlation is quite high!

  • Spearman Rank CorrelationCorrelations lie between 1 to +1A correlation coefficient close to zero indicates weak or no correlationA significant rs value depends on sample size and tells you that its unlikely these results have arisen by chanceCorrelation does NOT measure causality only association

  • Chi-squared testUsed when comparing 2 or more groups of categorical or nominal data (as opposed to measured data)Already covered!In SPSS Chi-squared test is test of observed vs. expected in single categorical variable

  • More than 2 groupsSo far we have been comparing 2 groupsIf we have 3 or more independent groups and data is not Normal we need NP equivalent to ANOVAIf independent samples use Kruskal-WallisIf related samples use FriedmanSame assumptions as before

  • More than 2 groups

  • Parametric related to Non-parametric test

    Parametric TestsNon-parametric TestsSingle sample t-testPaired sample t-test2 independent samples t-testOne-way Analysis of VariancePearsons correlation

  • Parametric / Non-parametric

    Parametric TestsNon-parametric TestsSingle sample t-testWilcoxon-signed rank testPaired sample t-test2 independent samples t-testOne-way Analysis of VariancePearsons correlation

  • Parametric / Non-parametric

    Parametric TestsNon-parametric TestsSingle sample t-testWilcoxon-signed rank testPaired sample t-testPaired Wilcoxon-signed rank2 independent samples t-testOne-way Analysis of VariancePearsons correlation

  • Parametric / Non-parametric

    Parametric TestsNon-parametric TestsSingle sample t-testWilcoxon-signed rank testPaired sample t-testPaired Wilcoxon-signed rank2 independent samples t-testMann-Whitney test (Note: sometimes called Wilcoxon Rank Sums test!)One-way Analysis of VariancePearsons correlation

  • Parametric / Non-parametric

    Parametric TestsNon-parametric TestsSingle sample t-testWilcoxon-signed rank testPaired sample t-testPaired Wilcoxon-signed rank2 independent samples t-testMann-Whitney test (Note: sometimes called Wilcoxon Rank Sums test!)

    One-way Analysis of VarianceKruskal-WallisPearsons correlation

  • Parametric / Non-parametric

    Parametric TestsNon-parametric TestsSingle sample t-testWilcoxon-signed rank testPaired sample t-testPaired Wilcoxon-signed rank2 independent samples t-testMann-Whitney test(Note: sometimes called Wilcoxon Rank Sums test!)One-way Analysis of VarianceKruskal-WallisPearsons correlationSpearman Rank

  • Summary Non-parametricNon-parametric methods have fewer assumptions than parametric testsSo useful when these assumptions not metOften used when sample size is small and difficult to tell if Normally distributedNon-parametric methods are a ragbag of tests developed over time with no consistent frameworkRead in datasets LDL, etc and carry out appropriate Non-Parametric tests

  • ReferencesCorder GW, Foreman DI. Non-parametric Statistics for Non-Statisticians. Wiley, 2009.Nonparametric statistics for the behavioural Sciences. Siegel S, Castellan NJ, Jr. McGraw-Hill, 1988 (first edition was 1956)