Non Parametric Methods

Non-Parametric Methods Peter T. DonnanProfessor of Epidemiology and BiostatisticsStatistics for Health Research

Objectives of PresentationIntroductionRanks & MedianWilcoxon Signed Rank TestPaired Wilcoxon Signed RankMann-Whitney testSpearmans Rank Correlation CoefficientOthers.

What are non-parametric tests? Parametric tests involve estimating parameters such as the mean, and assume that distribution of sample means are normally distributedOften data does not follow a Normal distribution eg number of cigarettes smoked, cost to NHS etc.Positively skewed distributions

A positively skewed distribution

What are non-parametric tests? Non-parametric tests were developed for these situations where fewer assumptions have to be madeNP tests STILL have assumptions but are less stringentNP tests can be applied to Normal data but parametric tests have greater power IF assumptions met

Ranks Practical differences between parametric and NP are that NP methods use the ranks of values rather than the actual valuesE.g. 1,2,3,4,5,7,13,22,38,45 - actual1,2,3,4,5,6, 7, 8, 9,10 - rank

MedianThe median is the value above and below which 50% of the data lie. If the data is ranked in order, it is the middle valueIn symmetric distributions the mean and median are the sameIn skewed distributions, median more appropriate

MedianBPs:135, 138, 140, 140, 141, 142, 143Median=

MedianBPs:135, 138, 140, 140, 141, 142, 143Median=140

No. of cigarettes smoked:0, 1, 2, 2, 2, 3, 5, 5, 8, 10Median=

MedianBPs:135, 138, 140, 140, 141, 142, 143Median=140

No. of cigarettes smoked:0, 1, 2, 2, 2, 3, 5, 5, 8, 10Median=2.5

T-testT-test used to test whether the mean of a sample is sig different from a hypothesised sample meanT-test relies on the sample being drawn from a normally distributed populationIf sample not Normal then use the Wilcoxon Signed Rank Test as an alternative

Wilcoxon Signed Rank TestNP test relating to the median as measure of central tendencyThe ranks of the absolute differences between the data and the hypothesised median calculatedThe ranks for the negative and the positive differences are then summed separately (W- and W+ resp.)The minimum of these is the test statistic, W

Wilcoxon Signed Rank Test:ExampleThe median heart rate for an 18 year old girl is supposed to be 82bpm. A student takes the pulse rates of 8 female students (all aged 18):83, 90, 96, 82, 85, 80, 81, 87Do these results suggest that the median might not be 82?

Wilcoxon Signed Rank Test:ExampleH0:

Wilcoxon Signed Rank Test:ExampleH0: median=82H1:

Wilcoxon Signed Rank Test:ExampleH0: median=82H1: median82

Wilcoxon Signed Rank Test:ExampleH0: median=82H1: median82

Two-tailed test

Because one result equals 82 this cannot be used in the analysis

Wilcoxon Signed Rank Test:ExampleW+= 1.5+6+7+4+5=23.5W-= 3+1.5=4.5So, W=4.5n=7, so the value of W > tabulated value of 2, so p>0.05

ResultAbove or below medianAbsolute difference from median=82Rank of difference83+11.590+8696+14785+3480-2381-11.587+55

Wilcoxon Signed Rank Test:ExampleTherefore, the student should conclude that these results could have come from a population which had a median of 82 as the result is not significantly different to the null hypothesis value.

Wilcoxon Signed Rank Test Normal ApproximationAs the number of ranks (n) becomes larger, the distribution of W becomes approximately NormalGenerally, if n>20Mean W=n(n+1)/4Variance W=n(n+1)(2n+1)/24Z=(W-mean W)/SD(W)

Wilcoxon Signed Rank Test AssumptionsPopulation should be approximately symmetrical but need not be Normal Results must be classified as either being greater than or less than the median ie exclude results=medianCan be used for small or large samples

Paired samples t-test Disadvantage: Assumes data are a random sample from a population which is Normally distributed

Advantage: Uses all detail of the available data, and if the data are normally distributed it is the most powerful test

The Wilcoxon Signed Rank Test for Paired Comparisons Disadvantage: Only the sign (+ or -) of any change is analysed

Advantage: Easy to carry out and data can be analysed from any distribution or population

Paired And Not Paired Comparisons If you have the same sample measured on two separate occasions then this is a paired comparisonTwo independent samples is not a paired comparisonDifferent samples which are matched by age and gender are paired

The Wilcoxon Signed Rank Test for Paired Comparisons Similar calculation to the Wilcoxon Signed Rank test, only the differences in the paired results are rankedExample using SPSS:A group of 10 patients with chronic anxiety receive sessions of cognitive therapy. Quality of Life scores are measured before and after therapy.

Wilcoxon Signed Rank Test example

QoL ScoreBeforeAfter695123949231132812691210

Wilcoxon Signed Rank Test example

p < 0.05SPSS Output

Mann-Whitney testUsed when we want to compare two unrelated or INDEPENDENT groupsFor parametric data you would use the unpaired (independent) samples t-testThe assumptions of the t-test were:The distribution of the measure in each group is approx Normally distributedThe variances are similar

Example (1)The following data shows the number of alcohol units per week collected in a survey:

Men (n=13): 0,0,1,5,10,30,45,5,5,1,0,0,0Women (n=14): 0,0,0,0,1,5,4,1,0,0,3,20,0,0

Is the amount greater in men compared to women?

Example (2)How would you test whether the distributions in both groups are approximately Normally distributed?

Example (2)How would you test whether the distributions in both groups are approximately Normally distributed?

Plot histogramsStem and leaf plotBox-plotQ-Q or P-P plot

Boxplots of alcohol units per week by gender

Example (3)Are those distributions symmetrical?

Example (3)Are those distributions symmetrical?

Definitely not!

They are both highly skewed so not Normal. If transformation is still not Normal then use non-parametric test Mann Whitney

Suggests perhaps that males tend to have a higher intake than women.

Mann-Whitney on SPSS

Normal approx (NS)Mann-Whitney (NS)

Spearman Rank CorrelationMethod for investigating the relationship between 2 measured variables Non-parametric equivalent to Pearson correlationVariables are either non-Normal or measured on ordinal scale

Spearman Rank Correlation ExampleA researcher wishes to assess whetherthe distance to general practice influences the time of diagnosis of colorectal cancer.

The null hypothesis would be that distance is not associated with time to diagnosis. Data collected for 7 patients

Distance from GP and time to diagnosis

Distance (km)Time to diagnosis (weeks)56244384205455104

Scatterplot

Distance from GP and time to diagnosis

Distance(km)Time(weeks)Rank for distanceRank for timeDifferencein RanksD22413-244321115637-416844311104532420565.50.50.2545575.51.52.25Total = 0d2=28.5

Spearman Rank Correlation ExampleThe formula for Spearmans rank correlation is:

where n is the number of pairs

Spearmans on SPSS

Spearmans in SPSS

Spearman Rank Correlation ExampleIn our example, rs=0.468

In SPSS we can see that this value is not significant, ie.p=0.29

Therefore there is no significant relationship between the distance to a GP and the time to diagnosis but note that correlation is quite high!

Spearman Rank CorrelationCorrelations lie between 1 to +1A correlation coefficient close to zero indicates weak or no correlationA significant rs value depends on sample size and tells you that its unlikely these results have arisen by chanceCorrelation does NOT measure causality only association

Chi-squared testUsed when comparing 2 or more groups of categorical or nominal data (as opposed to measured data)Already covered!In SPSS Chi-squared test is test of observed vs. expected in single categorical variable

More than 2 groupsSo far we have been comparing 2 groupsIf we have 3 or more independent groups and data is not Normal we need NP equivalent to ANOVAIf independent samples use Kruskal-WallisIf related samples use FriedmanSame assumptions as before

More than 2 groups

Parametric related to Non-parametric test

Parametric TestsNon-parametric TestsSingle sample t-testPaired sample t-test2 independent samples t-testOne-way Analysis of VariancePearsons correlation

Parametric / Non-parametric

Parametric TestsNon-parametric TestsSingle sample t-testWilcoxon-signed rank testPaired sample t-test2 independent samples t-testOne-way Analysis of VariancePearsons correlation


Parametric TestsNon-parametric TestsSingle sample t-testWilcoxon-signed rank testPaired sample t-testPaired Wilcoxon-signed rank2 independent samples t-testOne-way Analysis of VariancePearsons correlation


Parametric TestsNon-parametric TestsSingle sample t-testWilcoxon-signed rank testPaired sample t-testPaired Wilcoxon-signed rank2 independent samples t-testMann-Whitney test (Note: sometimes called Wilcoxon Rank Sums test!)One-way Analysis of VariancePearsons correlation


Parametric TestsNon-parametric TestsSingle sample t-testWilcoxon-signed rank testPaired sample t-testPaired Wilcoxon-signed rank2 independent samples t-testMann-Whitney test (Note: sometimes called Wilcoxon Rank Sums test!)

One-way Analysis of VarianceKruskal-WallisPearsons correlation


Parametric TestsNon-parametric TestsSingle sample t-testWilcoxon-signed rank testPaired sample t-testPaired Wilcoxon-signed rank2 independent samples t-testMann-Whitney test(Note: sometimes called Wilcoxon Rank Sums test!)One-way Analysis of VarianceKruskal-WallisPearsons correlationSpearman Rank

Summary Non-parametricNon-parametric methods have fewer assumptions than parametric testsSo useful when these assumptions not metOften used when sample size is small and difficult to tell if Normally distributedNon-parametric methods are a ragbag of tests developed over time with no consistent frameworkRead in datasets LDL, etc and carry out appropriate Non-Parametric tests

ReferencesCorder GW, Foreman DI. Non-parametric Statistics for Non-Statisticians. Wiley, 2009.Nonparametric statistics for the behavioural Sciences. Siegel S, Castellan NJ, Jr. McGraw-Hill, 1988 (first edition was 1956)

Non Parametric Methods

Documents

Transcript of Non Parametric Methods

Application of ambient analysis techniques for the ... › ~vanfrl › documents › ... · non-parametric and/or parametric spectral estimation methods. From the manyavailable methods,

Non-Parametric Bayesian Methods for Linear System ...

Parametric Methods

Non-Parametric Power Spectrum Estimation Methods › 527e › 897ffe7417a369fd...Non-Parametric Power Spectrum Estimation Methods SYDE 770 Image Processing C ourse Project Prof E.

PARAMETRIC AND NON-PARAMETRIC SYSTEM MODELLING › documents › ftp › phdliste › phd70_00.pdf · 2000-11-14 · and non-parametric methods of regression. This combination can

Non-parametric Methods for Estimation of Hawkes Process ... · non-parametric estimation methods. The main goal of this thesis is to test a variety of available non-parametric methods

Econometric Tools 1: Non-Parametric Methodsmanuelb/week6/LectureNotes06.pdf · Econometric Tools 1: Non-Parametric Methods ... estimation in Stata. Non-parametric econometrics is

Comparison of Parametric and Non-Parametric Estimation ...alphanumericjournal.com/...1...non-parametric-estimation-metho_dpljZMf.pdf · In this study, the aim was to review the methods

What Are Non Parametric Methods! (1)

Non-parametric statistical methods for testing questionable data-population assumptions

6. Non-parametric methods - University of Dundee€¦ · P-values and statistical tests 6. Non-parametric methods Hand-outs available at Marek Gierliński Division of Computational

Refined Non Parametric Methods for Genomic inference Refined Non Parametric Methods for Genomic inference Peter J. Bickel Department of Statistics University.

Anomaly Detection Systems. 2/86 Contents Statistical methods –parametric –non-parametric (clustering) Systems with learning.

Non-parametric Bayesian Methods

Parametric and Non-Parametric Methods for Efficiency ...

NON-PARAMETRIC GRADUATION USING KERNEL METHODS BY ... · Non-Parametric Graduation Using Kernel Methods 137 Thus ƒx( ) may be a straight line, a cubic spline, a logistic function

Density estimation with non–parametric methods · rely on non{parametric methods, the simplest of which is an histogram calculation. The main di erence with re-spect to the previous

Non-parametric Methods - Bilkent University · Non-parametric Density Estimation I Other methods for obtaining the regions for estimation: I Shrink regions as some function of n,

Parametric & Non-parametric

Parametric and non-parametric statistical methods for … · Why nonparametric methodsWhat test to use ?Rank Tests Parametric and non-parametric statistical methods for the life sciences