T-tests, ANOVA and regression Methods for Dummies February 1 st 2006 Jon Roiser and Predrag...
-
Upload
owen-hogan -
Category
Documents
-
view
221 -
download
0
Transcript of T-tests, ANOVA and regression Methods for Dummies February 1 st 2006 Jon Roiser and Predrag...
t-tests, ANOVA and regressiont-tests, ANOVA and regression
Methods for Dummies February 1Methods for Dummies February 1stst 2006 2006
Jon Roiser and Predrag PetrovicJon Roiser and Predrag Petrovic
OverviewOverview
Simple hypothesis testingSimple hypothesis testing Z-tests & t-testsZ-tests & t-tests F-tests and ANOVAF-tests and ANOVA Correlation/regressionCorrelation/regression
Central aim of statistical tests:Central aim of statistical tests: Determining the likelihood of a value in a Determining the likelihood of a value in a
sample, given that the Null Hypothesis is sample, given that the Null Hypothesis is truetrue: P(value|H: P(value|H00)) HH00: no statistically significant difference between : no statistically significant difference between
sample & population (or between samples)sample & population (or between samples) HH11: statistically significant difference between : statistically significant difference between
sample & population (or between samples)sample & population (or between samples)
Significance level: P(value|HSignificance level: P(value|H00) < 0.05) < 0.05
Starting pointStarting point
Types of errorTypes of error
True state of the worldTrue state of the world
HH0 0 truetrue HH0 0 falsefalse
DecisionDecision
Accept Accept HH00
Correct Correct acceptanceacceptance
p=1-p=1-
-error-error(Type (Type IIII error) error)False negativeFalse negative
Reject Reject HH00
-error-error(Type (Type II error) error)False positiveFalse positive
Correct rejection Correct rejection p=1-p=1-Power)Power)
TotalTotal
populationpopulation
PopulationPopulation studiedstudied
α
β
Distribution & probabilityDistribution & probability
n
xx
n
ii
1
n
xn
ii
1
2)(
/2
If we know something about the distribution of events in a population, we know something about the probability of these events
n
xn
ii
1
Populationmean
Population standard deviation
Standardised normal distributionStandardised normal distribution
The z-The z-score representsscore represents a value on the x-axis for a value on the x-axis for which we know the p-valuewhich we know the p-value
2-tailed: z = 1.96 is equivalent to p=0.05 (rule 2-tailed: z = 1.96 is equivalent to p=0.05 (rule of thumb ~2SD)of thumb ~2SD)
1-tailed: z = 1.65 is equivalent to p=0.05 (area 1-tailed: z = 1.65 is equivalent to p=0.05 (area between infinity and 1.65=5% on one of the between infinity and 1.65=5% on one of the tails)tails)
1
0
z
z
s
x
ii
xz
1 point compared to population
n
xz
Group compared to populationStandardised
Assumptions of parametric testsAssumptions of parametric tests
Variables are:Variables are: Normally distributedNormally distributed N>10 (or 12, or 15…)N>10 (or 12, or 15…) On an interval or ideally ratio scale (e.g. 2 On an interval or ideally ratio scale (e.g. 2
metres=2x1 metre)metres=2x1 metre)
……but parametric tests are (fairly) robust to but parametric tests are (fairly) robust to violations of these assumptionsviolations of these assumptions
Z- versus t- statistic?Z- versus t- statistic? Z is used when we know the Z is used when we know the
variance variance in the general in the general population population e.g. IQ. This is not e.g. IQ. This is not normally true!normally true!
t is used when we do not know t is used when we do not know the variance of the underlying the variance of the underlying population for sure, and is population for sure, and is dependent on Ndependent on N
The t distribution is similar to The t distribution is similar to the Z (but flatter)the Z (but flatter)
For N>30, t≈Z For N>30, t≈Z
Large N
Small N
Two-sample t-testTwo-sample t-test
21
21
xxs
xxt
2
22
1
21
21 n
s
n
ss xx Group 1 Group 2
Difference between the means divided by the pooled standard error of the mean
Different types of t-testDifferent types of t-test One-sampleOne-sample
Tests whether the mean of a population is different to a Tests whether the mean of a population is different to a given value (e.g. if chance performance=50% in 2 given value (e.g. if chance performance=50% in 2 alternative forced choice)alternative forced choice)
Paired t-test (within subjects)Paired t-test (within subjects) Tests whether a group of individuals tested under Tests whether a group of individuals tested under
condition A is different to tested under condition Bcondition A is different to tested under condition B Must have 2 values for each subjectMust have 2 values for each subject Basically the same as a one-sample t-test on the Basically the same as a one-sample t-test on the
difference scores, comparing the difference scores to 0difference scores, comparing the difference scores to 0
Another approach to group differencesAnother approach to group differences Instead of thinking about the group means, we can Instead of thinking about the group means, we can
instead think about variancesinstead think about variances Recall sample variance:Recall sample variance:
F=Variance 1/Variance 2F=Variance 1/Variance 2 ANOVA = ANANOVA = ANALYSISALYSIS O OFF VA VARIANCERIANCE
Total variance=model variance + error varianceTotal variance=model variance + error variance
1
)(1
2
2
n
xxs
n
ii
Partitioning the variancePartitioning the variance
Group 1
Group 2
Group 1
Group 2
Group 1
Group 2
Total = Model +
(Between groups)
Error
(Within groups)
ANOVAANOVA At its simplest, one-way ANOVA is the same as the two-At its simplest, one-way ANOVA is the same as the two-
sample t-testsample t-test Recall t=difference between means/spread around means Recall t=difference between means/spread around means
(pooled standard error of the mean)(pooled standard error of the mean)
Group 1
Group 2
Group 1
Group 2
Model (difference between means)
Between groups
Error (spread around means)
Within groups
2
2
Error
Model
s
sF
A quick proof from SPSSA quick proof from SPSSGroup Statistics
15 3.1352 1.45306 .37518
11 1.7157 1.03059 .31073
group2Ecstasy
Control
DepressionN Mean Std. Deviation
Std. ErrorMean
Independent Samples Test
2.105 .160 2.764 24 .011 1.41951 .51363 .35943 2.47958
2.914 23.991 .008 1.41951 .48715 .41406 2.42496
Equal variancesassumed
Equal variancesnot assumed
DepressionF Sig.
Levene's Test forEquality of Variances
t df Sig. (2-tailed)Mean
DifferenceStd. ErrorDifference Lower Upper
95% ConfidenceInterval of the
Difference
t-test for Equality of Means
ANOVA
Depression
12.788 1 12.788 7.638 .011
40.181 24 1.674
52.968 25
Between Groups
Within Groups
Total
Sum ofSquares df Mean Square F Sig. ANOVA
t-test
In fact, t=SQRT(F) => 2.764=SQRT(7.638) (for 1 degree of freedom)
ANOVA is useful for more complex designsANOVA is useful for more complex designs
Group 1 Group 2 Group 3
More than 1 group
Male Female
More than 1 effect (interaction)
● Drug
● Placebo
…but we need to use post-hoc tests (t-tests corrected for multiple comparisons) to interpret the results
Differences between t-tests and F-tests (especially in SPM)Differences between t-tests and F-tests (especially in SPM) t-tests can only be used to compare 2 t-tests can only be used to compare 2
groups/effects, while ANOVA can handle groups/effects, while ANOVA can handle more sophisticated designs (several more sophisticated designs (several groups/several effects/interactions)groups/several effects/interactions)
In SPM In SPM t-tests are one-tailed t-tests are one-tailed (i.e. for (i.e. for contrast X-Y, significant voxels are only contrast X-Y, significant voxels are only reported where X>Y)reported where X>Y)
In SPM In SPM F-tests are two-tailedF-tests are two-tailed (i.e. for (i.e. for contrast X-Y, significant voxels are contrast X-Y, significant voxels are reported for both X>Y and Y>X)reported for both X>Y and Y>X)
Correlation and RegressionCorrelation and Regression Is there a relationship between Is there a relationship between xx and and yy?? What is the strength of this relationshipWhat is the strength of this relationship
Pearson’s rPearson’s r Can we describe this relationship and use it to predict Can we describe this relationship and use it to predict yy
from from xx?? RegressionRegression
Fitting a line using the Least Squares solutionFitting a line using the Least Squares solution
Is the relationship we have described statistically Is the relationship we have described statistically significant? significant? Significance testsSignificance tests
Relevance to SPMRelevance to SPM GLMGLM
Is there a relationship between Is there a relationship between xx and and yy?? What is the strength of this relationshipWhat is the strength of this relationship
Pearson’s rPearson’s r Can we describe this relationship and use it to predict Can we describe this relationship and use it to predict yy
from from xx?? RegressionRegression
Fitting a line using the Least Squares solutionFitting a line using the Least Squares solution
Is the relationship we have described statistically Is the relationship we have described statistically significant? significant? Significance testsSignificance tests
Relevance to SPMRelevance to SPM GLMGLM
Correlation and RegressionCorrelation and Regression
Correlation: predictability about the relationship between Correlation: predictability about the relationship between two variablestwo variables
Covariance: measurement of this predictabilityCovariance: measurement of this predictability Regression: description about the relationship between two Regression: description about the relationship between two
variables where one is dependent and the other is variables where one is dependent and the other is independentindependent
No causality in any of these modelsNo causality in any of these models
Correlation and RegressionCorrelation and Regression
When X and Y : cov (x,y) = pos.When X and Y : cov (x,y) = pos. When X and Y : cov (x,y) = neg.When X and Y : cov (x,y) = neg. When no consistent relationship: cov (x,y) = 0When no consistent relationship: cov (x,y) = 0 Dependent on size of the data’s standard deviations (Dependent on size of the data’s standard deviations (!!)) We need to standardize the data (We need to standardize the data (Pearson’s rPearson’s r))
1
))((),cov( 1
n
yyxxyx
i
n
ii
CovarianceCovariance
Is there a relationship between Is there a relationship between xx and and yy?? What is the strength of this relationshipWhat is the strength of this relationship
Pearson’s rPearson’s r Can we describe this relationship and use it to predict Can we describe this relationship and use it to predict yy
from from xx?? RegressionRegression
Fitting a line using the Least Squares solutionFitting a line using the Least Squares solution
Is the relationship we have described statistically Is the relationship we have described statistically significant? significant? Significance testsSignificance tests
Relevance to SPMRelevance to SPM GLMGLM
Correlation and RegressionCorrelation and Regression
Pearson’s rPearson’s r
yx
i
n
ii
yxxy ssn
yyxx
ss
yxr
)1(
))((),cov( 1
Pearson’s rPearson’s r
Covariance does not really tell us anythingCovariance does not really tell us anything
– Solution: standardise this measureSolution: standardise this measure
Pearson’s r: standardises the covariance valuePearson’s r: standardises the covariance value Divides the covariance by the multiplied standard Divides the covariance by the multiplied standard
deviations of X and Y: deviations of X and Y:
Is there a relationship between Is there a relationship between xx and and yy?? What is the strength of this relationshipWhat is the strength of this relationship
Pearson’s rPearson’s r Can we describe this relationship and use it to predict Can we describe this relationship and use it to predict yy
from from xx?? RegressionRegression
Fitting a line using the Least Squares solutionFitting a line using the Least Squares solution
Is the relationship we have described statistically Is the relationship we have described statistically significant? significant? Significance testsSignificance tests
Relevance to SPMRelevance to SPM GLMGLM
Correlation and RegressionCorrelation and Regression
Best-fit lineBest-fit line
= ŷ, predicted value
Aim of linear regression is to fit a straight line, Aim of linear regression is to fit a straight line, ŷŷ = ax + b to data = ax + b to data that gives best prediction of y for any value of xthat gives best prediction of y for any value of x
This will be the line that minimises distance betweenThis will be the line that minimises distance between data and fitted line, i.e. the residualsdata and fitted line, i.e. the residuals
intercept
ε
ŷ = ax + b
ε = residual error
= y i , true value
slope
Least Squares RegressionLeast Squares Regression
To find the best line we must minimise the sum To find the best line we must minimise the sum of the squares of the residuals (the vertical of the squares of the residuals (the vertical distances from the data points to our line)distances from the data points to our line)
Residual (ε) = y - ŷ
Sum of squares (SS) of residuals = Σ (y – ŷ)2
Model line: ŷ = ax + b
We must find values of We must find values of a a and and bb that minimise that minimise
ΣΣ (y – (y – ŷŷ))22
a = slope, b = intercept
Finding bFinding b First we find the value of b that gives the least First we find the value of b that gives the least
sum of squaressum of squares
b
Trying different values of b is equivalent to Trying different values of b is equivalent to shifting the line up and down the scatter plotshifting the line up and down the scatter plot
bb
Finding aFinding a Now we find the value of a that gives the least Now we find the value of a that gives the least
sum of squaressum of squares
Trying out different values of a is equivalent to Trying out different values of a is equivalent to changing the slope of the line, while b stays constantchanging the slope of the line, while b stays constant
bb b
Minimising the sum of squaresMinimising the sum of squares
Need to minimise Need to minimise ΣΣ(y–(y–ŷŷ))22
ŷŷ = ax + b = ax + b So need to minimise:So need to minimise:
ΣΣ(y - ax - b)(y - ax - b)22
If we plot the sums of If we plot the sums of squares for all different squares for all different values of a and b we get a values of a and b we get a parabola, because it is a parabola, because it is a squared termsquared term
So the minimum sum of So the minimum sum of squares is at the bottom of squares is at the bottom of the curve, where the the curve, where the gradient is zerogradient is zero
Values of a and b
sum
of
squ
ares
(S
S)
Gradient = 0min S
The solutionThe solution
Doing this gives the following equation for a:Doing this gives the following equation for a:
a =r sy
sx
r = correlation coefficient of x and ysy = standard deviation of ysx = standard deviation of x
From this we can see that: From this we can see that: A low correlation coefficient gives a flatter slope (low A low correlation coefficient gives a flatter slope (low
value of a)value of a) Large spread of y, i.e. high standard deviation, results in a Large spread of y, i.e. high standard deviation, results in a
steeper slope (high value of a)steeper slope (high value of a) Large spread of x, i.e. high standard deviation, results in a Large spread of x, i.e. high standard deviation, results in a
flatter slope (low value of a)flatter slope (low value of a)
The solution continuedThe solution continued
Our model equation is Our model equation is ŷ ŷ = ax + b= ax + b This line must pass through (x, y) so:This line must pass through (x, y) so:
y = ax + b b = y – ax
We can put our equation for a into this giving: We can put our equation for a into this giving:
b = y – ax
b = y - r sy
sx
r = correlation coefficient of x and ysy = standard deviation of ysx = standard deviation of x
x
The smaller the correlation, the closer the intercept is to the mean of The smaller the correlation, the closer the intercept is to the mean of yy
Back to the modelBack to the model
If the correlation is zero, we will simply predict the mean of y for every If the correlation is zero, we will simply predict the mean of y for every value of x, and our regression line is just a flat straight line crossing the value of x, and our regression line is just a flat straight line crossing the x-axis at yx-axis at y
But this isn’t very usefulBut this isn’t very useful
We can calculate the regression line for any data, but the important We can calculate the regression line for any data, but the important question is how well does this line fit the data, or how good is it at question is how well does this line fit the data, or how good is it at predicting y from xpredicting y from x
ŷ = ax + b = r sy
sx
r sy
sx
x + y - x
r sy
sx
ŷ = (x – x) + yRearranges to:
a b
a a
Is there a relationship between Is there a relationship between xx and and yy?? What is the strength of this relationshipWhat is the strength of this relationship
Pearson’s rPearson’s r Can we describe this relationship and use it to predict Can we describe this relationship and use it to predict yy
from from xx?? RegressionRegression
Fitting a line using the Least Squares solutionFitting a line using the Least Squares solution
Is the relationship we have described statistically Is the relationship we have described statistically significant? significant? Significance testsSignificance tests
Relevance to SPMRelevance to SPM GLMGLM
Correlation and RegressionCorrelation and Regression
We’ve determined the form of the relationship We’ve determined the form of the relationship
(y = ax + b)(y = ax + b)
Does a prediction based on this model do a Does a prediction based on this model do a better job than just predicting the mean?better job than just predicting the mean?
How can we determine the significance of the model?How can we determine the significance of the model?
We can solve this using ANOVAWe can solve this using ANOVA In general: In general:
Total variance = predicted (or model) variance + error Total variance = predicted (or model) variance + error variancevariance
In a one-way ANOVA, we have:In a one-way ANOVA, we have:
VarianceVarianceTotal Total = = MSMSModelModel + + MSMSErrorError
MSError
MSModelF (df model, dferror)
1
)(1
2
2
n
xxs
n
ii
MS=SS/dfMS=SS/df
= +
Total = Model + Error(Between) (Within)
Partitioning the variance for linear regression (using ANOVA)Partitioning the variance for linear regression (using ANOVA)
So linear regression and ANOVA are doing the same thing statistically, and are the same as correlation…
ANOVAb
10.843 1 10.843 7.531 .017a
18.717 13 1.440
29.560 14
Regression
Residual
Total
Model1
Sum ofSquares df Mean Square F Sig.
Predictors: (Constant), Ecstasy_frequencya.
Dependent Variable: Depressionb.
Another quick proof from SPSSAnother quick proof from SPSSCorrelations
1 .606*
.017
15 15
.606* 1
.017
15 15
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
Depression
Ecstasy_frequency
DepressionEcstasy_frequency
Correlation is significant at the 0.05 level (2-tailed).*.
Regression
Correlation (Pearson’s r)
Relating the F and t statisticsRelating the F and t statistics
Alternatively (as F is the square of t):
2)2(ˆ1
)2(ˆ
r
Nrt N
So all we need toknow is N and r!!
2
2
ˆ1
)2(ˆ
r
NrF (df model, dferror)
MSError
MSModel
Basic assumptionsBasic assumptions
Variables: ratio or interval with > 10 (or 12, or Variables: ratio or interval with > 10 (or 12, or 15…) different pairs of values15…) different pairs of values
Variables normally distributed in the populationVariables normally distributed in the population Linear relationshipLinear relationship Residuals (errors) should be normally distributedResiduals (errors) should be normally distributed Independent samplingIndependent sampling
Warning 1: Outliers
Regression health warning!Regression health warning!
Warning 2: More than 1 different population or contrast(aka the “Ecological Fallacy”)
Regression health warning!Regression health warning!
(519 citations)
Science (1997) 277:968-71
Is there a relationship between Is there a relationship between xx and and yy?? What is the strength of this relationshipWhat is the strength of this relationship
Pearson’s rPearson’s r Can we describe this relationship and use it to predict Can we describe this relationship and use it to predict yy
from from xx?? RegressionRegression
Fitting a line using the Least Squares solutionFitting a line using the Least Squares solution
Is the relationship we have described statistically Is the relationship we have described statistically significant? significant? Significance testsSignificance tests
Relevance to SPMRelevance to SPM GLMGLM
Correlation and RegressionCorrelation and Regression
Multiple regressionMultiple regression
Multiple regression is used to determine the effect of a Multiple regression is used to determine the effect of a number of independent variables, xnumber of independent variables, x11, x, x22, x, x33 etc, on a etc, on a single dependent variable, ysingle dependent variable, y
The different x variables are combined in a linear way The different x variables are combined in a linear way and each has its own regression coefficient:and each has its own regression coefficient:
y = ay = a11xx11+ a+ a22xx22 +…..+ a +…..+ annxxnn + b + + b + εε
The a parameters reflect the independent contribution of The a parameters reflect the independent contribution of each independent variable, x, to the value of the each independent variable, x, to the value of the dependent variable, ydependent variable, y
i.e. the amount of variance in y that is accounted for by i.e. the amount of variance in y that is accounted for by each x variable after all the other x variables have been each x variable after all the other x variables have been accounted foraccounted for
SPMSPM
Linear regression is a GLM that models the effect of one Linear regression is a GLM that models the effect of one independent variable, x, on ONE dependent variable, y independent variable, x, on ONE dependent variable, y
Multiple regression models the effect of several Multiple regression models the effect of several independent variables, xindependent variables, x11, x, x22 etc, on ONE dependent etc, on ONE dependent
variable, yvariable, y Both are types of General Linear ModelBoth are types of General Linear Model GLM can also allow you to analyse the effects of several GLM can also allow you to analyse the effects of several
independent x variables on several dependent variables, independent x variables on several dependent variables, yy11, y, y22, y, y33 etc, in a linear combination etc, in a linear combination
This is what SPM does!This is what SPM does!
AcknowledgementsAcknowledgements
Previous year’s slidesPrevious year’s slides David Howell’s excellent book Statistical David Howell’s excellent book Statistical
Methods for Psychology (2002)Methods for Psychology (2002) And David Howell’s websiteAnd David Howell’s website
http://www.uvm.edu/~dhowell/StatPages/Sthttp://www.uvm.edu/~dhowell/StatPages/StatHomePage.htmlatHomePage.html
The lecturers declare that they do not own stocks, shares or capital investments in David Howell’s The lecturers declare that they do not own stocks, shares or capital investments in David Howell’s book, they are not employed by the Duxbury group and do not consult for them, nor are they book, they are not employed by the Duxbury group and do not consult for them, nor are they associated with David Howell, or his friends, or his family, or his catassociated with David Howell, or his friends, or his family, or his cat