Post on 21-Dec-2015
We can match statistical methods to the level of measurement of the two variables that we want to assess:Level of Measurement
Nominal Ordinal Interval Ratio
Nominal Chi-square
Chi-square
T-test
ANOVA
T-test
ANOVA
Ordinal Chi-square
Chi-Square
ANOVA ANOVA
Interval T-test
ANOVA
ANOVA Correlation
Regression
Correlation
Regression
Ratio T-test
ANOVA
ANOVA Correlation
Regression
Correlation
Regression
However, we should only use these tests when: We have a normal distribution for an interval
or ratio level variable. When the dependent variable (for
Correlation, T-test, ANOVA, and Regression) is interval or ratio.
When our sample has been randomly selected or is from a population.
Interpreting a Correlation from an SPSS Printout
Correlations
1 .633**. .000
474 474.633** 1.000 .474 474
Pearson CorrelationSig. (2-tailed)NPearson CorrelationSig. (2-tailed)N
Educational Level (years)
Beginning Salary
EducationalLevel (years)
BeginningSalary
Correlation is significant at the 0.01 level (2-tailed).**.
A correlation is:
An association between two interval or ratio variables.
Can be positive or negative. Measures the strength of the association
between the two variables and whether it is large enough to be statistically signficant.
Can range from -1.00 to 0.00 and from 0.00 to 1.00.
Example: Types of Relationships Positive Negative No Relationship
Income
($)
Education
(yrs)
Income
($)
Education
(yrs)
Income
($)
Education
(yrs)
20,000 10 20,000 18 20,000 14
30,000 12 30,000 16 30,000 18
40,000 14 40,000 14 40,000 10
50,000 16 50,000 12 50,000 12
75,000 18 75,000 10 75,000 16
The stronger the correlation the closer it will be to 1.00 or -1.00. Weak correlations will be close to 0.00 (either positive or negative)
You can see the degree of correlation (association) by using a scatterplot graph
Current Salary
140000120000100000800006000040000200000
Educational Level (y
ears
)
22
20
18
16
14
12
10
8
6
Looking at a scatterplot from the same data set, current and beginning salary we can see a stronger correlation
Current Salary
140000120000100000800006000040000200000
Begin
nin
g S
ala
ry
100000
80000
60000
40000
20000
0
If we run the correlation between these two variables in SPSS, we find
Correlations
1 .880**. .000
474 474.880** 1.000 .474 474
Pearson CorrelationSig. (2-tailed)NPearson CorrelationSig. (2-tailed)N
Beginning Salary
Current Salary
BeginningSalary
CurrentSalary
Correlation is significant at the 0.01 level (2-tailed).**.
For these two variables, if we were to test a hypothesis at Confidence Level, .01
Alternative Hypothesis:There is a positive association between beginning and current salary.
Null Hypothesis:There is no association between beginning and current salary.
Decision: r (correlation) = .88 at p. = .000. .000 is less than .01.
We reject the null hypothesis and accept the alternative hypothesis!
(Bonus Question): Why would we expect the previous correlation to be statistically significant at below the p.= .01 level?
Answer: This is a large data set N = 474 – this makes it likely that if there is a correlation, it will be statistically significant at a low significance (p) level.
Larger data sets are less likely to be affected by sampling or random error!
Other important information on correlation Correlation does not tell us if one variable “causes”
the other – so there really isn’t an independent or dependent variable.
With correlation, you should be able to draw a straight line between the highest and lowest point in the distribution. Points that are off the “best fit” line, indicate that the correlation is less than perfect (-1/+1).
Regression is the statistical method that allows us to determine whether the value of one interval/ratio level can be used to predict or determine the value of another.
Another measure of association is a t-test. T-tests Measure the association between a nominal
level variable and an interval or ratio level variable.
It looks at whether the nominal level variable causes a change in the interval/ratio variable.
Therefore the nominal level variable is always the independent variable and the interval/ratio variable is always the dependent.
Example of t-test – Self –Esteem Scores
Men Women
32 34
44 18
56 52
18 16
21 33
39 26
25 35
28 20
32.875 29.25
Important things to know about an independent samples t-test It can only be used when the nominal variable has
only two categories. Most often the nominal variable pertains to
membership in a specific demographic group or a sample.
The association examined by the independent samples t-test is whether the mean of interval/ratio variable differs significantly in each of the two groups. If it does, that means that group membership “causes” the change or difference in the mean score.
Looking at the difference in means between the two groups, can we tell if the difference is large enough to be statistically significant?
Group Statistics
258 $20301.4 ********* $567.275216 $13092.0 ********* $199.742
GenderMaleFemale
Beginning SalaryN Mean
Std.Deviation
Std. ErrorMean
T-test results
Independent Samples Test
105.969 .000 11.152 472 .000 $7,209.43 $646.447 $5939.16 $8479.70
11.987 318.818 .000 $7,209.43 $601.413 $6026.19 $8392.67
Equal variancesassumedEqual variancesnot assumed
Beginning SalaryF Sig.
Levene's Test forEquality of Variances
t df Sig. (2-tailed)Mean
DifferenceStd. ErrorDifference Lower Upper
95% ConfidenceInterval of the
Difference
t-test for Equality of Means
Positive and Negative t-tests
Your t-test will be positive when, the lowest value category (1,2) or (0,1) is entered into the grouping menu first and the mean of that first group is higher than the second group.
Your t-test will be negative when the lowest value category is entered into the grouping menu first and the mean of the second group is higher than the first group.
Paired Samples T-Test
Used when respondents have taken both a pre and post-test using the same measurement tool (usually a standardized test).
Supplements results obtained when the mean scores for all the respondents on the post test is subtracted from the pre test scores. If there is a change in the scores from the pre test and post test, it usually means that the intervention is effective.
A statistically significant paired samples t-test usually means that the change in pre and post test score is large enough that the change can not be simply due to random or sampling error.
An important exception here is that the change in pre and post test score must be in the direction (positive/negative specified in the hypothesis).
Pair-samples t-test (continued)For example if our hypothesis states that:
Participation in the welfare reform experiment is associated with a positive change in welfare recipient wages from work and participation in the experiment actually decreased wages, then our hypothesis would not be confirmed. We would accept the null hypothesis and accept the alternative hypothesis.
Pre-test wages = Mean = $400 per month for each participant
Post-test wages = Mean = $350 per month for each participant.
However, we need to know the t-test value to know if the difference in means is large enough to be statistically significant.
What are the alternative and null hypothesis for this study?
Let’s test a hypothesis for an independent t-test We want to know if women have higher
scores on a test of exam-related anxiety than men.
The researcher has set the confidence level for this study at p. = .05.
On the SPSS printout, t=2.6, p. = .03.
What are the alternative and null hypothesis?
Can we accept or reject the null hypothesis.
Answer
Alternative hypothesis:
Women have higher levels of exam-related anxiety than men as measured by a standardized test.
Null hypothesis: There will be no difference between men and women on the standardized test of exam-related anxiety.
Reject the null hypothesis, (p = .03 is less than the confidence level of .05.) Accept the alternative hypothesis. There is a relationship.
Computing a Correlation
Select Analyze Select Correlate Select two or more variables and click add Click o.k.
Computing an independent t-test Select Analyze Select Means Select Independent T-test Select Test (Dependent Variable - must be ratio) Select Grouping Variable (must be nominal – only
two categories) Select numerical category for each group (Usually group 1 = 1, group 2 = 2)Click o.k.
Computing a paired sample t-test Select Analyze Select Compare Means Select Paired Samples T-test Highlight two interval/ratio variables – should
be from pre and post test Click on arrow Click o.k.
Data from Paired Sample T-test
Paired Samples Statistics
$34419.6 474 ********* $784.311$17016.1 474 ********* $361.510
Current SalaryBeginning Salary
Pair1
Mean NStd.
DeviationStd. Error
Mean
More data from paired samples t-test
Paired Samples Test
$17403.5 ********* $496.732 $16427.4 $18379.6 35.036 473 .000Current Salary -Beginning Salary
Pair1
MeanStd.
DeviationStd. Error
Mean Lower Upper
95% ConfidenceInterval of the
Difference
Paired Differences
t df Sig. (2-tailed)