Post on 01-Feb-2018
MPH 2009
MPH 2009
Introduction to SPSS
Judith L. Jacobsen, PhD
Iben Gjødsbøl
MPH 2009
SPSS
• Menu based / point & click Writes and runs programs No programming knowledge required
No syntax errors Easy to import files in many formats
BUT:Not everything will be includedHeavy to use if you use it a lotThe reproducibility is somewhat lost
MPH 2009
Variable view window
• Here you define variables: name, label, values ect.
• Example defining variable: sex– Name: v1
– Label: sex
– Values: 1 = man, 2 = woman
– Missing: 9
– Measure: scale
• Check out your variable definitions by making a code book: – Utilities|Variables
MPH 2009
MPH 2009
Data view/output window
• Data view window:
– Here you type in your data
– In our example: 1 if man, 2 if woman
• Output window:
– The results (if any)
MPH 2009
Example with existing data: Blood
pressure & obesity OBESE weight/ideal weight BP systolic
blood pressure
SEX OBESE BP .. .. ..male 1.31 130 .. .. ..male 1.31 148 female 1.25 98male 1.19 146 female 1.24 110male 1.11 122 female 1.27 118male 1.34 140 female 1.57 116male 1.17 146 female 1.30 118male 1.56 132 female 1.32 138male 1.18 110 female 1.41 142male 1.04 124 female 1.21 124male 1.03 150 female 1.20 120
.. .. ..
.. .. .. female 1.73 208
MPH 2009
Data
Data are located in a *.txt file bp.txt
• With the following variables:
– SEX: character variable
– OBESE: obesity, i.e. weight/ideal weight
– BP: systolic blood pressure
I.e.:
3 variables
102 observations
MPH 2009
MPH 2009
Open Data
MPH 2009
Creating the dataset – step 1
• Open SPSS and click
Cancel
• Chose Files|Read Text
Data
– Select No and click Next
MPH 2009
Step 2
• Click Delimited ife.g. a spacebetween eachvariable
• Click Fixed width iffixed
• Click Yes – variable names areincluded
• Click Next
MPH 2009
MPH 2009
Step 3
• Type in:
– The first case of data
begins on line number 9
– One line represents a
case
– Import all of the cases
• Click Next
MPH 2009
Step 4
• Separate data in colums
by inserting variable
break line
• Click Next
MPH 2009
Step 5
• Name variables
• Select Data format:
– V1: String
– V2 & V3: Numeric
• Click Next
MPH 2009
MPH 2009
Step 6
• Chose if you want to
save the file format and
paste the syntax
• Click Finish
MPH 2009
Variable View
• Define labels, measure etc.
MPH 2009
Tabulating data
Chose the menu
Analyze >
Descriptive Statistics
> Crosstabs
click the variable
Sex→ Row(s)
click the variable
BP→ Column(s)102696Total
4.5595.45
44242Male
6.9093.1
58454Female
Total10
Frequency
Row Pct
TABLE OF SEX BY BP150
MPH 2009
MPH 2009
Different kinds of Data
• Discrete
– Dichotomous (two alternatives) ♀♂
– Nominal (named categories)
– Ordinal (ordered categories)
• Continuous
– Interval scale (specific distance)
– Ratio scale (same reference)
The statistical methods we chose are defined by the
scale of the variables or combinations of variables we
are interested in.
1 2 3
MPH 2009
Data Examples
Ordinal data
• Small – medium – big
Nominal data
• Red – green – blue
• Growth medium A – B – C
Binomial data
• Bacteria growth or not
Interval data
• Arbitrary point of zero (e.g.
C and F),
20° is not twice as warm as
10°
Ratio data
• Same reference 50 years is
twice as old as 25 yr
MPH 2009
Overview
Interval & ratio data contains more information
than
ordinal data,
which contains more information than
nominal data
One can always go from continuous ⇒ discrete
⇒ ordinal ⇒ nominal
But never the other way!
MPH 2009
MPH 2009
Views
• Variable view
– Here you can see all the variables. If information
about code values is insufficient, this must be
filled in
• Data view
– Here you can see the observations
• Don’t forget to save your data set: Files|Save
As
MPH 2009
Descriptive Statistics: min, max, mean
• Procedure:
– Analyze|Descriptive
Statistics|Descriptives
– Options: Here you can
chose min, max and mean
MPH 2009
Frequency Tables
• Procedure:
– Analyze|Descriptive
Statistics|Frequencies
– Statistics: here you can
choose mean, mode and
medianFrequency table
MPH 2009
MPH 2009
Graphs
• Procedure:
– Graphs|Legacy
Dialogs|Bar/Histogram/P
ie
MPH 2009
Creating categorical variables by
recoding
• Procedure:
– Transform|Recode into
Different Variables
• Click obese � Numeric
Variable -> Output
Variable window
• Give the new variable
name and label
MPH 2009
How To
• Click Old and New
Values
• Range, LOWEST
through value: 1,30 �
chose new value = 1.
Click Add
• Range, value through
HIGHEST: 1,31 �
chose new value = 0
• Standard setting:
Reference group = 0
MPH 2009
MPH 2009
Value Labels
• Type in values of the
new variable in variable
view window
• By using the same
method, create the
binary variable:
– BP > 150 / BP ≤ 150
– Ref. group: BP > 150
MPH 2009
Tabulating data
• Procedure:
– Analyse|Descriptive
Statistics|Crosstabs
– Click the variable
sex � Rows
– Click the variable
BP BINARY � Column(s)
• Row percentages: Click
Cells
MPH 2009
Output
Table of SEX by BP150 (High BP)
MPH 2009
MPH 2009
Categorical Data Example
• In a randomised investigation ( Storr et. al. Lancet,1987) the effect of a single dose prednisolone is compared to placebo for children w acute asthma
73 placebo and 67 prednisolone
• The result section states:
“2 patients in the placebo group (3%, 95% confidence interval −1 to 6%) and 20 in the prednisolone group (30%, 19 to 41%) were discharged at first examination (P < 0.0001)”
• The method section explains, that the above P-value is calculated using Fisher’s exact test
Lets check that out
MPH 2009
Data
Group
Response Placebo Prednisolone
Discharged 2 20
Hospitalised 71 47
73 57
Treatment Status
Placebo Discharged
Placebo Discharged
Placebo Hospitalized
Placebo Hospitalized
Placebo Hospitalized
SPSS needs data arranged in columns
MPH 2009
Solution
• Chi-square-test for independence chisq.test
• To calculate RR it is important to write the response
variable last in the table-specification:
• Procedure: Analyze|Descriptive Statistics|Crosstabs
click the variable Treatment→ Row(s)
click the variable Status→ Column(s)
click the button Statistics > new menu >
� by Chi-square
MPH 2009
MPH 2009
Data
• There must be an easier
way than
copy – paste
But I didn’t find it!
140 rows in all.
MPH 2009
Analyze
Procedure:
– Analyse|Descriptive
Statistics|Crosstabs
– Click the variable
treatment � Rows
– Click the variable
Status � Column(s)
MPH 2009
Output
100,00%1400,00%0100,00%140
Treatment * Status
PercentNPercentNPercentN
TotalMissingValid
Cases
Case Processing Summary
14011822Total
674720Predni
73712PlaceboTreatment
TotalHospDisch
Status
Count
Treatment * Status Crosstabulation
Ready for
your report
MPH 2009
MPH 2009
Procedure: File|Export
MPH 2009
MPH 2009
Result
140N of Valid Cases
0,0000,000Fisher's Exact Test
0,000121,753Likelihood Ratio
0,000117,394
Continuity
Correction(a)
0,000119,387(b)
Pearson Chi-Square
Exact Sig.
(1-sided)
Exact Sig.
(2-sided)
Asymp. Sig.
(2-sided)dfValue
Chi-Square Tests
a Computed only for a 2x2 tableb 0 cells (,0%) have expected count less than 5.
The minimum expected count is 10,53.
MPH 2009
MPH 2009
Furthermore
Chi-Square 19.387 p-value <.0001
Likelihood Ratio Chi-Square 21.7528 p-value <.0001
Fishers Exact 17.3942 p-value <.0001
Thus: Strong significant difference for the two treatments
• Fisher’s exact test, should be used when an expected value in any cell < 5.
• Recall info that no cells had expected count < 5. All expected values >> 5 so use the Chi-square-test
The authors were probably confused because one of the
observered numbers < 5
MPH 2009
Stratified analysis/Mantel Haenszel
analysis
• Example:
– The coherence between sex and BP controlling for
obesity
MPH 2009
Stratified analysis/Mantel Haenszel
analysis• Procedure:
– Analyze|Descriptive
Statistics|Crosstabs
– Click Obese BINARY �
Layer 1 of 1
MPH 2009
MPH 2009
Statistics
• Click Statistics and
chose:
– Chi-square
– Cochran’s and Mantel
Haenszel statistics
MPH 2009
Table of sex by high BP controlling
for obesity
MPH 2009
Test for effect modification
• Breslow-Day test:
– Tests if the 2 separate
OR are different
– H0: homogeneity
– If homogeneity:
calculate ORMH
– If effect modification:
stop analysis• In this case:
χ2 = 0,709 � p>0,05
homogeneity
MPH 2009
MPH 2009
Mantel Haenszel estimate (ORMH )
• ORMH = 1,098
�A weighted
average of
separate OR-
estimates
MPH 2009
Mantel Haenszel test
• χ2MH is a test for conditional independence: no
association between exposure (sex) and outcome
(high BP), adjusted for the confounder (obesity)
• �tests if ORMH = 1 (H0)
In our case: χ2MH = 0,707 � ORMH = 1
MPH 2009
Conclusion
• ORMH does not differ from 1 � there is
conditional independence, i.e there is no
coherence between sex and BP when
controlling for obesity
• If ORMH differs substantially from 1 (rejection
of the H0), the control variable is a confounder