Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata,...
Transcript of Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata,...
Overview of Statistical SoftwareSPSS, Stata, SAS, R
Debby Kermer
Data Services
George Mason University
Software
v 25spss.com
v 15stata.com
v 9.4sas.com
3.5.1r-project.org
2
Pros and Cons
SPSS Stata SAS R
Use High Low High Growing
Jobs Some Academic Many More
Cost Expensive Depends Expensive Free
Learning Easy Middle Hard Very Hard
Extensible Scripts Users Built-in Users
3
What can it do well?
SPSS Stata
ANOVA, Factor Analysis, Discriminant Analysis
License modules separatelyTrends, Missing Data, Tables
Regression, diagnostics, and robust regression; Analysis of Survey Data, Time Series, SEM
Freely downloadable packages
SAS R
Data Management; Complex models; Mixed Model Analysis,
License components separately SAS/GIS, SAS/STAT, SAS/ACCESS
Anything, if you can find a [well written] package
Download additional packages from CRAN for free
4
Who Uses it?
SPSS Stata
Academic: Social Scientists (the “SS”), and non-scientists
Non-Academic: Companies that just want to do neat things
Academic: Economics, Public Policy, Biomedical Researchers
Non-Academic: Groups that often work with academics
SAS R
Academic: Statistics, Medicine
Non-Academic: Government, and corporations who are serious about data
Academic: Statistics, various
Non-Academic: Small companies with big plans, and others serious about data
5
Which to Pick?
SPSS Stata
Easy to start, limited capability
Best for those with infrequent and/or minimal needs
Easy syntax, highly extensible
Best for academics doing cutting-edge research
SAS R
Hard to learn, highly capable
Best for managing huge and/or complex datasets
Hard to learn, highly extensible
Best for those who program and know what they are doing
6
Job Prospects
R vs SAS vs Python
9
http://www.burtchworks.com/2016/07/13/sas-r-python-survey-2016-tool-analytics-pros-prefer/
Survey of selected “quantitative professionals”, 2016
Use in Academia
12
http://r4stats.com/articles/popularity/
# of Scholarly Articleson Google Scholar
2015
http://r4stats.com/articles/popularity/
Use in Industry
# of Analytics Jobs on Indeed.comFebruary 2014
13
Companies using it
http://blog.datacamp.com/statistical-language-wars-the-infograph/
14
Use
InterfaceSPSS Stata
SAS R
16
GUISPSS Stata
SAS Studio Deducer & R Cmdr
17
Syntax Contingency Table for variable q1 and q2;
with only n, row %, and χ2 test
SPSS
CROSSTABS/TABLES= q1 BY q2/STATISTICS=CHISQ /CELLS=COUNT ROW.
Stata
tabulate q1 q2, obs row chi2
SAS
PROC FREQ data=test; table q1*q2 / NOCOL NOPERCENT CHISQ;
RUN;
R
mytable <- table(q1, q2)mytableprop.table(mytable, 1)chisq.test(mytable)
19
Learning Curve
20
http://guides.nyu.edu/quant/statsoft#s-lib-ctab-6295863-7
Important Differences
Working with multiple files
SPSS Multiple datasets allowed, active data can be specified
Stata One dataset at a time, allows multiple instances
SAS Data always specified, no datasets in memory
R Data always specified, multiple objects in memory
22
Directories & Data Files
SPSS cd "directory" filename.sav
Stata cd "directory" filename
SAS libname name "directory" name.filename
R setwd("directory") use / or \\ filename.RData
23
Labeled/Categorical Variables
SPSS separate LABEL VALUES assigns labels to levels
Stata shared label define creates a 'label'
SAS shared PROC step creates label 'formats'
R separate defining a 'factor' creates labels for levels
24
Missing Values
SPSS . no value or user defined FALSE FALSE
Stata . highest possible value TRUE FALSE
SAS . lowest possible value FALSE TRUE
R NA no value, comparable TRUE TRUE
25
> # < #
Code Characteristics
CodeFile
Code Prompt
CommandEnd
Case Sensitive
Code Comment
SPSS Syntax File [nothing] . No *
Stata Do file . [line break] Yes *
SAS Program [line #] ; No *
R R Script > or + [interpreted] Yes #
26
Data Files
Files
Data Syntax Output Others
SPSS .sav .sps .spo / .spv .por
Stata .dta .do .smcl / .log .dct
SAS .sas7bdat .sas .lst / .log .sas7???
R .RData / .rda .R / .txt .txt .R??
28
Opening other File Types in…
29
Can open Stata and SAS directly
Use usespss, R, or Stat/Transfer (commercial)
Can import SPSS and Stata directly
Use packages foreign or haven to convert
Resources
Help transitioning, links to help for each software
http://dataservices.gmu.edu/resources/software
Single Statistical Software Initiative
https://wikis.uit.tufts.edu/confluence/display/SSSI/Home
31