Getting the Most From Your Survey Data
Transcript of Getting the Most From Your Survey Data
Getting the Most From
Your Survey Data
April 27, 2016
Walter R. Paczkowski, Ph.D.Data Analytics Corp. and Rutgers University
2
Session Objectives
1. Present the advantages of using JMP for survey
research.
2. Highlight key features and functionality of JMP for
survey research.
3. Demonstrate how to use JMP for analyzing survey
data based on a Case Study.
2
3
Session Structure
This session has three sections:
1. Introduction and motivation.
2. Why use JMP?
3. Analyze some Case Study survey data.
The Case Study
Analysis of National Survey of Veterans, Active Duty
Service Members, Demobilized National Guard and
Reserve Members, Family Members, and Surviving
Spouses; 2010
3
Source: http://www.va.gov/VETDATA/Surveys.asp
4
Role of Surveys
5
Decision Makers Need Information
Every public and private sector leader needs information –
not data – to make decisions:
Public Sector
What services to provide.
How much to increase/decrease taxes.
What policies to implement.
Private Sector
What product to sell.
How much to charge.
How to promote the product.
Surveys provide data that must be converted to information.
6
Information Comes From Analyzing Data
Survey data are raw, unfiltered, disorganized pieces of
material.
Data are like Lego bricks that have to be assembled.
– Like the bricks, they can be assembled in different
ways reflecting your creativity and questions.
Information is what you build and extract from the data
by manipulating the data in different ways – with
different views.
Data have to be analyzed.
What to do with Data
7
What Is Analysis?
The word analysis means to break into parts. The Greek
root: “a breaking up, a loosening, releasing.”*
Analyzing survey data means looking for relationships,
trends, patterns, and anomalies beneath the surface of
the obvious.
Analysis and reporting of the data from surveys are
often treated synonymously – but they’re different.
– Simple tables and graphs are created without
looking more deeply into what the survey data have
to offer.
– Metaphor: Looking at an ocean for the first time,
seeing whales break the surface, and reporting that
only whales occupy oceans.
*Source: http://www.etymonline.com/index.php?term=analysis
8
Not All Analyses Are The Same
There are two types of analysis.
Static
– Most prevalent form.
– It doesn’t allow for different views of the same data.
Perfectly non-interactive; fixed.
– There’s no drill-down for additional or unexpected
insight; no exploration of relationships.
Dynamic
– Dynamic means active, real time changing of views
and drilling-down for insight and relationships.
Highly interactive: drag & drop or click an icon to
quickly create new views of the data.
– This is what survey data analysis should be.
– The gained insight is the information.
9
The Typical Analysis Tools Are Static
The typical tools are tables and graphs.
Simple, and extensive, tables combined with the
graphs are often the sole forms of “analyses.”
– The reports are created at the same time as the
“analysis” – they are the same.
Tables don’t provide dynamic views of data.
– Most reports mimic “the tabs” which are simple
cross-tabulations at best.
The static nature of the graphs makes it difficult to
quickly explore and test ideas.
– No links back to the original data for drill-down or
filtering.
10
The Result: Understanding Is Lost
The result of focusing almost exclusively on static tables
and graphs is that the wealth of understanding that could
be gained from survey data is left untapped.
Important relationships are left uncovered, unexplored,
buried beneath the surface.
Simple or naïve conclusions and recommendations
result.
The goal of researchers should be to provide
actionable information to decision makers
based on dynamic tools and sophisticated
analyses.
Analysis Goal
11
Analyzing Survey Data Is Complex & Iterative
1. Modify the data to aid analysis and/or interpretation.
2. Tabulate the data dynamically with descriptive statistics and different views or cuts of the data.
3. Graph the data to dynamically look for relationships, trends, patterns, and anomalies.
4. Model the data dynamically for different perspectives.
– Regression models (a family).
– Recursive partitioning.
– Perceptual maps.
– And others.
5. Test hypotheses about the data and model results.
– There are formal statistical tests.
– Sometimes testing, in the form of what-if analysis, is better done with simulators and profilers.
12
The Iterative Analytical Cycle
12
Modify
Tabulate
GraphModel
Test
Each phase has the same level of importance
and is nondirectional.
13
Why JMP?
14
Why JMP?
JMP is a powerful tool for handling the major requirements
for project management and statistical analysis.
15
JMP Management Features
Management features include:
Variable grouping.
Value labels.
Notes.
Table variables.
to mention just a few.
16
Univariate
Analysis
Descriptive
Statistics
Graphs/
Charts
Means
Median
Proportions
Spread
Sums
Frequency tables
Histograms
Box plots
Multivariate
Analysis
ModelsGraphs/
Charts
Box plots
Bubble plots
Heat Maps
Scatter plots
Panel plots
Choice modeling
Regression
ANOVA
Time series
Clustering
Recursive partitioning
Correspondence
Statistical
Capabilities in
JMP
17
It’s Easy to Tabulate Data In JMP
Data tabulations are still useful and informative.
JMP has several platforms to tabulate data:
– Analyze/Consumer Research/Categorical
– Analyze/Tabulate
– Tables/Summary
– Analyze/Fit Y by X (both variables are categorical)
Categorical platform is the most versatile.
Tabulate platform is dynamic – quick change of views.
Summary platform creates data tables of summary
statistics.
Fit Y by X platform will be demonstrated because it has
components useful for analysis.
18
Graph Builder Aids Dynamic Analysis
The powerful Graph Builder in JMP enables you to
dynamically study simple and complex data by enabling
you to:
Build simple to complex scatter plots.
Change scatter plots to box plots or histograms.
Add smooths to see general patterns or trends.
Build panel graphs.
Create pie and bar charts.
– Yes, we sometimes need pie charts.
Link to the data table for further analysis.
Quickly change views by clicking icons.
19
Bottom Line: JMP Enables Dynamic Analyses
JMP enables you to dynamically study simple and
complex data by enabling you to:
Can add a Data Filter to subset data based on other
variables.
Change tables, especially in the Tabulate and
Categorical platforms, by dragging & dropping
variables and using a local data filter.
Changed graphs in the Graph Builder platform by
clicking icons and dragging & dropping variables to
different parts of a palette and apply a local data filter.
20
Summary
21
Summary
We need an active, dynamic analysis of survey data for
penetrating and insightful information.
Static analyses are just snapshots.
– They only skim the surface.
– They don’t penetrate a vail that obscures the
information decision makers need.
Dynamic analyses penetrate the vail to reveal
relationships.
– It involves drilling-down, linking to tables, quickly
and seamlessly changing views.
JMP is perfect for the dynamic analyses of survey data.
21
22
The Case
Study
23
Case Study: Overview
The Department of Veterans Affairs conducted a national
survey in 2010 of veterans, active duty service members,
activated national guard and reserve members, family
members and survivors.
I’ll emphasize the veterans.
This survey was designed to help the VA plan future
programs and services for veterans.
The sample size was 8,710 veterans representing the
Army, Navy, Air Force, Marine Corps, Coast Guard,
and Other (e.g. Public Health Services, Environmental
Services Administration, NOAA, U.S. Merchant
Marine).
23
24
Case Study: Overview
The questionnaire is divided into 15 sections:
24
I. Life Insurance
J. Home Loans
K. Burial Benefits
L. Burial Plans
M. Internet Use
N. Income
O. Demographics
A. Background
B. Familiarity with Veteran Benefits
C. Disability/Vocational Rehabilitation
D. Health Status
E. Health Care
F. Health Insurance
G. Education and Training
H. Employment
25
Time Spent On Data Preparation
Most of your time is spent on data preparation, not on
planning, collecting, analysis, reporting.
Preparation involves:
Importing.
Documenting.
Grouping.
Wrangling.
Modifying and Transforming.
This 80/20 rule held here.
Source: Dasu T and Johnson T. Exploratory Data Mining and Data Cleaning. New York: John Wiley & Sons,
Inc, 2003.
26
Data Preparation
There are 8,710 respondents and 614 variables plus
sampling weight.
Variables were in alphabetical order by “new” variable
names that didn’t agree with the questionnaire; needed
to change the names to match questionnaire.
Added Value Labels and Question for each question to
improve documentation.
Grouped variables using JMP’s Grouping capability to
have one group for each questionnaire section.
– Arranged questions to agree with the questionnaire.
Assigned Missing Value Code to Don’t Know responses.
Deleted “junk” variables: loop counters, miscellaneous.
– Final variable count: 513.
26
27
Data Preparation
Created two demographic variables for later use.
Age of respondent.
– Questionnaire asked for year of birth (YOB).
– Calculated Age as: 2010 – YOB.
Age groups.
– Used JMP Add-in: Interactive Binning.
– 20 - 40
– 41 - 60
– 61 - 80
– 80+
– Missing
Also added military Branch – discussed below.
27
28
Case Study: Overview
National Survey of Veterans
Population Control Totals
28
Source: “National Survey of Veterans Detailed Description of Weighting Procedures.” Appendix B-1.
http://www.va.gov/VETDATA/Surveys.asp. Last accessed April 10, 2016.
29
Case Study: Overview
Sum of Weights
29
Sum of weights must equal the population total, as it
does in this case.
30
Analyzing the Case
Study Data for
Information
31
Structure For This Section
I’ll look at four sections of the VA survey:
A. Background
B. Familiarity with Veteran Benefits
C. Disability/Vocational Rehabilitation
E. Health Care
The objective is to illustrate how JMP can be used for
survey research.
31
32
Live
Demonstrations
33
Session Summary
34
The Iterative Analytical Cycle
34
Modify
Tabulate
GraphModel
Test
Each phase has the same level of importance
and is nondirectional.
35
Contact Information
3636