Getting the Most From Your Survey Data

36
Getting the Most From Your Survey Data April 27, 2016 Walter R. Paczkowski, Ph.D. Data Analytics Corp. and Rutgers University

Transcript of Getting the Most From Your Survey Data

Page 1: Getting the Most From Your Survey Data

Getting the Most From

Your Survey Data

April 27, 2016

Walter R. Paczkowski, Ph.D.Data Analytics Corp. and Rutgers University

Page 2: Getting the Most From Your Survey Data

2

Session Objectives

1. Present the advantages of using JMP for survey

research.

2. Highlight key features and functionality of JMP for

survey research.

3. Demonstrate how to use JMP for analyzing survey

data based on a Case Study.

2

Page 3: Getting the Most From Your Survey Data

3

Session Structure

This session has three sections:

1. Introduction and motivation.

2. Why use JMP?

3. Analyze some Case Study survey data.

The Case Study

Analysis of National Survey of Veterans, Active Duty

Service Members, Demobilized National Guard and

Reserve Members, Family Members, and Surviving

Spouses; 2010

3

Source: http://www.va.gov/VETDATA/Surveys.asp

Page 4: Getting the Most From Your Survey Data

4

Role of Surveys

Page 5: Getting the Most From Your Survey Data

5

Decision Makers Need Information

Every public and private sector leader needs information –

not data – to make decisions:

Public Sector

What services to provide.

How much to increase/decrease taxes.

What policies to implement.

Private Sector

What product to sell.

How much to charge.

How to promote the product.

Surveys provide data that must be converted to information.

Page 6: Getting the Most From Your Survey Data

6

Information Comes From Analyzing Data

Survey data are raw, unfiltered, disorganized pieces of

material.

Data are like Lego bricks that have to be assembled.

– Like the bricks, they can be assembled in different

ways reflecting your creativity and questions.

Information is what you build and extract from the data

by manipulating the data in different ways – with

different views.

Data have to be analyzed.

What to do with Data

Page 7: Getting the Most From Your Survey Data

7

What Is Analysis?

The word analysis means to break into parts. The Greek

root: “a breaking up, a loosening, releasing.”*

Analyzing survey data means looking for relationships,

trends, patterns, and anomalies beneath the surface of

the obvious.

Analysis and reporting of the data from surveys are

often treated synonymously – but they’re different.

– Simple tables and graphs are created without

looking more deeply into what the survey data have

to offer.

– Metaphor: Looking at an ocean for the first time,

seeing whales break the surface, and reporting that

only whales occupy oceans.

*Source: http://www.etymonline.com/index.php?term=analysis

Page 8: Getting the Most From Your Survey Data

8

Not All Analyses Are The Same

There are two types of analysis.

Static

– Most prevalent form.

– It doesn’t allow for different views of the same data.

Perfectly non-interactive; fixed.

– There’s no drill-down for additional or unexpected

insight; no exploration of relationships.

Dynamic

– Dynamic means active, real time changing of views

and drilling-down for insight and relationships.

Highly interactive: drag & drop or click an icon to

quickly create new views of the data.

– This is what survey data analysis should be.

– The gained insight is the information.

Page 9: Getting the Most From Your Survey Data

9

The Typical Analysis Tools Are Static

The typical tools are tables and graphs.

Simple, and extensive, tables combined with the

graphs are often the sole forms of “analyses.”

– The reports are created at the same time as the

“analysis” – they are the same.

Tables don’t provide dynamic views of data.

– Most reports mimic “the tabs” which are simple

cross-tabulations at best.

The static nature of the graphs makes it difficult to

quickly explore and test ideas.

– No links back to the original data for drill-down or

filtering.

Page 10: Getting the Most From Your Survey Data

10

The Result: Understanding Is Lost

The result of focusing almost exclusively on static tables

and graphs is that the wealth of understanding that could

be gained from survey data is left untapped.

Important relationships are left uncovered, unexplored,

buried beneath the surface.

Simple or naïve conclusions and recommendations

result.

The goal of researchers should be to provide

actionable information to decision makers

based on dynamic tools and sophisticated

analyses.

Analysis Goal

Page 11: Getting the Most From Your Survey Data

11

Analyzing Survey Data Is Complex & Iterative

1. Modify the data to aid analysis and/or interpretation.

2. Tabulate the data dynamically with descriptive statistics and different views or cuts of the data.

3. Graph the data to dynamically look for relationships, trends, patterns, and anomalies.

4. Model the data dynamically for different perspectives.

– Regression models (a family).

– Recursive partitioning.

– Perceptual maps.

– And others.

5. Test hypotheses about the data and model results.

– There are formal statistical tests.

– Sometimes testing, in the form of what-if analysis, is better done with simulators and profilers.

Page 12: Getting the Most From Your Survey Data

12

The Iterative Analytical Cycle

12

Modify

Tabulate

GraphModel

Test

Each phase has the same level of importance

and is nondirectional.

Page 13: Getting the Most From Your Survey Data

13

Why JMP?

Page 14: Getting the Most From Your Survey Data

14

Why JMP?

JMP is a powerful tool for handling the major requirements

for project management and statistical analysis.

Page 15: Getting the Most From Your Survey Data

15

JMP Management Features

Management features include:

Variable grouping.

Value labels.

Notes.

Table variables.

to mention just a few.

Page 16: Getting the Most From Your Survey Data

16

Univariate

Analysis

Descriptive

Statistics

Graphs/

Charts

Means

Median

Proportions

Spread

Sums

Frequency tables

Histograms

Box plots

Multivariate

Analysis

ModelsGraphs/

Charts

Box plots

Bubble plots

Heat Maps

Scatter plots

Panel plots

Choice modeling

Regression

ANOVA

Time series

Clustering

Recursive partitioning

Correspondence

Statistical

Capabilities in

JMP

Page 17: Getting the Most From Your Survey Data

17

It’s Easy to Tabulate Data In JMP

Data tabulations are still useful and informative.

JMP has several platforms to tabulate data:

– Analyze/Consumer Research/Categorical

– Analyze/Tabulate

– Tables/Summary

– Analyze/Fit Y by X (both variables are categorical)

Categorical platform is the most versatile.

Tabulate platform is dynamic – quick change of views.

Summary platform creates data tables of summary

statistics.

Fit Y by X platform will be demonstrated because it has

components useful for analysis.

Page 18: Getting the Most From Your Survey Data

18

Graph Builder Aids Dynamic Analysis

The powerful Graph Builder in JMP enables you to

dynamically study simple and complex data by enabling

you to:

Build simple to complex scatter plots.

Change scatter plots to box plots or histograms.

Add smooths to see general patterns or trends.

Build panel graphs.

Create pie and bar charts.

– Yes, we sometimes need pie charts.

Link to the data table for further analysis.

Quickly change views by clicking icons.

Page 19: Getting the Most From Your Survey Data

19

Bottom Line: JMP Enables Dynamic Analyses

JMP enables you to dynamically study simple and

complex data by enabling you to:

Can add a Data Filter to subset data based on other

variables.

Change tables, especially in the Tabulate and

Categorical platforms, by dragging & dropping

variables and using a local data filter.

Changed graphs in the Graph Builder platform by

clicking icons and dragging & dropping variables to

different parts of a palette and apply a local data filter.

Page 20: Getting the Most From Your Survey Data

20

Summary

Page 21: Getting the Most From Your Survey Data

21

Summary

We need an active, dynamic analysis of survey data for

penetrating and insightful information.

Static analyses are just snapshots.

– They only skim the surface.

– They don’t penetrate a vail that obscures the

information decision makers need.

Dynamic analyses penetrate the vail to reveal

relationships.

– It involves drilling-down, linking to tables, quickly

and seamlessly changing views.

JMP is perfect for the dynamic analyses of survey data.

21

Page 22: Getting the Most From Your Survey Data

22

The Case

Study

Page 23: Getting the Most From Your Survey Data

23

Case Study: Overview

The Department of Veterans Affairs conducted a national

survey in 2010 of veterans, active duty service members,

activated national guard and reserve members, family

members and survivors.

I’ll emphasize the veterans.

This survey was designed to help the VA plan future

programs and services for veterans.

The sample size was 8,710 veterans representing the

Army, Navy, Air Force, Marine Corps, Coast Guard,

and Other (e.g. Public Health Services, Environmental

Services Administration, NOAA, U.S. Merchant

Marine).

23

Page 24: Getting the Most From Your Survey Data

24

Case Study: Overview

The questionnaire is divided into 15 sections:

24

I. Life Insurance

J. Home Loans

K. Burial Benefits

L. Burial Plans

M. Internet Use

N. Income

O. Demographics

A. Background

B. Familiarity with Veteran Benefits

C. Disability/Vocational Rehabilitation

D. Health Status

E. Health Care

F. Health Insurance

G. Education and Training

H. Employment

Page 25: Getting the Most From Your Survey Data

25

Time Spent On Data Preparation

Most of your time is spent on data preparation, not on

planning, collecting, analysis, reporting.

Preparation involves:

Importing.

Documenting.

Grouping.

Wrangling.

Modifying and Transforming.

This 80/20 rule held here.

Source: Dasu T and Johnson T. Exploratory Data Mining and Data Cleaning. New York: John Wiley & Sons,

Inc, 2003.

Page 26: Getting the Most From Your Survey Data

26

Data Preparation

There are 8,710 respondents and 614 variables plus

sampling weight.

Variables were in alphabetical order by “new” variable

names that didn’t agree with the questionnaire; needed

to change the names to match questionnaire.

Added Value Labels and Question for each question to

improve documentation.

Grouped variables using JMP’s Grouping capability to

have one group for each questionnaire section.

– Arranged questions to agree with the questionnaire.

Assigned Missing Value Code to Don’t Know responses.

Deleted “junk” variables: loop counters, miscellaneous.

– Final variable count: 513.

26

Page 27: Getting the Most From Your Survey Data

27

Data Preparation

Created two demographic variables for later use.

Age of respondent.

– Questionnaire asked for year of birth (YOB).

– Calculated Age as: 2010 – YOB.

Age groups.

– Used JMP Add-in: Interactive Binning.

– 20 - 40

– 41 - 60

– 61 - 80

– 80+

– Missing

Also added military Branch – discussed below.

27

Page 28: Getting the Most From Your Survey Data

28

Case Study: Overview

National Survey of Veterans

Population Control Totals

28

Source: “National Survey of Veterans Detailed Description of Weighting Procedures.” Appendix B-1.

http://www.va.gov/VETDATA/Surveys.asp. Last accessed April 10, 2016.

Page 29: Getting the Most From Your Survey Data

29

Case Study: Overview

Sum of Weights

29

Sum of weights must equal the population total, as it

does in this case.

Page 30: Getting the Most From Your Survey Data

30

Analyzing the Case

Study Data for

Information

Page 31: Getting the Most From Your Survey Data

31

Structure For This Section

I’ll look at four sections of the VA survey:

A. Background

B. Familiarity with Veteran Benefits

C. Disability/Vocational Rehabilitation

E. Health Care

The objective is to illustrate how JMP can be used for

survey research.

31

Page 32: Getting the Most From Your Survey Data

32

Live

Demonstrations

Page 33: Getting the Most From Your Survey Data

33

Session Summary

Page 34: Getting the Most From Your Survey Data

34

The Iterative Analytical Cycle

34

Modify

Tabulate

GraphModel

Test

Each phase has the same level of importance

and is nondirectional.

Page 35: Getting the Most From Your Survey Data

35

Contact Information

Page 36: Getting the Most From Your Survey Data

3636