The Evolution of Survey Process Quality. Concepts Survey Design Quality Quality dimensions Product...

Post on 28-Mar-2015

221 views 0 download

Tags:

Transcript of The Evolution of Survey Process Quality. Concepts Survey Design Quality Quality dimensions Product...

The Evolution of Survey Process Quality

Concepts• Survey

• Design

• Quality

• Quality dimensions

• Product quality

• Process quality

• Organizational quality

• Quality assurance

• Quality control

• Error sources

• Mean squared error

The Survey Process

Research Objectives

SamplingDesign

Data Collection

Data Processing

Analysis/Interpretation

Concepts Population

Mode of AdministrationQuestions

Questionnaire

revi

se

revi

se

The Concept of a Survey

• concerns a set of objects comprising a population• population under study has one or

more measurable properties• goal is to describe the population by

one or more parameters defined in terms of the measurable properties

The Concept of a Survey (con’d)

• access to the population requires a

frame

• sample is selected in accordance with a

sampling design specifying a probability

mechanism and a sample size

The Concept of a Survey (cont’d)

• observations are made in accordance

with a measurement process

• based on the measurements an

estimation process is applied to

compute estimates

• purpose is to make inference to the population (facts, decision-making)

Typical Shortcomings

• target population is changed during the

study

• selection probabilities are not known for

all selected units

• correct estimation formulas are not used

Types of Surveys• One-time

– Attitudes, opinions

• Repeated or continuing– Official statistics (short term indicators, agriculture,

living conditions, crime)

– Other (drug use, consumer research, behaviors)

• International and comparative– Official statistics (European Statistical System,

poverty, water supply)

– Student achievement, literacy, values, happiness, marketing, attitudes

Types of organizations

• Official Statistics– Centralized (NSIs)– Decentralized (Different agencies)

• General survey work– Private, academic– IMF, OECD, UN

Stakeholders

• Customers and users

• Researchers

• Survey organizations

• Owners

• Interest organizations

• The general public

A Brief History• Biblical censuses• Political arithmetic 1650-1800, Graunt and

Eden• The 1895 ISI proposal regarding

representative investigations• Bowley argues for random sampling 1913 in

an attempt to connect statistical theory and sample design

• ISI agrees to promote extended investigation of representative methods in the mid-20s

• Tschuprow, stratified random sampling, early 20s• The 1934 Neyman paper on the

representative method and optimum allocation• Neyman develops theories for

sampling (cluster sampling, ratio estimation, two-phase sampling) and confidence intervals

• Fisher’s random experiments• Nonsampling error theory in the

1940s

• Interpenetration 1946, Mahalanobis• The US Census Bureau survey model 1959-1964• Data quality, Kish, Zarkovich 1965-66• Total survey design, Dalenius 1968

• Developments in other disciplines (errors and their causes)– Questions and interviewers (1917-)

– The response process (1968-) Sudman, Bradburn, Cannell, Tourangeau

– Interviewer-respondent interaction

• Statistical process control (SPC)– Shewhart’s control chart, 1924

– Administrative applications of SPC in survey work, Minton 1968

Quality Milestones

• Early quality management (building ships, maintaining roads, leading empires)

• Industrial revolution (Taylor, Benz, Ford 1910-)

The Quality Revolution Starts Here

• Shewhart’s control chart for process

control

• Dodge and Romig’s acceptance sampling

• A theory for statistical process control

These are methods and tools to handle process variation

• Deming’s 14 points• Juran’s spiral of progress• Ishikawa’s 7 quality control tools• The Joiner Triangle (quality, scientific

approach, teamwork)• Taguchi’s experimental design• Bottom line

– Recognition of the client/customer/user– Increased competition– A need for continuous improvement

Just a Few More Milestones

• Business excellence models (ISO, EFQM, Malcolm Balridge), a clear user perspective

• TQM, Six Sigma, Kaizen, Lean, PDCA, BPR and more

• Quality assurance and quality control

• Standards and quality guidelines

Quality According to ISO 9001

The totality of characteristics of an entity that bear on its ability to satisfy stated and implied needs

Definitions of Quality

• General– Fitness for use

– Design

– Conformance

• In the survey context– Accurate, timely,

accessible plus other dimensions

– Advanced visual display vs tables

– Tolerable error

Quality Assurance and Quality Control

• QA is defined as a set of activities whose purpose is to demonstrate that an entity meets all quality requirements

• QC is defined as a set of activities whose purpose is to ensure that all quality requirements are met

Quality Product (QP)

A QP is one that meets the needs and expectations of customers/clients/users

Eurostat’s Quality Dimensions

• Relevance of statistical concepts

• Accuracy of estimates• Timeliness and

punctuality in disseminating results

• Accessibility and clarity of the information

• Comparability• Coherence• (Completeness)

The Process View

• Product characteristics are established together with the user

• The quality of the product is decided by the processes generating the product

• The processes are controlled via key process variables

Assuring and Controlling QualityQuality Level Main stake-

holdersControl instrument

Measures and indicators

Product User, client Product specs, SLA, evaluation studies, frameworks, standards

Frameworks, compliance, MSE, user surveys

Process Survey designer

SPC, charts, acceptance sampling, risk analysis, CBM, SOP, paradata, checklists, verification

Variation via control charts, other paradata analysis, outcomes of evaluation studies

Organization Agency, owner, society

Excellence models, ISO, CoP, reviews, audits, self-assessments

Scores, strong and weak points, user surveys, staff surveys

Measuring and Documenting Quality

• Accuracy can be measured• Other quality dimensions are

qualitative and can be seen as constraints

• Quality profiles• Quality reports• Performance measures• Codes of practice

Examples of Tools - 1• Self-assessment via excellence model

or other frameworks• Checklists• Quality management (TQM, Six Sigma)• External and internal auditing• Customer satisfaction surveys

Examples of Tools - 2• Staff surveys• Quality control (verification, paradata)• Documentation• MSE component measures

Improving Quality• Benchmarking

• Changing processes

• Small steps or business process reengineering

• Project teams

• Standardization via current best methods

documents or standard operating procedures

and checklists

• Development of quality guidelines

• Training

Quality management philosophies

• This is how I run my company

• Theory a la Drucker

• Improvement methodologies (TQM, Six Sigma, Lean)

• Business Excellence Models (EFQM, Malcolm Baldrige)

34

EQFM Model 2010

Contents of ISO 20252 (sections)

1. Scope 2. Terms and definitions (Swedish translation, some terms inconsistent

with terms used at Stats Sweden)

3. Quality management system requirements (documentation, staff competence and training )

4. Managing the executive elements of research (research proposals, project schedules, Questionnaire Design)

5. Data collection (field worker training, validation levels & methods, also qualitative data collection)

6. Data management and processing (coding, data editing, data storage & data security, eg. original data shall be kept)

7. Report on research projects

What Is Six Sigma?

1. Results oriented management2. Infrastructure and competence3. Problem-solving methodology

Six Sigma focuses on…

• variations

• customers

• processes

• chronic problems

• results

Why the name Six Sigma?

With a sigma level of 6σ a process has no more than 3.4 defects per million opportunities (dpmo)

Control chart (example)

Common cause variation

• Common causes are the process inputs and conditions that contribute to the regular, everyday variation in a process

• Every process has common cause variation

• Example: Percentage of correctly scanned data, affected by people’s handwriting, operation of the scanner…

Understanding Variation (I)

Understanding Variation (II)Special cause variation

• Special causes are factors that are not always present in a process but appear because of particular circumstances

• The effect can be large• Special cause variation is not present all the time• Example: Using paper with a color unsuitable for

scanning

Action

• Eliminate special cause variation

• Decrease common cause variation if necessary

• Do not treat common cause as special cause

                 

                 

                 

                 

                 

                 

                 

 

                 

                 

                 

                 

                 

                 

                 

 

Roots of paradata

• Traditional global ones such as error rates (since 1940)

• The Bristol monograph

• The 1998 ASA session in Dallas

• The Eurostat LEG on Quality

• Handbook on process quality

• Rapid development last 10 years

Meta and para

• Prefixes derived from Greek

• Meta (discussions about discussions, data about data)

• Para (beside, near, beyond, parallel)

Mick Couper’s trilogy

• Data

• Metadata (data about data)

• Paradata (data about processes)

• There are many standards for surveys; examples include– ISO 20252– OMB standards– NCES statistical standards– Quality guidelines developed by specific

organizations (Stat Can; RTI; etc.)– ESS Standards for survey reports

Standards

A Standard Is…

A document that – describes methods and procedures for the

collecting, processing, storing, and presenting survey data.  

– define the (minimal) level of quality and effort that is acceptable for all survey processes

What purposes do survey standards serve?

• Define a minimally acceptable level of quality that organizations should attain.

• Provide consistency across surveys in different organizations

• Facilitate communication of complex concepts, formulas, procedures and methodologies

• Provide transparency of the methodologies used to produce a survey data set.

• Transfers skills and knowledge of best survey practice

We Concentrate on Accuracy• Data must be of sufficient quality for

decision-making• Other dimensions are constraints• Accuracy is much more difficult to

understand• It is important to convey information on

error sources and their contributions to total survey error

• Accuracy is measured by the mean squared error, MSE

Two Routes to Handling Survey Errors

1. Get an estimate of MSE so that we get confidence or other intervals that we can trust

2. Try to develop and use methods that are almost error-free so that the estimated variance becomes an approximation of the MSE

What is mean squared error?

MSE = Bias2 + Variance = (Bspec + BNR + BFr + Bmeas + BDP)2

+ Varsamp + Varmeas + VarDP

The Survey Process Revisited

The Survey Process

Research Objectives

SamplingDesign

Data Collection

Data Processing

Analysis/Interpretation

Concepts Population

Mode of AdministrationQuestions

Questionnaire

revi

se

revi

se

3M Survey Life Cycle Paradigm

Copyright CCSG - ccsg.isr.umich.edu

Survey Quality

Data Dissemination

Data Collection

Data Processing and Statistical

Adjustment

Data Harmonization

Pretesting

Ethical Considerations in

Surveys

Translation

Adaptation of Survey Instruments

Questionnaire Design

Sample Design

Tenders, Bids, and Contracts

Study, Organizational, and

Operational Structure

Interviewer Recruitment, Selection, and

Training

Instrument Technical Design

Examples of issues

• Research questions and survey questions

• General survey design• Target population• Main mode or mix of

modes• Developing the

instrument• Sampling design

• Data collection• Data processing• Estimation• Providing survey

results• Quality assurance• Quality control• Evaluation

Objective of Survey Design

• Maximize survey quality for given budget

or

• Minimize cost of achieving specified level of quality

Due to selecting Errors due toa sample instead of mistakes or systemthe entire pop’n. deficiencies.

Survey Error

SamplingError

NonsamplingError

Specification Error

Measurement Error

Processing Error

Nonresponse Error

Frame Error

Nonsampling Error

Risk of Bias and Variance by Error Source

MSE Component Var Bias

Sampling error High Low

Specification error Low High

Nonresponse error Low High

Frame error Low High

Measurement error High High

Data Processing error High High

How do we estimate bias?• Obtain measurements that are essentially error

free (“gold standard measurements”)– Implement preferred survey methods on a limited

basis– Record checks

• Comparisons to external gold standard estimates– Census, CPS, other high quality national surveys

• Modelling attempts

Effects of Nonsampling Errorsof Estimates - 1

• Variable errors increase the variances of means, totals, and proportions– Confidence levels for interval estimates may be

over-stated

• Systematic errors bias the estimates of means, totals, and proportions

Effects of Nonsampling Errorsof Estimates - 2

• Both variable and systematic errors bias estimates of correlation and regression coefficients

• The nominal level of Type I error can be either to high or too low in the presence of nonsampling errors.

Total Survey Error

• Sampling usually more efficient than census

• Sampling error predictable

• Nonsampling error nonpredictable

• Find the balance

• Use risk management

Conclusions

• Survey design involves allocations of resources using incomplete and imperfect information

• Objective should be to minimize total error subject to cost constraints

Specific Error Sources

Specification Error

• Concepts

• Objectives

• Subject matter problem translated into a statistical problem

• Mismatch between research question and survey question

• Are all research questions covered?

Frame Errors

• Coverage errors– Missing units– Duplications– Extraneous units

• Classification errors– Industry (e.g., Standard Industry Classification

(SIC))– Geography– Size

Frame Errors (Cont’d)

• Contact errors– Address incomplete or incorrect– Contact name– Phone number

• Other errors– Unit structure error– Frame not current => Errors

Not on Frame

On Frame

= mean for entire target pop’n

,Ct CY

1 ,NC Ct t NCY

Y

C C NC NCY t Y t Y

Relative Bias Due to Coverage Error

(1 )( )C C NCNC

t Y YRB

Y

Coverage Bias as a Function of tc and the Relative Difference Btwn and

CY-10

Relative Difference Btwn Covered & Noncovered (%)

Relative CoverageBias (%)

-25-20-15-10-505

10152025

-50 -40 -30 -20 10 20 30 40 50

tc = .90

tc = .70tc = .50

0

CY NCY

Nonresponse Error

• Unit nonresponse

– Noncontacts

– Refusals

• Item nonresponse

– Individual questions skipped

Nonresponse Bias

Nonrespondents

Respondents

Total Population

Relative Bias Due to Nonresponse

RBNR = (1- tR) ( YR YNR

Y)

tR = Response rate for a tele. survey= 75%

YR = Av. income for respondents= 107 Kr.

YNR = Av. income for nonrespondents= 89 Kr.

Y = .75 (107) + .25 (89)= 102.5 Kr.

Example

RB = (.25)

= .044 or 4.4%

( 107 89102.50 )

Example

Response Bias as a Function of tR and the Relative Difference Btwn YR and YNR

-10

Relative Diff Btwn Respondents & Nonrespondents (%)

Relative ResponseBias (%)

-25-20-15-10-505

10152025

-50 -40 -30 -20 10 20 30 40 50

tR = .90

tR = .70tR = .50

0

Components of Response and Nonresponse

In -sco pe U n its(4 )

O u t-o f-sco pe U n its(5 )

R e so lved U n its(2 )

E s tim a tedIn -sco pe U n its

(3 A )

E s tim a tedO u t-o f-sco pe U n its

(3 B )

U n re so lve d U n its(3 )

T o ta l U n its

Estimating the Unresolved Units That Are In-Scope

(4)

(2)x (3) = (3A)

In-scope Units

Refusals Conversions(11)

O therrespondents

(12)

Respondent Units(6)

Refusals(13)

No contacts(14)

ResidualNonrespondents

(15)

Nonrespondent Units(7)

In-scope Units(4 )

Out-of-scope Units

N o n -ex is te n tU n its

(8 )

T e m p o ra rilyO u t-o f-sco pe

U n its(9 )

P e rm an e n tlyO u t-o f-sco pe

U n its(1 0 )

O u t-o f-sco pe U n its(5 )

Response Rate ComponentsGlobal Process Data

•Response rate (6) / [(3A)+(4)]

•Cooperation rate (6) / [(6)+(13)]

•Refusal rate (13)/ [(6)+(13)]

•Refusal rate(13)/ [(3A)+(4)]

•Nonresponse rate (7) / [(3A)+(4)]

Factors Influencing Refusals

Survey Design:• Mode• Respondent rule• Interview length• Interview period length• Survey topic• Questionnaire design

Respondent Characteristics:

• Age, gender, income, health

• Urban-rural

• Crime rate

• Literacy

Interviewer Characteristics:

• Age, gender, race, perceived income, etc.

• Prior experience (skill, confidence)

• Interviewer expectations

• Attitude, recent experience, motivation

Societal Factors:

• Social responsibility

• Legitimacy of survey objective

Psychological Factors(Groves, Cialdini, Couper, 1992)Reciprocation: Compliance as repayment

for a gift, payment, or concession; benefit to R

Consistency: Compliance is consistent with an announced position (belief, attitude, or value)

Social Validation: More willingness to comply if one believes that similar others would also comply

Authority: Compliance is more likely if request comes from a legitimate authority

Scarcity: More willingness to comply to secure opportunities that are scarce

Liking: More willingness to comply to requests from interviewers who are liked

Implications for Interviewing

Prolong Interaction: Maintain conversation to identify cues to use with psychological factors

Tailoring: Adapt interviewing approach to the sample unit

Other Methods to Handle NR

• Decrease respondent burden• New theory for respondent-friendly

questionnaires• Incentives• Call scheduling algorithms• Adjusting for nonresponse• Dillman’s TDM• Mixed mode• Ensuring confidentiality

The Response Process and Its Implications for

Questionnaire Design

InformationSystem

Respondent Interviewer

Instrument

Modeof Data

Collection

Setting

Self-

Adm

in

Response Processes• Cannell, Miller and Oksenberg 1981• Tourangeau 1984• Cantor and Edwards 1991• Biemer and Fecso 1995• Sudman, Bradburn and Schwarz 1996• Tourangeau, Raps and Rasinski 2000• Sudman, Willimack, Nichols and Mesenbourg

2000• Willimack and Nichols 2010

Response ProcessesIndividuals

• Comprehension

• Retrieval

• Judgment and estimation

• Communicating an answer

Establishments

• Encoding in memory/record formation

• Identification and selection of respondents

• Assessment of priorities

• Comprehension

• Retrieval

• Judgment and estimation

• Communicating an answer

• Data release

Phenomena I

• Satisficing• Telescoping• Recency• Primacy• Surprise questions• Context effects• Response alternatives

effect

• Middle alternatives and DK

• Vague terms• Reference period• Double-barreled

questions• Sensitive questions

Phenomena II

• Social desirability bias• Respondent calculations• Vague quantifiers• Number of scale points• Progress indicators• Aided recall

• Labelling scale points• Numerical labels• Acquiescence• General and specific

questions• CAPITALIZED TEXT• Images

Implications for Questionnaire Design

• Wording• Length• Format

– Open– Closed– Scales– Filter

• Positioning of questions

• Type of question

– Factual

– Attitude

– Hypothetical

• Layout

• Navigation

• Computer-aided

Encoding/Record Formation

• Description Knowledge is obtained, processed, and is either stored in memory or a physical record is made. To be retrieved the information must exist.

– Proxy R error; responses from R’s who really “don’t know”

– Memory is incomplete, distorted, or inaccurate

– Records are missing, incomplete, or incompatible with survey requirements

Types of Errors:

Comprehension

• Description Meaning of the question, as researcher intended it, is understood by the respondent

– Context Errors– Use of technical terms– Translation problems– Misleading response alternatives

Types of Errors:

Retrieval of Information

• Description

Respondent retrieves relevant information from memory or from records or other external sources

– Forgetting– Telescoping– Estimating– Use of out-dated records

Types of Errors:

Judgment and Formatting a Response

• Description

Information is evaluated and a response is formatted corresponding to the response alternatives presented

– Response alternatives are too constrained– Response alternatives suggest a response

distribution– Respondents are pressured into giving a

“top of the head” response

Types of Errors:

Response Editing and Communication

• Description

Respondent edits response and communicates it

– Social desirability effects– Fear of disclosure– Acquiescent behavior

Types of Errors:

Errors Due to Interviewers

and Interviewing

The Role of the Interviewer• School A: “Standardized” interview

perspective • Requires interviewers to: Read questions exactly as worded Refrain from unscripted interactions Obtain a codeable response from the

respondent Avoid attempts to clarify concepts unless

clarifications are prescripted

School B: “Collaborative” or “conversational” interview perspective • Requires interviewers to:

– Detect and repair respondent misunderstanding of the question

– Collaborate with respondent in the interview process

– Make common sense inferences in recording answers– Redesign questions to adapt them to the respondent’s

situation

In practice

• Conversational flexible interviewing approach

• Mixture of standardized and conversational

• Person-oriented style

– Discussing personal opinions with the respondent

– Inconsistent probing– Inconsistent feedback– Rewording or misinterpreting questions– Falsification

Systematic Interviewer Errors

Systematic Errors

• Poor questionnaire design• systematic errors• across all respondents• Interviewer error • systematic errors• within an interviewer’s • assignment

Interviewer Characteristics (Age, Race, Sex, Education) Appearance Motives Beliefs/attitudes Perceptions Expectations Behaviors Skills Knowledge

Respondent Characteristics (Age, Race, Sex, Education) Knowledge Interest/motivation Confidence Strength of convictions Expectations

Interviewer Error

Questionnaire Definition clarity Terminology/jargon Question form Instructions Question wording Question topic

Survey Conditions andSetting Mode of interview Standardization Interviewer training Interviewer supervision Monitoring/observation

Interviewer Characteristics (Age, Race, Sex, Education) Appearance Motives Beliefs/attitudes Perceptions Expectations Behaviors Skills Knowledge

Respondent Characteristics (Age, Race, Sex, Education) Knowledge Interest/motivation Confidence Strength of convictions Expectations

Interviewer Error

Questionnaire Definition clarity Terminology/jargon Question form Instructions Question wording Question topic

Survey Conditions andSetting Mode of interview Standardization Interviewer training Interviewer supervision Monitoring/observation

Design Factors that May Explain Interviewer Effects

No Interviewer Variance

Sample Dispersion

Interviewer Variance

Interviewer A

Interviewer B

Interviewer C

Interviewer D

Interviewer E

Interviewer Error Model Observed value = true value

+ systematic error + variable error

Using this model, we estimate:

int = variance (systematic error)total variance of observed value

Consequences of Interviewer Error for Totals and Means

Variance of y (bar) is increased (i.e. multiplied) by the factor

1 + (m 1) int

Computation of the Increase in Variance

Suppose m = 100 and int = .01 then

1 + (m 1) int = 1 + (99) x .01= 2, approx.

Example

Values of intint from Interviewer Variance Studies in the Literature

Studies Reporting intint InterviewMode

AverageValue of intint

Study of Blue Collar Workers (Kish, 1962) Study 1 Study 2

Face to Face0.0200.014

Canadian Census, 1961 (Fellegi, 1964) Face to Face 0.008

Canadian Health Survey (Feather, 1973) Face to Face 0.006

Study of Mental Retardation (Freeman and Butler,1976)

Face to Face 0.036

World Fertility Survey (O’Muircheartaigh andMarckwardt, 1980)Peru - main surveyPeru - reinterviewLesotho - main survey

Face to Face

0.0500.0580.102

Consumer Attitude Survey (Collins and Butcher,1982)

0.013

Interviewer Training Project (Fowler andMangione, 1985)

Face to Face 0.005

Average intint for Face to Face Surveys .0312

Values of intint from Interviewer Variance Studies in the Literature

Studies Reporting intint InterviewMode

AverageValue of intint

Study of Telephone Methodology Telephone 0.0089

Health and Television Viewing Telephone 0.0074

Health in America Telephone 0.0018

1980 Post Election Study Telephone 0.0086

Monthly Consumer Attitude Survey November 1981 December 1981 January 1982 February 1982 March 1982

Telephone0.01840.00570.01630.00900.0067

Average intint for Telephone Surveys .0092

Evaluating Interviewer Performance

– Monitoring telephone interviews– Tape and video recording and behavior coding– Verification recontact– Reinterviews– On-site observations– Questionnaire review– Keystroke files– Mock interviews

Data Collection Modes and

Associated Errors

Data Collection Mode

– Modes of data collection– Choosing a mode

– Data quality considerations– New technologies and mode

Modes I• CAPI = Computer Assisted Personal

Interviewing

• ACASI = Audio CASI

• CATI = Computer Assisted Telephone Interviewing

• PAPI = Paper and Pencil Interviewing

• CADE = Computer Assisted Data Entry

Modes II

• TDE = Touchtone Data Entry

• CASI = Computer Assisted Self Interviewing

• EDI = Electronic Data Interchange

• DBM = Disk by mail

• EMS = Electronic Mail Survey

• VRE = Voice Recognition Entry

• T-ACASI = Telephone ACASI

Face-to-face

– Flexible– Expensive– Advantages and drawbacks of interviewer– Visual aids

Telephone interviewing

– Similar to f-f but less flexible– Fast– Monitoring possible– Questionnaires have to be simpler

Mail

– Good for sensitive topics– No control over response process– Can be made respondent-friendly– All survey materials must be crystal-clear– Respondent sets the pace– Question order effects reduced

Web Surveys• Internet access varies

• Differences in computer systems and browsers must be considered

• Good for visual stimuli

• Questionnaires should be short

• Fast and inexpensive

Diary

– Recall error increases over time– Heavy response burden– Behavior can change temporarily– Survey topic is such that total survey period is

quite long

Administrative records

– Errors similar to those of other modes– Statisticians have sometimes no control over

contents, updates, etc– Statistical purposes come in second after

administrative ones– Conceptual differences common

Direct observation

• No respondents• Devices and calibration problems• Various kinds of observations

– Counting behaviors, eye estimates, anthropology, mystery shopping, price collection, photos

• Observer errors (rho)

Mixed modes

• Can be an “optimal” solution

• Can be a necessity due to frame problems or nonresponse problems

• Give respondents a choice

• Adjustment of questions and questionnaire seldom done

The Choice of Mode

• Each mode has advantages and disadvantages regarding– Costs– Measurement errors– Nonresponse and coverage– Flexibility– Timeliness

The Decision Regarding Mode• Sometimes there is no real choice due to

costs or practical constraints

• Often more than one mode must be used

• Pure mode effects difficult to assess

• The decision often concerns a main mode

Summary

• The choice of mode can be very simple or very complex

• Error structures of new modes are not fully understood

Summary

• The choice of mode can be very simple or very complex

• Error structures of new modes are not fully understood

Data Processing Errors and Their

Control

Data Processing Steps for PAPI

1. Check-in: questionnaires are collected and work units are formed

2. Scan edit: entries are inspected to avoid data entry problems

3. Data entry: questionnaire data are captured via keying, scanning or other optical sensing

4. Editing: captured data are “corrected” and “cleaned;” missing data are “imputed.”

Data Processing Error• Relatively sparse literature• Some steps are very error prone (e.g., coding

and editing)• Errors are both systematic and variable

– rho

• Increased automation and integration reduces variable error while increasing systematic error

Data Capture Errors

• Keying errors

– Discovered by verification keying or editing

– Error rates usually small based on records, fields or characters

– Studies often conducted in QC environments

– The vital few large errors can have large effects on MSE

Data Capture Errors (cont’d)• Intelligent Character Recognition

– Error types are substitution and rejection

– Substitution errors can be systematic

– Condition of incoming documents and the equipment is crucial which calls for continuing calibration

– Might have to be complemented with manual keying

Editing definition• Editing is the identification and, if

necessary, correction of errors and outliers in individual data used for statistics production

• The definition does not state that all errors be corrected or even identified

• Editing can be very costly

Purpose of editing

• To provide information about data quality (patterns and root causes)

• To provide information about future survey improvements

• To ”clean up” the data

Different Kinds of Editing

• Micro-editing: Editing at record level

• Macro-editing: Editing at aggregate level

• Selective editing

• Output editing

The result is overediting

• Historical reasons• Large budgets• Really QC of the data collection operation• Feedback loop often missing• Risk management

Key process variables for editing(examples)

• Edit failure rate (#objects with edit failures/#objects edited) estimates amount of verification

• Correction rate (#objects corrected/#objects edited) estimates the effect

• Edit success rate by variable (#objects with changes on variable X/#objects with edit failures on X estimates how successfully the edits identify errors on X

Coding

• Classification process where open-ended responses are classified into coding categories

• Coding can be expensive, error-prone and boring

• Coding can be manual centralized or decentralized, automated or computer-assisted

Input Action OutputResponse

Coding Instructions

Nomenclature

CoderJudgment

Code NumberAssignment

The Generic Coding Process

Coding Errors

• Coding is subjective in nature• Error rates and variability rates can be

large• Coding error occurs when there is a

deviation between the assigned code number and the true code number

Coding Errors (con’d)

• Coding errors are identified by verification

• Coding rules and nomenclatures may be incomplete

• Errors are controlled by automation, dependent, and independent verification

Examples of Coding Error Rates

• 1970 Swedish Census– Occupation 13.5 %– Industry 9.9 %

• 1970 US Census– Occupation 13,3 %– Industry 9.1 %

• 1991 RTI– Occupation 21%– Industry 17%

Production coding by Coder A resulting in

code number xA

Verification coding by Coder B resulting in

code number xB

Compare code numbers xA and xB

xA = xB?

xA = xB is the final, outgoing

code number

Verification coding by Coder D resulting in

code number xD

xD is the final, outgoing

code number

Verification coding by Coder C resulting in

code number xC

Compare code numbers xA , xB and xC

xA = xC?or

xB = xC?

xA = xC

orxB = xC

is the final, outgoingcode number

Two-way Independent Verification with Adjudication

Yes

No

Yes

No

Automated Coding• There should be a computer-stored dictionary• Responses are entered online or via some

other medium like scanning or keying• Responses are matched with dictionary

descriptions and based on that matching the responses are coded by the software or transferred to manual coding

• By collecting and analyzing process data the system is continually improved

Levels of Automation• Computer Assisted Coding

• Automated

• Matching can be exact or inexact

• Coding degrees obtained: -Purchases 73% (Sweden)

-Industry and occupation 63% (US)

Key process variables in coding

• Coding degree in AC and MC• Effects in coding degree by updates of

dictionary• Coding degree by category, AC and MC• Coding error rate by coders, categories, coding

mode and update version• CAC consultation degree by category and coder

File preparation

• Attaching weights to each unit• Final weight is a product of base weight and

adjustment factors for nonresponse and noncoverage

• No theory for measurement error adjustment yet• Computation can be difficult• Application of disclosure avoidance techniques,

macrodata and microdata

The Total Survey Error Framework

160

Deming (1944) “On Errors in Surveys”

• American Sociological Review!

• First listing of sources of problems, beyond sampling, facing surveys

• The 13 factors

161

Deming’s 13 factors

-The 13 factors that affect the usefulness of a survey

-To point out the need for directing effort toward all of them in the planning process with a view to usefulness and funds available

-To point out the futility of concentrating on only one or two of them

-To point out the need for theories of bias and variability that correlate accumulated experience

162

163

Comments on Deming (1944)

• Does include nonresponse, sampling, interviewer effects, mode effects, various other measurement errors, and processing errors

• Omits coverage errors• Includes nonstatistical notions (auspices)• Includes estimation step errors (wrong

weighting)• “Total survey error” not used as a term

164

Sampling Text Treatment of Total Survey Error

• Kish, Survey Sampling, 1965– Graphic on biases– 65 of 643 pages on various errors, with

specified relationship among errors

165

Sampling Biases

Frame biases

“Consistent” Sampling Bias

Constant Statistical Bias

Nonsampling

Biases

Noncoverage

NonresponseNonobservation

Field: data collection

Office: processingObservation

166

Sampling Text Treatment of Total Survey Error

• Särndal, Swensson, Wretman, Model Assisted Survey Sampling, 1992– Part IV, 124 pp. of 694, coverage, nonresponse,

measurement error; omits processing error

• Lohr, Sampling Design and Analysis, 2009– 34 of 600 pages on nonresponse, and 40 on

nonsampling errors and survey quality

167

Other textbooks

• Cochran (1953). Sampling Techniques.40 pages in concluding chapter on “sources of error

in surveys”• Deming (1950). Some Theory of Sampling. Starts

with the 1944 factors but then continues with pure sampling

• Hansen, Hurwitz and Madow (1953). Sample Survey Methods and Theory, Vol 1. Nine pages on survey errors.

• Zarkovich 1966. Quality of Statistical Data.

168

Total Survey Error (1979)Anderson, Kasper, Frankel, and Associates

• Empirical studies on nonresponse, measurement, and processing errors for health survey data

• Initial total survey error framework in more elaborated nested structure

169

Total Error

VariableError

Sampling

Nonsampling

Field

Processing

Bias

Nonsampling

Observation

Field

Processing

Sampling

Frame

Consistent

Nonobservation

Noncoverage

Nonresponse

170

Survey Errors and Survey Costs (1989), Groves

• Attempts conceptual linkages between total survey error framework and– psychometric true score theories

– econometric measurement error and selection bias notions

• Ignores processing error• Highest conceptual break on variance vs. bias• Second conceptual break on errors of

nonobservation vs. errors of observation

171

Coverage Nonresponse Sampling Interviewer Respondent Instrument Mode

Coverage Nonresponse Sampling Interviewer Respondent Instrument Mode

Errors ofNonobservation

ObservationalErrors

Bias

Errors ofNonobservation

ObservationalErrors

Variance

Mean Square Error

construct validitytheoretical validityempirical validityreliability

criterion validity - predictive validity - concurrent validity

172

Nonsampling Error in Surveys (1992), Lessler and Kalsbeek

• Evokes “total survey design” more than total survey error

• Omits processing error

173

Components of Error Topics

Frame errorsMissing elements

Nonpopulation elements

Unrecognized multiplicities

Improper use of clustered frames

Sampling errors

Nonresponse errorsDeterministic vs. stochastic view of nonresponse

Unit nonresponse

Item nonresponse

Measurement errorsError models of numeric and categorical data

Studies with and without special data collections

174

Introduction to Survey Quality, (2003), Biemer and Lyberg

• Major division of sampling and nonsampling error

• Adds “specification error” (a la “construct validity”) or relevance error

• Formally discusses process quality

• Discusses “fitness for use” as quality definition

175

Sources of Error Types of Error

Specification error Concepts

Objectives

Data element

Frame error Omissions

Erroneous inclusions

Duplications

Nonresponse error Whole unit

Within unit

Item

Incomplete Information

Measurement error Information system

Setting

Mode of data collection

Respondent

Interview

Instrument

Processing error Editing

Data entry

Coding

Weighting

Tabulation

176

Survey Methodology, (2009) Groves, Fowler, Couper, Lepkowski, Singer,

Tourangeau• Notes twin inferential processes in surveys

– from a datum reported to the given construct of a sampled unit

– from estimate based on respondents to the target population parameter

• Links inferential steps to error sources

177

ConstructInferential Population

Measurement

Response

Target Population

Sampling Frame

Sample

Validity

Measurement Error

Coverage

Error

Sampling

Error

Measurement Representation

Respondents

Nonresponse

ErrorEdited Data

ProcessingError

Survey Statistic

178

Key Statistical Developments in Total Survey Error 1

• Errors of observers can be correlated (1902), Karl Pearson

• Interpenetrating samples (1946), Mahalanobis• Criteria for true values (1951), Hansen, Hurwitz,

Marks and Mauldin• Essential survey conditions, correlated response

variance (1959), H-H-Bershad• BC survey model “mixed-error model”(1961), H-

H-B

179

• Interviewer effects using ANOVA (Kish 1962)• Simple response variance via reinterviews (1964), H-H-Pritzker• Relaxed assumptions of zero covariance of true values and

response deviations (1964, 1974), Fellegi• Errors of Measurement (1968), Cochran• Estimating model components via basic study schemes using

replication, interpenetreation and combinations of the two (1969), Bailar and Dalenius

• Estimating nonsampling variance using mixed linerar models (1978), Hartley and Rao

• “Error Profile” of Current Population Survey (1978), Brooks and Bailar

• Multi-method multi-trait models on survey measures (1984), Wothke and Browne

Key Statistical Developments in Total Survey Error 2

180

Weaknesses of the Common Usage of “Total Survey Error”

– Notably a user perspective is missing– Key quality dimensions are missing in the TSE

paradigm– User often cannot or prefers not to question accuracy– The complexity does not invite outside scrutiny of

accuracy– Users not really informed about real levels of error or

uncertainty– We don’t really know how users perceive

information on errors

181

Other Weaknesses of the Total Survey Error Paradigm 1

1. Lack of routine measurementsNo agency does thisError/quality profiles are useful but rare

2. Ineffective influence on professional standards Little expansion beyond sampling error in practice

Press releases on Federal statistics rarely contain even sampling errors

Survey error research compartmentalized rather than integratedMethodologists tend to specializeRoot causes of error often still missingHow about OMB’s requirement of NR bias studies if NR

expects to exceed 20%?

182

Other Weaknesses of the Total Survey Error Paradigm 2

3.Large burden on design of some estimators

Interpenetration, reinterviews for variance estimation complicated and costly

Intractable expressions for some components

4. Some assumptions unrealistic

183

Strengths of the Total Survey Error Framework

1. Taxonomic decomposition of errors• nomenclature for different components

2. Separation of phenomena affecting statistics in different ways

• variance vs. bias; observation vs. nonobservation; respondent/interviewer/measurement task; processing

3. Conceptual foundation of the field of survey methodology• subfields defined by errors

4. Tool to identifying gaps in the research literature• e.g., where are the error evaluation papers on processing?

184

Needed Steps in a Research Agenda for Total Survey Error 1

1. Integrating causal models of survey errors• cognitive psychological mechanisms (anchoring,

recall decay)

2. Research on interplay of two or more error sources jointly

• e.g., nonresponse and measurement error

3. Research on the interplay of biases and variances• e.g., does simple response variance increase

accompany some response bias reductions (self-administration effects)?

185

4. Guidance on tradeoffs between quality measurement and quality maximization and between measures and developing error-free processes- how much should we spend on quality enhancement vs. measurement of quality (Spencer, 1985)?

5. Integrating other notions of quality into the total survey error paradigm- if “fitness for use” predominates as a conceptual base, how can we launch research that incorporates error variation associated with different uses? Australian Bureau of Statistics

Needed Steps in a Research Agenda for Total Survey Error 2

186

Needed Steps in a Research Agenda for Total Survey Error 3

6. Exploiting a multiple-mode, multiple frame, multiple phase survey world

7. Need for methodological studies to assist the user8. Costs and risks9. Develop theories for optimal design of specific operations,

design principles10. More standards?

Measures and Indicators of Quality

187

Assuring and Controlling QualityQuality Level Main stake-

holdersControl instrument

Measures and indicators

Product User, client Product specs, SLA, evaluation studies, frameworks, standards

Frameworks, compliance, MSE, user surveys

Process Survey designer

SPC, charts, acceptance sampling, risk analysis, CBM, SOP, paradata, checklists, verification

Variation via control charts, other paradata analysis, outcomes of evaluation studies

Organization Agency, owner, society

Excellence models, ISO, CoP, reviews, audits, self-assessments

Scores, strong and weak points, user surveys, staff surveys

Process data and Paradata

Definitions

• Process is a series of actions or steps towards achieving a particular end

• Process quality is an assessment of how far each step meets defined criteria

• Process variables are factors that can vary with each repetition of the process

• Key process variables are factors that have a large effect on process end result

Some paradata terminology

• Data, Metadata, Paradata• Macro paradata– global process data such as

response rates, coverage rates, edit failure rates, sometimes broken down

• Micro paradata– process data that concern individual records such as flagged imputed records, keystroke data

• Formal selection, collection, and analysis of key process variables that have an effect on a desired outcome, e.g., increased productivity

Definitions of paradata

• Groves and Couper: Paradata are data about the data collection process

• Admit that definition is not well-evolved and subject to debate

• Groves et al. : Process and administrative data produced auxiliary to the survey data collection.

• The European term “process data” takes all survey processes into account

• Developing terminology standards is usually a waste of time

• Paradata is a subset of process data but nothing to argue about

• The important thing is: Never collect data on processes that are not related to quality, every collection should be goal-driven

• Collecting data on processes related to quality without using SPC and other proper analysis methods is extremely wasteful

• If you don’t know how to analyze don’t collect

Plan for continuous improvement (of a product) Marker and Morganstein

1997• Identify critical product characteristics• Develop a process flow map• Determine key process variables• Evaluate measurement capability• Determine stability of critical processes• Determine process capability• Establish a system for continuous process

monitoring

Product characteristics

• Ideally decided by the customer

• Communicating concepts and innovative ideas

Flow charts

• Flow, decision points, customers

• Define owners

• List process variables (those whose values can affect product characteristics)

• At this stage a process “variable” is much broader than what is usually meant (factors such as prices, dates, lists of customers, etc can be “variables”

Key process variables

• Difficult step• Key are those that have the largest effect on

process outputs• Collective knowledge is used in the

selection process• Tools include the Pareto diagram and the

cause-and-effect diagram (fishbone or Ishikawa)

Measurement capability

• Do not reach conclusions about process stability without knowledge about measurement errors

• Available data may be useless• Data should allow quantification of

improvement• Be careful when it comes to customer

satisfaction surveys

Determine stability of critical processes

• Control charts

• Diagnose type of process variation

– (Assignable) special cause

– Common cause

• Take action

Determine system capability

• After system changes (improvement projects) triggered by unacceptable common cause variation process stability must be reevaluated so that the new process is capable of meeting specs such as minimum response rates, minimum error rates, deadlines, etc

• Reduced variation is maintained by adhering to SOPs or CBMs

System for process monitoring

• Processes cannot be expected to remain stable over time.

• Technology changes, new types of human errors, customer requirements change

• Thus, monitoring necessary

Paradata in coding, say, occupation

• Manual: error rate by coder, category, coder experience, within- and between coder variability

• Computer-assisted: degree of computer-consulting, error rates combined with computer use

• Automated: error rates by category, coding degree in general, by dictionary update, by dictionary type

New types of paradata

• Interviewer notes

• Attributes of call attempts

• Nature of interaction with sample member

• Behaviours during the interview

• Flagging imputed records

• Keystroke data

• Response latency

Importance of paradata (I)

• Continuous updates of progress and stability checks (monitoring)– Control charts, standard reports– Managers choose to act or not to act– Early warning system

• Input to long-run process improvement of product quality– Analysis of special and common cause variation

• Input to methodological changes– Finding and eliminating root causes of problems– Research

Importance of paradata (II)

• Responsive designs– Simultaneous monitoring of paradata and regular

survey data to improve efficiency and accuracy

• Input to organizational change– E.g., centralization, decentralization, standardization

• Quality profiles, client communication, public use paradata files, inference, picturing quality over time

Exploratory analysis of paradata

• Example of multivariate situation

• Observing one interviewer: Large % vacant housing, unusual time of interview, short interview length, response pattern does not vary much

• Possible curbstoning

Risks associated with paradata

• Could be a lot due to automatic byproducts• Could be a lot of indirect indicators of cost

and quality• Correct analysis approaches must be used• Ethical concerns• Overuse and underuse

Thoughts on development

• Process indicators should be key• Paradata are multivariate in nature and might have

to be combined to be relevant• We need to learn how to use paradata to intervene in

the process as needed• Create paradata archives to allow reanalysis so that

understanding of what is key can grow or change• Examine potential of partnership across

organizations• Communicating paradata with users

An Overview of Survey Error EvaluationMethods

Purpose of Survey Error Evaluation

• Compare data collection modes or methods

• Optimize allocation of resources

• Error reduction for specific survey processes

• Provide users with information on data quality

• Adjustment estimates for nonsampling error

General Methods for Evaluation

• Pretesting

• Experiments

• Statistical Process Control– Process Control

• Key process variables

• Control charts

– Acceptance sampling

• Postsurvey validation

Some Techniques for Survey Evaluation I

• Evaluation method– Expert review of questionnaires

• Unstructured

• Structured

• Stage– Design

• Purpose– Identify problems with questionnaire layout and

format, question wording, order and instructions

Some Techniques for Survey Evaluation II

• Evaluation method– Cognitive methods

• Behavior coding

• Cognitive interviewing

• Other cognitive lab methods

• Stage– Design/pretest

• Purpose– Evaluate one or more stages of the response process

Some Techniques for Survey Evaluation III

• Evaluation method– Debriefings

• Interviewer group discussions• Respondent focus groups

• Stage– Pretest/survey/post survey

• Purpose– Evaluate questionnaire and data collection

procedures

Some Techniques for Survey Evaluation IV

• Evaluation method– Observation

• Supervisor observation• Telephone monitoring• Tape recording/CARI

• Stage– Pretest/survey

• Purpose– Evaluate interviewer performance. Identify

questionnaire problems.

Some Techniques for Survey Evaluation V

• Evaluation method– Post-survey analysis

• Experimentation• Nonrandom observation• Internal consistency• External validation

• Stage– Post-survey

• Purpose– Compare alternative methods of data collection, estimate

MSE components, validate estimates

Some Techniques for Survey Evaluation VI

• Evaluation method– Post-survey data collection

• Reinterviews• Nonresponse follow-up• Record checks

• Stage– Post-survey

• Purpose– Estimate MSE components

Basic interview-reinterview table for a dichotomous variable

 

Interview

Reinterview 1 0 

 

1 a 

b a+b

0 c d c+d 

  a+c b+d n 

Some measures

• g=(b+c)/n gross difference rate or disagreement rate

• A=(a+d)/n agreement rate (1-g)• ndr=(b-c)/n net difference rate• I=g/[p1(1-p2)+p2(1-p1)] index of

inconsistency• p1=(a+b)/n• p2=(a+c)/n

Practical Survey Design for Minimizing MSE

What Should Be Designed?

• Requirements+specifications+operations

• Ideal goal+ Defined goal+Actual results

• Good survey design means control of accuracy through the specs (QA) and control of operations (QC)

Some Early Thinking

• Hansen-Hurwitz-Pritzker 1967

– Take all error sources into account

– Minimize all biases and select a minimum-variance scheme so that Var becomes an approximation of (a decent) MSE

– The zero defects movement that later became Six Sigma

• Dalenius 1969

– Total survey design

Alternative Criteria of Effectiveness

• Minimizing MSE for a given budget while meeting other requirements

• Maximizing fitness for use for a given budget

• Maximizing comparability for a given budget

• All these reversed

• Something else?

The Elements of Design

• Assessing the survey situation (requirements)• Choosing methods, procedures, “intensities”, and

controls (specifications)• Allocating resources• Assessing alternative designs• Carry out one of them or a modification of it• Have a Plan B

So, What’s the Problem?

• No established survey planning theory

• Multi-purpose, many users

• The information paradox

• Uninformed clients/users/designers

• Much design work is partial, not total

• Limited knowledge of effects of measures on MSE and cost

More Problems

• Decision theory and economics theory not used to their potential

• New surveys conducted without sufficient consideration of what is already known

• No one knows the proper allocation of resources put in before, during and after

• The literature is small

Various Skills Needed Which Calls for a Design Team

• Survey methodology

• Subject-matter

• Statistics (decision theory, risk analysis, loss functions, optimization, process control)

• Economics (cost functions, utility)

• IT

Rules of the Road

• Use reliable methods

• Develop a survey plan showing the resource allocation to each stage

• To be able to allocate resources optimally, collect information during planning and implementation

Rules of the Road (cont’d)

• Monitor the processes that lead to the product

• Disseminate information on data quality to users and producers

The Balance Between Cost, Errors and Other Quality Features

• Quality dimensions conflict– Accuracy vs timeliness– Accuracy vs relevance– Comparability vs accuracy– Cost vs error

Problems that Impede our Ability to Optimize Surveys

• Lack of expertise• The relationship between resources spent on

error reduction and actual error reduction is unknown

• Survey errors are highly interactive

Problems that Impede our Ability to Optimize Surveys

(con’d)• Major surveys are multi-purpose• All quality dimensions and constraints on them

limit design flexibility• It is not known how to allocate resources

between pilot studies, error reduction and error measurement

Bad News and Good News

• Bad news– Cost-survey optimization can be extremely

complex and much of this complexity is unknown

Bad News and Good News

• Good news– Simple models describing the relationship

between cost and error are still useful because often the optimum is flat

The Adaptive Element

• The entire survey process should be responsive to anticipated uncertainties that exist before the process begins and to real time information obtained throughout the execution of the process

or • Use process data (paradata) to check, and if

necessary, adjust the process

We Should Assemble What We Know

• Assessment methods

• Design principles

• Trade-offs and their effects

• The potential offered by other disciplines

• We shouldn’t accept partial designs

Apply Design Principles

• If pop is skewed then….

• If pop is nested then….

• If questions are sensitive then….

• If a high NR rate is expected then…

Examples of Trade-offs

• Accuracy vs timeliness• Response burden vs wealth of detail• Conduct survey vs other information

collection• Large n vs smaller n• Mixed vs single mode• NR bias vs measurement error• NR vs interpretation by family members

Example of Outline for a Survey Plan

• Statement of work

• Technical approach

• Management plan

• Schedule of activities and deliverables

• Budget

Checking out the Resources

• Consult in-house experts• Participate in professional activities• Develop current best methods for major

survey processes

Checking out the Resources (cont’d)

• Apply findings from the survey methods literature

• Consult general quality guidelines developed by prominent organizations

Examples of resources

• Conferences

– ASA

– AAPOR

– ISI

– Topic

• Journals

• JOS

• Survey Methodology

• POQ

• Books

Using Pilot Studies to Inform Survey Design

• Paradox:

• In principle, the survey designer needs information that will not be available until the survey has been completed

• The answer: Pilot studies on a smaller scale than the survey itself

Examples of Pilot Study Topics

• Choice of mode

• Length of recall period

• Topic sensitivity

• Response burden

• Clarity of concepts and definitions

• Effect of confidentiality pledges

• Question wording

• Alternative respondent rules

• Time estimates

• Expected rates of nonsampling error

• Cost components

Documentation

• Survey administrative processes– Survey plan– Revisions of plan– Process details– Process variables

Documentation (con’d)• Quality reports

– Use framework based on quality dimensions– Report estimates of MSE components– In absence of MSE component estimates provide

indicators of quality– Implement a rolling evaluation scheme