Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12,...

36
Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements Chris Snijders Eindhoven University of Technology The Netherlands [email protected] Jeroen Weesie Utrecht University The Netherlands [email protected]

Transcript of Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12,...

Page 1: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

The Online Use of Randomized Response Measurements

Chris SnijdersEindhoven University of TechnologyThe [email protected]

Jeroen WeesieUtrecht UniversityThe [email protected]

Page 2: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Questions in surveys

• Surveys are one of the standard instruments of the social scientist

• You ask for behavior, attitudes, characteristics etc

• Big problem: non-response (especially firms), you get selective responses (cf. Dutch elections)

• Many surveys now conducted online either after email invitation, banners, etc

Page 3: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Internet surveys

• Seem to work relatively well

• Except for sensitive questions (which were problematic off-line as well)

• Social desirability bias: the tendency to report about oneself in a favourable manner or in accordance with local norms (Edwards, 1957)

Page 4: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Getting rid of social desirability bias

• Indirect questions

• "Covariate technique" Marlowe-Crowne scale (MCSDS)

• Lie-detector (!; this does not seem to work that well online)

• stress-reduction through question wording ("everybody does things they later regret ...")

• randomized response

Page 5: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Sensitive questions in surveys

For instance, questions about

– criminal behavior– sexual preferences– monetary issues– ...

Two major concerns

– Survey drop-out– Useless answers (respondents do not admit to behavior that

is likely to be considered unappropriate or weird)

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 6: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Throughout

ONLY BINARY RESPONSE VARIABLES

(0/1)

YES = ADMITTING TO THE SENSITIVE ISSUE

Page 7: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

See, e.g., Warner, 1965; Kuk, 1990; Chaudhuri & Mukerjee, 1988; Fox, 1986

Basic idea (here you see the forced response method):

Did you cheat on your tax-return last year?

Respondent is instructed to roll two dice:

if 2, 3 or 4 : reply YESif 11 or 12 : reply NOotherwise : tell truth

Possible solution: Randomized Response

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 8: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Randomized Response

• Allows estimation of group averages (if respondents follow the protocol).

• Protocols other than using dice are possible, e.g., using a question such as:If your mother’s birthday is in Jan, Feb, Mar : YES

If your mother’s birthday is in Nov, Dec : NOOtherwise : truth

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Did you cheat on your tax-return last year?

Respondent is instructed to roll two dice:if 2, 3 or 4 : reply YESif 11 or 12 : reply NOotherwise : tell truth

Might be negative!

Page 9: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Other ways to do randomized response

• Roll the dice. If you roll 2 through 7 answer question number 1, otherwise answer question 2:

1) I own an illegal copy of Microsoft Office.correct / not correct

2) I do not own an illegal copy of Microsoft Officecorrect / not correct

Page 10: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Or:

• How many of the following issues pertain to you:

- you own a laptop- you like country music- you own a motor-cycle- you play a musical instrument

- you own a laptop- you like country music- you own a motor-cycle- you play a musical instrument- you own an illegal copy of Microsoft Office

Version 1

Version 2

Page 11: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Randomized Response

• Main use: dichotomous variables (yes-no)

• Two kinds of studies:

– With an objective control

– Without an objective control, we assume higher observed percentages are better measurements

• RR improves results (in paper-and-pencil surveys; Edgell et al., 1982; Lensvelt-Mulders et al, 2005), but is still far from perfect

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 12: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Randomized Response

But ... not often used, because:

• Necessary sample size is larger (typically 750 or more [given prev=7%])

• Wide-spread myth that analyses at the individual level are impossible.

And

• Most of the evidence is based on off-line research

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 13: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Individual level (logistic) regressions

... are possible, but that is not common knowledge.

1. Stata

2. SPSS www.randomizedresponse.nl, search for HLanalyse.pdf (in Dutch, unfortunately)

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

capture program drop rrlogit_lfprogram rrlogit_lf args lnf lp tempvar p quietly { gen double `p' = exp(`lp')/(1 + exp(`lp')) replace `p' = (1/6) + (3/4)*`p’ replace `lnf' = ($ML_y1==1)*log(`p') + ($ML_y1==0)*log(1-`p') }end

ml clearml model lf rrlogit_lf (y = x1 x2)ml max

Page 14: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Randomized Response online

• Might work: replying online you feel more anonymous when combined with RR, you feel even more anonymous and hence do not mind answering sensitive questions

• Might not work: – Implementation online is non-trivial– Since online already makes one feel anonymous, loss in

precision might not be compensated for– Respondents might “play it safe” and not follow protocol

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 15: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Design

Population: Internetpanel in Netherlands - EuroClixSensitive questions in four conditions (n=3,557)

A direct [control condition] n=1,078 complB dice embedded in the survey n = 910 compl

C “downloadable dice” n = 679 compl

D optional rand. response (if yes B) n = 890 compl

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Question ...

Page 16: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Sensitive questions about three topics

1. Behavior in surveys

2. Traffic violations

3. Illegal copies of software / movies / music

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 17: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Behavior in surveys

S1: At PanelClix I am registered under more than just one name.

S2: I fill out the surveys without really reading what they ask me.

S3: In the past two weeks, I filled out 4 or more surveys from PanelClix

S4: I sometimes fill out surveys under the id of another PanelClix member

S5: I sometimes let somebody else fill out surveys under my id.

S6: I sometimes lie about personal characteristics in a PanelClix survey

S7: When I have to respond to large numbers of statements I sometimes just rush through the answers.

S8: I am what you could call "a professional respondent" S9: Almost always I leave open questions blank.

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 18: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Traffic violationsT1: Have you had a speeding ticket in the past 3 months?

T2: Do you ever drive faster than 100 where only 50 is allowed?

T3: Do you ever drive a car or motorcycle when you know you have had too much to drink?

T4: Did you neglect a red traffic light in the past week (by car or motorcycle)? T5: On the highway I tend to drive closely behind the car in front of me, so that

they will get out of my way ("bumperkleven").

T6: Have you ever damaged the vehicle of somebody else without reporting it?

T7: In the past two months, have you driven faster than 150 km/h with a car or motorcycle?

T8: In the past two months, did you park in a place where you had to pay, but paid less than you had to, or nothing at all?

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 19: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Illegal music, movies and software

IC1: I own copied music for which I have not paid although I should have

IC2: -------------- movies ----------------------------------------------------

IC3: -------------- software --------------------------------------------------

IC4: I have past on copied music, movies or software to others so that they do not have to pay for it, although I know they should have just bought it.

IC5: I have an illegal copy of Microsoft Windows in my possession.

IC6: Whenever possible, I try to get commercial software without having to pay for it.

IC7: The largest part of my music collection is actually illegal.

IC8: I have an illegal copy of Microsoft Office in my possession.

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 20: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Do you understand ...

...what you are supposed to do?No clue 4%Not really 3%I think I do 39%Completely clear 55%

...what the purpose of the procedure is?

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

0.5

11

.5

Den

sity

0 2 4 6 8 101 3 5 7 9Understand usefulness of procedure

Page 21: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Some data cleaning is necessary ...

Page 22: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Completion rates per condition

(Mean time for survey-completion 15 minutes)(given that respondent started survey)

A: direct 85.5%B: RR embedded 80.7%C: RR download 62.2%D: RR optional 78.5%

So downloadable dice cost 15-20 percentage points of the completion rate

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 23: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Also: nine indirect questions

Out of 100 people, how many ...

(behavior in surveys)S6S2S4

(traffic violations)T2T3T4

(illegal copies)IC1IC5IC8

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

And several covariates:- age- gender- computer literacy- ...

[for those in the control condition]

Indirect questions correlate with the direct question scores,

but are not strong enough predictors to actually predict the behavioral data.

Page 24: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Results: surveys

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Direct RRemb

RRdown

RRoptN

RRoptY

S1 6 -6 -2 4 0

S2 2 -7 -8 2 -4

S3 30 30 31 30 29

S4 2 9 -11 1 -9

S5 2 8 -11 1 -11

S6 4 4 -5 5 -6

S7 27 24 29 32 27

S8 9 4 -1 9 -1

S9 27 24 26 25 25

NB Estimate = 4/3*(Obs – 1/6)

Page 25: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Results: traffic

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Direct RRemb

RRdown

RRoptN

RRoptY

T1 16 10 11 18 8

T2 3 -1 2 6 0

T3 12 9 2 10 1

T4 31 32 26 37 34

T5 16 14 14 19 13

T6 6 -1 1 7 0

T7 20 15 1 19 12

T8 38 32 6 34 37

Page 26: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Results: illegal copies

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Direct RRemb

RRdown

RRoptN

RRoptY

IC1 62 63 60 62 70

IC2 43 44 46 44 55

IC3 47 53 44 44 51

IC4 43 46 49 44 51

IC5 14 13 13 16 9

IC6 46 51 49 44 56

IC7 26 26 24 25 22

IC8 21 24 24 22 15

Page 27: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

This does not seem to work that well...

• The group of people who is convinced by the Randomized Response method is not large enough

• Or ... respondents are not following the protocol!

Page 28: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Who are not following the protocol?

7.0% of respondents has 2 “yes”-answers or less2.4% gives no “yes”-answers at all

Logistic regression on 2 “yes”-answers or less

Age +

Female +

Education -

Computer literacy1 (podcasts, RSS etc) 0

Computer literacy2 (basic internet skills)-

Understand how 0

Understand why -

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 29: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Randomized response: non-compliance

Three types of respondents:1) Honest yes =2) Honest no =3) Cheater: has yes, but says no =And: they add up to 1, and we are interested in Assumption: if ordered to say no, all do soThen we have:

Direct questions :

Indirect questions :

Page 30: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Note: the idea itself is old (and not mine)

The idea for this is in Clarke 1998.

Downloadable from

http://chrissnijders.com/tempback/Clarke1998.pdf

Page 31: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Taking non-compliance into account

Direct RRemb

RRdown

RRoptN

RRoptY

RRadjNC

IC1 62 63 60 62 70 84

IC2 43 44 46 44 55 61

IC3 47 53 44 44 51 72

IC4 43 46 49 44 51 63

IC5 14 13 13 16 9 22

IC6 46 51 49 44 56 69

IC7 26 26 24 25 22 38

IC8 21 24 24 22 15 36

Page 32: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Conclusions– Rand.Response online

• Not much support for Randomized Response (with the Forced Response method) for these particular topics if non-compliance is not taken into account.

For illegal software we find small positive effects. Larger effects if non-compliance is taken into account.

• Some indication that RR works better as the sensibility of the topic increases

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 33: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Conclusions (2)

• Quick and dirty result: compliance with protocol more often for younger, male, high-educated, computer-literate respondents (who understand what RR is for).

• Allowing for optional Randomized Response does not seem to work very well; perhaps some support with the illegal copying topic

• Downloadable dice – not a good idea

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

?

Page 34: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

So there is good and bad news ...

• Internet high openness, which makes RR less necessary.

• For really sensitive behavior, RR can be conducted online and analyzed relatively easily ...

• ... but compliance with the protocol is a major issue and has to be explicitly modeled

different kinds of non-compliance analyses less straightforward

Page 35: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Possible assignments

First review the literature on Randomized Response measurement (given online).

Either:

1) For the fanatics: Run a mini-survey on some topic that is sensitive but interests you (help will be provided). Try to come up with different ways to measure the topic of interest.

2) Design an experiment to further test the use of randomized response measurement. For instance, compare different methods.

3) Give a brief overview of randomized response measurement, and come up with a large set of questions that can be used as randomizers (such as "in which month was your mother born?")

Page 36: Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements.

Results: traffic

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

RRnCOMPL

Direct RRemb

RRdown

RRoptN

RRoptY

T1 72 16 10 11 18 8

T2 69 3 -1 2 6 0

T3 55 12 9 2 10 1

T4 34 31 32 26 37 34

T5 42 16 14 14 19 13

T6 52 6 -1 1 7 0

T7 65 20 15 1 19 12

T8 33 38 32 6 34 37