The Online Use of Randomized Response Measurements

36
Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008 The Online Use of Randomized Response Measurements Chris Snijders Eindhoven University of Technology The Netherlands [email protected] Jeroen Weesie Utrecht University The Netherlands [email protected]

description

The Online Use of Randomized Response Measurements. Chris Snijders Eindhoven University of Technology The Netherlands [email protected] Jeroen Weesie Utrecht University The Netherlands [email protected]. Questions in surveys. - PowerPoint PPT Presentation

Transcript of The Online Use of Randomized Response Measurements

Page 1: The Online Use of  Randomized Response Measurements

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

The Online Use of Randomized Response Measurements

Chris SnijdersEindhoven University of TechnologyThe [email protected]

Jeroen WeesieUtrecht UniversityThe [email protected]

Page 2: The Online Use of  Randomized Response Measurements

Questions in surveys

• Surveys are one of the standard instruments of the social scientist

• You ask for behavior, attitudes, characteristics etc

• Big problem: non-response (especially firms), you get selective responses (cf. Dutch elections)

• Many surveys now conducted online either after email invitation, banners, etc

Page 3: The Online Use of  Randomized Response Measurements

Internet surveys• Seem to work relatively well

• Except for sensitive questions (which were problematic off-line as well)

• Social desirability bias: the tendency to report about oneself in a favourable manner or in accordance with local norms (Edwards, 1957)

Page 4: The Online Use of  Randomized Response Measurements

Getting rid of social desirability bias• Indirect questions

• "Covariate technique" Marlowe-Crowne scale (MCSDS)

• Lie-detector (!; this does not seem to work that well online)

• stress-reduction through question wording ("everybody does things they later regret ...")

• randomized response

Page 5: The Online Use of  Randomized Response Measurements

Sensitive questions in surveysFor instance, questions about

– criminal behavior– sexual preferences– monetary issues– ...

Two major concerns

– Survey drop-out– Useless answers (respondents do not admit to behavior that

is likely to be considered unappropriate or weird)

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 6: The Online Use of  Randomized Response Measurements

Throughout

ONLY BINARY RESPONSE VARIABLES

(0/1)

YES = ADMITTING TO THE SENSITIVE ISSUE

Page 7: The Online Use of  Randomized Response Measurements

See, e.g., Warner, 1965; Kuk, 1990; Chaudhuri & Mukerjee, 1988; Fox, 1986

Basic idea (here you see the forced response method):

Did you cheat on your tax-return last year?

Respondent is instructed to roll two dice:

if 2, 3 or 4 : reply YESif 11 or 12 : reply NOotherwise : tell truth

Possible solution: Randomized Response

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 8: The Online Use of  Randomized Response Measurements

Randomized Response

• Allows estimation of group averages (if respondents follow the protocol).

• Protocols other than using dice are possible, e.g., using a question such as:If your mother’s birthday is in Jan, Feb, Mar : YESIf your mother’s birthday is in Nov, Dec : NOOtherwise : truthOnline use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Did you cheat on your tax-return last year?

Respondent is instructed to roll two dice:if 2, 3 or 4 : reply YESif 11 or 12 : reply NOotherwise : tell truth

Might be negative!

Page 9: The Online Use of  Randomized Response Measurements

Other ways to do randomized response• Roll the dice. If you roll 2 through 7 answer question

number 1, otherwise answer question 2:

1) I own an illegal copy of Microsoft Office.correct / not correct

2) I do not own an illegal copy of Microsoft Officecorrect / not correct

Page 10: The Online Use of  Randomized Response Measurements

Or:• How many of the following issues pertain to you:

- you own a laptop- you like country music- you own a motor-cycle- you play a musical instrument

- you own a laptop- you like country music- you own a motor-cycle- you play a musical instrument- you own an illegal copy of Microsoft Office

Version 1

Version 2

Page 11: The Online Use of  Randomized Response Measurements

Randomized Response

• Main use: dichotomous variables (yes-no)

• Two kinds of studies:

– With an objective control

– Without an objective control, we assume higher observed percentages are better measurements

• RR improves results (in paper-and-pencil surveys; Edgell et al., 1982; Lensvelt-Mulders et al, 2005), but is still far from perfect

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 12: The Online Use of  Randomized Response Measurements

Randomized Response

But ... not often used, because:

• Necessary sample size is larger (typically 750 or more [given prev=7%])

• Wide-spread myth that analyses at the individual level are impossible.

And

• Most of the evidence is based on off-line research

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 13: The Online Use of  Randomized Response Measurements

Individual level (logistic) regressions... are possible, but that is not common knowledge.

1. Stata

2. SPSS www.randomizedresponse.nl, search for HLanalyse.pdf (in Dutch, unfortunately)

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

capture program drop rrlogit_lfprogram rrlogit_lf args lnf lp tempvar p quietly { gen double `p' = exp(`lp')/(1 + exp(`lp')) replace `p' = (1/6) + (3/4)*`p’ replace `lnf' = ($ML_y1==1)*log(`p') + ($ML_y1==0)*log(1-`p') }end

ml clearml model lf rrlogit_lf (y = x1 x2)ml max

Page 14: The Online Use of  Randomized Response Measurements

Randomized Response online

• Might work: replying online you feel more anonymous when combined with RR, you feel even more anonymous and hence do not mind answering sensitive questions

• Might not work: – Implementation online is non-trivial– Since online already makes one feel anonymous, loss in

precision might not be compensated for– Respondents might “play it safe” and not follow protocol

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 15: The Online Use of  Randomized Response Measurements

DesignPopulation: Internetpanel in Netherlands - EuroClixSensitive questions in four conditions (n=3,557)

A direct [control condition] n=1,078 complB dice embedded in the survey n = 910 compl

C “downloadable dice” n = 679 compl

D optional rand. response (if yes B) n = 890 compl

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Question ...

Page 16: The Online Use of  Randomized Response Measurements

Sensitive questions about three topics

1. Behavior in surveys

2. Traffic violations

3. Illegal copies of software / movies / music

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 17: The Online Use of  Randomized Response Measurements

Behavior in surveys

S1: At PanelClix I am registered under more than just one name.

S2: I fill out the surveys without really reading what they ask me.

S3: In the past two weeks, I filled out 4 or more surveys from PanelClix

S4: I sometimes fill out surveys under the id of another PanelClix member

S5: I sometimes let somebody else fill out surveys under my id.

S6: I sometimes lie about personal characteristics in a PanelClix survey

S7: When I have to respond to large numbers of statements I sometimes just rush through the answers.

S8: I am what you could call "a professional respondent" S9: Almost always I leave open questions blank.

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 18: The Online Use of  Randomized Response Measurements

Traffic violationsT1: Have you had a speeding ticket in the past 3 months?

T2: Do you ever drive faster than 100 where only 50 is allowed?

T3: Do you ever drive a car or motorcycle when you know you have had too much to drink?

T4: Did you neglect a red traffic light in the past week (by car or motorcycle)? T5: On the highway I tend to drive closely behind the car in front of me, so that

they will get out of my way ("bumperkleven").

T6: Have you ever damaged the vehicle of somebody else without reporting it?

T7: In the past two months, have you driven faster than 150 km/h with a car or motorcycle?

T8: In the past two months, did you park in a place where you had to pay, but paid less than you had to, or nothing at all?

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 19: The Online Use of  Randomized Response Measurements

Illegal music, movies and software

IC1: I own copied music for which I have not paid although I should have

IC2: -------------- movies ----------------------------------------------------

IC3: -------------- software --------------------------------------------------

IC4: I have past on copied music, movies or software to others so that they do not have to pay for it, although I know they should have just bought it.

IC5: I have an illegal copy of Microsoft Windows in my possession.

IC6: Whenever possible, I try to get commercial software without having to pay for it.

IC7: The largest part of my music collection is actually illegal.

IC8: I have an illegal copy of Microsoft Office in my possession.

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 20: The Online Use of  Randomized Response Measurements

Do you understand ......what you are supposed to do?

No clue 4%Not really 3%I think I do 39%Completely clear 55%

...what the purpose of the procedure is?

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

0.5

11.

5D

ensi

ty

0 2 4 6 8 101 3 5 7 9Understand usefulness of procedure

Page 21: The Online Use of  Randomized Response Measurements

Some data cleaning is necessary ...

Page 22: The Online Use of  Randomized Response Measurements

Completion rates per condition

(Mean time for survey-completion 15 minutes)(given that respondent started survey)

A: direct 85.5%B: RR embedded 80.7%C: RR download 62.2%D: RR optional 78.5%

So downloadable dice cost 15-20 percentage points of the completion rate

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 23: The Online Use of  Randomized Response Measurements

Also: nine indirect questions

Out of 100 people, how many ...

(behavior in surveys)S6S2S4(traffic violations)T2T3T4(illegal copies)IC1IC5IC8

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

And several covariates:- age- gender- computer literacy- ...

[for those in the control condition]

Indirect questions correlate with the direct question scores,

but are not strong enough predictors to actually predict the behavioral data.

Page 24: The Online Use of  Randomized Response Measurements

Results: surveys

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Direct RRemb

RRdown

RRoptN

RRoptY

S1 6 -6 -2 4 0S2 2 -7 -8 2 -4S3 30 30 31 30 29S4 2 9 -11 1 -9S5 2 8 -11 1 -11S6 4 4 -5 5 -6S7 27 24 29 32 27S8 9 4 -1 9 -1S9 27 24 26 25 25

NB Estimate = 4/3*(Obs – 1/6)

Page 25: The Online Use of  Randomized Response Measurements

Results: traffic

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Direct RRemb

RRdown

RRoptN

RRoptY

T1 16 10 11 18 8T2 3 -1 2 6 0T3 12 9 2 10 1T4 31 32 26 37 34T5 16 14 14 19 13T6 6 -1 1 7 0T7 20 15 1 19 12T8 38 32 6 34 37

Page 26: The Online Use of  Randomized Response Measurements

Results: illegal copies

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Direct RRemb

RRdown

RRoptN

RRoptY

IC1 62 63 60 62 70IC2 43 44 46 44 55IC3 47 53 44 44 51IC4 43 46 49 44 51IC5 14 13 13 16 9IC6 46 51 49 44 56IC7 26 26 24 25 22IC8 21 24 24 22 15

Page 27: The Online Use of  Randomized Response Measurements

This does not seem to work that well...

• The group of people who is convinced by the Randomized Response method is not large enough

• Or ... respondents are not following the protocol!

Page 28: The Online Use of  Randomized Response Measurements

Who are not following the protocol?

7.0% of respondents has 2 “yes”-answers or less2.4% gives no “yes”-answers at all

Logistic regression on 2 “yes”-answers or less

Age + Female+ Education - Computer literacy1 (podcasts, RSS etc) 0 Computer literacy2 (basic internet skills)-

Understand how 0Understand why -

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 29: The Online Use of  Randomized Response Measurements

Randomized response: non-complianceThree types of respondents:1) Honest yes =2) Honest no =3) Cheater: has yes, but says no =And: they add up to 1, and we are interested in Assumption: if ordered to say no, all do soThen we have:

Direct questions :

Indirect questions :

Page 30: The Online Use of  Randomized Response Measurements

Note: the idea itself is old (and not mine)

The idea for this is in Clarke 1998.

Downloadable from

http://chrissnijders.com/tempback/Clarke1998.pdf

Page 31: The Online Use of  Randomized Response Measurements

Taking non-compliance into account

Direct RRemb

RRdown

RRoptN

RRoptY

RRadjNC

IC1 62 63 60 62 70 84IC2 43 44 46 44 55 61IC3 47 53 44 44 51 72IC4 43 46 49 44 51 63IC5 14 13 13 16 9 22IC6 46 51 49 44 56 69IC7 26 26 24 25 22 38IC8 21 24 24 22 15 36

Page 32: The Online Use of  Randomized Response Measurements

Conclusions– Rand.Response online

• Not much support for Randomized Response (with the Forced Response method) for these particular topics if non-compliance is not taken into account.

For illegal software we find small positive effects. Larger effects if non-compliance is taken into account.

• Some indication that RR works better as the sensibility of the topic increases

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

Page 33: The Online Use of  Randomized Response Measurements

Conclusions (2)

• Quick and dirty result: compliance with protocol more often for younger, male, high-educated, computer-literate respondents (who understand what RR is for).

• Allowing for optional Randomized Response does not seem to work very well; perhaps some support with the illegal copying topic

• Downloadable dice – not a good idea

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

?

Page 34: The Online Use of  Randomized Response Measurements

So there is good and bad news ...• Internet high openness, which makes RR less

necessary.

• For really sensitive behavior, RR can be conducted online and analyzed relatively easily ...

• ... but compliance with the protocol is a major issue and has to be explicitly modeled

different kinds of non-compliance analyses less straightforward

Page 35: The Online Use of  Randomized Response Measurements

Possible assignments

First review the literature on Randomized Response measurement (given online).

Either:

1) For the fanatics: Run a mini-survey on some topic that is sensitive but interests you (help will be provided). Try to come up with different ways to measure the topic of interest.

2) Design an experiment to further test the use of randomized response measurement. For instance, compare different methods.

3) Give a brief overview of randomized response measurement, and come up with a large set of questions that can be used as randomizers (such as "in which month was your mother born?")

Page 36: The Online Use of  Randomized Response Measurements

Results: traffic

Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

RRnCOMPL

Direct RRemb

RRdown

RRoptN

RRoptY

T1 72 16 10 11 18 8T2 69 3 -1 2 6 0T3 55 12 9 2 10 1T4 34 31 32 26 37 34T5 42 16 14 14 19 13T6 52 6 -1 1 7 0T7 65 20 15 1 19 12T8 33 38 32 6 34 37