Identifying Potential Attrition Bias using Sampling Frame
Transcript of Identifying Potential Attrition Bias using Sampling Frame
RTI International
RTI International is a trade name of Research Triangle Institute. www.rti.org
Identifying Potential Attrition Bias
using Sampling Frame Information,
Paradata, and Survey Data
Darryl V. Creel, RTI
Susan Mitchell, RTI
Kristine Fahrney, RTI
RTI International
Outline
Survey objective and design
Survey performance for data collection
Concern about potential attrition bias
Data available to model attrition
Finding variables related to attrition
Adjustment approach
Impact of adjustments
Looking forward
RTI International
Survey Objective and Design
Pilot study to assess relationship education, marriage
education, and other relationship skill programs
Community based evaluation (3 treatment and 3 control
sites)
Address based sample by within site with Census
information added
Stratified simple random sampling
In-person interviews
Two rounds of data collection: pre- and post-treatment
Supplemental sample (not discussed here)
RTI International
Final Status – Round 2
Final Status Count Percent
Complete 2,985 74.2
Eligible, Non-interview 960 23.9
Non-contact 774 19.3
Non-cooperation 186 4.6
Ineligible 78 1.9
Total 4,023 100.0
RTI International
Response Rates by Site
Site
Respondent
Round 1
Respondent
Round 2
Response
Rate (Unwt)
Response
Rate (Wt)
Dallas, TX 708 518 73.2 68.7
Milwaukee, WI 752 602 80.1 78.9
St. Louis, MO 730 472 64.7 66.7
Cleveland, OH 568 472 83.1 80.8
Fort Worth, TX 591 435 73.6 73.6
Kansas City, MO 596 486 81.5 79.7
Total 3,945 2,985
RTI International
Potential Attrition Bias
Overall response
Differential response across sites
Too many variables to investigate individually or in
groups
RTI International
Available Data for Modeling Attrition
Sampling frame information
Paradata
– Round 1
– Round 2
Round 1 survey responses
RTI International
Sampling Frame Information
Access – defined a proximity to service provider
– 1 = Closest (0-<25 percentage points of distance function)
– 2 = Close (25-<60 percentage points of distance function)
– 3 = Not close (60-100 percentage points of distance function)
Age
– 1 = Older
– 2 = Mixed
– 3 = Younger
Race/Ethnicity (only Dallas, TX and Fort Worth, TX)
– 1 = >50% Black
– 2 = >50% Hispanic
– 3 = Other
RTI International
Paradata: Round 1 and Round 2
Level of effort (number of contacts)
Ever had access denied (pending status code 319)
Ever had a broken appointment (pending status code
335)
Ever had refusal by the respondent (pending status code
360)
Ever had refusal by other (pending status code 362)
RTI International
Round 1 Survey Responses
About 1500 variables on the file
Used about 150 variables
Variables not used, e.g., timing variables, roster fields
Demographic and Socioeconomic
– Age
– Ethnicity/Race
– Gender
– Income
– Marital Status
RTI International
Round 1 Survey Responses
Awareness of Media Messages and Services
Relationship Status and Attitudes
Receipt of Services
Social Ties
Relationship Quality
Child Well-Being
Respondent Characteristics and Background
Spouse/Partner Characteristics and Background
Non-Resident Parent Characteristics and Background
Household Self-Sufficiency
Household Observations
RTI International
Finding Variables Related to Attrition
Looked at each site separately
Made about 150 variables available to the software
Used separate tree-based models for non-contact and
non-cooperation within each site
Terminal nodes of the tree were the weighting classes
Restricted weighting classes to minimum of 50
observations to be able to split and minimum of 25
observations in a weighting class
RTI International
Example Non-contact Tree
RTI International
Example Non-contact Tree
RTI International
Example Non-cooperation Tree
RTI International
Example Non-cooperation Tree
RTI International
Paradata is Important
Did you notice that the “most important” variables in the
non- contact trees and non-cooperation trees were
variables from the paradata?
Non-contact tree – number of contacts
Non-cooperation tree – ever refused by respondent
RTI International
Effectiveness of Trees
Ever Respondent Refusal by Cooperate
Cooperate
Ever Refusal
by
Respondent
Yes No Total
Yes 22% (45) 78% (157) 100% (202)
No 99% (2,940) 1% (29) 100% (2,969)
RTI International
Adjusting the Weights
Terminal nodes from the trees were weighting classes
Ratio adjustment within weighting classes
Poststratification
RTI International
Attended Classes?
Round Response Point Estimate Standard Error
1 with R1 weight Yes 17.11 0.746
2 with R1 weight Yes 20.80 0.925
2 with R2 weight Yes 20.03 1.111
IF MARRIED: Since you’ve been married, have you ever
attended classes about couple relationships or marriage, or
have you ever received individual marriage or relationship
counseling?
IF NOT MARRIED: Have you ever attended classes about
couple relationships or marriage, or have you ever received
individual marriage or relationship counseling?
RTI International
Round Response Point Estimate Standard Error
1 with R1 weight Yes 28.90 2.114
2 with R1 weight Yes 43.12 2.482
2 with R2 weight Yes 41.97 2.986
Attended Classes in Last 18 Months?
IF MARRIED: In the past eighteen months, that is, since (INSERT MONTH/YEAR) have
you attended any classes, workshops, or group sessions to help you improve your
relationship with (SPOUSE/PARTNER)? These sessions would have included other
people, not just you and (SPOUSE/PARTNER).
IF NOT MARRIED: In the past eighteen months, that is since (INSERT MONTH/YEAR)
have you attended any classes, workshops, or groups to help you improve your
relationship with a spouse or partner? These sessions would have included other
people, not just you and your partner.
RTI International
RTI’s Paradata Initiative
For this survey, we used relatively few variables from the
paradata
RTI has developed a standardized system to capture
paradata
– Real time information during data collection
– Available for weighting
In the future, we will have consistent and easy to more
paradata