FCSM/WSS Workshop on Quality of Blended...
Transcript of FCSM/WSS Workshop on Quality of Blended...
![Page 1: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/1.jpg)
FCSM/WSS Workshop on Quality of Blended Data
26. Februar 2018
Summary
Frauke Kreuter
![Page 2: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/2.jpg)
Lessons learnedCombining Data Sources
![Page 3: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/3.jpg)
When assessing quality, we need to focus on
Y
![Page 4: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/4.jpg)
We need to get comfortable with proxies in
Y and X
![Page 5: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/5.jpg)
We need to remember the initial question
?
![Page 6: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/6.jpg)
We need to change the way we operate
![Page 7: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/7.jpg)
Lessons not yet learnedCombined data collection
![Page 8: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/8.jpg)
Research Question – Effects of Unemployment
![Page 9: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/9.jpg)
Research Question – Effects of Unemployment
Research-App, that …… issues questionnaires… collects passive data… links to panel survey and
administrative data
![Page 10: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/10.jpg)
PASS – Panel (10 years) + Administrative Data
Sample of households with at least one welfare benefit recipient (at reference date)
Refreshed annually
Surveyed annually
Random household sample of resident population
Refreshed annually
Surveyed annually
Trappmann M., Christoph B., Achatz J., Wenzig C. (2009) PASS: a new panel study for labour market research, Int. J. of Manpower , 30, 7, pp.765-770
![Page 11: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/11.jpg)
Coverage // Selection // External Validity
Sample
Android user
Smart phone user
Population
Pew Research estimates: 77% smart phone user in U.S. in 2016
Source: Valliant R, Dever J, Kreuter F (2018): Practical Tools for Designing and Weighting Survey Samples. 2nd Edition. New York: Springer.
![Page 12: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/12.jpg)
Ownership by age groups (unweighted PASS estimates)
![Page 13: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/13.jpg)
Age in years
Gender: Female
Immigrant
Higher education
Welfare benefit recipient
Predicting ownership and device type
Average Marginal Effect with 95% CIs
Android
No ownership
iOS
![Page 14: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/14.jpg)
Lessons offeredSurvey and Data Science
![Page 15: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/15.jpg)
Data Generating Process
Data Curation/Storage
Data Analysis
Data Output/Access
Research Question
Understand how to collect data yourself, and howdata are generated through administrative andother processes.
Learn how to curate and manage data
Learn a variety of analysis methodssuited for different data types
Learn how to communicate results and distribute and store your data
Learn how to formulate your research goal and which data are best suited to achieve this goal.
Source: Usher in Japec et al 2015
![Page 16: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/16.jpg)
surv
ey-d
ata-
scie
nce.
net
![Page 17: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/17.jpg)
min.6 ECTS
min.10 ECTS
min.6 ECTS
min.10 ECTS
min.6 ECTS
Data Generating Process
DataCuration/Storage
Data Analysis
Data Output/Access
Research Question
Fundamentals of Survey and Data
Science3 credits/6 ECTS
Web Surveys1 credits/2 ECTS
Record Linkage1 credit/2 ECTS
Practical Tools for Sampling and
Weighting3 credits/6 ECTS
Applied Sampling I-II
1 credits/2 ECTSeach
Experimental Design
2 credits/4 ECTS
Database Management I-III1 credits/2 ECTS
each
Data Munging I-III1 credit/2 ECTS
each
Generalized Linear Models
2 credits/3 ECTS
Analysis of Complex Data I-III1 credits/2 ECTS
each
Machine Learning I-II
1 credit/2 ECTS each
Ethics1 credit/2 ECTS
Data Confidentiality and Statistical
Disclosure Control2 credits/4 ECTS
Visualization2 credits/4 ECTS
Single coursesSpecializationsMaster degree
Mas
ter T
hesis
User Experience 1 credits/2 ECTS
Questionnaire Design
2 credits/4 ECTS
Data Collection3 credits/6 ECTS
Paper Writing / Publishing
2 credits/4 ECTS
MultipleImputation
1 credit/2 ECTS
Python / SQL1 credit/2 ECTS
![Page 18: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/18.jpg)
FacultyU. of Maryland / Michigan:Chris AntounFred ConradSteven HeeringaPartha LahiriJames LepkowskiRichard Valliant
University of Mannheim:Thomas GautschiFlorian KeuschThomas FetzerHeiner Stuckenschmidt
Other universities:Helmut Kuechenhoff(LMU Munich)Daniel Oberski(Utrecht University)Trent Buskirk(U. Mass, Boston)Simon Munzert (HU Berlin)
Government Agencies:Manfred Antoni (IAB)Jörg Drechsler (IAB)Joseph Sakshaug (IAB)Stefan Bender (Bundesbank)
Jeffrey Gonzalez (BLS)Carolina Franco (Census)
Private partners:Mario Callegaro (Google)Jennifer Romano-Bergstrom (Facebook)
Jill Dever (RTI)Emily Geisen (RTI)Raphael Nishimura (Abt)Roger Tourangeau (Westat)
![Page 19: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/19.jpg)
Onsite (Connect@IPSDS) Online
![Page 20: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/20.jpg)
Asynchronous Synchronous
• Small virtual classrooms • Weekly 50-minute discussions led by the
instructor• Obligatory component
• Pre-recorded lectures (split into smaller video units)
• (Bi)weekly assignments • Discussion forums
![Page 21: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/21.jpg)
Community is keyColeridge Initiative
![Page 22: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/22.jpg)
FEDERAL
County
State
22
City
Networks: The first two classes brought together ~40 agencies from city, state, county and federal agencies
![Page 23: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/23.jpg)
Professional Training WorkshopsThree Classes
• Different cohorts (ex-offenders, welfare recipients and veterans)• Joined with housing, transportation and jobs data
Class Format
• Module 1: Foundations – Research Questions, Python, SQL• Module 2: Data Acquisition – Web Scraping, API, Record Linkage• Module 3: Data Analysis – Machine Learning, Networks, Text, Spatial• Module 4: Visualization, Inference, Ethics, Privacy
Additional Information
• Final reports are all virtual• Teaching Assistants and facilitators will be at each site for each module
![Page 24: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/24.jpg)
Collaborative secure environment
Data Discovery
Software Version Control
JupyterHub (Data Analysis)
Database Browser
![Page 25: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/25.jpg)
Big Data for Federal Agencies- Fall course: 25 students
- curriculum = book outline
Outlook
- one-stop enrollment
- engagement of PI/PR
![Page 26: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/26.jpg)
Source: Abe Usher
![Page 27: FCSM/WSS Workshop on Quality of Blended Datawashingtonstatisticalsociety.org/presentations/20180226/... · 2018-05-13 · Source: Valliant R, Dever J, Kreuter F (2018): Practical](https://reader030.fdocuments.net/reader030/viewer/2022041016/5ec782f40d7d4674434d87cd/html5/thumbnails/27.jpg)
www.iab.de
Frauke Kreuter ([email protected])
survey-data-science.netcoleridgeinitiative.org
Shift in mindset! Dare to experiment (now)!
Thank you!