Texas SDC/BIDC Conference for Data Users
May 22, 2013Austin, TX
Techniques for Reallocating State Estimates of the Undocumented Immigrant Population to Small Area Geographies
• Texas is one of the fastest growing states, with migration making up 45% of this growth.
• Issue of immigration, especially unauthorized or illegal migration, critical when planning and considering:– Concerns about border security– Concerns about economic impact on receiving
communities– Concerns about resulting shifts in the social characteristics
of communities• With the exception of California, sub-state level
estimates of the undocumented population are not available.
Rationale
• Conventionally, estimation of the undocumented population is produced using the residual method (Warren 2011; Passel 2010, 2011). – Estimates of legal foreign born residents are subtracted
from estimates of the foreign born population. • Most commonly used national and state estimates
include Pew Hispanic Center, Dept. of Homeland Security, and R. Warren estimates.
Background
Background
1990 2000 2005 20080
200
400
600
800
1000
1200
1400
1600
1800
450
1,100
1,400 1,400
440
1,127
1,4741,527
1,090
1,360
1,680
Estimates of Texas Unauthorized Immigrant Population (thousands)
Passel Warren DHS
Background
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 20100.0
2.0
4.0
6.0
8.0
10.0
12.0
14.0
5.0 5.4 5.3 5.5 6.1 6.9 6.9 7.7 7.5 6.9 6.9
2.32.7 2.8 2.8
2.92.7 2.9
2.8 2.72.6 2.61.1
1.2 1.3 1.41.4
1.5 1.51.5 1.4
1.6 1.7
Estimates of the Unauthorized Immigrant Population, 2000 to 2010
All Other States California Texas
Unau
thor
ized
Imm
igra
nts (
in m
illio
ns)
Source: Pew Hispanic Center, 2011
1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 200811.5
12.0
12.5
13.0
13.5
14.0
12.6
12.4 12.312.4
12.6
12.8
13.113.2 13.3 13.2
13.113.2
13.4 13.4 13.4 13.513.6
13.7 13.7
Estimates of Texas Unauthorized Immigrant Population as % of U.S. Total Unauthorized Immigrant Population
Background
Source: Warren, 2010
Residual method presents challenges when attempting to produce estimates at lower geographies due to data
unavailability.
Challenge
• Hill & Johnson (2011) employ a methodology that combines census population data with new administrative data that allows for estimation of the total unauthorized population and its distribution at sub-state level geographies
• 80 percent of unauthorized immigrants report filing federal income taxes and about 75 percent report having payroll taxes withheld (Porter 2005; Hill et al. 2010)
• Estimates suggest over half of unauthorized immigrants already pay income and payroll taxes through withholding, filed tax returns, or both (Orrenius and Zavodny 2012)
Literature Review
• Since immigrants without work authorization do not have valid social security numbers, many instead use Internal Revenue Service (IRS) issued Individual Taxpayer Identification Numbers (ITIN) when filing tax returns.
• Hill et al. (2011) have shown a high correlation (0.96 < r < 0.98) between the ITIN filers and unauthorized immigrant estimates in the U.S.
Literature Review
• To reallocate Texas state estimates of the unauthorized to the county level using ITIN data
• To expand upon this new estimation method by employing spatial prediction techniques to refine the distribution of unauthorized immigrants across the state
Objectives
• R. Warren’s 2008 state level estimates of the unauthorized,
• 2008 IRS Individual Taxpayer Identification Number (ITIN) administrative data,
• American Community Survey (ACS) 2008 estimates of relevant sociodemographic characteristics, and
• U.S. Bureau of Economic Analysis (BEA) local employment data for 2008
Data Sources
• Not all unauthorized immigrants file tax returns and not all ITIN filers are unauthorized (Hill et al. 2011).
• Hill & Johnson use regression analysis and incorporate economic and sociodemographic characteristics related to the unauthorized immigrant status to predict a state level ratio of ITIN filers to unauthorized immigrants.
Methodology
• Model 1 borrows Hill & Johnson method to identify important parameters useful for modeling the ITIN to unauthorized state estimate ratio.
• Model 2 is a simple OLS regression using parameters identified in Model 1 to estimate the ITIN to unauthorized at the county level.
• Model 3 is a geographically weighted regression model that incorporates a county-specific ratio that estimates the distribution of ITIN filers as a percentage of unauthorized immigrants by county.
• The final step involves applying the respective predicted values from each of these models and scaling these to Warren’s statewide estimate.
Methodology
14
Methodology
• Run a weighted least squares regression, weighted by foreign-born residents, using a backward elimination stepwise method
(ITIN/Warren Estimate)s = Xsα + Wsβ + Zsγ + εs
• This ratio is then used as a factor to allocate the unauthorized populations at the county level.
ParametersModel 1:
Texas-specific statewide model
% born in Central America -0.006% not in labor force -0.013% manufacturing employment 0.025
% new return 0.046Constant 0.219R-squared 0.52N 51
Results
• Model 1 borrows Hill & Johnson method to identify important parameters useful for modeling the ITIN to unauthorized state estimate ratio.
• Model 2 is a simple OLS regression using parameters identified in Model 1 to estimate the ITIN to unauthorized at the county level.
• Model 3 is a geographically weighted regression model that incorporates a county-specific ratio that estimates the distribution of ITIN filers as a percentage of unauthorized immigrants by county.
• The final step involves applying the respective predicted values from each of these models and scaling these to Warren’s statewide estimate.
Methodology
• Model 1 borrows Hill & Johnson method to identify important parameters useful for modeling the ITIN to unauthorized state estimate ratio.
• Model 2 is a simple OLS regression using parameters identified in Model 1 to estimate the ITIN to unauthorized at the county level.
• Model 3 is a geographically weighted regression model that incorporates a county-specific ratio that estimates the distribution of ITIN filers as a percentage of unauthorized immigrants by county.
• The final step involves applying the respective predicted values from each of these models and scaling these to Warren’s statewide estimate.
Methodology
Methodology• OLS Model
• Space plays no role in the modeling process, and the global coefficients are constant across the entire sample size.
• GWR Model
• Addresses spatial non-stationarity and yields a set of estimates of spatially varying parameters for each geographic location.
• Smooths out distribution and provides estimates even in counties where ITIN=0.
County –Specific ModelsOLS Model GWR Model (mean)
% born in Central America -0.122 -0.190% not in labor force -0.663 -0.603% manufacturing employment 1.181 0.649
% new return 7.282 7.440Constant 0.012 0.040R-squared 0.15 0.49N 254 254AIC 195.88 112.11
Results
• Model 1 borrows Hill & Johnson method to identify important parameters useful for modeling the ITIN to unauthorized state estimate ratio.
• Model 2 is a simple OLS regression using parameters identified in Model 1 to estimate the ITIN to unauthorized at the county level.
• Model 3 is a geographically weighted regression model that incorporates a county-specific ratio that estimates the distribution of ITIN filers as a percentage of unauthorized immigrants by county.
• The final step involves applying the respective predicted values from each of these models and scaling these to Warren’s statewide estimate.
Methodology
Estimates of the Unauthorized Immigrant Population, 2008
Estimates of the Unauthorized Immigrant Population, 2008
• The GWR model was a better fit when compared to the OLS model.
• Higher unauthorized estimates were found in areas characterized by agriculture, urbanicity, high employment, fast Hispanic population growth, and substantial foreign born populations
• These areas include counties in the Dallas-Fort Worth-Arlington, Houston-Baytown-Sugarland, and Austin-Round Rock metropolitan areas, large border counties, and counties in parts of East Texas.
• When examined as a percentage of the county population, Panhandle counties and counties in the Dallas and border areas have higher percentages.
Results
• Estimate models specific to Texas• Explore trends from available data• Explore other spatial techniques
Future Directions
Laura Hill & Hans Johnson @Public Policy Institute of California
&Robert Warren
Acknowledgements
27
Contact
Office: (512) 463-8390 or (210) 458-6530E-mail: [email protected]: http://osd.state.tx.us
Office of the State Demographer
Top Related