Some ACS Data Issues and Statistical Significance (MOEs)
description
Transcript of Some ACS Data Issues and Statistical Significance (MOEs)
![Page 1: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/1.jpg)
Some ACS Data Issues and Some ACS Data Issues and Statistical Significance Statistical Significance (MOEs)(MOEs)
Table Release RulesTable Release Rules
Statistical Filtering & CollapsingStatistical Filtering & Collapsing
Disclosure Review BoardDisclosure Review Board
Statistical Significance Testing & Statistical Significance Testing & Margins of Error (MOEs)Margins of Error (MOEs)
![Page 2: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/2.jpg)
Table Release RulesTable Release Rules
February 28, 2007February 28, 2007
![Page 3: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/3.jpg)
““B” and “C” TablesB” and “C” Tables
![Page 4: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/4.jpg)
Full Table – Full Table – PASSED FILTERINGPASSED FILTERING
Statistically Statistically too Smalltoo Small
![Page 5: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/5.jpg)
Collapsed TableCollapsed Table
![Page 6: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/6.jpg)
The Census Bureau StoryThe Census Bureau Story
Why did we collect all this data if we were not going to
release it?
![Page 7: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/7.jpg)
ACS Data Release Rules
Doug Hillmer
Data Products Area
American Community Survey Office
U.S. Census Bureau
October 11, 2006
![Page 8: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/8.jpg)
Limitation of Disclosure Risk
– The Census Bureau’s Disclosure Review Board (DRB) must clear all data products prior to their release to the public.
Assurance of Statistical Reliability
– Data users need to be able to use ACS estimates as official Census Bureau data. Thus, some rules must be in place to ensure minimum reliability of estimates.
– Statistical reliability is assured by:
• Population size thresholds below which estimates are not released
• Data release testing and collapsing of tables that fail
The Census Bureau Will Not Release All Available Estimates to the Public
![Page 9: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/9.jpg)
The ACS “Identity Crisis” on Reliability• Ultimately, the 5-year estimates, with no “data
release rules” acts as a long-form replacement• Single-year ACS sample is more like a current
demographic survey – although much larger in size
• Question to answer for single-year estimates: Do we accept less detail in our measures of characteristics or do we allow more detail but with data release rules in place? Less detail punishes those areas with the diversity to support the detail.
![Page 10: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/10.jpg)
Choices for displaying estimatesin ACS data products
No suppression
1. Publish full detail with no suppression but higher pop threshold (eg., 500,000)
2. Publish limited set of estimates for all areas with 65,000+ pop
3. Published more detailed estimates for higher pop threshold and limited set for lower threshold
With suppression or Warnings4. Define a very detailed set of estimates for all geo areas with
65,000+ pop and suppress estimates that fail reliability test
5. Define a very detailed set of estimates for all geo areas with 65,000+ pop and flag estimates that fail reliability test
![Page 11: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/11.jpg)
Filtering <<Data Release Rules >>
• Goal: to identify “weak” tables• Some tables have many zero or “near zero” cells
and relatively large standard errors• Filtering <<Data Release>> rule used during
2000-2004 ACS: drop tables if…– Universe is less than 500 (weighted) – Average cell size is less than 2 cases (unweighted)
• filtering <<data release>> rule used now: – Accept if median coefficient of variation is less than or
equal to 61%– Otherwise, collapse and review again
![Page 12: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/12.jpg)
![Page 13: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/13.jpg)
Why not just use cell suppression as is done for the Economic products?
Advantages• Gets rid of the “bad” estimates• Keeps the “good” estimates (depends on complementary
suppression)
Disadvantages• Creates “holes” in distributions• Makes new problems for combined estimates (eg., in derived
products, such as data profiles)• Produces a new set of problems for year-to-year comparisons
![Page 14: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/14.jpg)
Data Release Testing – Step by Step• Compute coefficients of variation
– Coefficient of variation = standard error / estimate– Standard error = (upper bound – estimate) / 1.65– If the estimate = 0 set coefficient of variation = 100%
• Ignore total and sub-total lines in base table• Sort coefficients of variation in descending order• Find the middle value (the median)• If the median is greater than 61% the table FAILS
(median > 61% means more than half of the cells have a lower bound of 0; i.e., these cells are not statistically different from 0)
• If the median is 61% or less the table PASSES
![Page 15: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/15.jpg)
Collapsing
• Goal: release a simplified version of a base table for a geographic area that otherwise would get nothing
• Decisions on design of collapsed tables are made by subject-matter experts at the Census Bureau
• For operational reasons, only one collapsed version of each base table will be available regardless of geographic area
![Page 16: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/16.jpg)
How the Data Release Rules will Work with Collapsed Versions of Base Tables
![Page 17: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/17.jpg)
More About Collapsing
• Collapsed Tables are designed to assure that derived products (profiles, ranking tables, subject tables,…) can still be sourced from the base tables
• 2005 Tables: if a table passes filtering and a collapsed version exists, publish both the original version and the collapsed version for that geographic area
![Page 18: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/18.jpg)
Problems to fix in the current implementation of the data
release rules
• Collapsed versions missing in some cases
• Collapsed versions that aren’t working
• Poor choices in “sourcing” for derived products (eg., profiles)
![Page 19: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/19.jpg)
Statistical Significance Testing Statistical Significance Testing
Why should I do it?Why should I do it?
When should I do it?When should I do it?
How do I do it?How do I do it?
![Page 20: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/20.jpg)
Testing is ImportantTesting is Important
![Page 21: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/21.jpg)
• Estimate X is bigger than YEstimate X is bigger than Y
• Estimate X this year is larger Estimate X this year is larger than X last yearthan X last year
• Estimate X is smaller than Estimate X is smaller than Census 2000 valueCensus 2000 value
• State Z has the highest valueState Z has the highest value
Statements you might want to makeStatements you might want to make
![Page 22: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/22.jpg)
1.1. Get the Margin of Error (MOE) from ACS Get the Margin of Error (MOE) from ACS
2. Calculate the Standard Error (SE)2. Calculate the Standard Error (SE) [SE = MOE / 1.645][SE = MOE / 1.645]
3. Solve for Z where A and B are the two 3. Solve for Z where A and B are the two estimatesestimates
22 (SE(B))(SE(A))
BAZ
4. If Z < -1.645 or Z > 1.6454. If Z < -1.645 or Z > 1.645Difference is Significant at 90% confidenceDifference is Significant at 90% confidence
How do I do a significance test?How do I do a significance test?
![Page 23: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/23.jpg)
Obtaining Standard Errors is the KeyObtaining Standard Errors is the Key
• Sum or Difference of EstimatesSum or Difference of Estimates
• Proportions and PercentsProportions and Percents
• Means and Other RatiosMeans and Other Ratios
Simple FormulasSimple Formulas
222 )()(1
BSEPASEB
PSE
22 )(BSEASEBASE
B
AP Where….
![Page 24: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/24.jpg)
There is There is HELP HELP off in off in the the
wingswings
![Page 25: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/25.jpg)
But what if I am using 2000But what if I am using 2000non-ACS Data?non-ACS Data?
Where’s are my MOEs?Where’s are my MOEs?
![Page 26: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/26.jpg)
![Page 27: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/27.jpg)
Lets get to work on the Standard ErrorLets get to work on the Standard Error
)1(Y5ΥSE NY
N = Size of publication area (population)
Y = Estimate of characteristic
XSurvey Design Factor
![Page 28: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/28.jpg)
Survey Design Factor
www.census.gov/prod/cen2000/doc/tablec-xx.pdfxx=fl
Mode to Work 1.4 1.2 0.9 0.7
![Page 29: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/29.jpg)
)1(Y5ΥSE NY
N = Size of publication area (population = 362,563 )
Y = Estimate of characteristic
5Y = 5* 126,540632,700
1 - (Y/N) = 126,540 / 362,5631- 0.3490152
0.6509848
SE = 641.7772
![Page 30: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/30.jpg)
)1(Y5ΥSE NY X
Survey Design Factor
SE = 641.777 126,540 / 362,563 = 35%
Survey Design Factor
= 0.7Final Adjusted SE = 450
![Page 31: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/31.jpg)
![Page 32: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/32.jpg)
![Page 33: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/33.jpg)
![Page 34: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/34.jpg)
Tempting
Green is OKGreen is OK
This is NOTThis is NOT
![Page 35: Some ACS Data Issues and Statistical Significance (MOEs)](https://reader033.fdocuments.net/reader033/viewer/2022051401/56813ac9550346895da2e23b/html5/thumbnails/35.jpg)
Want to do an Want to do an exercise on your exercise on your
own?own?