UNECE/FAO/Eurostat WORKSHOP ON AGRI-ENVIRONMENTAL STATISTICS
Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality
description
Transcript of Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality
![Page 1: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality](https://reader030.fdocuments.net/reader030/viewer/2022020208/56815fdc550346895dcee014/html5/thumbnails/1.jpg)
Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality
Farm Structure Survey: Considerations on the Release of a
European MicrodataL. Franconi
D. Ichim
L. Corallo
Istituto Nazionale di Statistica (ISTAT)
ITALIA
Tarragona, Spain, 26-28 October 2011
![Page 2: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality](https://reader030.fdocuments.net/reader030/viewer/2022020208/56815fdc550346895dcee014/html5/thumbnails/2.jpg)
Summary
• Description of the European Farm Structure Survey (FSS)
• Disclosure Scenarios and Risk Analysis
• Disclosure Limitation Procedure
• Information Loss Assessment
• Conclusions
![Page 3: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality](https://reader030.fdocuments.net/reader030/viewer/2022020208/56815fdc550346895dcee014/html5/thumbnails/3.jpg)
Objective
CASE STUDYCASE STUDY
To explore,analyze and make recommendations on the release of European FSS MFR
Italian FSS 2005
Dutch FSS 2007
![Page 4: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality](https://reader030.fdocuments.net/reader030/viewer/2022020208/56815fdc550346895dcee014/html5/thumbnails/4.jpg)
D•Survey provides information on: a) utilised agricultural area (UAA) b) livestock unit c) SGMtotal standard gross margin (ESU) d) geographical location (NUTS) e) farming type
•Regional character and sparsity of the data•The survey unit agricultural holdings•The target population agricultural holdings •Member States: a) census at least each 10 years b)BE,LU,NL,FI,SE census each survey round c) UK,NO sample survey and census
•Response rate >90%
Description of the European FSS
![Page 5: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality](https://reader030.fdocuments.net/reader030/viewer/2022020208/56815fdc550346895dcee014/html5/thumbnails/5.jpg)
coefficientsFarming Type total SGM partial SGM agricultural quantities
on a three years average prices available with one year delay
The dominating activity partial SGM = more than 66% total SGM 70 farming type sudvisionClassification farms 50 the particular type 17 principal type 9 general type
SGM and farming type
![Page 6: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality](https://reader030.fdocuments.net/reader030/viewer/2022020208/56815fdc550346895dcee014/html5/thumbnails/6.jpg)
Relative variations of the mean number of holdings with respect to y2000 at NUTS2 level
stability of the phenomenon
Analysis of the temporal detail
![Page 7: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality](https://reader030.fdocuments.net/reader030/viewer/2022020208/56815fdc550346895dcee014/html5/thumbnails/7.jpg)
Percentage of large holdings at NUTS0 level in each wave of FSS.
i) it’s difficult to analyse the phenomenon in a single MSii) «small» countries do not have many large holdings
0
1
2
3
4
5
6
7
8
9
BE BG CZ DK DE EE IE GR ES FR IT CY LV LT LU HU MT NL AT PL PT RO SI SK FI SE UK NO
2000200320052007
Analysis of the geographical detail
![Page 8: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality](https://reader030.fdocuments.net/reader030/viewer/2022020208/56815fdc550346895dcee014/html5/thumbnails/8.jpg)
Scenario Spontaneous Identification • categorical structural variables Area status (A05) – 3 categories SGM region code (A07) – NUTS2 - 21 categories for
Italy Holder-Sex (L011) – 3 categories Age group (L012) – 7 categories
Disclosure scenario and risk analysis
COUNTRY COMBINATIONNUMBER OF SAMPLE UNIQUE
NUMBER OF SAMPLE DOUBLES
Italy 649 4% 3%The Netherlands 27 15% 11%
![Page 9: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality](https://reader030.fdocuments.net/reader030/viewer/2022020208/56815fdc550346895dcee014/html5/thumbnails/9.jpg)
VISIBLE RE-IDENTIFICATIONVISIBLE RE-IDENTIFICATION
External information
![Page 10: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality](https://reader030.fdocuments.net/reader030/viewer/2022020208/56815fdc550346895dcee014/html5/thumbnails/10.jpg)
release SGM as it is in the original data
TWO STRATEGIES recalculation SGM based on the recoded and perturbed variables (agricultural
quantitites)
SGM and Farming Type
SUPPRESSION of some identification variables AGGREGATION of some categorical variables
PERTURBATION of some numerical variables
![Page 11: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality](https://reader030.fdocuments.net/reader030/viewer/2022020208/56815fdc550346895dcee014/html5/thumbnails/11.jpg)
Suppression and aggregation
Variable Description UNIQUE CASESA04A (NUTS3) District 61%A04D Municipality 12%A07 (NUTS2) Region 4%Suppression A05 Area status 0%
A05 might be considered by some MS a significant data utility
loss.
• variables with high percentage of zero values (missing phenomenon) ADD UP• regional character and sparsity of the data
![Page 12: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality](https://reader030.fdocuments.net/reader030/viewer/2022020208/56815fdc550346895dcee014/html5/thumbnails/12.jpg)
Perturbation
continuous variables skew distribution re-identification
- microaggregation k=3 - preserve the weighted means - SGM region as blocking variable - retains characteristics of the data
INDIVIDUAL RANKING
![Page 13: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality](https://reader030.fdocuments.net/reader030/viewer/2022020208/56815fdc550346895dcee014/html5/thumbnails/13.jpg)
Information loss assesment
• percentage variation of the means/variances • the skew distributions as a consequence of the
sparsity• Member States decide: IR at NUTS3 or NUTS2 level
![Page 14: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality](https://reader030.fdocuments.net/reader030/viewer/2022020208/56815fdc550346895dcee014/html5/thumbnails/14.jpg)
Conclusion
• EACH FARM IS VISIBLE• REGIONAL CHARACTER AND SPARSITY OF THE DATA
GEOGRAPHICAL DETAIL = NUTS2 LEVEL GEOGRAPHICAL DETAIL = NUTS2 LEVEL
• RELEASE SGM ORIGINAL AND FARMING TYPE• FARMING TYPE AT PARTICULAR LEVEL• VARIATION OF INDIVIDUAL RANKING