1 Confidentiality and statistics on grids Vilni Verner Holst Bloch MSc. landscape ecology and...
-
Upload
myles-hopkins -
Category
Documents
-
view
224 -
download
0
Transcript of 1 Confidentiality and statistics on grids Vilni Verner Holst Bloch MSc. landscape ecology and...
1
Confidentiality and statistics on grids
Vilni Verner Holst BlochMSc. landscape ecology and natural resourcesStatistics NorwayOtervegen 23N - 2225 KongsvingerTel : ++47 / 62 88 50 62 Fax : ++47 / 62 88 50 97 E-mail : [email protected]
A proposal on common rules for handling confidentialityto the Board of Confidentiality
at Statistics Norway
The European Forum for Geostatistics workshop in Haag, Netherlands 5th – 7th of October 2009
“bridge the gap” between theory and practice in GeoStatistics
Session 1: Small area statistics
Bjørn ThorsdalenPopulation StatisticsStatistics NorwayOtervegen 23N - 2225 KongsvingerTel : ++47 / 62 88 50 62 Fax : ++47 / 62 88 50 97 E-mail : [email protected]
2
Overview of the presentation
• Background
• System of grids for national statistics
• Examples on confidentiality issues
• Different confidentiality rules
• Examples on use of todays confidentiality rules
• Guidelines for grid statistics
• Further work
3
Background
• A) Requests from users(insurance companies, scientists, companies with localisation or marketing issues, general public, education puposes)
• B) Internal drive within Statistics Norway (coming GIS and censuses)
• C) Partnership in National INSPIRE Forum (obligations)
• D) New possibilities(better presentation of spatial statistics, spatial analysis etc)
The more users need, and we produce, the more crucial common rules for confidentiality becomes
4
4
Norwegian Mapping
andCadastreAuthority
FIREWALL
Geodatabase
Statisticalbase
registers Statistics
Statistics N
orway
ArcGIS coverages,shape files
etc.
WMS ssb.no
Statbank
ExternalWMS/WFSproviders
Local
Copy
”Wall of confidentiality”
5
Statistical grids for Norway
Grid name Cell size Number of cellsSSB100m (1) 0.01 km2 35 000 000 cellsSSB125m (1) 0.01 km2 20 000 000 cellsSSB250m (1) 0.0625 km2 5 600 000 cellsSSB500m (1) 0.25 km2 1 400 000 cellsSSB1km 1 km2 350 000 cellsSSB5km 25 km2 15 000 cellsSSB10km 100 km2 5 000 cellsSSB25km (2) 625 km2 500 cellsSSB50km (2) 2 500 km2 150 cellsSSB100km (2) 10 000 km2 40 cellsSSB250km (2) 62 500 km2 10 cellsSSB500km (2) 250 000 km2 4 cells
(1) Because of limitations in many software packages and for practical use, these grids are recommended as grids with a county coverage.
(2) These grids might also cover sea territories. Number of cells refers to coverage of Norwegian mainland. One has however to be aware of deviations in grid cell areas for regions remote from the Norwegian mainland and Svalbard.
http://www.ssb.no/english/subjects/01/90/doc_200909_en/doc_200909_en.pdf
6
Number of farms. 1x1km. 1999
1 – 3 farms 4 or more farms
Confidentiality examples
7
Building stock. 100x100m. 2007.
1 – 3 buildings 4 or more buildings
Confidentiality examples
8
Night time population. 1x1km. Year 2000 over 2008.
1 – 9 persons 2000 10 or more 2000 1 – 9 persons 2008 (new settlements) 10 or more 2008 (new settlements)
Confidentiality examples
9
Confidentiality examples
Number of enterprises. 1x1km. 2008
1 – 3 enterprises 4 or more enterprises
10
Leisure homes. 1x1km. 2008.
1 – 3 leisure homes 4 or more leisure homes
Confidentiality examples
11
Night time population. 1x1km. Year 2008 over 2000.
1 – 9 persons 2000 (abandond cell) 10 or more persons 2000 (abandond cell)
Confidentiality examples
120 10 20 30 40 50 60 70 80 90 100
Ant personer i ruter med færre enn 4 personer
Ant personer i ruter med færre enn 11 personer
Ant personer i ruter med færre enn 31 personer
Ant personer i ruter med færre enn 51 personer
Ant personer i ruter med færre enn 101 personer
Ruter med færre enn 4 personer
Ruter med færre enn 11 personer
Ruter med færre enn 31 personer
Ruter med færre enn 51 personer
Ruter med færre enn 101 personer
1000 m 250 m 100 m
Confidentiality examplesNumber of grid cells and inhabitants by grid cell sizes. Per cent.
Share of grid cells with less than N persons
Share of persons in grid cells with less thanN persons
N > 101
N > 51
N > 31
N > 11
N > 4
N > 101
N > 51
N > 31
N > 11
N > 4
13
Fordeling av jordbruksbedrifter på 1x1km ruter. 2008.
0
2 000
4 000
6 000
8 000
10 000
12 000
14 000
16 000
18 000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Antall jordbruksbedrifter i 1x1km rute
Ant
all j
ordb
ruks
bedr
ifte
r
Confidentiality examples
Frequency of agricultural enterprises by 1x1 km grid cells.2008
Grid cells by number of agricultural enterprises
Nu
mb
er o
f ag
ricu
ltu
ral e
nte
rpri
ses
14
• Previous and existing rules for handling confidentiality on grids are not adequate.
• Confidentiality rules should be handled at lowest reasonable geographical level.
• Official statistics should not be given at all geographical levels/grid sizes.
• One should have a set of limits/treshold values dependent of the sensibility of the topics for statistics or quality of sources for statistics.
Recommondation given to the Board of Confidentiality at Statistics Norway
15
• The following has been recommended to the Board
Recommondation given to the Board of Confidentiality at Statistics Norway
1. Total figures (persons, enterprises, buildings, dwellings) and non-sensitive variables (age, sex, building type, NACE code) do not need to be anonymised.
2. Statistics on sensitive variables can be given if total figures exceed threshold values. Threshold value is to be set by responsible department for each statistics, dependending on quality, sensitivity and details. Threshold values are fixed to total figures of 10, 30 or 50. No further anonymisation is done.
3. Grid sizes of 125mx125m and 500mx500m shall not be used for official statistics.
16
• Work within the Geostat to make guidelines for handling confidentiality issues
• Adoptation of European rules for handling confidentiality in grid statistics ?
Further work
Thank you for your attention