7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
1/34
Alex de Sherbinin, Deputy ManagerNASA Socioeconomic Data and Applications Center
Center for International Earth Science Information NetworkThe Earth Institute, Columbia University
Palisades, New York, USA
Acknowledgements:This presentation borrows heavily frommaterial prepared by Deborah Balk, formerly of CIESIN and currentlyat Baruch College, and Gregory Yetman of CIESIN.
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
2/34
2
Short history of gridding population data
Why grid? Gridded Population of the World (GPW) Methodology
Global Rural Urban Mapping Project (GRUMP)
US Census Grids
Poverty Mapping
Gridded Infant Mortality Rate
Gridded Child Malnutrition
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
3/34
More attention to global scope
More attention to comparability
More attention to problem-orientedscience
More attention to spatial frameworks
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
4/34
US Census Bureaus Global Population Database (early 1990s)
Africa Population Grid (UNEP/GRID, 1991)
GPW v1 and Global Demography Project (NCGIA & CIESIN, 1994) 1 degree global grid (Environment Canada, 1995)
Europe (RIVM, 1995)
Africa update and Asia (NCGIA, UNEP/GRID & WRI, 1996)
Latin America (CIAT)
HYDE (RIVM/ Klein Goldewijk 1997, 2001 and 2006)
LandScan (ORNL, 1999 and onwards)
GPW v2 (CIESIN et al., 2000) GRUMP alpha (CIESIN et al., 2004)
GPW v3 (CIESIN et al., 2005)
In the past decade there have been far more efforts than can be
listed here
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
5/34
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
6/34
http://sedac.ciesin.columbia.edu/gpw/
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
7/34
!"
Find tabular information withattributes E.g., Population counts
Match to geographicboundaries Administrative
Urban footprints
Estimate Population to the target years
(1990, 1995 and 2000)
Transform to grids
Statistics South Africa
Descriptive - South Africa by Province and Municipality
Table 1
Province (PR_SA)District municipality (DC_PR_SA)Municipality (MN_PR_SA)
Main place (MP_SA)Sub-place (SP_SA)
Geography by Gender
for Person weighted
Male Female Total3 Northern Cape 401094 421636 822729
6 DC6: NAMAKWA District Municipality 53424 54687 108110
301 NC061: Richtersveld 5170 4961 10130
30101 Alexander Bay 723 729 1452
30101001 Alexander Bay Navel Base 30 12 42
30101000 Alexander Bay SP 738 675 1413
NB: Spatially matched population census (and survey) data
generally has several data providers!
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
8/34
#$
Population and boundary data must match Best available & matchable data are used
Matching the inputs to one another is not as easyas it might seem Boundaries change often and come in different scales
Population data may not match boundaries We may have population values for different years atdifferent levels (e.g., district-level one year, state-levelanother)
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
9/34
"
Clean boundaries
E.g., remove slivers
Make them consistent across borders and coasts
Use international standardthe DCWwith exceptions
Europemost spatially data supplied by one agency (SABE)
and all international boundaries are internally consistent For GPW v4 we plan to use ISciences Global Coastline v.1
Coastlines matched to DCW, except where much higherquality data are supplied
E.g., Indonesia
Data table needs to include the same variables, withthe same variable names, formats, etc.
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
10/34
%&'(
Places highlighted in yellow are new municipiosNeed to find where they came from & their pop size
Use on-line atlases or newer maps, when availableAdd new pop to unit of origin or allocate old population to new unit proportionally.
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
11/34
%'
Annual rate of change calculated:
Population estimates adjusted to target years:
Px= P2ert
Definitions
r - Annual rate of growth
P1..2 - Census estimate
t - number of years between
census enumerations
Px - Year of Estimate
Pun - UN EstimatePadj - Adjusted estimate
t
P
P
er
1
2
log
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
12/34
%')*
Definitionsa - Adjustment factor
Px - Year of Estimate (90 or 95)
Pun - UN Estimate
Padj - Adjusted estimate
Adjustment factor for matching national estimates to UN
estimates calculated:a = (Pun- Px) / Pun
Adjustment factor applied at the national level :
Padj= Px* a
Differences ranged from 20% under (Somalia, 1995) to
25% over (Jordon, 1990)
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
13/34
'
Proportional allocation used to spread the populationover grid cells
Virtually all data work completed on vector data
Gridding is the last step
National grids created, global grids assembled byadding national grids together
Country grids are created with collars so that they startand end on even degrees; therefore the assembly of
the grids without interpolation is possible
Replacement of country-specific grids feasible
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
14/34
'
Land Area: 458.4 square km
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
15/34
Area 2.6 kmPop = 628.5 persons
per sq. km * 2.6 =1,634.1persons
Area 16.1 km
Pop = 628.5 personsper sq. km * 16.1 =10,118 persons
Area 0.05 kmPop = 628.5 personsper sq. km * 0.05 =31.4 persons
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
16/34
Population 2000
PersonsHigh: 10,123.5
Low: 7
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
17/34
!"# $%"# $%"# $%"#
& ' ''
( ' ' )'
#+,$"$-./
)*
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
18/34
01)01*
http://sedac.ciesin.columbia.edu/gpw/
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
19/34
01)01*
Objective: To delineate urban and rural extents and
populations
Collaboration between CIESIN, IFPRI, World Bank, & CIAT Builds on GPW infrastructure
Adds urban areas from Nighttime lights satellite data
Three databases: Settlement Points (>70,000 w/ pop of 1k+)
Urban Extents (>23,500 w/pop of 5k+)
Pop Grid at 1 km resolution
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
20/34
&01)2*
Stand alone model GRUMPe written in C
Combines the following pieces of information: Population and boundaries of each urban area based on NTLs
Boundaries sometimes based on buffered points where no NTL signature
Population and boundaries of each admin area
Size of the intersect areas where urban and admin areas overlap UN national estimates for percentage of population in urban and
rural areas
20
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
21/34
&01)3*
The algorithm reallocates the total pop in each admin
unit into rural and urban areas based on UN estimates,with six contraints:
1. Total admin pop remains constant
2. Urban pop density in any admin unit must be > rural density
3. Rural pop density cannot be lower than national mininimumrural pop density threshold for country/region
4. Rural pop density cannot be higher than the national maximumrural pop density threshold for country/region
5. Urban pop density cannot be lower than national minimumurban pop density threshold for country/region
6. Urban pop density cannot be higher than the nationalmaximum urban pop density threshold for country/region
21
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
22/34
&01)4*
The algorithm is trivial where only one urban area iscontained within an admin unit
It is more complex when:
there are multiple urban areas overlapping an admin unit
Urban areas overlap more than one admin area
Large urban areas contain more than one admin area
These are common situations and require successiveiterations to meet all constraints
22
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
23/34
(
23
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
24/34
1
Close upof Brazilusing the100K
person cutoff
Note thevariety ofshape
Much morethan pointsconvey
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
25/34
1"
http://sedac.ciesin.columbia.edu/usgrid/
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
26/34
1"
Uses proportional allocation algorithm
Higher resolution: Resolution is 1km (30 arc-sec) for the country as a whole
Metropolitan areas are available at 250m (7.5 arc-sec)
More census variables:
Individual data: age distribution, race, ethnicity, income,poverty, educational level, and immigrant status
Household data: household size, one-person households,female-headed households with children under 18, and
linguistically isolated households
Housing unit data: occupied housing units without a vehicle,and year of construction
26
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
27/34
1""
27
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
28/34
http://sedac.ciesin.columbia.edu/povmap/
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
29/34
Infant mortalityrates (IMRs):
Serve as a usefulproxy for overallpoverty levelsbecause they arehighly correlated
with metrics suchas income,education levels,and health statusof the population
This metric isparticularly goodfor distinguishingpoverty levels atthe lower end ofthe income ladder
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
30/34
$0
Sources Demographic and Health Surveys (39 countries)
Multiple Indicator Cluster Surveys (5 countries)
National Human Development Reports (14 countries) National Statistical Offices (18 countries)
6,494 spatial units in global data base Brazil and Mexico 5,372 units
74 other countries with subnational data 22 units per country on average
115 countries national level data only (UNICEF) 36 countries no data
Calibration Subnational IMR values adjusted to be consistent with national UNICEF
2000 IMR values
Gridding using proportional allocation algorithm We also converted rates to counts
For each subnational unit, estimates of live births, infant deaths calculatedbased on griddedpopulation, nationalfertility data, and subnationalIMR.
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
31/34
31
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
32/34
Use anthropometric data found in household surveys DHS and MICS data were aggregated to the spatial units at which the
surveys report, based on raw data where it was available, and publishedreports otherwise.
These spatial units are typically equivalent to first level administrativeregions or aggregations thereof.
Geospatial boundary files that match those spatial units werelocated or created in order to match the reporting regions ofthe surveys as closely as possible.
In many cases, the survey reports contained maps detailing the survey
regions. Elsewhere, matches were purely name-based. Map percent of children underweight
Underweight defined as being two standard deviations or more belowthe mean weight for a given age, as compared to an international
reference population. 32
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
33/34
5!
Continued emphasis on higher resolution inputs
Effort to collect and grid more census variables Age and sex distribution Urban/rural distribution
Proposed output resolution: 1km grids
May create time series back to 1980
Looking for data sharing partners
33
7/27/2019 17_construction of Gridded Population and Poverty Data Sets From Different Data Sources _Alex de Sherbinin
34/34
#6'-7+180+16
'("
&9..
34
Top Related