World Grid Square Statistics and their Application to Data ......11.25 arc-seconds for longitude...
Transcript of World Grid Square Statistics and their Application to Data ......11.25 arc-seconds for longitude...
World Grid Square Statistics and their Application to Data Analytics
Aki-Hiro Sato (Kyoto University, JST PRESTO)
Shoki Nishimura (National Statistics Center of Japan)
Naoki Makita (Office of Director-General for Policy Planning on Statistical Standards)
Tsuyoshi Namiki (Statistics Bureau)
Hiroe Tsubaki (National Statistics Center of Japan)
Workshop on Integrating Geospatial and Statistical StandardsStockholm, Sweden 6-8 November 2017
Outlines
• Background and Motivation
• Japanese Grid Square Codes (JIS X0410)
• World Grid Square Codes with high compatibility to JIS X0401
• Web Application of World Grid Square Statistics
• Linked Open Data
• Summary
2
Three pillars
How is the world now?
How will the world change?
How should we construct our world?
Nowcast
Forecast
Design
3
Data are not answers
•Data are not answers, but inspire questions to researchers and practitioners.
•Data just tell us how a phenomenon behaves, and at the same time, it asks us why a phenomenon behaves as we observe.
•Data is just a collection of observations or facts.
4
What the goal is
Data collection on socioeconomic activities from a global perspective
Spread of a global evaluationplatform to support people
This project is to develop a business intelligence platform to evaluate and visualize
sustainability of tourism from a global perspective.
5
Information silosAgency A
P5
P4
P3
P2
P1
Agency B
P5
P4
P3
P2
P1
Agency C
P5
P4
P3
P2
P1
P3: Common geographies for dissemination of statistics
P4: Interoperable data and metadata standards
P5: Accessible and usable geospatially enabled statisticsPortal Site
(e-Stat)
LODSatellite Accounts
Coding Systems(Prefectural Identification
code, Identification code for cities, Grid Square codes)
JIS X0401: Prefecture Identification Code (Level 1 administrative areas=NUTS2)
JIS X0402: Identification code for cities, towns and villages (Level 2 administrative areas = NUTS3)
JIS X0410: Grid Square codes (defined by latitude and longitude)
Statistical geographic standards are prevailed widely in Japan. They are adopted as Japanese Industrial Standards (JIS).
*Geographical Coding Systems (P3) can be also used as geocoded units in data management systems (P2).
6
Grid square codes (JISX0410)• In 1973, "Standard Grid Square and Grid Square Code Used for the
Statistics" was made as the Announcement No. 143 of the Administrative Management Agency (AMA) that hierarchically defines grid squares covering the entire land of Japan.
• In 1976, The Japanese Industrial Standards Committee adopted “Grid Square Code” as JISX0410.
454560
digit) one is ( 4560
5.75.760
digit) one is ( 5.760
100longitude
digits) twois ( 100longitude
303060
digit) one is ( 3060
55
digit) one is ( 5
404060latitude
digits) twois ( 4060latitude
wgh
wgw
vfg
vfv
uf
uu
rbc
rbr
qab
qaq
pa
pp
Grid square code= puqvrw
80km grid square code (4 digits)
10km grid square code (6 digits)
1km grid square code (8 digits)1km gird square30 arc-seconds for latitude45 arc-seconds for longitude
80km grid square40 arc-minutes for latitude1 arc-degree for longitude
10km grid square5 arc-minutes for latitude7.5 arc-minute for longitude
7
0 1 2 3 4 5 6 7
0 1
2 3
4 5
6 7
0 1
2 3
4 5
6 7
8 9
0 1 2 3 4 5 6 7 8 9
5438
5438-23
5438-2343
1st level grid code40 arc-minutes for latitude1 arc-degree for longitude
2nd level grid code5 arc-minutes for latitude7.5 arc-minutes for longitude
3rd level grid code30 arc-seconds for latitude45 arc-seconds for longitude
3 41 2
3 41 2
3 41 2
5438-2343-1
5438-2343-11
5438-2343-111
6th level grid code3.75 arc-seconds for latitude
5.625 arc-seconds for longitude
4th level grid code15 arc-seconds for latitude22.5 arc-seconds for longitude
5th level grid code7.5 arc-seconds for latitude11.25 arc-seconds for longitude
Grid square codes (JIS X0410)
e.g. By way of Grid Square code, Kyoto University is in 52354632.
8
World Grid Square Codes
(JIS X0410 compatible definition)
• Administrative areas are used as geographical unit in statistical aggregation, however, they vary over time.
• Grid square statistics is a geographical statistics defined by latitude and longitude.
• The grid squares code is a key definition to generate grid square statistics data.
9
3rd level world grid square codes(JIS X0410 compatible definition)
454560
digit) one is ( 4560
5.75.760
digit) one is ( 5.760
100longitude21
digits) twois ( 100longitude21
303060
digit) one is ( 3060
55
digit) one is ( 5
404060latitude21
digits) twois ( 4060latitude21
wgh
wwg
vfg
vvf
uzyf
uuzy
rbc
rrb
qab
qqa
px-a
ppx-
10,100
10,10010
10,10
0
00
10,100
100,10010
10,10
0
00
000
code square grid
up
up
up
opuqvrw
puqvrwo
puqvrwo
up
up
up
uqvrwop
uqvrwpo
uqvrwpo
1222 zyxo
10
Probable data sources• We identified three types of data source:
• official statistics
• satellite imagery
• point data collected from the Internet
• We found four types of procedure to create World Grid Square statistics:
1. Convert grid square statistics provided as part of government statistics into World Grid Square statistics.
2. Aggregate girded data and compute World Grid Square statistics.
3. Compute grid square statistics from data that include geographical information (latitude and longitude).
4. Generate World Grid Square statistics from polygon data by checking their inclusiveness.
11
Grid Square demonstration of Government statistics on e-Stat (Official Statistics Portal Site of Japan)
Population(Population Census)
Number of Employments(Economic Census)
https://jstatmap.e-stat.go.jp
12
• Administrative areas for 252 countries and regions
Potential WGS application
The original data are provide from GADM. ©GADM
This research used computational resources of the HPCI system provided by Institute of Statistical Mathematics through the HPCI System Research Project (Project ID: hp16060) (1 April 2016-31 March 2017).
Belarus
Canada
Australia
Germany Denmark
13
Sweden
meshcode alt_min alt_mean alt_median alt_max lat0 long0 lat1 long1
20061973411 8.17602 9 11 4.625 119.3875 4.616667 119.4
20071923820 6.721212 4 18 4.908333 119.4 4.9 119.4125
20071923720 10.962733 12 20 4.9 119.4 4.891667 119.4125
20071923621 12.734207 13 21 4.891667 119.4 4.883333 119.4125
2007191312-19 8.215956 8 17 4.766667 119.4 4.758333 119.4125
2007191302-22 9.224736 8 21 4.758333 119.4 4.75 119.4125
20071903921 9.973765 10 23 4.75 119.4 4.741667 119.4125
20071903820 4.625693 4 14 4.741667 119.4 4.733333 119.4125
min altitude mean altitude median altitude max altitude difference
Potential WGS application
• Altitude
The original data are provide from Japan Aerospace Exploration Agency. ©JAXA
The Advanced Land Observing Satellite "DAICHI" (ALOS)
An elevation data set with a resolution of 30 meters horizontally (30-m mesh version) was used.
14
http://www.fttsus.jp/worldgrids/en/top/
Research Institute for World Grid Squares
World Grid Square Statistics
DataDocumentsLibrariesWeb Applications
15
Cloud-based Data Analytics Platform
MESHSTATShttps://www.meshstats.xyz/meshstats
Supported by 9 languages
16
Proposed system
Application Interface Service
Open Data
Data collected from Web
Resource Information
Government Statistics
DMO Data
StakeholdersIn Tourism(UNWTO Affiliate
Members)
National Statistics(NSTAC)
Internet
MESHSTATS
Users
Data about accommodations and transportation
Data from Satellites
Aerospace Agencies
(JAXA, ESA, NASA)
17
Potential WGS application
Population from government statistics
Forest area from satellite imagery
18
Room occupancy ratio estimated fromdata of a Internet booking site
Occ
up
ancy
rat
io(%
)
Date of stay24 June 2016 to 19 November 2016
19
Management Unit for world grid
squares
API function
HTML function 1
HTML function i
HTML function N
API output
HTMLI/O
Management Unit for users
Javascript for data analytics
Management Unit for Multi-languages
DB for world grid statistics
Language DB
User DB
……
API search query parser
API search query
WebAPI Unit
HTML Unit
20
Big Data Reference Architecture
Big Data Reference Architecture consist of five roles:
• Data Consumer
• Data Provider
• Data Orchestrator
• Data Application provider
• Data platform provider
Big Data Reference Architecture
21
Approach• Data Application Providers can provide Data
Consumers with their data application developed by using open data provided from Data Providers.
• Data Providers can provide part of their own data as open data.
• Data Consumers can join activities of participatory design by using the developing data application and create use cases for the developing data application provided by Data Application Providers
• Data Orchestrators can create insights about use cases as a field investigation with Data Providers and Data Consumers.
[Data Application Providers]
[Data Providers][Data Consumers]
[Data Orchestrators]
Field investigation
Providing open data
Create Insights about use cases
Developing data applications based on an agile method
22
Linked Open Data (LOD)
• LOD are presented using standard technologies based on World Wide Web Consortium (W3C) recommendations
• LOD are created using the Resource Description Framework (RDF)
• The National Statistics Center plans to release grid square statistics that population census as LOD
http://data.e-stat.go.jp/lodw/
23
Relation diagram expression
• World Grid Square statistics can readily be linked to data other than government statistics
• Data definition can be standardized by providing RDF formatted LOD as URI.
Cell ID
gsc:g2052353561
cd-dimension:sex cd-code:
sex-all
sdmx-dimension:refArea
11
gsc:wgsCode
cd-dimension:timePriod
2010
2052353561
52353561
gsc:jgsCode
34.975
gsc:lat-NW
135.6375
gsc:long-NW
34.966667
gsc:lat-SE
135.65
gsc:long-SE1.141449
gsc:span-EWN
World grid square code
1.141565
gsc:span-EWS
24
Case studies• Data integration and data processing: We can link different grid square
statistics (linkage) and use them in operations among different grid square statistics, synthesizing new grid square statistics from several original data types.
• Mapping: We can visualize grid square statistics on a map for use in analyzing our areas of interest.
• Data creation on given areas: We can generate statistics on a given area by recalculating grid square statistics for that area.
• Identifying effective areas: As grid squares make it easy to measure distances among grid squares, World Grid Square statistics can be used to calculate potential demand computed from population within a given distance.
• Defining observation areas: Grid square statistics can be used to define an area for collection of data or samples.
• Unit for numerical simulation: Grid square statistics can be used to conduct numerical simulations for a unit such as diffusion processes, percolation models, and migration processes.
25
Summary• We proposed World Grid Square statistics and some examples of their
application to administrative areas and elevations.
• We addressed use cases for data production and data consumption based on World Grid Square Statistics.
• We introduced our multi-language data visualization and analytics platform called MESHSTATS (https://www.meshstats.xyz/meshstats).
• We explained how to produce world grid square statistics based on LOD.
• We showed that Big Data Reference Architecture is applicable to data acquisition, data collection, data analysis, interpretation and deployment with Data orchestrators, Data consumers, Data providers, Data application providers, and Data platform providers.
26