PostGIS Right Size your Spatial Data - Postgres Conf
Transcript of PostGIS Right Size your Spatial Data - Postgres Conf
![Page 2: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/2.jpg)
Owner / CEO @ RustProof Labs
Publishes on blog.rustprooflabs.comDB Developer / Analyst
@ Front Range CC
SELECT *
FROM pgconf.presenter
WHERE id = ‘ryanlambert’
![Page 3: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/3.jpg)
MySQL, mid-2000s
MS SQL Server, 2009PostgreSQL, 2011OpenStreetMap && PostGIS
SELECT *
FROM pgconf.presenter
WHERE id = ‘ryanlambert’
![Page 4: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/4.jpg)
Agenda
Spatial data overviewGIS tasks: Analysis vs. ThematicSimplification strategiesPolygonsLines
![Page 5: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/5.jpg)
GIS Data used…
© OpenStreetMap Contributors
Thank you!
https://www.openstreetmap.org/user/RustProof%20Labs
![Page 6: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/6.jpg)
What is spatial data?
![Page 7: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/7.jpg)
What is spatial data?
Data about the world around us.
“GIS data is still just data!”… we have tools for that
![Page 8: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/8.jpg)
SELECT ST_SetSRID(
ST_Point(-73.9815, 40.7625), 4326);
![Page 9: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/9.jpg)
Spatial data types
POINT (Node)
LINE
POLYGON
![Page 10: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/10.jpg)
Spatial data size
Lines and Polygons can bloat quickly
Similar toJSON
BYTEA
![Page 11: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/11.jpg)
Puts on GIS analyst hat…
![Page 12: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/12.jpg)
Tables in PostGIS
Layers in QGIS
Network Activity
![Page 13: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/13.jpg)
![Page 14: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/14.jpg)
![Page 15: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/15.jpg)
Extra challenge
Feels slow
DB doesn’t always register performance issues
![Page 16: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/16.jpg)
GIS Analyst Tasks
Analysis vs. Thematic
![Page 17: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/17.jpg)
Using spatial data: Analysis
I need coffee, quick, where’s the nearest location?
Distance from buildings to fire hydrants?
Is my house in a flood plain?
![Page 18: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/18.jpg)
Using spatial data: Thematic
Density of drivers around the northeast United States
Regional crime rates
Regional weather maps
![Page 19: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/19.jpg)
Using Spatial Data: Thematic
Visualize trends over an area
![Page 20: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/20.jpg)
Transactional vs. Reporting
OLTP OLAP
ETL
![Page 21: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/21.jpg)
All data is dirty!
![Page 22: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/22.jpg)
OpenStreetMap Data
Data entrySome professionals
Other unpaid, untrained volunteers
Everything between
Variable quality / formatting
![Page 23: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/23.jpg)
Thematic GIS
![Page 24: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/24.jpg)
Large-area polygons
Counties
Zip Codes
States
Countries
Lakes
![Page 25: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/25.jpg)
Counties of Colorado (64)
![Page 26: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/26.jpg)
Large Polygon
Jefferson County, CO
ST_Npoints(way) = 780
ST_MemSize(way) = 12.3 kb
![Page 27: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/27.jpg)
![Page 28: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/28.jpg)
Counties in the western US
SELECT COUNT(*) AS county_count,
AVG(ST_NPoints(way)) AS points_avg, SUM(ST_NPoints(way)) AS points_total, AVG(ST_MemSize(way)) / 1024 AS kb_avg,SUM(ST_MemSize(way)) / 1024 AS kb_total
FROM osm.boundary_polygonWHERE admin_level = '6'
![Page 29: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/29.jpg)
Counties in the western US
# of Counties: 460
Total # of Nodes: 623k
![Page 30: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/30.jpg)
Counties in the western US
# of Counties: 460
Total # of Nodes: 623kAverage # of Nodes: 1,355
Max # of Nodes: 13,832
![Page 31: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/31.jpg)
Counties in the western US
# of Counties: 460
Total # of Nodes: 623kAverage # of Nodes: 1,355
Max # of Nodes: 13,832
Average size per polygon: 21.2 kB
Max size of polygon: 216 kB
![Page 32: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/32.jpg)
Size Matters
Average of 16 bytes / node
![Page 33: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/33.jpg)
What can we do?
![Page 34: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/34.jpg)
PostGIS: ST_Simplify()
“Returns a "simplified" version of the given geometry...”
ST_Simplify(geometry, tolerance)
Higher tolerance == More simplification
![Page 35: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/35.jpg)
In non-spatial terms…
3.14159 ≈ 3.14
![Page 36: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/36.jpg)
PostGIS: ST_Simplify()
SRID 900913
tolerance=10
![Page 37: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/37.jpg)
Polygon simplification
![Page 38: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/38.jpg)
Benefits of Simplified Polygons
Reduced # of nodes by 45-50%
Reduced size on disk by 45-50%
Improved query performance by ~ 40%
![Page 39: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/39.jpg)
Side effects
Reduced accuracy
Potential errorsBefore/AfterError rate in testing: < 0.5%
![Page 40: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/40.jpg)
Too much simplification?
![Page 41: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/41.jpg)
Questions so far?
![Page 42: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/42.jpg)
Large number of small lines
![Page 43: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/43.jpg)
Large number of small lines
OpenStreetMap roads
Interstate
Major highway
Minor highway
Residential roads
Sidewalks
Parking aisles
Hiking trails
![Page 44: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/44.jpg)
I-70 in Colorado
![Page 45: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/45.jpg)
I-70 in Colorado
450 miles
1,647 rows of data
19,281 nodesMIN(ST_NPoints(way)): 2
MAX(ST_NPoints(way)): 217
![Page 46: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/46.jpg)
Aggregate and Simplify
ST_Collect()ST_Simplify()
![Page 47: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/47.jpg)
ST_Collect()
Aggregate functionThink SUM() for spatialGROUP BYrefnamelevelcity
![Page 48: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/48.jpg)
Demo (video)
Render roads and waterways in QGIS
Two windowsRaw data: Upper-left
Thematic: Lower-right
![Page 49: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/49.jpg)
![Page 50: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/50.jpg)
PostGIS to QGIS Rendered
https://blog.rustprooflabs.com/2018/12/postgis-tame-your-data-2
![Page 51: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/51.jpg)
PostGIS to QGIS Rendered
40% faster query in Postgres
80-95% faster in QGIS
QGIS pulls 2k rows at a time
QGIS has to load, process, apply rules, and render
![Page 52: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/52.jpg)
PostGIS to Rendered
80 - 95% improvement!
![Page 53: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/53.jpg)
Faster in-DB spatial operations
![Page 54: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/54.jpg)
Trees (Point) per county (Polygon)
EXPLAIN (ANALYZE, BUFFERS, COSTS) SELECT c.osm_id, c.name, c.way,
COUNT(n.osm_id) AS treesFROM osm.county_polygon cINNER JOIN osm.natural_point n
ON ST_Contains(c.way, n.way)WHERE n."natural" = 'tree’AND c.name = 'Jefferson County’
GROUP BY c.osm_id, c.name, c.way;
![Page 55: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/55.jpg)
Trees per county
Raw
274 MB
![Page 56: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/56.jpg)
Trees per county
Raw
Thematic
274 MB
134 MB (-51% diff in spillage)
![Page 57: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/57.jpg)
Trees per county
Raw
Thematic
![Page 58: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/58.jpg)
Latencies at Human Scale
System EventActual
LatencyScaled
LatencyOne CPU cycle 0.4 ns 1 sLevel 1 cache access 0.9 ns 2 sLevel 2 cache access 2.8 ns 7 sLevel 3 cache access 28 ns 1 minMain memory access (DDR DIMM) ~100 ns 4 minSSD I/O 50–150 μs 1.5–4 daysRotational disk I/O 1–10 ms 1–9 monthsInternet call: San Francisco to New York City 65 ms[3] 5 yearsInternet call: San Francisco to Hong Kong 141 ms3 11 years
https://www.prowesscorp.com/computer-latency-at-a-human-scale/
![Page 59: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/59.jpg)
When to optimize?
ETL
Views / Materialized views
Ad-hoc queries
![Page 60: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/60.jpg)
ETL: PgOSM Project
Started in 2015Transforms osm2pgsql structure to “Layers”
MIT License
https://github.com/rustprooflabs/pgosm
![Page 61: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/61.jpg)
Final Thoughts
![Page 62: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/62.jpg)
Postgres v11…
Covering indexes!
![Page 63: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/63.jpg)
Coming in Postgres v12
Covering GIST indexes
CREATE INDEX gix_road_lineON osm.road_lineUSING GIST (way)INCLUDE (highway, ref);
https://commitfest.postgresql.org/21/1615/
https://github.com/postgres/postgres/commit/f2e403803fe6deb8cff59ea09dff42c6163b2110
![Page 64: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/64.jpg)
Resources
![Page 65: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/65.jpg)
PostGIS Docs
https://postgis.net/docs/reference.htmlhttps://postgis.net/workshops/postgis-intro/
![Page 66: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/66.jpg)
RustProof Labs Blog
PostGIS: Tame your spatial data (Part 1)PostGIS: Tame your spatial data (Part 2)Load OpenStreetMap data to PostGISosm2pgsql on a Raspberry PiPgOSM: Transform OpenStreetMap data in
PostGISPgOSM Transformations explained
![Page 67: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/67.jpg)
Versions used
SELECT version(); PostgreSQL 11.1 (Ubuntu 11.1-1.pgdg16.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609, 64-bit
SELECT PostGIS_Full_version(); POSTGIS="2.5.1 r17027" [EXTENSION] PGSQL="95" (procs need upgrade for
use with "110") GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.2, 08 September 2015" GDAL="GDAL 1.11.3, released 2015/09/16" LIBXML="2.9.3" LIBJSON="0.11.99" LIBPROTOBUF="1.2.1" RASTER
![Page 68: PostGIS Right Size your Spatial Data - Postgres Conf](https://reader031.fdocuments.net/reader031/viewer/2022012409/616a469911a7b741a350bbe7/html5/thumbnails/68.jpg)
Thank you!
Questions?