Geospatial ETL with Stetl - GeoPython 2016
-
Upload
just-van-den-broecke -
Category
Software
-
view
1.101 -
download
0
Transcript of Geospatial ETL with Stetl - GeoPython 2016
![Page 1: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/1.jpg)
Geospatial Data Processing with Stetl
Just van den BroeckeGeoPython 2016
Muttenz - SwitserlandJune 24, 2016
www.justobjects.nl
www.stetl.org
![Page 2: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/2.jpg)
About MeIndependent Open Source Geospatial Professional
+ Secretary OSGeo Dutch Local Chapter + Member of the Dutch OpenGeoGroep
Just van den [email protected] www.justobjects.nl
![Page 3: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/3.jpg)
Agenda
• Spatial ETL• Stetl
• Concepts• Cases• Status
• Q & A
![Page 4: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/4.jpg)
Spatial ETL
![Page 5: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/5.jpg)
Data Wrangling
![Page 6: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/6.jpg)
ETL - Extract Transform Load
![Page 7: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/7.jpg)
Spatial ETL
GML
PostGIS
Shapefile
WFS
CSV
SQLite
XML
GMLPostGIS
Shapefile
WFSCSV
SQLite
XML
![Page 8: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/8.jpg)
GMLPostGIS
Shapefile
WFSCSV
SQLite
XML
Spatial ETL Example non-standard source
![Page 9: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/9.jpg)
From: https://live.osgeo.org/en/overview/gdal_overview.html
![Page 10: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/10.jpg)
![Page 11: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/11.jpg)
Plenty of Tools…
Each tool is powerful by itself but cannot do the entire ETL
ogr2ogr
Spatial ETL
![Page 12: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/12.jpg)
FOSS ETL - How to Combine Components?
=+ + ?+ ..
![Page 13: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/13.jpg)
Example - 2011 INSPIRE-FOSS
http://inspire.kademo.nl/doc/design-etl.html
Nice ideas but hard to scale, deploy and reuse.
Need Framework
![Page 14: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/14.jpg)
Solution: Add Python to the Equation
=+ + ?( )+ ..
![Page 15: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/15.jpg)
Stetl
Solution: Add Python to the Equation
=+ +( )+ ..
![Page 16: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/16.jpg)
Stetl =
Simple Streaming
Spatial Speedy
ETL
![Page 17: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/17.jpg)
Stetl Concepts
![Page 18: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/18.jpg)
Process Chain
Input Filter OutputFilter
Stetl concepts
Source Target
![Page 19: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/19.jpg)
Process Chain
Input Filter Outputgml
Filter
Stetl concepts
![Page 20: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/20.jpg)
Example: GML to PostGIS
GMLReader
PG Output
gml
Stetl concepts
![Page 21: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/21.jpg)
Example: Data Model Transform
OGRReader XSLT
GMLWriter
gml
Stetl concepts
Simple Features
Complex Features
or Jinja2 !
![Page 22: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/22.jpg)
Process Chain - How?
Input Filters Output
Stetl concepts
Stetl Config File
Instantiate
![Page 23: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/23.jpg)
Example: XML to Shape
XMLInput
XSLTFilter
OGROutput
![Page 24: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/24.jpg)
Example: XML to Shape
The Source File
![Page 25: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/25.jpg)
Example: XML to Shape
XMLInput
![Page 26: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/26.jpg)
Example: XML to Shape
XMLInput
XSLTFilter
![Page 27: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/27.jpg)
Example: XML to Shape
Prepare XSLT Script
![Page 28: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/28.jpg)
Example: XML to Shape
XSLT GML Output
![Page 29: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/29.jpg)
Example: XML to Shape
XMLInput
XSLTFilter
OGROutput
OGC Simple
Features
XML DOM
![Page 30: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/30.jpg)
Example: XML to Shape
The Stetl Config File
ProcessChain
XMLInputXSLT
Filter
ogr2ogrOutput
![Page 31: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/31.jpg)
Running Stetl
stetl -c etl.cfg
stetl -c etl.cfg [-a <properties>]
![Page 32: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/32.jpg)
Installing Stetl - PyPi
Deps•GDAL+Python bindings•lxml (xml proc)•psycopg2 (Postgres)
sudo pip install stetl
![Page 33: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/33.jpg)
Installing Stetl - new: Docker
https://hub.docker.com/r/justb4/stetl
![Page 34: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/34.jpg)
Speed: Streaming
Input Filter Output
gml
Stetl concepts
![Page 35: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/35.jpg)
Speed: Going Native
Input Filter Output
gml
ogr2ogr StetlStetl
Native C Libs/Progs
Calls
Stetl concepts
![Page 36: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/36.jpg)
Example Components
Input Filters Output
Stetl concepts
XMLFile XSLT GML
ogr2ogr XMLAssembler GDAL/OGR
LineStream XMLValidator WFS-T
Postgres/PostGIS Jinja2 Postgres/PostGISdeegree* FeatureExtractor deegree*YourInput YourFilter YourOutput
![Page 37: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/37.jpg)
Example: XsltFilter Pythonfrom util import Util, etreefrom filter import Filterfrom packet import FORMAT
log = Util.get_log("xsltfilter")
class XsltFilter(Filter): # Constructor def __init__(self, configdict, section): Filter.__init__(self, configdict, section, consumes=FORMAT.etree_doc, produces=FORMAT.etree_doc)
self.xslt_file_path = self.cfg.get('script') self.xslt_file = open(self.xslt_file_path, 'r') # Parse XSLT file only once self.xslt_doc = etree.parse(self.xslt_file) self.xslt_obj = etree.XSLT(self.xslt_doc) self.xslt_file.close()
def invoke(self, packet): if packet.data is None: return packet return self.transform(packet)
def transform(self, packet): packet.data = self.xslt_obj(packet.data) log.info("XSLT Transform OK") return packet
![Page 38: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/38.jpg)
[etl]chains = input_xml_file|my_filter|output_std
[input_xml_file]class = inputs.fileinput.XmlFileInputfile_path = input/cities.xml
# My custom component[my_filter] class = my.myfilter.MyFilter
[output_std]class = outputs.standardoutput.StandardXmlOutput
class MyFilter(Filter): # Constructor def __init__(self, configdict, section): Filter.__init__(self, configdict, section, consumes=FORMAT.etree_doc, produces=FORMAT.etree_doc)
def invoke(self, packet): log.info("CALLING MyFilter OK!!!!") return packet
Your Own Components
Stetl concepts
Step 1- Define Class
Step 2- Configure Class
![Page 39: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/39.jpg)
Data Structures
Stetl concepts
• Components exchange Packets • Packet contains data and status• Data formats, e.g. :
xml_line_stream etree_docetree_element (feature)etree_element_arraystringany..
![Page 40: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/40.jpg)
Cases
![Page 41: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/41.jpg)
![Page 42: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/42.jpg)
Cases - GML to PostGIS
• National Dutch Open Datasets (GML) http://nlextract.nl ✴ Topography: Top10NL, BGT✴ Cadastral Parcels (BRK)
• Ordnance Survey UK✴ PoC Topography (OS Mastermap)
![Page 43: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/43.jpg)
Cases - IoT/SensorWeb• SOSPilot
✴ Dutch Air Quality Data✴ publish to PostGIS+SOS (SOS-T)✴ EU Reporting (Jinja2 Filter)✴ sospilot.geonovum.nl
• Smart Emission✴ AQ Sensors hosted by citizens✴ Calibration/Aggregation✴ publication to SOS and OGC SensorThings✴ data.smartemission.nl
![Page 44: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/44.jpg)
Project Status - June 24, 2016
• v1.0.9 installable via PyPi or Docker • Documentation on www.stetl.org • Real world transforms done
![Page 45: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/45.jpg)
Project - Planned & in progress
• PyWPS integration • GUI - Flask+Celery+Redis? - Node Red? - Jupiter Notebook?• More on GitHub https://github.com/geopython/stetl
![Page 46: Geospatial ETL with Stetl - GeoPython 2016](https://reader034.fdocuments.net/reader034/viewer/2022050614/5879cf131a28ab842c8b4c13/html5/thumbnails/46.jpg)
Thank You !
www.stetl.org
github.com/geopython/stetl