Edward King SPEDDEXES 2014

49
IMOS AODAAC – gridded data access Australian Oceans Data Access & Archive Centre MARINE & ATMOSPHERIC RESEARCH Edward King | IMOS Satellite Remote Sensing Facility Leader with Matt Paget (TERN/AusCover) + Ken Suber + historical others 16 March 2014

description

Australian Oceans Distributed Active Archive Centre (AODAAC) : Gridded data extraction

Transcript of Edward King SPEDDEXES 2014

Page 1: Edward King SPEDDEXES 2014

IMOS AODAAC ndash gridded data accessAustralian Oceans Data Access amp Archive Centre

MARINE amp ATMOSPHERIC RESEARCH

Edward King | IMOS Satellite Remote Sensing Facility Leaderwith Matt Paget (TERNAusCover) + Ken Suber + historical others

16 March 2014

Outline

bull Our problembull OPeNDAP as a means to a solutionbull What we didbull Implementationbull Lessons Learnedbull Opportunities

National Satellite Data Reception Network

bull Distributed data archivesbull Variety of formatsbull Variety of data managersbull Range of sampling typesbull Big data setsbull Resource-poor usersbull Range of user capabilities

bull Need to make discovery and access easier much easier

Rectangular Grids

bull ldquoimplicit geolocationrdquo ndash can compute pixel lonlat from grid indices via linear functionsbull Straightforward

Latitude

Pixel (x)

Longitude

Line(y)

Swath Databull So-called ldquosatellite projectionrdquobull Explicit geolocation ndash latlon are lookup tablesbull Very important use case for remote sensing usebull More difficult case ndash each is unique

Imagery Latitude Longitude

Channel 1

Channel 2

Cloud Mask

Quality Flags

float

float

integer

Lat

Lon

Proj

_y

Non-rectangular projectionsbull ldquoMap-basedrdquo higher level productsbull LonLat is an analytic (non-linear) functions of grid indicesbull Eg Mercator Projection

Forward transform (lonlat) to (xy)

Inverse transform (xy) to (lonlat))

Proj_x

Data Access Protocol bull conceived by oceanographers in 1993 (when the

www was 4) as the Distributed Oceanographic Data System ndash DODS now OPeNDAP

bull designed to be as general as possible without being constrained to a particular discipline or world view

bull It is a data model - An abstraction for describing databull It is a transport mechanism

bull Layered over HTTPbull Anywhere the web can go DAP is sure to (be able to) follow bull And a browser can be a client

bull Data serversbull Respond to specially formed URLsbull Expose data AND metadata bull Return requested elements encapsulated within DAPbull Hyrax amp TDS (THREDDS Data Server)

bull Clientsbull Create requestsbull Unpack and use data that is returned within the DAP

Workflow

Data File

DAPServer

DAPClient

Requests

DAP ResponsesMappingTo DAP

Write to netCDF

Use in computation

eg

FilesystemAccess

DAP object

bull Grids

bull Sequences

bull Structures

Formats

bull netCDF

bull HDF45

bull Grib

bull ldquoFreeformrdquo

Client Libraries

bull Cbull Java

bull Python

bull Matlab

or

OPeNDAP Transport Layer A Data Standardisation Bus

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

Reception Station and

Product Generation

International Data via InternetTape ModelData

Synthesis

Internet

eg Curtin U iVEC UTAS CMAR (Canberra)

eg AIMS BoM GA CMAR (Hobart)

eg UTAS Curtin U CMAR (Hobart)

Multi-tiered design ndash based on TPAC Digital Library

Client User Applications

URL Crawler amp Metadata+Harvester

Spatial Database

Web Query Service

OPeNDAP Servers

OPeNDAP Interface Web Service Interface

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name11 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

WQS Client (Java app)

3 XML

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name12 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

06

WQS Client (Java app)

3 XML

Aggregator ndash a system clientbull accepts a list of URLs and various metadata codified as XML

returned by the web query service bull Computes necessary index ranges to create DAP constraint URLsbull reads data from each URL and combines (aggregates) each data

array into one (or more) arrays or files for output to the user bull writes the data output file (netCDF)bull Framework supports post-processing filters on netCDF file

T

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name14 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name15 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

bull Use to examine progress and ultimately get links to output netCDF and log filesbull Can also return as JSON for easy machine

interface

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 2: Edward King SPEDDEXES 2014

Outline

bull Our problembull OPeNDAP as a means to a solutionbull What we didbull Implementationbull Lessons Learnedbull Opportunities

National Satellite Data Reception Network

bull Distributed data archivesbull Variety of formatsbull Variety of data managersbull Range of sampling typesbull Big data setsbull Resource-poor usersbull Range of user capabilities

bull Need to make discovery and access easier much easier

Rectangular Grids

bull ldquoimplicit geolocationrdquo ndash can compute pixel lonlat from grid indices via linear functionsbull Straightforward

Latitude

Pixel (x)

Longitude

Line(y)

Swath Databull So-called ldquosatellite projectionrdquobull Explicit geolocation ndash latlon are lookup tablesbull Very important use case for remote sensing usebull More difficult case ndash each is unique

Imagery Latitude Longitude

Channel 1

Channel 2

Cloud Mask

Quality Flags

float

float

integer

Lat

Lon

Proj

_y

Non-rectangular projectionsbull ldquoMap-basedrdquo higher level productsbull LonLat is an analytic (non-linear) functions of grid indicesbull Eg Mercator Projection

Forward transform (lonlat) to (xy)

Inverse transform (xy) to (lonlat))

Proj_x

Data Access Protocol bull conceived by oceanographers in 1993 (when the

www was 4) as the Distributed Oceanographic Data System ndash DODS now OPeNDAP

bull designed to be as general as possible without being constrained to a particular discipline or world view

bull It is a data model - An abstraction for describing databull It is a transport mechanism

bull Layered over HTTPbull Anywhere the web can go DAP is sure to (be able to) follow bull And a browser can be a client

bull Data serversbull Respond to specially formed URLsbull Expose data AND metadata bull Return requested elements encapsulated within DAPbull Hyrax amp TDS (THREDDS Data Server)

bull Clientsbull Create requestsbull Unpack and use data that is returned within the DAP

Workflow

Data File

DAPServer

DAPClient

Requests

DAP ResponsesMappingTo DAP

Write to netCDF

Use in computation

eg

FilesystemAccess

DAP object

bull Grids

bull Sequences

bull Structures

Formats

bull netCDF

bull HDF45

bull Grib

bull ldquoFreeformrdquo

Client Libraries

bull Cbull Java

bull Python

bull Matlab

or

OPeNDAP Transport Layer A Data Standardisation Bus

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

Reception Station and

Product Generation

International Data via InternetTape ModelData

Synthesis

Internet

eg Curtin U iVEC UTAS CMAR (Canberra)

eg AIMS BoM GA CMAR (Hobart)

eg UTAS Curtin U CMAR (Hobart)

Multi-tiered design ndash based on TPAC Digital Library

Client User Applications

URL Crawler amp Metadata+Harvester

Spatial Database

Web Query Service

OPeNDAP Servers

OPeNDAP Interface Web Service Interface

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name11 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

WQS Client (Java app)

3 XML

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name12 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

06

WQS Client (Java app)

3 XML

Aggregator ndash a system clientbull accepts a list of URLs and various metadata codified as XML

returned by the web query service bull Computes necessary index ranges to create DAP constraint URLsbull reads data from each URL and combines (aggregates) each data

array into one (or more) arrays or files for output to the user bull writes the data output file (netCDF)bull Framework supports post-processing filters on netCDF file

T

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name14 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name15 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

bull Use to examine progress and ultimately get links to output netCDF and log filesbull Can also return as JSON for easy machine

interface

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 3: Edward King SPEDDEXES 2014

National Satellite Data Reception Network

bull Distributed data archivesbull Variety of formatsbull Variety of data managersbull Range of sampling typesbull Big data setsbull Resource-poor usersbull Range of user capabilities

bull Need to make discovery and access easier much easier

Rectangular Grids

bull ldquoimplicit geolocationrdquo ndash can compute pixel lonlat from grid indices via linear functionsbull Straightforward

Latitude

Pixel (x)

Longitude

Line(y)

Swath Databull So-called ldquosatellite projectionrdquobull Explicit geolocation ndash latlon are lookup tablesbull Very important use case for remote sensing usebull More difficult case ndash each is unique

Imagery Latitude Longitude

Channel 1

Channel 2

Cloud Mask

Quality Flags

float

float

integer

Lat

Lon

Proj

_y

Non-rectangular projectionsbull ldquoMap-basedrdquo higher level productsbull LonLat is an analytic (non-linear) functions of grid indicesbull Eg Mercator Projection

Forward transform (lonlat) to (xy)

Inverse transform (xy) to (lonlat))

Proj_x

Data Access Protocol bull conceived by oceanographers in 1993 (when the

www was 4) as the Distributed Oceanographic Data System ndash DODS now OPeNDAP

bull designed to be as general as possible without being constrained to a particular discipline or world view

bull It is a data model - An abstraction for describing databull It is a transport mechanism

bull Layered over HTTPbull Anywhere the web can go DAP is sure to (be able to) follow bull And a browser can be a client

bull Data serversbull Respond to specially formed URLsbull Expose data AND metadata bull Return requested elements encapsulated within DAPbull Hyrax amp TDS (THREDDS Data Server)

bull Clientsbull Create requestsbull Unpack and use data that is returned within the DAP

Workflow

Data File

DAPServer

DAPClient

Requests

DAP ResponsesMappingTo DAP

Write to netCDF

Use in computation

eg

FilesystemAccess

DAP object

bull Grids

bull Sequences

bull Structures

Formats

bull netCDF

bull HDF45

bull Grib

bull ldquoFreeformrdquo

Client Libraries

bull Cbull Java

bull Python

bull Matlab

or

OPeNDAP Transport Layer A Data Standardisation Bus

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

Reception Station and

Product Generation

International Data via InternetTape ModelData

Synthesis

Internet

eg Curtin U iVEC UTAS CMAR (Canberra)

eg AIMS BoM GA CMAR (Hobart)

eg UTAS Curtin U CMAR (Hobart)

Multi-tiered design ndash based on TPAC Digital Library

Client User Applications

URL Crawler amp Metadata+Harvester

Spatial Database

Web Query Service

OPeNDAP Servers

OPeNDAP Interface Web Service Interface

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name11 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

WQS Client (Java app)

3 XML

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name12 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

06

WQS Client (Java app)

3 XML

Aggregator ndash a system clientbull accepts a list of URLs and various metadata codified as XML

returned by the web query service bull Computes necessary index ranges to create DAP constraint URLsbull reads data from each URL and combines (aggregates) each data

array into one (or more) arrays or files for output to the user bull writes the data output file (netCDF)bull Framework supports post-processing filters on netCDF file

T

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name14 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name15 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

bull Use to examine progress and ultimately get links to output netCDF and log filesbull Can also return as JSON for easy machine

interface

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 4: Edward King SPEDDEXES 2014

Rectangular Grids

bull ldquoimplicit geolocationrdquo ndash can compute pixel lonlat from grid indices via linear functionsbull Straightforward

Latitude

Pixel (x)

Longitude

Line(y)

Swath Databull So-called ldquosatellite projectionrdquobull Explicit geolocation ndash latlon are lookup tablesbull Very important use case for remote sensing usebull More difficult case ndash each is unique

Imagery Latitude Longitude

Channel 1

Channel 2

Cloud Mask

Quality Flags

float

float

integer

Lat

Lon

Proj

_y

Non-rectangular projectionsbull ldquoMap-basedrdquo higher level productsbull LonLat is an analytic (non-linear) functions of grid indicesbull Eg Mercator Projection

Forward transform (lonlat) to (xy)

Inverse transform (xy) to (lonlat))

Proj_x

Data Access Protocol bull conceived by oceanographers in 1993 (when the

www was 4) as the Distributed Oceanographic Data System ndash DODS now OPeNDAP

bull designed to be as general as possible without being constrained to a particular discipline or world view

bull It is a data model - An abstraction for describing databull It is a transport mechanism

bull Layered over HTTPbull Anywhere the web can go DAP is sure to (be able to) follow bull And a browser can be a client

bull Data serversbull Respond to specially formed URLsbull Expose data AND metadata bull Return requested elements encapsulated within DAPbull Hyrax amp TDS (THREDDS Data Server)

bull Clientsbull Create requestsbull Unpack and use data that is returned within the DAP

Workflow

Data File

DAPServer

DAPClient

Requests

DAP ResponsesMappingTo DAP

Write to netCDF

Use in computation

eg

FilesystemAccess

DAP object

bull Grids

bull Sequences

bull Structures

Formats

bull netCDF

bull HDF45

bull Grib

bull ldquoFreeformrdquo

Client Libraries

bull Cbull Java

bull Python

bull Matlab

or

OPeNDAP Transport Layer A Data Standardisation Bus

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

Reception Station and

Product Generation

International Data via InternetTape ModelData

Synthesis

Internet

eg Curtin U iVEC UTAS CMAR (Canberra)

eg AIMS BoM GA CMAR (Hobart)

eg UTAS Curtin U CMAR (Hobart)

Multi-tiered design ndash based on TPAC Digital Library

Client User Applications

URL Crawler amp Metadata+Harvester

Spatial Database

Web Query Service

OPeNDAP Servers

OPeNDAP Interface Web Service Interface

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name11 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

WQS Client (Java app)

3 XML

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name12 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

06

WQS Client (Java app)

3 XML

Aggregator ndash a system clientbull accepts a list of URLs and various metadata codified as XML

returned by the web query service bull Computes necessary index ranges to create DAP constraint URLsbull reads data from each URL and combines (aggregates) each data

array into one (or more) arrays or files for output to the user bull writes the data output file (netCDF)bull Framework supports post-processing filters on netCDF file

T

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name14 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name15 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

bull Use to examine progress and ultimately get links to output netCDF and log filesbull Can also return as JSON for easy machine

interface

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 5: Edward King SPEDDEXES 2014

Swath Databull So-called ldquosatellite projectionrdquobull Explicit geolocation ndash latlon are lookup tablesbull Very important use case for remote sensing usebull More difficult case ndash each is unique

Imagery Latitude Longitude

Channel 1

Channel 2

Cloud Mask

Quality Flags

float

float

integer

Lat

Lon

Proj

_y

Non-rectangular projectionsbull ldquoMap-basedrdquo higher level productsbull LonLat is an analytic (non-linear) functions of grid indicesbull Eg Mercator Projection

Forward transform (lonlat) to (xy)

Inverse transform (xy) to (lonlat))

Proj_x

Data Access Protocol bull conceived by oceanographers in 1993 (when the

www was 4) as the Distributed Oceanographic Data System ndash DODS now OPeNDAP

bull designed to be as general as possible without being constrained to a particular discipline or world view

bull It is a data model - An abstraction for describing databull It is a transport mechanism

bull Layered over HTTPbull Anywhere the web can go DAP is sure to (be able to) follow bull And a browser can be a client

bull Data serversbull Respond to specially formed URLsbull Expose data AND metadata bull Return requested elements encapsulated within DAPbull Hyrax amp TDS (THREDDS Data Server)

bull Clientsbull Create requestsbull Unpack and use data that is returned within the DAP

Workflow

Data File

DAPServer

DAPClient

Requests

DAP ResponsesMappingTo DAP

Write to netCDF

Use in computation

eg

FilesystemAccess

DAP object

bull Grids

bull Sequences

bull Structures

Formats

bull netCDF

bull HDF45

bull Grib

bull ldquoFreeformrdquo

Client Libraries

bull Cbull Java

bull Python

bull Matlab

or

OPeNDAP Transport Layer A Data Standardisation Bus

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

Reception Station and

Product Generation

International Data via InternetTape ModelData

Synthesis

Internet

eg Curtin U iVEC UTAS CMAR (Canberra)

eg AIMS BoM GA CMAR (Hobart)

eg UTAS Curtin U CMAR (Hobart)

Multi-tiered design ndash based on TPAC Digital Library

Client User Applications

URL Crawler amp Metadata+Harvester

Spatial Database

Web Query Service

OPeNDAP Servers

OPeNDAP Interface Web Service Interface

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name11 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

WQS Client (Java app)

3 XML

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name12 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

06

WQS Client (Java app)

3 XML

Aggregator ndash a system clientbull accepts a list of URLs and various metadata codified as XML

returned by the web query service bull Computes necessary index ranges to create DAP constraint URLsbull reads data from each URL and combines (aggregates) each data

array into one (or more) arrays or files for output to the user bull writes the data output file (netCDF)bull Framework supports post-processing filters on netCDF file

T

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name14 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name15 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

bull Use to examine progress and ultimately get links to output netCDF and log filesbull Can also return as JSON for easy machine

interface

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 6: Edward King SPEDDEXES 2014

Proj

_y

Non-rectangular projectionsbull ldquoMap-basedrdquo higher level productsbull LonLat is an analytic (non-linear) functions of grid indicesbull Eg Mercator Projection

Forward transform (lonlat) to (xy)

Inverse transform (xy) to (lonlat))

Proj_x

Data Access Protocol bull conceived by oceanographers in 1993 (when the

www was 4) as the Distributed Oceanographic Data System ndash DODS now OPeNDAP

bull designed to be as general as possible without being constrained to a particular discipline or world view

bull It is a data model - An abstraction for describing databull It is a transport mechanism

bull Layered over HTTPbull Anywhere the web can go DAP is sure to (be able to) follow bull And a browser can be a client

bull Data serversbull Respond to specially formed URLsbull Expose data AND metadata bull Return requested elements encapsulated within DAPbull Hyrax amp TDS (THREDDS Data Server)

bull Clientsbull Create requestsbull Unpack and use data that is returned within the DAP

Workflow

Data File

DAPServer

DAPClient

Requests

DAP ResponsesMappingTo DAP

Write to netCDF

Use in computation

eg

FilesystemAccess

DAP object

bull Grids

bull Sequences

bull Structures

Formats

bull netCDF

bull HDF45

bull Grib

bull ldquoFreeformrdquo

Client Libraries

bull Cbull Java

bull Python

bull Matlab

or

OPeNDAP Transport Layer A Data Standardisation Bus

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

Reception Station and

Product Generation

International Data via InternetTape ModelData

Synthesis

Internet

eg Curtin U iVEC UTAS CMAR (Canberra)

eg AIMS BoM GA CMAR (Hobart)

eg UTAS Curtin U CMAR (Hobart)

Multi-tiered design ndash based on TPAC Digital Library

Client User Applications

URL Crawler amp Metadata+Harvester

Spatial Database

Web Query Service

OPeNDAP Servers

OPeNDAP Interface Web Service Interface

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name11 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

WQS Client (Java app)

3 XML

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name12 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

06

WQS Client (Java app)

3 XML

Aggregator ndash a system clientbull accepts a list of URLs and various metadata codified as XML

returned by the web query service bull Computes necessary index ranges to create DAP constraint URLsbull reads data from each URL and combines (aggregates) each data

array into one (or more) arrays or files for output to the user bull writes the data output file (netCDF)bull Framework supports post-processing filters on netCDF file

T

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name14 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name15 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

bull Use to examine progress and ultimately get links to output netCDF and log filesbull Can also return as JSON for easy machine

interface

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 7: Edward King SPEDDEXES 2014

Data Access Protocol bull conceived by oceanographers in 1993 (when the

www was 4) as the Distributed Oceanographic Data System ndash DODS now OPeNDAP

bull designed to be as general as possible without being constrained to a particular discipline or world view

bull It is a data model - An abstraction for describing databull It is a transport mechanism

bull Layered over HTTPbull Anywhere the web can go DAP is sure to (be able to) follow bull And a browser can be a client

bull Data serversbull Respond to specially formed URLsbull Expose data AND metadata bull Return requested elements encapsulated within DAPbull Hyrax amp TDS (THREDDS Data Server)

bull Clientsbull Create requestsbull Unpack and use data that is returned within the DAP

Workflow

Data File

DAPServer

DAPClient

Requests

DAP ResponsesMappingTo DAP

Write to netCDF

Use in computation

eg

FilesystemAccess

DAP object

bull Grids

bull Sequences

bull Structures

Formats

bull netCDF

bull HDF45

bull Grib

bull ldquoFreeformrdquo

Client Libraries

bull Cbull Java

bull Python

bull Matlab

or

OPeNDAP Transport Layer A Data Standardisation Bus

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

Reception Station and

Product Generation

International Data via InternetTape ModelData

Synthesis

Internet

eg Curtin U iVEC UTAS CMAR (Canberra)

eg AIMS BoM GA CMAR (Hobart)

eg UTAS Curtin U CMAR (Hobart)

Multi-tiered design ndash based on TPAC Digital Library

Client User Applications

URL Crawler amp Metadata+Harvester

Spatial Database

Web Query Service

OPeNDAP Servers

OPeNDAP Interface Web Service Interface

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name11 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

WQS Client (Java app)

3 XML

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name12 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

06

WQS Client (Java app)

3 XML

Aggregator ndash a system clientbull accepts a list of URLs and various metadata codified as XML

returned by the web query service bull Computes necessary index ranges to create DAP constraint URLsbull reads data from each URL and combines (aggregates) each data

array into one (or more) arrays or files for output to the user bull writes the data output file (netCDF)bull Framework supports post-processing filters on netCDF file

T

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name14 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name15 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

bull Use to examine progress and ultimately get links to output netCDF and log filesbull Can also return as JSON for easy machine

interface

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 8: Edward King SPEDDEXES 2014

Workflow

Data File

DAPServer

DAPClient

Requests

DAP ResponsesMappingTo DAP

Write to netCDF

Use in computation

eg

FilesystemAccess

DAP object

bull Grids

bull Sequences

bull Structures

Formats

bull netCDF

bull HDF45

bull Grib

bull ldquoFreeformrdquo

Client Libraries

bull Cbull Java

bull Python

bull Matlab

or

OPeNDAP Transport Layer A Data Standardisation Bus

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

Reception Station and

Product Generation

International Data via InternetTape ModelData

Synthesis

Internet

eg Curtin U iVEC UTAS CMAR (Canberra)

eg AIMS BoM GA CMAR (Hobart)

eg UTAS Curtin U CMAR (Hobart)

Multi-tiered design ndash based on TPAC Digital Library

Client User Applications

URL Crawler amp Metadata+Harvester

Spatial Database

Web Query Service

OPeNDAP Servers

OPeNDAP Interface Web Service Interface

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name11 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

WQS Client (Java app)

3 XML

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name12 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

06

WQS Client (Java app)

3 XML

Aggregator ndash a system clientbull accepts a list of URLs and various metadata codified as XML

returned by the web query service bull Computes necessary index ranges to create DAP constraint URLsbull reads data from each URL and combines (aggregates) each data

array into one (or more) arrays or files for output to the user bull writes the data output file (netCDF)bull Framework supports post-processing filters on netCDF file

T

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name14 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name15 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

bull Use to examine progress and ultimately get links to output netCDF and log filesbull Can also return as JSON for easy machine

interface

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 9: Edward King SPEDDEXES 2014

OPeNDAP Transport Layer A Data Standardisation Bus

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

OPeNDAP Server

Local Data Store

Reception Station and

Product Generation

International Data via InternetTape ModelData

Synthesis

Internet

eg Curtin U iVEC UTAS CMAR (Canberra)

eg AIMS BoM GA CMAR (Hobart)

eg UTAS Curtin U CMAR (Hobart)

Multi-tiered design ndash based on TPAC Digital Library

Client User Applications

URL Crawler amp Metadata+Harvester

Spatial Database

Web Query Service

OPeNDAP Servers

OPeNDAP Interface Web Service Interface

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name11 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

WQS Client (Java app)

3 XML

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name12 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

06

WQS Client (Java app)

3 XML

Aggregator ndash a system clientbull accepts a list of URLs and various metadata codified as XML

returned by the web query service bull Computes necessary index ranges to create DAP constraint URLsbull reads data from each URL and combines (aggregates) each data

array into one (or more) arrays or files for output to the user bull writes the data output file (netCDF)bull Framework supports post-processing filters on netCDF file

T

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name14 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name15 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

bull Use to examine progress and ultimately get links to output netCDF and log filesbull Can also return as JSON for easy machine

interface

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 10: Edward King SPEDDEXES 2014

Multi-tiered design ndash based on TPAC Digital Library

Client User Applications

URL Crawler amp Metadata+Harvester

Spatial Database

Web Query Service

OPeNDAP Servers

OPeNDAP Interface Web Service Interface

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name11 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

WQS Client (Java app)

3 XML

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name12 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

06

WQS Client (Java app)

3 XML

Aggregator ndash a system clientbull accepts a list of URLs and various metadata codified as XML

returned by the web query service bull Computes necessary index ranges to create DAP constraint URLsbull reads data from each URL and combines (aggregates) each data

array into one (or more) arrays or files for output to the user bull writes the data output file (netCDF)bull Framework supports post-processing filters on netCDF file

T

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name14 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name15 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

bull Use to examine progress and ultimately get links to output netCDF and log filesbull Can also return as JSON for easy machine

interface

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 11: Edward King SPEDDEXES 2014

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name11 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

WQS Client (Java app)

3 XML

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name12 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

06

WQS Client (Java app)

3 XML

Aggregator ndash a system clientbull accepts a list of URLs and various metadata codified as XML

returned by the web query service bull Computes necessary index ranges to create DAP constraint URLsbull reads data from each URL and combines (aggregates) each data

array into one (or more) arrays or files for output to the user bull writes the data output file (netCDF)bull Framework supports post-processing filters on netCDF file

T

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name14 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name15 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

bull Use to examine progress and ultimately get links to output netCDF and log filesbull Can also return as JSON for easy machine

interface

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 12: Edward King SPEDDEXES 2014

(replicated)

Complete System (Version 2)Can be fully distributed

Presentation title | Presenter name12 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

1

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

06

WQS Client (Java app)

3 XML

Aggregator ndash a system clientbull accepts a list of URLs and various metadata codified as XML

returned by the web query service bull Computes necessary index ranges to create DAP constraint URLsbull reads data from each URL and combines (aggregates) each data

array into one (or more) arrays or files for output to the user bull writes the data output file (netCDF)bull Framework supports post-processing filters on netCDF file

T

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name14 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name15 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

bull Use to examine progress and ultimately get links to output netCDF and log filesbull Can also return as JSON for easy machine

interface

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 13: Edward King SPEDDEXES 2014

Aggregator ndash a system clientbull accepts a list of URLs and various metadata codified as XML

returned by the web query service bull Computes necessary index ranges to create DAP constraint URLsbull reads data from each URL and combines (aggregates) each data

array into one (or more) arrays or files for output to the user bull writes the data output file (netCDF)bull Framework supports post-processing filters on netCDF file

T

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name14 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name15 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

bull Use to examine progress and ultimately get links to output netCDF and log filesbull Can also return as JSON for easy machine

interface

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 14: Edward King SPEDDEXES 2014

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name14 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name15 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

bull Use to examine progress and ultimately get links to output netCDF and log filesbull Can also return as JSON for easy machine

interface

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 15: Edward King SPEDDEXES 2014

User Experience ndash Low levelbull Initiate data request via web call (CGI script)

Presentation title | Presenter name15 |

bull Returns a JSON fragment with ldquohandlerdquo (URL)

bull Use to examine progress and ultimately get links to output netCDF and log filesbull Can also return as JSON for easy machine

interface

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 16: Edward King SPEDDEXES 2014

User Experience ndash higher level

bull Machine interface supports simple web front-endbull Or full portal bull Or distributed clients in a

clusterbull NOTE we do NOT attempt to

deliver data in ldquoweb-timerdquo (which is not a realistic objective for GB-scale data systems)

Presentation title | Presenter name16 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 17: Edward King SPEDDEXES 2014

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name17 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 18: Edward King SPEDDEXES 2014

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name18 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 19: Edward King SPEDDEXES 2014

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

Presentation title | Presenter name19 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 20: Edward King SPEDDEXES 2014

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205]

Presentation title | Presenter name20 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 21: Edward King SPEDDEXES 2014

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around it

Presentation title | Presenter name21 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 22: Edward King SPEDDEXES 2014

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the database

Presentation title | Presenter name22 |

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 23: Edward King SPEDDEXES 2014

DAP implicationsbull It does one thing really well ndash accesses and delivers subsets of n-

Dimensional data

bull Semantically weak ndash eg doesnrsquot have native support for time lon lat etc (no OGC fluff sohellip) httpaopendapfilencasciilst[010][1601165][2001205] bull Needs a spatio-temporal information infrastructure around itbull We do this with a data model implemented in the databasebull It is not necessarily that DAP is the wrong solution it just means

the hard part of the problem is not data volume but metadata

Presentation title | Presenter name23 |

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 24: Edward King SPEDDEXES 2014

Metadata Harvester

Presentation title | Presenter name24 |

Imagery Latitude Longitude

bull Has to extract spatial bounding boxes from all these different types of filebull Need to help the harvester identify geospatial information in files (nominating particular variables as relevant) ndash each data set needs lsquohelperrsquo config filesbull These can be maintained by the data provider OR the AODAAC adminbull And then you need to be able to take a user ROI and transform it back to the grid

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 25: Edward King SPEDDEXES 2014

Geospatial model

Presentation title | Presenter name25 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 26: Edward King SPEDDEXES 2014

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

Presentation title | Presenter name26 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 27: Edward King SPEDDEXES 2014

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quickly

Presentation title | Presenter name27 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 28: Edward King SPEDDEXES 2014

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)

Presentation title | Presenter name28 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 29: Edward King SPEDDEXES 2014

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

Presentation title | Presenter name29 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 30: Edward King SPEDDEXES 2014

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

Presentation title | Presenter name30 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 31: Edward King SPEDDEXES 2014

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

Presentation title | Presenter name31 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 32: Edward King SPEDDEXES 2014

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) help

Presentation title | Presenter name32 |

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 33: Edward King SPEDDEXES 2014

Lessons Learned

bull DAP performs well but you need to build a fair bit of infrastructure to make it general (V1 system only handled lonlat grids)

bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essential

Presentation title | Presenter name33 |

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 34: Edward King SPEDDEXES 2014

Lessons Learnedbull DAP performs well but you need to build a fair bit of infrastructure to make it

general (V1 system only handled lonlat grids)bull Incredible variety of input data makes this hard very quicklybull Tempting to bloat system with features (V1 could produce thumbnails for the

web and output HDF and ASCII)bull We were on the verge of adding remapping too ()

bull All these features can be done as a post-filter a la Unix philosophy of ldquodo one thing and do it wellrdquo V2 supports this approach

bull Web service with Tomcat was legacy of TPAC original Would be simpler just as a standalone Java app (and use CGI not WSDL for example)

bull Formats with infile-compression (nc4 hdf) helpbull Robust data serving is essentialbull Donrsquot use a giant software project to learn a new language

Presentation title | Presenter name34 |

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 35: Edward King SPEDDEXES 2014

(replicated)

Distribute the computing and modularise ndash V1 had a lot (more) of the compute in the WQS which became a bottleneck In V2 the WQS is just a series of SQL lsquoselectrsquos and the aggregator takes the rest of the load (very necessary for swath data)

Presentation title | Presenter name35 |

OPeNDAP Data Servers

URL Crawler and Metadata

Harvester(Java Apps)

SpatialData-base(PostGIS)

Web QueryService(Tomcat Webapp)

Aggre-gator(Java app)

Internet

2

4 DAP

5 ndash netCDF files

Job Controller (Python)

Web Server (Apache)

Temporary Data Store

1 0

3XML

6

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 36: Edward King SPEDDEXES 2014

Opportunities + Shortcomingsbull This may meet some of your needs bull (including warning you what not to do)bull It could be a lot more useful with some back end filters such as

bull Format conversions (eg geoTIFF csv)bull Shape file cookie cuttingbull Statistics extractionbull Reprojecting + resampling

bull We need tools for managing our XML config files (admin tools)bull Doesnrsquot handle 4+ dimensions yet (eg depth or height)bull Want to make easier to deploy as an ldquoappliancerdquobull V1 live for 2+ years V2 is being integrated in IMOS portal nowbull Would be fun to run against the AGDC data set when it goes netCDF

Presentation title | Presenter name36 |

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 37: Edward King SPEDDEXES 2014

Marine amp Atmospheric ResearchDr Edward KingIMOS Satellite Remote Sensing Facility Leadert +61 3 6232 5334e edwardkingcsiroauw wwwcsiroaucmar

MARINE amp ATMOSPHERIC RESEARCH

Thank you ndash questions

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 38: Edward King SPEDDEXES 2014

X=Longitude (deg East)

813 elts

Y=Latitude

(deg North)

670 elts

T=TimeDays since xxx

1 elt

WRel2

ldquoRelative Soil Moisture (lower layer)rdquo

1 x 617 x 813

Mean in each cell unitless

0 lt= value lt= 1

Missing data = -9999

A netCDF asidehellip

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 39: Edward King SPEDDEXES 2014

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 40: Edward King SPEDDEXES 2014

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 41: Edward King SPEDDEXES 2014

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 42: Edward King SPEDDEXES 2014

httpaodaac2-cbractcsiroau8080

opendapauscoverawaprun26amonthlywrel2contentshtml

OPeNDAP server ndash explore via a browser

File Download Link DAP Links

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 43: Edward King SPEDDEXES 2014

Data Structures DDS linkhellip

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 44: Edward King SPEDDEXES 2014

Data Attributes DAS linkhellip

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 45: Edward King SPEDDEXES 2014

XML Package DDX linkhellip

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 46: Edward King SPEDDEXES 2014

And finally ndash datahellip

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 47: Edward King SPEDDEXES 2014

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 48: Edward King SPEDDEXES 2014

And finally ndash datahellip

httpafilencasciiWrel2[010][1601165][2001205]

Format Var Time Latitude Longitude

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange
Page 49: Edward King SPEDDEXES 2014

Note absence of semantics in the exchange

bull There is nothing in the URL to impart meaning just a variable name and some subscriptshellip

bull httpafilencasciiWrel2[010][1601165][2001205]

bull cf Fortran float Wrel2(1670813) bull eg Canrsquot naturally specify a bounding box (cf WMS)bull This is both

bull A weakness geospatial handling requires extra workbull A strength not limited to geospatial domainsbull There is always some extra work to do (or assumptions to make)

in order to be able to make a meaningful request

  • IMOS AODAAC ndash gridded data access
  • Outline
  • National Satellite Data Reception Network
  • Rectangular Grids
  • Swath Data
  • Non-rectangular projections
  • Data Access Protocol
  • Workflow
  • OPeNDAP Transport Layer A Data Standardisation Bus
  • Multi-tiered design ndash based on TPAC Digital Library
  • Complete System (Version 2) Can be fully distributed
  • Complete System (Version 2) Can be fully distributed (2)
  • Aggregator ndash a system client
  • User Experience ndash Low level
  • User Experience ndash Low level (2)
  • User Experience ndash higher level
  • DAP implications
  • DAP implications (2)
  • DAP implications (3)
  • DAP implications (4)
  • DAP implications (5)
  • DAP implications (6)
  • DAP implications (7)
  • Metadata Harvester
  • Geospatial model
  • Lessons Learned
  • Lessons Learned (2)
  • Lessons Learned (3)
  • Lessons Learned (4)
  • Lessons Learned (5)
  • Lessons Learned (6)
  • Lessons Learned (7)
  • Lessons Learned (8)
  • Lessons Learned (9)
  • Distribute the computing and modularise ndash V1 had a lot (more) o
  • Opportunities + Shortcomings
  • Thank you ndash questions
  • A netCDF asidehellip
  • OPeNDAP server ndash explore via a browser
  • OPeNDAP server ndash explore via a browser (2)
  • OPeNDAP server ndash explore via a browser (3)
  • OPeNDAP server ndash explore via a browser (4)
  • Data Structures DDS linkhellip
  • Data Attributes DAS linkhellip
  • XML Package DDX linkhellip
  • And finally ndash datahellip
  • And finally ndash datahellip (2)
  • And finally ndash datahellip (3)
  • Note absence of semantics in the exchange