THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF...

25
• THREDDS development – Dynamic Catalogs: DQC, Resolvers – IDD Data Server – ADDE Cataloger • NetCDF development – NetCDF Markup Language (NcML) – More efficient Java I/O (NIO) – NetCDF/DODS/HDF5 Data Models Recent Work in Progress John Caron, June 3, 2003

Transcript of THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF...

Page 1: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

• THREDDS development– Dynamic Catalogs: DQC, Resolvers– IDD Data Server– ADDE Cataloger

• NetCDF development– NetCDF Markup Language (NcML)– More efficient Java I/O (NIO)– NetCDF/DODS/HDF5 Data Models

Recent Work in ProgressJohn Caron, June 3, 2003

Page 2: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

HTTP Server

THREDDS Catalogs

Client

ApplicationDatasets

Catalog.xml

hostname.edu

Catalog Generator

Data ServerDODS, ADDE, FTP, HTTP

CatalogRef.xml

CatalogRef.xml

Page 3: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

HTTP Tomcat Server

Dynamic Catalogs = Services

Client

Application

Datasets

Catalog.xml

hostname.edu

Data ServerDODS, ADDE, FTP, HTTP

Query Resolver Service

DQC.xml

Catalog ServiceCatalog Generator CatalogRef.xml

Resolver Service URI

URL

Page 4: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

Dataset Query Capability (DQC)

• XML document.

• Describes what the user can ask for as a set of orthogonal “selections”.

• On the client, a “query URL” is formed based on the user’s choices, and sent to the server.

• The “query resolver” server finds which datasets satisfy the query and returns a list of real dataset URLs.

•The DQC describes the queries that the server is capable of responding to.

Page 5: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

Resolver Services

• Logical Dataset, eg “latest ETA model run”

• Dataset with Service type “Resolver”

• On the client, the URI of the logical dataset is sent to the server

• The server finds what is available and returns a list of real dataset URLs.

Page 6: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

HTTP Tomcat Server

ADDE Cataloger

Client

Application

Datasets

Catalog.xml

hostname.edu

ADDE Data Server

Catalog ServiceADDE

CatalogerCatalogRef.xml

Query Resolver Service

DQC.xml

IDD

XxxxxXxxxxXxxx

Page 7: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

Summary IDD Data Server

Get as much of the IDD Data feeds available via THREDDS as possible.– NCEP model data (catgen) (DODS)– Level 3 NEXRAD (custom server/DQC) (ADDE) – SSEC/Unidata Satellite data (ADDE

Cataloger) (ADDE)– Text Data: Metars, Surface Obs, etc

(DQC/custom server), returns text or XML.– Profiler Data (custom server/DQC) (ADDE)

Page 8: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

NetCDF 3

NetCDFFile

NetCDF-3 library

API

Local file

HTTP protocol

Client

Application

OpenDAPDataset

OpenDAP

protocolNcML Dataset XML

NcML Dataset XML

Virtual dataset

Page 9: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

NetCDF Markup Language

XML representation of netCDF metadata, uses XML Schema

• Core: existing netCDF data model• Coordinate System: general and

georeferencing coordinate system• Dataset: redefine, aggregate, subset

• Luca Cinquini (NCAR/SCD/ESG), John Caron, Ethan Davis, Bob Drach (LLNL), Stefano Nativi (Florence), Russ Rew

Page 10: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

NcML Coordinate Systems

NcML Dataset XML

NcML Dataset XML

NetCDF FileOpenDAP Dataset

Convention Parser

•ATDRadar•AWIPS•COARDS•CF•CSM•GDV•NUWG•WRF•Zebra

Netcdf Dataset

Page 11: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

GeoGrids, GeoTiffs, Geowhiz!

NetCDF FileOpenDAP Dataset

Convention Parser

GeoGrid factory GeoGrid Dataset

Netcdf Dataset

GeoTiff WriterStrange land

of GIS

GeoTiff FileOpenGIS

WCS

WCS Server

VisAD / IDV

Page 12: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

NcML Dataset : “virtual view”

NetCDF FileOpenDAP Dataset

NetCDF Dataset

NcML Dataset XML

NcML Dataset XML

Dataset XML ParserJava-netCDF 2.1

Client Application

Page 13: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

NcML Dataset

• Use NcML like CDL, to declare the contents of a netCDF file.

• Add, delete or rename Variables, Attributes, and Dimensions

• Subset Variables• Reorder a Variable’s dimensions• Aggregate multiple netCDF files, a la DODS

Aggregation Server• NcML Dataset is a “virtual view” or can make

copy to a local netCDF file.

Page 14: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

2: NcML Datasets on a Server

Client

Application

Datasets

Catalog.xml

hostname.edu

DODS Agg/Netcdf ServerDODS, ADDE, FTP, HTTP

NcML Dataset XML

NcML Dataset XML

Dataset XML Parser

Page 15: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

3: NcML Datasets via Catalogs

Client

Application

Catalog.xml NcML Dataset XML

NcML Dataset XML

Catalog/Dataset XML ParserJava-netCDF v 2.1.1

NetCDF FileOpenDAP Dataset

Page 16: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

NIO

• Rewrite ucar.nc2 I/O layer using java.nio package (currently using ucar.netcdf)

• Uses memory mapping, bulk I/O transfer

• Prototype has 7x speedup on large files.

• Requires JDK 1.4+

• HTTP access must be rewritten

Page 17: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

NIO Current old/new

First access small (3.9 Mb) 281 671 2.4 large (240 Mb) 3334 28221 8.5

Average next 5 accesses small 54 290 5.4 large 2239 16367 7.3

• Time in millisecs to sequentially read entire file• Wintel 2GHz, 1 GB main memory• Java 1.4.2 -client

NIO vs current Java

Page 18: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

NIO C C/NIO

First access

small (3.9 Mb) 281 370 1.3

large (240 Mb) 3334 193485.8

Average next 5 accesses

small 54 24 .44

large 2239 953 .43

• Java 1.4.2 –client vs. VC 6.0 /O2

NIO vs optimized C

Page 19: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

NetcdfFile

NetCDF Data Model

Variable Dimension

Attribute

Attribute

DataType•byte•char•short•int•float•double

Page 20: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

BaseType

•primitive (8)

•string

•array

•grid

•structure

•sequence

Dataset

OpenDAP Data Model

BaseType

Attribute

Attribute

array

Dimension

BaseType

structure /

sequence

BaseType

Attribute

Attribute

Page 21: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

Datatype•Fixed point

•floating point

•date/time

•string

•bit field

•Opaque

•Compound

•Reference

•Enumeration

•Variable length

•Array

GroupsFile directory structure inside HDF file.

HDF5 Data Model

Data storage

•Compact

•External

•Layout

•Indexed

•Striped

Dataset

DataSpace

Attribute

Datatype

Page 22: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

Possible Extensions to netCDF data model

• Add new data types:– Strings: variable length arrays of bytes, plus an encoding

attribute.– Structures: collections of any other element types, allow nested

structures. – Vector: a variable length 1D array of any type.

• Allow reusable structure definition = user defined data type.

• Allow unnamed, undeclared dimensions = anonymous dimensions.

• Allow multiple unlimited dimensions (outer dimension only)

• Compression. Push scale/offset into library, allow variable bit sizes.

• Explicit support for coordinate variables/axes.

Page 23: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

NetcdfFile

New NetCDF Data Model

Variable

Dimension

AttributeAttribute

DataType•byte•short•int•long•float•double•String•Structure•Vector

Structure

Vector

•Length

DataType

DataType

DataTypeDataType

Page 24: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.

NetCDF 4

OpenDAPDatasetHDF5

File

NetCDF 4 library

API

OpenDAP

4.0

protocol

Local file or

HTTP protocol

Client

Application

NcML Dataset XML

NcML Dataset XML

NetCDFV.1 and 2

File

Virtual dataset

Page 25: THREDDS development –Dynamic Catalogs: DQC, Resolvers –IDD Data Server –ADDE Cataloger NetCDF development –NetCDF Markup Language (NcML) –More efficient.