HDF Update

38
- 1 - HDF HDF Mike Folk National Center for Supercomputing Applications HDF and HDF-EOS Workshop VIII October 27, 2004 HDF Update HDF Update HDF HDF

Transcript of HDF Update

Page 1: HDF Update

- 1 - HDFHDF

Mike Folk

National Center for Supercomputing Applications

HDF and HDF-EOS Workshop VIII

October 27, 2004

HDF UpdateHDF Update

HDFHDF

Page 2: HDF Update

- 2 - HDFHDF

TopicsTopics

• HDF Team and Supporters

• HDF software update

• Other Activities of Interest

Page 3: HDF Update

- 3 - HDFHDF

The HDF TeamThe HDF Team

Xuan BaiFrank BakerPeter CaoVailin ChoiMike FolkBarbara JonesQuincey KoziolJames LairdRaymond Lu

John MainzerRobert McGrathPedro NunesElena PourmalBinh-minh RiblerEric ShapiroRishi SinhaKent Yang

And all those wonderful folks out there who contribute ideas, requests, bug reports, code, and support.

Page 4: HDF Update

- 4 - HDFHDF

OrganizationOrganization

• Staff breakdown– User support, documentation– QA, maintenance, testing– Software development– System administration– Management

• See Thursday tutorial on HDF Software Process

Basic library Basic library developmentdevelopmentBasic library Basic library developmentdevelopment

Support, doc, Support, doc, QA, QA,

maintenancemaintenance

Support, doc, Support, doc, QA, QA,

maintenancemaintenance

Tools and Tools and JavaJava

Tools and Tools and JavaJava

Parallel I/O, Parallel I/O, Grid, Grid,

big machinesbig machines

Parallel I/O, Parallel I/O, Grid, Grid,

big machinesbig machines

HDF ProjectHDF ProjectHDF ProjectHDF Project

Page 5: HDF Update

- 5 - HDFHDF

Who is supporting HDFWho is supporting HDF??

• Organizations and communities with institutional and financial commitment to HDF– NCSA, NASA, State of IL, DOE, Boeing

• Agencies supporting R&D– NCSA, NASA, NARA, DOE, NSF, ONR

• Collaborators who make in-kind contributions– Cactus, PyTables, NeXUS, CGNS, many others

Page 6: HDF Update

- 6 - HDFHDF

HDF Software UpdateHDF Software Update

Page 7: HDF Update

- 7 - HDFHDF

HDF software milestones in FY 2004HDF software milestones in FY 2004

JanJan FebFeb MarMar AprApr MayMay JunJun JulJul AugAug SepSep OctOctDecDec

HDF 4.2r0

HDF5 1.6.2

HDF5 1.6.3

HDF5 Java 2.0HDF5 High Level

Flexible parallel HDF5 (Alpha)

20032003 20042004

Page 8: HDF Update

- 8 - HDFHDF

HDF4.2 Release 0 – Dec. 2003HDF4.2 Release 0 – Dec. 2003

• Bug fixes• New features• Support for new platforms and compilers

Page 9: HDF Update

- 10 - HDFHDF

HDF4.2r0HDF4.2r0New FeaturesNew Features

• Tools (per DAAC and Instrument Team requests)– hdfimport – converts float/integer data to SDS/raster

• Replaces fp2hdf

– Hdiff – compares two HDF4 files• Revision of earlier hdfdiff tool

– Hrepack – makes a copy of an HDF4 file • optionally rewrite objects with compression, chunking, etc.

– h4cc, h4fc, h4redeploy• Helper scripts to facilitate compilation and installation

Page 10: HDF Update

- 11 - HDFHDF

HDF4.2r0HDF4.2r0New FeaturesNew Features

• Szip compression– Fast compression method– Available on all platforms except Crays– NCSA distributes Szip source and binaries– HDF Library binaries come with SZIP enabled– SZIP Documentation available from

http://hdf.ncsa.uiuc.edu/SZIP

Page 11: HDF Update

- 12 - HDFHDF

HDF4.2r0HDF4.2r0New ConfigurationNew Configuration

• Addressing key needs– Porting to new platforms– New versions of JPEG and ZLIB libraries– Optional SZIP compression– Many features were hard coded, but could be

done at configuration time

Page 12: HDF Update

- 13 - HDFHDF

HDF4.2r0HDF4.2r0New Compilers and PlatformsNew Compilers and Platforms

• New compilers – Intel C and Fortran– Portland Group Compilers (C only for now)

• New OS– Mac OSX– RedHat 8/9– AIX 5.1 64-bit– OSF1– Linux 64 (SuSE and RH8) (JPL machines)– Altix (Aura Team)

Page 13: HDF Update

- 14 - HDFHDF

HDF5 1.6.2 – Feb. 2004HDF5 1.6.2 – Feb. 2004

• New functions– better user control over open/close objects

• Bug fixes • Parallel improvements

– h5pcc, h5pfc helper scripts for parallel compiles– Configure improvements – Improved parallel performance

• Speed improvements of data conversion routines• Some SZIP improvements

Page 14: HDF Update

- 15 - HDFHDF

HDF5 1.6.2HDF5 1.6.2

• Support for new compilers and platforms– IBM Fortran on MacOS X– Support for gcc 3.3.4– Linux 64 (SuSE and RH) at JPL– Altix (Aura team) including parallel C and Fortran

Libraries– Investigated SX-6 (NEC) port

Page 15: HDF Update

- 16 - HDFHDF

HDF5 1.6.3 – Oct. 2004HDF5 1.6.3 – Oct. 2004

• Windows– Improvements to the build, test, and installation

• New API routines– H5Fget_filesize. Returns size of opened file.– New: H5Fget_name. Returns name of file by

object ID– Some F90 and C++ routines added

Page 16: HDF Update

- 17 - HDFHDF

HDF5 1.6.3HDF5 1.6.3

• Utilities– H5repack utility (new)

• Regenerates an HDF5 file from another HDF5 file, • Optionally applies filters, chunking to new file

– H5dump utility improvements• Print new info, such as dataset filters, storage

layout, fill value info

Page 17: HDF Update

- 18 - HDFHDF

Szip in HDF5 1.6.3Szip in HDF5 1.6.3

• HDF5 can now include SZIP compression with or without Szip's encoder – Required to create SZIP compressed files – Not required to read SZIP compressed files

• Info on Szip and Szip licensing:• http://hdf.ncsa.uiuc.edu/doc_resource/SZIP/

Page 18: HDF Update

- 19 - HDFHDF

HDF5 1.6.3 HDF5 1.6.3 New platforms & compilersNew platforms & compilers

• PGI Fortran for Linux64 (x86-64)

• Absoft F95 for Linux 2.4 -32 bit

• IBM XL Fortran and Absoft F95 for Mac OS X

Page 19: HDF Update

- 20 - HDFHDF

HDF Java Products 2.0 – March 2004HDF Java Products 2.0 – March 2004

• Tested with HDF5-1.6.2• Platforms

– Windows (98/NT/2000/XP)– Solaris– Linux– AIX– IRIX 6.5– Mac OSX– OSF1

• http://hdf.ncsa.uiuc.edu/hdf-java-html/

Page 20: HDF Update

- 21 - HDFHDF

Modular HDFViewModular HDFView

• Replaceable modules: – File I/O (file/data format)– Tree view (show file structure)– Table view (spreadsheet-like)– Text view (view/edit text dataset)– Image view (view/process image)– Palette view (view/change palette)– Metadata (attribute) view

• http://hdf.ncsa.uiuc.edu/hdf-java-html/hdfview/

Application(HDFView)

InterfacesI/O, TreeView, TableView, etc

DefaultImplementation

UserImplementation

Modular HDFView – improved HDFView where I/O and GUI components are replaceable modules.

Page 21: HDF Update

- 22 - HDFHDF

HDFView Web Browser Plug-inHDFView Web Browser Plug-in

• Goal: Click-and-view HDF files remotely and locally from popular web browsers.

• See poster.

Page 22: HDF Update

- 23 - HDFHDF

Parallel HDF5 in 2004Parallel HDF5 in 2004

• A few performance improvements

• MPICH/MPE instrumentation feature added– performance analysis tools for their MPI

programs

• “Flexible parallel HDF5” programming model– More flexible model for parallel HDF5 – Other options currently under investigation

Page 23: HDF Update

- 24 - HDFHDF

Parallel HDF5 developmentsParallel HDF5 developments

• New parallel platforms supported– Solaris 2.8 (32 & 64 bits)– OSF 5.1– Cray T3E, SV1, T90– HPUX 11.0– FreeBSD

Page 24: HDF Update

- 28 - HDFHDF

Other Activities of InterestOther Activities of Interest

Page 25: HDF Update

- 29 - HDFHDF

DOE/ASCI*DOE/ASCI*

• Massively parallel computing and I/O

• Complex data models and big data

• HDF5 a standard format for ASCI apps

* “Advanced Simulation and Computing Program”

“ASCI provides the integrating simulation and modeling capabilities and technologies needed …for future design assessment and certification of nuclear

weapons and their components”

Page 26: HDF Update

- 30 - HDFHDF

BoeingBoeingHDF5 for real-time flight test dataHDF5 for real-time flight test data

• Needed for flight test data systems• Must handle raw, real-time data• Implemented API to read/write data• Based on HDF5 “table” API• Challenge: Variable length data• Possible Boeing-wide standard• Potential applications to many domains• See poster

Page 27: HDF Update

- 31 - HDFHDF

NCASSR*: Indexing & viewing tablesNCASSR*: Indexing & viewing tables

• Opportunities arising from Boeing work– Make test-data features widely available– Common data model and API for tabular data in HDF5 – Indexing for post-processing– Viewing capabilities

• Tasks– Identify apps to study and gather requirements – Develop data model and API for tabular data– Include general purpose indexing structures and API– Implement prototype API and viewer

* National Center for Advanced Secure Systems Research

Page 28: HDF Update

- 32 - HDFHDF

National Archives and Records National Archives and Records Administration (NARA)Administration (NARA)

• Investigate HDF5 as format for records archiving• Focus on geospatial data

– Images (e.g. elevation models, aerial photography)– Features (e.g. boundaries, roads, rivers)

• Results so far– HDF5 data model handles all data types– Feature (vector) data present access and size challenges– Work is leading to good performance lessons

• See poster about study of vector data

Page 29: HDF Update

- 33 - HDFHDF

SciDAC/PMODELSciDAC/PMODELArithmetic Data TransformArithmetic Data Transform

• Apply algebraic operations to dataset during read/write. • Initial goal:

– transform individual elements (e.g., x * 1.8 + 32). – During reads, applies to result in memory.

During writes, data in the file changed.

• Implemented in HDF5 v1.7, to be released in v1.8• Future

– Transformations on attributes or multiple datasets (e.g. (A + B) / 2.0)

• http://hdf.ncsa.uiuc.edu/PMODELS/datatransform/

Page 30: HDF Update

- 34 - HDFHDF

Weather Research Forecast (WRF) Weather Research Forecast (WRF) ModelModel

• WRF – NCAR community standard model• HDF5 I/O module for NCAR’s WRF• HDF5-WRF parallel I/O studies

– Improved performance for computations with large I/O

• Sequential HDF5-WRF studies– Compression can save disk space

• See the poster• And see

http://hdf.ncsa.uiuc.edu/apps/WRF-ROMS 

Page 31: HDF Update

- 35 - HDFHDF

netCDF-HDF ProjectnetCDF-HDF Project

• Enhanced NetCDF-4 Interface to HDF5– Combine features of netCDF and HDF5– Take advantage of their separate strengths

• Collaboration between NCSA and Unidata• See poster: “Merging the netCDF and HDF5 libraries to

achieve gains in performance and interoperability”

Page 32: HDF Update

- 36 - HDFHDF

OPeNDAP – netCDF – HDF5OPeNDAP – netCDF – HDF5

• OPeNDAP– A system for the transmitting data across the Internet– Supports selection of data using constraint expressions– Can translate data from one format to another

• NetCDF and HDF5– Formats of major interest to the OPeNDAP community

• All three are in heavy use in the earth sciences• So the question is …

Page 33: HDF Update

- 37 - HDFHDFOPeN

DAP

OPeN

DAP

netCDF

netCDF

HDF5

HDF5

To harmonize To harmonize OPeNDAPOPeNDAP

netCDFnetCDFHDF5HDF5

??

Are the planets Are the planets finally aligned?finally aligned?

Page 34: HDF Update

- 38 - HDFHDF

OpenDAP/netCDF/HDF5 HarmonizationOpenDAP/netCDF/HDF5 Harmonization

• Opportunity– Unidata is creating netcdf-4– Existing OPeNDAP work with netcdf and HDF5 – OPeNDAP project working on a new spec (4.0)  – John Caron working on new java-netCDF library (2.2)

• Creates a "common data model" which is more-or-less a union of the 3 models.

• But there are important differences– Different ecological niche– Some very different object types– So a union of all the models is unlikely 

Page 35: HDF Update

- 39 - HDFHDF

OpenDAP/netCDF/HDF5 HarmonizationOpenDAP/netCDF/HDF5 Harmonization

• Goal: map between the three models, and possibly tweak the models to better make them harmonize.

• Tackle certain important differences– OPeNDAP Sequences

• Hard to represent in the netCDF API• But seems like they might work in HDF5.

– HDF5 attributes• Hard to represent in the DAP.

• Also perhaps devise a formal mapping between the three models

Page 36: HDF Update

- 40 - HDFHDF

Thank youThank you

AcknowledgementsAcknowledgementsThis report is based upon work supported in part by a

Cooperative Agreement with NASA under NASA grant

NAG 5-2040 and NAG NCCS-599.

Any opinions, findings, and conclusions or

recommendations expressed in this material are those of

the author(s) and do not necessarily reflect the views of

the National Aeronautics and Space Administration. 

Other support provided by NCSA and other

sponsors and agencies.

(http://hdf.ncsa.uiuc.edu/acknowledge.html).

Made on location in Champaign Illinois.

To the best of our knowledge, no animals were abused in

the making of these slides.

Page 37: HDF Update

- 41 - HDFHDF

Questions/comments?Questions/comments?

Page 38: HDF Update

- 42 - HDFHDF

Information SourcesInformation Sources

• HDF website– http://hdf.ncsa.uiuc.edu/

• HDF5 Information Center– http://hdf.ncsa.uiuc.edu/HDF5/

• HDF Helpdesk– [email protected]

• HDF users mailing list– [email protected]