HDF5 Backward and Forward Compatibility Issues

25
05/25/22 HDF and HDF-EOS Workshop X, Landover, MD 1 HDF5 Backward and Forward Compatibility issues or What do we promise to our users?

description

The goal of this talk is to educate HDF5 users about backward and forward compatibility issues across releases of the HDF5 Library and versions of the HDF5 file format. We will discuss changes in the file format that were done to support new HDF5 features such as object creation order, compact groups, efficient access to the variable length data, UTF-8 encoding, external links, etc., and their implications on the HDF5 Library and users' applications.

Transcript of HDF5 Backward and Forward Compatibility Issues

Page 1: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

1

HDF5 Backward and Forward Compatibility

issuesor

What do we promise to our users?

Page 2: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

2

• Forward compatibility is the ability of a system to accept input intended for later versions of itself.

• In technology, a product is said to be backward compatible when it is able to take place of an older product, by interoperating with other products that were designed for the older product.

Wikipedia

Page 3: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

3

Outline

• Introduction• HDF5 library versioning• HDF5 file format versioning• Forward compatibility• Backward compatibility• What to expect with 1.8.0?

Page 4: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

4

Introduction

• Why new versions of the library and file format?•Bug fixes•Performance improvements•New features•All above may require

•File format changes•API changes•New APIs•Public structure changes

Page 5: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

5

Introduction

• To upgrade or not to upgrade?•Did they finally fix a bug I reported 5 years

ago?•Will I be able to read my old files with the

new library?•Do I need to re-link my application?•Will IDL that we bought 3 years ago work

with files created by the new library?•My colleagues and I use different versions

of HDF5 libraries. Can we modify and access each others files?

Page 6: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

6

Introduction

• This talk is about what HDF5 users should expect when moving from one version of the HDF5 libraries to another.

• Information • Backward and forward compatibility issues

• http://hdfgroup.org/HDF5/faq/bkfwd-compat.html

• API changes from release to release• http://hdfgroup.org/HDF5/doc_1.8pre/doc/ADGuid

e/Changes.html

• File Format changes• http://hdfgroup.org/HDF5/doc/H5.format.html

Page 7: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

7

HDF5 Library Versioning

• HDF5 version number has a form of X.Y.Z(-suffix)• X is called major version number • Y is called minor version number (always an even

number for public release)• Z is called release number• suffix is present in snapshots and release

“candidates” (e.g. snap8, pre1)

• Examples:• Releases:

• HDF5 1.6.5 and upcoming HDF5 1.8.0• source tar file names hdf5-1.6.5.tar and hdf5-1.8.0.tar

• Snapshots (source under development)• HDF5 1.6.6-snap8 and HDF5 1.7.58

• source tar file names hdf5-1.6.6-snap8.tar and hdf5-1.7.58.tar

Page 8: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

8

HDF5 Library Versioning

• HDF5 release number Z in a public release X.Y.Z• Incremented each time a new set of

bug fixes and /or performance enhancements is made available to the public

•Upgrading/downgrading between different versions with the same X.Y and different Zs may cause bugs to disappear/appear

Page 9: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

9

HDF5 Library Versioning

• HDF5 release number Z in a public release X.Y.Z• No file format change• No changes to the existing APIs• No change to public data structures • New APIs may be added by popular

demand or by demand of the funding agencies (NASA, ASC) or as a result of a bug fix.

• Existing applications should be able to re-compile with the newest version

Page 10: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

10

HDF5 Library Versioning

• HDF5 release number Z in a public release X.Y.Z• Some exceptions for severe bugs

• Examples: • File format change

• File format changed between 1.6.0 and 1.6.X to support control of B-trees for indexing chunked datasets (ASC)

• 1.6.0 library couldn’t read 1.6.X files when the feature was used; 1.6.X could read 1.6.0 files

• API change• APIs changed between 1.6.4 and 1.6.3 to replace signed

with unsigned to improve library performance and code portability

• Behavior change• Application will fail at run-time if compiled with versions

greater than 1.6.5 (a rare file corruption issue was discovered when better error checking was added to the library)

• HDF group provided a tool to fix corrupted files

Page 11: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

11

HDF5 Library Versioning

• HDF5 minor version number Y in a public release X.Y.Z• Incremented each time when a new set

of features is introduced• File format may change• New APIs are added• Old APIs may be removed or deprecated

(will be removed in the next Y release)• Public data structures may change (handled

the same way as deprecated APIs)

Page 12: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

12

HDF5 Library Versioning

• HDF5 minor version number Y in a public release X.Y.Z• Upgrading/downgrading between

different versions with the same X may cause problems• X.Y1 may not be able to read X.Y2 files• Application written with X.Y2 features will not

link with X.Y1 library• Application written for X.Y1 may not link with

X.Y2 library due to changed or removed APIs or due to a change in public data structures

Page 13: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

13

HDF5 Library Versioning

• HDF5 major version number X in a public release X.Y.Z• Indicates major file format changes

• It is probably HDF6; let’s talk about it in 10 years :-)

Page 14: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

14

HDF5 File Format Versioning

• There is no HDF5 file format version number •Micro-versioning: each object and

structure within an HDF5 file is versioned •Updated “File Format Specification” is

available with every public release•There is no way to find what version of

the library created or modified a particular file

•Why did we choose such approach?

Page 15: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

15

HDF5 File Format Versioning

• Maximum file format compatibility principle•By default the HDF5 files are written

with the earliest version of file format that describes information, rather than always using the latest version possible.

•Assures best forward compatibility with the older versions (objects in new files can be read with old libraries if that object is “known” to the old libraries)

Page 16: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

16

HDF5 File Format Versioning

• Maximum file format compatibility principle• Example: Datatype header message

• Versions 0, 1 and 2 • Version 0 used by the latest library for datatype

messages in all situations where are no array datatypes used

• Version 1 (introduced in 1.4.0) used by 1.6.5 and earlier versions of the library to encode compound datatypes with explicit array fields.

• Version 2 is used for 1.8.0 and later if requested by setting special flag (latest file format); helps to reduce overhead in describing complex datatypes

• By default 1.8.0 writes compound data compatible with 1.4.0 – 1.6.X libraries

• If feature is requested, compound data created by 1.8.0 will not readable by earlier versions

Page 17: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

17

HDF5 Forward Compatibility

• Forward compatibility or what do we promise (file format)• Forward compatibility is most difficult to

achieve and maintain• Achieved by using micro-versioning and

“maximum compatibility” principle• Old versions of the library will read all objects

in a file created by a newer library if objects are known to the old library• Example: 1.6.5 library will read a group in a file

created by 1.8.0 version unless new 1.8.0 features are used (e.g. external links or compact groups)

Page 18: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

18

HDF5 Forward Compatibility

• Forward compatibility or what do we promise (APIs)• Application written to work with an older version

will compile, link and run as expected with a newer version

• APIs are not deleted or changed (if possible)• APIs do not change behavior (if possible)• May require configuration flag –enable–hdf5v1_Y to enable

old APIs, data structures and behavior

• Drawbacks• Have to keep old APIs until another major release or

indefinitely• Cannot make new features to be enabled by default

• H5Gcreate will create “old” style groups in 1.8.0• H5Gcreate2 will create “new” groups (supports creation

order, compact storage, improved heap structure, low and controlled overhead, etc.)

Page 19: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

19

HDF5 Backward Compatibility

• Backward compatibility or what do we promise• File Format

• Newer version of the library will always read files created with an older version

• Aside: HDF4 can read HDF4 files created in 1988

• Library APIs• Application that doesn’t use new features

will compile and link with the older library

Page 20: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

20

What to expect with 1.8.0?

• 1.8.0 introduces file format changes, new APIs and old API changes• File format changes

• Revised internal file structures to support new features such as creation order, UTF-8 encoding, external links, etc., and to reduce file’s overhead

• New APIs added• Group API revisions to allow two types of indices on links within a

group• New compression methods (scale+offset, n-bit)• Link APIs including UTF-8 encoding for links and external links• Etc.

• Old API changes• There are several APIs (rarely used) with changed signatures or

behavior• http://hdfgroup.org/HDF5/doc_1.8pre/doc/ADGuide/Changes.html

Page 21: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

21

What to expect with 1.8.0?

• Application written for 1.6.5 • Will compile, link and run with 1.8.0 library as expected

producing files compatible with 1.6.5 release• Will take advantage of new meta data cache, performance

enhancements and bug fixes• New 1.8.0 features are not available unless application is

modified• Smaller file overhead

• Shared object messages• More space efficient object header storage• Compact groups• Efficient heap storage for groups with many links

• New group and links features• Creation order on links • UTF-8 encoding• Groups’ compact storage• External links

Page 22: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

22

What to expect with 1.8.0?

• Applications written for 1.8.0 • Will always read older files• May modify 1.6.5 file in a way that 1.6.5 library will

not be able to access some old objects it• Example

• Groups converted to use new format• Compact storage for compound datatype

• May produce files NOT compatible with 1.6.5 !!!• Example

• Root group is created using 1.8.0 features

• Takes full advantage of greatest latest HDF5!

Page 23: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

23

Example

• How application can create a 1.6.5 incompatible file?• Latest format is used for storing compound datatypes

fapl = H5Pcreate(H5P_FILE_ACCESS);

H5Pset_latest_format(fapl, TRUE);

file = H5Fcreate(filename, H5F_ACC_TRUNC, H5P_DEFAULT, fapl);

tid = H5Tcreate(H5T_COMPOUND, sizeof(struct s1));

H5Tinsert(…);

dset = H5Dcreate(file, “New compound”, tid,………);

H5Dwrite(dset, …);

……

Page 24: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

24

Acknowledgements

Thank you!

Questions?

Page 25: HDF5 Backward and Forward Compatibility Issues

04/13/23 HDF and HDF-EOS Workshop X, Landover, MD

25

Acknowledgement

This report is based upon work supported in part by a Cooperative Agreement with NASA under NASA NNG05GC60A. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics and Space Administration.