Migrating from HDF5 1.6 to 1.8

91
October 15, 2008 HDF and HDF-EOS Workshop XII 1 Migrating from HDF5 1.6 to HDF5 1.8

description

This tutorial targets HDF5 application developers and users who still use HDF5 1.6 releases and anyone who is interested in the HDF5 1.8.x libraries features. We will discuss how applications written for versions 1.6.x and earlier can be seamlessly moved to use the latest HDF5 releases. We will also talk about new features of the 1.8.x HDF5 Library such as redesigned group object, links, creation order, and different performance tuning knobs.

Transcript of Migrating from HDF5 1.6 to 1.8

Page 1: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 1

Migrating from HDF5 1.6 to HDF5 1.8

Page 2: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 2

Outline

• Status of the HDF5 1.6 and 1.8 releases• Overview of the HDF5 1.8 features• How to move applications to HDF5 1.8 ?

Page 3: Migrating from HDF5 1.6 to 1.8

Status of HDF5 releases

October 15, 2008 3HDF and HDF-EOS Workshop XII

Page 4: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 4

Current HDF5 Releases

• HDF5 1.8.0 was released in February 2008• Major update of HDF5 1.6.* series (stable set of

features and APIs since 1998)• New features• 200 new APIs• Changes to file format• Changes to APIs• Backward compatible

• HDF5 1.8.1 was released in June 2008• Minor bug fixes

• Included Fortran90 APIs for new C functions

Page 5: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 5

Current HDF5 Releases

• HDF5 1.6.7 was released in February 2008• Addressed backward compatibility bug for reading

files with corrupted object header information

• New maintenance releases will be in November 2008• HDF5 1.6.8 and 1.8.2

• Minor bug fixes• Tools improvements

• Current plans are to support HDF5 1.6 and 1.8 until November 2009

Page 6: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 6

Information About Current Releases

http://www.hdfgroup.org/HDF5

Page 7: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 7

Goal of the Tutorial

• Help with transition to the 1.8 releases• Discuss new features beneficial to applications

written for 1.6 releases• Raise awareness about forward/backward

compatibility issues with the 1.8 releases• Get feedback from the users who already moved

to 1.8 releases

Page 8: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 8

Why New Features?

• Need to address some deficiencies in initial design• Examples:

• Big overhead in file sizes• Non-tunable metadata cache implementation• Handling of free-space in a file

Page 9: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 9

Why New Features?

• Need to address new requirements• Add support for

• New types of indexing (object creation order)• Big volumes of variable-length data (DNA

sequences)• Simultaneous real-time streams (fast append to one

-dimensional datasets)• UTF-8 encoding for objects’ path names• Accessing objects stored in another HDF5 files

(external or user-defined links)

Page 10: Migrating from HDF5 1.6 to 1.8

What Did We Do in HDF5 1.8?

• Extended File Format Specification • Reviewed group implementations• Introduced new link object• Revamped metadata cache implementation• Improved handling of datasets and datatypes• Introduced shared object header message• Extended error handling• Enhanced backward/forward APIs and file format

compatibility

October 15, 2008 HDF and HDF-EOS Workshop XII 10

Page 11: Migrating from HDF5 1.6 to 1.8

What Did We Do in HDF5 1.8?

And much more good stuff to make HDF5

October 15, 2008 HDF and HDF-EOS Workshop XII 11

Better and Faster

Page 12: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 12

HDF5 File Format Extension

Page 13: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 13

HDF5 File Format Extension

• Why: • Address deficiencies of the original file format

• Address space overhead in an HDF5 file

• Enable new features

• What: • New routine that instructs the HDF5 library to

create all objects using the latest version of the HDF5 file format (cmp. with the earliest version when object became available, for example, array datatype)

Page 14: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 14

HDF5 File Format Extension

Example

/* Use the latest version of a file format for each object created in a file */

fapl_id = H5Pcreate(H5P_FILE_ACCESS);H5Pset_libver_bounds(fapl_id, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST);fid = H5Fcreate(…,…,…,fapl_id);orfid = H5Fopen(…,…,fapl_id);

Page 15: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 15

Group Revisions

Page 16: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 16

Better Large Group Storage

• Why: • Faster, more scalable storage and access for large

groups

• What: • New format and method for storing groups with

many links

Page 17: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 17

Informal Benchmark

• Create a file and a group in a file• Create up to 10^6 groups with one dataset in

each group• Compare files sizes and performance of HDF5

1.8.1 using the latest group format with the performance of HDF5 1.8.1 (default, old format) and 1.6.7

• Note: Default 1.8.1 and 1.6.7 became very slow after 700000 groups

Page 18: Migrating from HDF5 1.6 to 1.8

Time to Open and Read a Dataset

October 15, 2008 HDF and HDF-EOS Workshop XII 18

Page 19: Migrating from HDF5 1.6 to 1.8

Time to Close the File

October 15, 2008 HDF and HDF-EOS Workshop XII 19

Page 20: Migrating from HDF5 1.6 to 1.8

File Size

October 15, 2008 HDF and HDF-EOS Workshop XII 20

Page 21: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 21

Access Links by Creation Order

• Why: • Allow iteration and lookup of group’s links

(children) by creation order as well as by name order

• Support NetCDF access model for NetCDF-4

• What: • Option to access objects in group according to

relative creation time

Page 22: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 22

Access Links by Creation Order

Example

/* Track and index creation order of the links */

H5Pset_link_creation_order(gcpl_id, (H5P_CRT_ORDER_TRACKED | H5P_CRT_ORDER_INDEXED));

/* Create a group */

gid = H5Gcreate(fid, GNAME, H5P_DEFAULT, gcpl_id, H5P_DEFAULT);

Page 23: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 23

Example: h5dump --group=1 tordergr.h5

HDF5 "tordergr.h5" {GROUP "1" { GROUP "a" { GROUP "a1" { } GROUP "a2" { GROUP "a21" { } GROUP "a22" { } } } GROUP "b" { } GROUP "c" { …

Page 24: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 24

Example: h5dump --sort_by=creation_order

HDF5 "tordergr.h5" {GROUP "1" { GROUP "c" { } GROUP "b" { } GROUP "a" { GROUP "a1" { } GROUP "a2" { GROUP "a22" { } GROUP "a21" { } } }

Page 25: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 25

Compact Groups

• Why: • Save space and access time for small groups• If groups are small, don’t need B-tree overhead

• What:• Alternate storage for groups with few links• Default storage when “latest format” is specified• Library converts to “original” storage (B-tree based)

using default or user-specified threshold

Page 26: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 26

Compact Groups

• Example• File with 11,600 groups• With original group structure, file size ~ 20 MB• With compact groups, file size ~ 12 MB• Total savings: 8 MB (40%)• Average savings/group: ~700 bytes

Page 27: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 27

Compact Groups

Example

/* Change storage to “dense” if number of group members is bigger than 16 and go back to compact storage if number of group members is smaller than 12 */

H5Pset_link_phase_change(gcpl_id, 16, 12)

/* Create a group */

g_id = H5Gcreate(…,…,…,gcpl_id,…);

Page 28: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 28

Intermediate Group Creation

• Why: • Simplify creation of a series of connected groups

• Avoid having to create each intermediate group separately, one by one

• What: • Intermediate groups can be created when creating

an object in a file, with one function call

Page 29: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 29

Intermediate Group Creation

• Want to create “/A/B/C/dset1”• “A” exists, but “B/C/dset1” do not

/A

/A

BB

dset1dset1

CC

One call creates groups “B” & “C”, then creates “dset1”

Page 30: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 30

Intermediate Group Creation

Example

/* Create link creation property list */

lcrp_id = H5Pcreate(H5P_LINK_CREATE);

/* Set flag for intermediate group creation

Groups B and C will be created automatically */

H5Pset_create_intermediate_group(lcrp_id, TRUE);

ds_id = H5Dcreate (file_id, "/A/B/C/dset1",…,…,

lcrp_id,…,…,);

Page 31: Migrating from HDF5 1.6 to 1.8

Link Revisions

October 15, 2008 31HDF and HDF-EOS Workshop XII

Page 32: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 32

What are Links?

• Links connect groups to their members• “Hard” links point to a target by address• “Soft” links store the path to a target

root group

Hard link

dataset

Soft link“/target dataset”<address>

Page 33: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 33

Links: Before and After

• New data model for handling links• Links may have properties (UTF-8 name

encoding, creation order indexing, storage property, etc.)

Group

Before After

Group

Object Object

Name Name and other properties

Page 34: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 34

Anonymous Object

• Object can be created without being immediately linked into graph structure• Group, dataset and datatype

• See new H5*create_anon APIs

Group

Object

• Use H5O* APIs to manipulate the objects

Page 35: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 35

New: External Links

• Why:• Access objects stored in other HDF5 files in a

transparent way

• What:• Store location of file and path within that file

• Can link across files

Page 36: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 36

file2.h5

file1.h5

New: External Links

root group

“External_link”

“file2.h5”

“/A/B/C/D/E”

root group

group

“target object”

<address>

External link object “External_link” in file1.h5 points to the group/A/B/C/D/E in file2.h5

Page 37: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 37

External Links

Example

/* Create an external link */

H5Lcreate_external(TARGET_FILE, ”/A/B/C/D/E", source_file_id, ”External_link”, …,…);

/* We will use external link to create a group in a target file */

gr_id = H5Gcreate(source_file_id,”External_link/F”,…,…,…,…);

/* We can access group “External_link/F” in the source file and group “/A/B/C/D/E/F” in the target file */

Page 38: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 38

New: User-defined Links

• Why:• Allow applications to create their own kinds of links and

link operations, such as• Create “hard” external link that finds an object by address• Create link that accesses a URL• Keep track of how often a link is accessed, or other behavior

• What:• Applications can create new kinds of links by supplying

custom callback functions• Can do anything HDF5 hard, soft, or external links do

Page 39: Migrating from HDF5 1.6 to 1.8

Traversing an HDF5 File

October 15, 2008 39HDF and HDF-EOS Workshop XII

Page 40: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 40

Traversing HDF5 File

• Why:• Allow applications to iterate through the objects in a

group or visit recursively all objects under a group

• What:• New APIs to traverse a group hierarchy • New APIs to iterate through a group using different

types of indices (name or creation order)• H5Giterate is deprecated in favor of new functions

Page 41: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 41

Traversing HDF5 File

Example of some new APIs

/* Check if object “A/B” exists in a root group */H5Lexists(file_id, “A/B”, …);

/* Iterate through group members of a root group

using name as an index; this function doesn’t recursively follow links into subgroups */

H5Literate(file_id, H5_INDEX_NAME, H5_ITER_INC, &idx, iter_link_cb, &info);

/* Visit all objects under the root group; this function recursively follow links into subgroups */

H5Lvisit(file_id, H5_INDEX_NAME, H5_ITER_INC, visit_link_cb, &info);

Page 42: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 42

Traversing HDF5 File

• Things to remember• Never use H5Ldelete in any HDF5 iterate or

visit call back functions• Always close parent object before deleting a

child object

Page 43: Migrating from HDF5 1.6 to 1.8

Shared Object Header Messages

October 15, 2008 43HDF and HDF-EOS Workshop XII

Page 44: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 44

Shared Object Header Messages

• Why: metadata duplicated many times, wasting space

• Example:• You create a file with 10,000 datasets• All use the same datatype and dataspace• HDF5 needs to write this information 10,000 times!

Dataset 1

data 1

datatype

dataspace

Dataset 2

data 2

datatype

dataspace

Dataset 3

data 3

datatype

dataspace

Page 45: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 45

Shared Object Header Messages

What:• Enable messages to be shared automatically

• HDF5 shares duplicated messages on its own!

Dataset 1

data 1

datatype

dataspace

Dataset 2

data 2

Page 46: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 46

Shared Messages

• Happens automatically• Works with datatypes, dataspaces, attributes, fill

values, and filter pipelines• Saves space if these objects are relatively large• May be faster if HDF5 can cache shared

messages• Drawbacks

• Usually slower than non-shared messages• Adds overhead to the file

• Index for storing shared datatypes• 25 bytes per instance

• Older library versions can’t read files with shared messages

Page 47: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 47

Two Informal Tests

• File with 24 datasets, all with same big datatype• 26,000 bytes normally• 17,000 bytes with shared messages enabled• Saves 375 bytes per dataset

• But, make a bad decision: invoke shared messages but only create one dataset…• 9,000 bytes normally• 12,000 bytes with shared messages enabled• Probably slower when reading and writing, too.

• Moral: shared messages can be a big help, but only in the right situation!

Page 48: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 48

Error Handling

Page 49: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 49

Extendible Error-handling APIs

• Why: Enable application to integrate error reporting with HDF5 library error stack

• What: New error handling API• H5Epush - push major and minor error ID on specified error

stack• H5Eprint – print specified stack• H5Ewalk – walk through specified stack• H5Eclear – clear specified stack• H5Eset_auto – turn error printing on/off for specified stack• H5Eget_auto – return settings for specified stack traversal

Page 50: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 50

Error-handling Programming Model

• Create new class, major and minor error messages• Register messages with the HDF5 library• Manage errors

• Use default or create new error stack • Push error• Print error stack• Close stack

Page 51: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 51

Error-handling Example

#define ERR_CLS_NAME "Error Test"

#define PROG_NAME "Error Program"

#define PROG_VERS "1.0”

……

#define ERR_MAJ_TEST_MSG "Error in test”

#define ERR_MIN_MYFUNC_MSG "Error in my function”

……

/* Initialize error information for application */

ERR_CLS = H5Eregister_class(ERR_CLS_NAME, PROG_NAME, PROG_VERS);

ERR_MAJ_TEST = H5Ecreate_msg(ERR_CLS, H5E_MAJOR, ERR_MAJ_TEST_MSG);

ERR_MIN_MYFUNC = H5Ecreate_msg(ERR_CLS, H5E_MINOR, ERR_MIN_MYFUNC_MSG);

……..

/* Unregister major and minor error, and class handles when done */

H5Eunregister_class(ERR_CLS);

Page 52: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 52

Error-handling Example

/* This function creates and write a dataset */

static herr_t my_function(hid_t fid)

{

…….

/* Force this function to fail and make it push error */

H5E_BEGIN_TRY {

dataset = H5Dcreate1(FAKE_ID, DSET_NAME, H5T_STD_I32BE, space,

H5P_DEFAULT);

} H5E_END_TRY;

if(dataset < 0) {

H5Epush(H5E_DEFAULT, __FILE__, FUNC_my_function, __LINE__, ERR_CLS, ERR_MAJ_IO, ERR_MIN_CREATE, "H5Dcreate failed");

goto error;

} /* end if */

……

Page 53: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 53

Error-handling Example

Error Test-DIAG: Error detected in Error Program (1.0) thread 0: #000: error_example.c line 160 in main(): Error stack test failed major: Error in test minor: Error in my function #001: error_example.c line 100 in my_function(): H5Dcreate failed major: Error in IO minor: Error in H5DcreateHDF5-DIAG: Error detected in HDF5 (1.8.1) thread 0: #002: H5Ddeprec.c line 154 in H5Dcreate1(): not a location ID major: Invalid arguments to routine minor: Inappropriate type #003: H5Gloc.c line 241 in H5G_loc(): invalid object ID major: Invalid arguments to routine minor: Bad value

Page 54: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 54

Metadata Cache

Page 55: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 55

Metadata Cache Improvements

• Why: • Improve I/O performance and memory usage when

accessing many objects• What:

• New metadata cache APIs• control cache size• monitor actual cache size and current hit rate

• Under the hood: adaptive cache resizing• Automatically detects the current working size• Sets max cache size to the working set size

Page 56: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 56

Metadata Cache Improvements

• Note: most applications do not need to worry about the cache.

• See “Special topics” in the HDF5 User’s Guide for details.

• And if you do see unusual memory growth or poor performance, please contact us. We want to help you.

Page 57: Migrating from HDF5 1.6 to 1.8

Dataset and Datatype Improvements

October 15, 2008 57HDF and HDF-EOS Workshop XII

Page 58: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 58

Text-based Datatype Descriptions

• Why:• Simplify data type creation

• Make data type creation code more readable

• Facilitate debugging by printing the text description of a data type

• What: • New routines to create an HDF5 data type through

the text description of the data type and get a text description from the HDF5 data type

Page 59: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 59

Text Datatype Description

Example

/* Create the data type from DDL text description */dtype = H5LTtext_to_dtype(( "H5T_IEEE_F32BE\n”,H5LT_DDL);"H5T_IEEE_F32BE\n”,H5LT_DDL);

/* Convert the data type back to text */

H5LTtype_to_text(dtype, NULL, H5LT_DLL, str_len);dt_str = (char*)calloc(str_len, sizeof(char));

H5LTdtype_to_text(dtype, dt_str, H5LT_DDL, &str_len);

Page 60: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 60

Serialized Datatypes and Dataspaces

• Why: • Allow datatype and dataspace info to be

transmitted between processes

• Allow datatype/dataspace to be stored in non-HDF5 files

• What: • A new set of routines to serialize/deserialize HDF5

datatypes and dataspaces.

Page 61: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 61

Serialized Datatypes and Dataspaces

Example

/* Find the buffer length and encode a datatype into buffer */

status = H5Tencode(t_id, NULL, &cmpd_buf_size);cmpd_buf = (unsigned char*)calloc(1, cmpd_buf_size);H5Tencode(t_id, cmpd_buf, &cmpd_buf_size)

/* Decode a binary description of a datatype and retune a datatype handle */

t_id = H5Tdecode(cmpd_buf);

Page 62: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 62

Integer to Float Convert During I/O

• Why: • HDF5 1.6 and earlier supported conversion within

the same class (16-bit integer 32-bit integer, 64-bit float 32-bit float)

• Conversion needed to support NetCDF-4 programming model

• What: • Integer to float conversion supported during I/O

Page 63: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 63

Integer to Float Convert During I/O

Example: conversion is transparent to application

/* Create a dataset of 64-bit little-endian type */

dset_id = H5Dcreate(loc_id,“Mydata”, H5T_IEEE_F64LE,space_id, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);

/* Write integer data to “Mydata” */

status = H5Dwrite(dset_id, H5T_NATIVE_INT, …);

Page 64: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 64

Revised Conversion Exception Handling

• Why: • Give apps greater control over exceptions (range

errors, etc.) during datatype conversion

• Needed to support NetCDF-4 programming model

• What: • Revised conversion exception handling

Page 65: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 65

Revised Conversion Exception Handling

• To handle exceptions during conversions, register handling function through H5Pset_type_conv_cb().

• Cases of exception:• H5T_CONV_EXCEPT_RANGE_HI• H5T_CONV_EXCEPT_RANGE_LOW• H5T_CONV_EXCEPT_TRUNCATE• H5T_CONV_EXCEPT_PRECISION• H5T_CONV_EXCEPT_PINF• H5T_CONV_EXCEPT_NINF• H5T_CONV_EXCEPT_NAN

• Return values: H5T_CONV_ABORT, H5T_CONV_UNHANDLED, H5T_CONV_HANDLED

Page 66: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 66

Compression Filter for N-bit Data

• Why: • Compact storage for user-defined datatypes

• What:• When data stored on disk, padding bits chopped

off and only significant bits stored

• Works with compound datatypes

Page 67: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 67

N-bit Compression Example

• In memory, one value of N-Bit datatype is stored like this:

| byte 3 | byte 2 | byte 1 | byte 0 ||????????|????SPPP|PPPPPPPP|PPPP????|

S-sign bit P-significant bit ?-padding bit

• After passing through the N-Bit filter, all padding bits are chopped off, and the bits are stored on disk like this:

| 1st value | 2nd value ||SPPPPPPP PPPPPPPP|SPPPPPPP PPPPPPPP|...

• Opposite (decompress) when going from disk to memory• Limited to integer and floating-point data

Page 68: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 68

N-bit Compression Example

Example

/* Create a N-bit datatype */

dt_id = H5Tcopy(H5T_STD_I32LE); H5Tset_precision(dt_id, 16); H5Tset_offset(dt_id, 4);

/* Create and write a dataset */

dcpl_id = H5Pcreate(H5P_DATASET_CREATE); H5Pset_chunk(dcpl_id, …); H5Pset_nbit(dcpl_id);dset_id = H5Dcreate(…,…,…,…,…,dcpl_id,…); H5Dwrite(dset_id,…,…,…,…,buf);

Page 69: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 69

Offset+size Storage Filter

• Why:• Use less storage when less precision needed

• What:• Performs scale/offset operation on each value• Truncates result to fewer bits before storing• Currently supports integers and floats• Precision may be lost

Page 70: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 70

Example with Floating-Point Type

• Data: {104.561, 99.459, 100.545, 105.644}• Choose scaling factor: decimal precision to keep

E.g. scale factor D = 21. Find minimum value (offset): 99.4592. Subtract minimum value from each

elementResult: {5.102, 0, 1.086, 6.185}

3. Scale data by multiplying 10D = 100Result: {510.2, 0, 108.6, 618.5}

4. Round the data to integerResult: {510 , 0, 109, 619}

5. Pack and store using min number of bits

Page 71: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 71

Offset+size Storage Filter

Example

/* Use scale+offset filter on integer data; let library figure out the number of minimum bits necessary to story the data without loss of precision */

H5Pset_scaleoffset (dcrp_id,H5Z_SO_INT,H5Z_SO_INT_MINBITS_DEFAULT);

H5Pset_chunk(dcrp_id,…,…);

dset_id = H5Dcreate(…,…,…,…,…,dcpl_id, …);

/* Use sclae+offset filter on floating-point data; compression may be lossy */

H5Pset_scaleoffset(dcrp_id,H5Z_SO_FLOAT_DSCALE,2 );

Page 72: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 72

“NULL” Dataspace

• Why:• Allow datasets with no elements to be described

• NetCDF-4 needed a “place holder” for attributes

• What:• A dataset with no dimensions, no data

Page 73: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 73

NULL Dataspace

Example

/* Create a dataset with “NULL” dataspace*/

sp_id = H5Screate(H5S_NULL);

dset_id = H5Dcreate(…,"SDS.h5”,…,sp_id,…,…,…);

HDF5 "SDS.h5" {GROUP "/" { DATASET "IntArray" { DATATYPE H5T_STD_I32LE DATASPACE NULL DATA { } }}}

Page 74: Migrating from HDF5 1.6 to 1.8

Part IIHow to Move an Application

to HDF5 1.8

October 15, 2008 74HDF and HDF-EOS Workshop XII

Page 75: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 75

If you are new to HDF5 …

• Start with the HDF5 Tutorial and use HDF5 1.8.1 (or later) librarieshttp://www.hdfgroup.org/HDF5/Tutor/

• Look at the comprehensive set of exampleshttp://www.hdfgroup.uiuc.edu/UserSupport/examples-by-api/api18-c.html

• Use HDF5 Reference Manual 1.8.1 or laterhttp://www.hdfgroup.org/HDF5/doc

to get the latest APIs

• Use h5cc, h5fc, h5c++ scripts to build your applications

Page 76: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 76

If you are new to HDF5 …

• To check library version • Talk to you system administrator

• Try any HDF5 command line utility with –V flag./h5dump -V

h5dump: Version 1.8.1

• Look at the libhdf5.settings file on your system (under lib directory in the HDF5 installation directory); you should see the following lines

HDF5 Version: 1.8.1

Default API Mapping: v18 (can be v16)

• API mapping indicates API version (1.8 in this case)

Page 77: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 77

If you are new to HDF5 …

• Use the latest version of API from the Reference Manual (for example, H5Gcreate or H5Gcreate2 instead of H5Gcreate1)

• Be aware of forward compatibility issues• If you use HDF5 features introduced in 1.8.0 and

later, you may create a file that old applications and third party tools may not be able to read

http://www.hdfgroup.org/HDF5/doc/ADGuide/CompatFormat180.html

• All HDF5 command line utilities and tools should read HDF5 files created during the last 10 years

• Report problems to [email protected]

Page 78: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 78

If you used HDF5 before….

• And would like to switch to 1.8….• Or you build and install HDF5 Libraries for

yourself or others….• You need to know about HDF5 API versioning.

Page 79: Migrating from HDF5 1.6 to 1.8

Introduction to HDF5 API Versioning

October 15, 2008 79HDF and HDF-EOS Workshop XII

Page 80: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 80

API Versioning Example

• Old way

H5Gcreate ( loc_id, “New/My old group”, 0 )

H5Gcreate1( loc_id, “New/My old group”, 0 )• No new features can be invoked

• New wayH5Gcreate ( loc_id, “New/My new group”, lcpl_id, gcpl_id, gapl_id)

H5Gcreate2( loc_id, “New/My new group”, lcpl_id, gcpl_id, gapl_id)

• New features can be invoked Creation order Unicode names Compact storage Intermediate group creation

• In 1.8.0 and later some functions have version number suffix.

Page 81: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 81

HDF5 Macros for API Compatibility

Name: H5GcreateSignature: hid_t H5Gcreate( hid_t loc_id, const char *name, size_t

size_hint ) hid_t H5Gcreate( hid_t loc_id, const char *name, hid_t

lcpl_id, hid_t gcpl_id, hid_t gapl_id Purpose: Creates a new empty group and links it to a location in the

file.Description: H5Gcreate is a macro that is mapped to either

H5Gcreate1 or

H5Gcreate2, depending on the needs of the application.

Page 82: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 82

HDF5 Library Configuration

• Mapping of APIs is set at configuration time• Several configuration flags are provided to enable

different mappings• Macro name is mapped to old API, e.g., H5Gcreate

is mapped to H5Gcreate1

• Macro name is mapped to new API, e.g., H5Gcreate is mapped to H5Gcreate2 (will have different parameters from 1.6 H5Gcreate)

• Disable old APIs completely, e.g., H5Gcreate1 is not available in the library

Page 83: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 83

If you are HDF5 library maintainer…

• If you do installation of the HDF5 Library, there are several choices• Install library with 1.8 APIs as default APIs (default,

recommended).

• Install library with 1.6 APIs.

• Install library without 1.6 APIs to get smaller footprint.

• Install library with strict file format checking option to detect files that do not comply with the File Format Specification.

Page 84: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 84

HDF5 Library Configuration

Configure flag Public APIs has version

--with-default-api-version=v18

(default)

1.8(Example: H5Gcreate is mapped to H5Gcreate2, old H5Gcreate is H5Greate1)

--enable-strict-format-checks1.8

Library checks compliance with the file format

--disable-deprecated-symbols

1.8(Example: H5Gcreate is mapped to H5Gcreate2, H5Gcreate1 is not available)

--with-default-api-version=v161.6

(Example: H5Gcreate is mapped to H5Gcreate1, H5Gcreate2 is available)

Page 85: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 85

If you used HDF5 before…

• And want to use new HDF5 1.8 Library with the application written for the older versionsh5cc -DH5_USE_16_API my_program.c

• H5Gcreate is mapped to H5Gcreate1; all three H5Gcreate1, H5Gcreate2 and H5Gcreate can be used

• HDF5 compilation scripts and H5_USE_16_API flag can be used with GNU auto tools to build packages like HDF-EOS5.• The HDF Group runs daily tests for HDF-EOS5

using the latest HDF5 Libraries under development

Page 86: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 86

If you used HDF5 before …

• Other available options:Assuming both deprecated and new symbols areavailable in the library:

h5cc my_program.c• Both H5Gcreate1, H5Gcreate2 and H5Gcreate

may be used•H5Gcreate will have new signature

h5cc -DH5_NO_DEPRECATED_SYMBOLS my_program.c

• Only new symbols are available for application; H5Gcreate is mapped to H5Gcreate2; application may use both, but cannot use H5Gcreate1

Page 87: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 87

Example: --with-default-api-version=v18

hid_t file_id, group_id; /* identifiers */ .../* Open “file.h5” */ file_id = H5Fopen(“file.h5”, H5F_ACC_RDWR, H5P_DEFAULT, H5P_DEFAULT);

/* Create several groups in a file */ grp1_id = H5Gcreate (file_id, ”New/A", H5P_DEAFULT, gcpt, gapt); grp2_id = H5Gcreate1(file_id,"/B",0);… grp3_id = H5Gcreate2(file_id,”New/A", H5P_DEAFULT, gcpt, gapt);

Page 88: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 88

Example: --with-default-api-version=v16

hid_t file_id, group_id; /* identifiers */ .../* Open “file.h5” */ file_id = H5Fopen(“file.h5”, H5F_ACC_RDWR, H5P_DEFAULT, H5P_DEFAULT);

/* Create several groups in a file */ grp1_id = H5Gcreate (file_id, "/A",0);grp2_id = H5Gcreate1(file_id,"/B",0); grp3_id = H5Gcreate2(file_id,”New/C", H5P_DEAFULT, gcpt, gapt);

Page 89: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 89

Example: --disable-deprecated-symbols

hid_t file_id, group_id; /* identifiers */ .../* Open “file.h5” */ file_id = H5Fopen(“file.h5”, H5F_ACC_RDWR, H5P_DEFAULT, H5P_DEFAULT);

/* Create several groups in a file */ grp1_id = H5Gcreate (file_id, ”New/A", H5P_DEAFULT, gcpt, gapt);/* Compilation will fail */ grp2_id = H5Gcreate1(file_id,"/B",0); grp3_id = H5Gcreate2(file_id,”New/A", H5P_DEAFULT, gcpt, gapt);

Page 90: Migrating from HDF5 1.6 to 1.8

October 15, 2008 HDF and HDF-EOS Workshop XII 90

Thank you!

Questions?

Page 91: Migrating from HDF5 1.6 to 1.8

Acknowledgement

• This report is based upon work supported in part by a Cooperative Agreement with the National Aeronautics and Space Administration (NASA) under NASA Awards NNX06AC83A and NNX08AO77A. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics and Space Administration.

October 16, 2008 91HDF and HDF-EOS Workshop XII