May 30-31, 2012HDF5 Workshop at PSI1 HDF5 at Glance Quick overview of known topics.
-
Upload
anabel-strickland -
Category
Documents
-
view
217 -
download
0
Transcript of May 30-31, 2012HDF5 Workshop at PSI1 HDF5 at Glance Quick overview of known topics.
May 30-31, 2012 HDF5 Workshop at PSI 1
HDF5 at Glance
Quick overview of known topics
May 30-31, 2012 HDF5 Workshop at PSI 2
Outline
• Overview of HDF5 • Topics not covered by PSI HDF5 Tutorial
• Groups and Links
HDF5 File
HDF5 Workshop at PSI 3
lat | lon | temp----|-----|----- 12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6
An HDF5 file is a container that holds data objects.
Experiment Notes:
Serial Number: 99378920
Date: 3/13/09
Configuration: Standard 3
May 30-31, 2012
HDF5 File
4
lat | lon | temp----|-----|----- 12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6
/
SimOutViz
HDF5 groups and links organize data objects.
Every HDF5 file has a root group
Parameters10;100;1000
Timestep36,000
May 30-31, 2012 HDF5 Workshop at PSI
Similar to UNIX directories
HDF5 Software Layers & Storage
HDF5 File Format File Split
Files
File on Parallel Filesystem
Other?I/O Drivers
Virtual File Layer Posix I/O
Split Files MPI I/O Custom
Internals Memory Mgmt
Datatype Conversion Filters Chunked
StorageVersion
Compatibilityand so on…
LanguageInterfaces
C, Fortran, C++
HDF5 Data Model ObjectsGroups, Datasets, Attributes, …
Tunable PropertiesChunk Size, I/O Driver, …
HD
F5 L
ibra
rySt
orag
e
h5dumptool
High LevelAPIs
HDFview toolTo
ols
h5repack tool
Java Interface…
API
5May 30-31, 2012 HDF5 Workshop at PSI
GROUPS AND LINKS
May 30-31, 2012 HDF5 Workshop at PSI 6
May 30-31, 2012 HDF5 Workshop at PSI 7
Groups and Links
• Groups are containers for links (graph edges)• Links were added in 1.8.0• Warning: Many APIs in H5G interface are
obsolete - use H5L interfaces to discover and manipulate file structure
Example h5_links.py
8
/
BA
Different kinds of links
May 30-31, 2012 HDF5 Workshop at PSI
a External
a soft
dangling
dset.h5
links.h5
Dataset can be “reached” using three paths /A/a/a/soft Dataset is in a different file
May 30-31, 2012 HDF5 Workshop at PSI 9
Links
• Name• Example: “A”, “B”, “a”, “dangling”, “soft”• Unique within a group; “/” are not allowed in names
• Type• Hard Link
• Value is object’s address in a file• Created automatically when object is created• Can be added to point to existing object
• Soft Link• Value is a string , for example, “/A/a”, but can be
anything• Use to create aliases
May 30-31, 2012 HDF5 Workshop at PSI 10
Links (cont.)
• Type• External Link
• Value is a pair of strings , for example, (“dset.h5”, “dset” )
• Use to access data in other HDF5 files• HDF5 1.8.7 introduced caching of files
opened via external links H5Pset_elink_file_cache_size
May 30-31, 2012 HDF5 Workshop at PSI 11
Links Properties
• Links Properties• ASCII or UTF-8 encoding for names• Create intermediate groups
• Saves programming effort
• C examplelcpl_id = H5Pcreate(H5P_LINK_CREATE);
H5Pset_create_intermediate_group( lcpl_id, 1 );
H5Gcreate (fid, "A/B", lcpl_id, H5P_DEFAULT, H5P_DEFAULT);
• Group “A” will be created if it doesn’t exist
May 30-31, 2012 HDF5 Workshop at PSI 12
Operations on Links
• See H5L interface in Reference Manual• Create• Delete• Copy• Iterate• Check if exists
May 30-31, 2012 HDF5 Workshop at PSI 13
Groups Properties
• Creation properties• Type of links storage
• Compact (in 1.8.* versions)• Used with a few members (default under 8)
• Dense (default behavior)• Used with many (>16) members (default)
• Tunable size for a local heap• Save space by providing estimate for size of the storage
required for links names
• Can be compressed (in 1.8.5 and later)• Many links with similar names (XXX-abc, XXX-d, XXX-
efgh, etc.)• Requires more time to compress/uncompress data
May 30-31, 2012 HDF5 Workshop at PSI 14
Groups Properties
• Creation properties• Links may have creation order tracked and indexed
• Indexing by name (default) • A, B, a, dangling, soft
• Indexing by creation order (has to be enabled)• A, B, a, soft, dangling
• http://www.hdfgroup.org/ftp/HDF5/examples/examples-by-api/api18-c.html
May 30-31, 2012 HDF5 Workshop at PSI 15
Discovering HDF5 file’s structure
• HDF5 provides C and Fortran 2003 APIs for recursive and non-recursive iterations over the groups and attributes• H5Ovisit and H5Literate (H5Giterate)• H5Aiterate
• Life is much easier with H5Py (h5_visita.py)import h5pydef print_info(name, obj): print name for name, value in obj.attrs.iteritems():
print name+":", valuef = h5py.File('GATMO-SATMS-npp.h5', 'r+')f.visititems(print_info)f.close()
May 30-31, 2012 HDF5 Workshop at PSI 16
Checking a path in HDF5
• HDF5 1.8.8 provides HL C and Fortran 2003 APIs for checking if paths exists• H5LTvalid_path (h5ltvalid_path_f)• Example: Is there an object with a path /A/B/C/d ?• TRUE if there is a path, FALSE otherwise
17
Hints
• Use latest file format (see H5Pset_libver_bound function in RM) • Save space when creating a lot of groups in
a file• Save time when accessing many objects
(>1000)• Caution: Tools built with the HDF5 versions prior
to 1.8.0 will not work on the files created with this property
May 30-31, 2012 HDF5 Workshop at PSI
May 30-31, 2012 HDF5 Workshop at PSI 18
Informal Benchmark
• Create a file and a group in a file• Create up to 10^6 groups with one dataset in
each group• Compare files sizes and performance of HDF5
1.8.1 using the latest group format with the performance of HDF5 1.8.1 (default, old format) and 1.6.7
• Note: Default 1.8.1 and 1.6.7 became very slow after 700000 groups
Time to Open and Read a Dataset
May 30-31, 2012 HDF5 Workshop at PSI 19
10000 100000 10000000.1
1
10
100
1000
1.61.8 (old groups)1.8 (new groups)
Number of Groups
Tim
e (
mil
lis
ec
on
ds
)
Time to Close the File
May 30-31, 2012 HDF5 Workshop at PSI 20
10000 100000 10000000
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
1.61.8 (old groups)1.8 (new groups)
Number of Groups
Tim
e (
mil
lis
ec
on
ds
)
File Size
May 30-31, 2012 HDF5 Workshop at PSI 21
0 200000 400000 600000 8000000
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1.8 (old groups)1.8 (new groups)
Number of Groups
Siz
e (
kil
ob
yte
s)
DATATYPES
May 30-31, 2012 HDF5 Workshop at PSI 22
23
Datatypes
• See Tutorial examples
May 30-31, 2012 HDF5 Workshop at PSI
Thank You!
Questions?
May 30-31, 2012 HDF5 Workshop at PSI 24