Integrating HDF5 with SRB
description
Transcript of Integrating HDF5 with SRB
February 2-3, 2006 SRB Workshop, San Diego
Peter Cao, NCSA
Mike Wan, SDSC
Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration
Object-level Access to Remote Files
Integrating HDF5 with SRB
February 2-3, 2006 SRB Workshop, San Diego 2/26
Outline
Introduction to HDF5
The HDF-SRB model
SRB Support in HDFView
February 2-3, 2006 SRB Workshop, San Diego 3/26
Overview of HDF5 Answering big
questions …
Matter & universe
Weather & climate
August 24, 2001 August 24, 2002Total Column Ozone (Dobson)
60 385 610
Life & nature
February 2-3, 2006 SRB Workshop, San Diego 4/26
Overview of HDF5 Involves big
data …
February 2-3, 2006 SRB Workshop, San Diego 5/26
Overview of HDF5 On big
computers …
February 2-3, 2006 SRB Workshop, San Diego 6/26
Overview of HDF5 HDF solution
…
Software & tools
open source & multiple platform
Commonmodels
extensions Standard
APIsconventions & easy use
File formatfor all kinds of data
Efficiencystorage & IO
February 2-3, 2006 SRB Workshop, San Diego 7/26
Overview of HDF5 Exmaple HDF5
February 2-3, 2006 SRB Workshop, San Diego 8/26
Overview of HDF5 HDF Software
HDF I/O Library
Tools & ApplicationsTools & Applications
HDF FileHDF File
February 2-3, 2006 SRB Workshop, San Diego 9/26
Overview of HDF5 Object
model
Primary ObjectsGroupsDatasets
Additional ways to organize dataAttributesSharable objectsStorage and access properties
February 2-3, 2006 SRB Workshop, San Diego 10/26
Overview of HDF5 Groups
“/”
tom dick harry
temp
A mechanism for collections of related objectsEvery file starts with a root groupSimilar to UNIXdirectoriesCan have attributes
February 2-3, 2006 SRB Workshop, San Diego 11/26
Overview of HDF5
Datasets
DataMetadataDataspace
3
RankRank
Dim_2 = 5Dim_1 = 4
DimensionsDimensions
time = 32.4
pressure = 987
temp = 56
AttributesAttributes
Chunked
compressed
Dim_3 = 7
Storage infoStorage info
IEEE 32-bit float
DatatypeDatatype
February 2-3, 2006 SRB Workshop, San Diego 12/26
Overview of HDF5 Data
subsetting
(c) A sequence of points from a 2D array to a sequence of points in a 3D array.
(d) Union of hyperslabs in file to union of hyperslabs in memory.
(b) Regular series of blocks from a 2D array to a contiguous sequence at a certain offset in a 1D array
(a) Hyperslab from a 2D array to the corner of a smaller 2D array
February 2-3, 2006 SRB Workshop, San Diego 13/26
Project Description
Motivation
SRB HDF5
Indexing and searching Distributed data system Access control
Large and diverse data High performance access Interactive and subsetting
High performance distributed data system
February 2-3, 2006 SRB Workshop, San Diego 14/26
Project Description Goals
Working prototype of client/server system forobject-level access to HDF5 stored in the SRB
Use SRB as middleware to transfer data between the server and client Use Object-level access for interactive and efficient access to part of the file
February 2-3, 2006 SRB Workshop, San Diego 15/26
Remote Data Access on SRB
Methods
Normal ways to access SRB:Get the whole file: large files (100TB
SCEC)Use POSIX low level calls: low
performance
New way: Implement proxy operations to access
objects or parts of objects in one request
February 2-3, 2006 SRB Workshop, San Diego 16/26
Normal SRB File Access Architecture
SRB ServerSRB Server
HDF5HDF5
MCAT
clientclient
HDF5 File(whole file or a
sequence of bytes)
February 2-3, 2006 SRB Workshop, San Diego 17/26
Object-level File Access Architecture
SRB ServerSRB Server
MCAT
HDF5 LibraryHDF5 Library
HDF5-SRB Module(pack/unpack messages)
HDF5 Object(File, Group, Dataset,
Subset, Attribute)
HDF5-SRB Module(pack/unpack messages)
ClientClient ServerServer
HDF5 Object(File, Group, Dataset,
Subset, Attribute)
Client Application HDF5 file
February 2-3, 2006 SRB Workshop, San Diego 18/26
Examples of File Access
HDF5HDF5
I need to see the eye of Hurricane Bob!
February 2-3, 2006 SRB Workshop, San Diego 19/26
Examples of File Access Whole file
transfer
clientclient
Get the file
Transfer large image – slow!
HDF5HDF5
February 2-3, 2006 SRB Workshop, San Diego 20/26
Examples of File Access SRB POSIX
API
HDF5HDF5
clientclient
image foundimage open
open image
find imagefile’s open
Open file
Many small messages – slow and complex!
February 2-3, 2006 SRB Workshop, San Diego 21/26
Examples of File Access Object
level
clientclient
HDF5HDF5
Get me the eye of hurricane Bob
1 request, small transfer – fast!
February 2-3, 2006 SRB Workshop, San Diego 22/26
HDF5-SRB Model New
objects/APIs
A new set data objects H5File, H5Group, H5Dataset, H5Datatype, etc Encapsulated client requests and server results
Enhanced SRB APIs Pack/Unpack routines (exchange data between
byte stream and structure) to handle complicated struct – string, pointers, pointers to arrays, arrays of pointers, etc
New srbGenProxyFunct (general Proxy Function) handles other types of request besides HDF5
February 2-3, 2006 SRB Workshop, San Diego 23/26
HDF5-SRB Model Data Flow
Client APIsrbObjRequest(void *obj, int objID)
Server APIsrbObjProcess(void *obj, int objID)
srbGenProxyFunct
1. packMsg()
2. u
npac
kMsg
()
HDF5 Library
HDF5 file
3. H5Obj::op()
4. Access file
5. H5Object
6. p
ackM
sg()
7. unpackMsg()
SRB Server
February 2-3, 2006 SRB Workshop, San Diego 24/26
Running Server/Client
A SRB server that supports HDF5 HDF5 library and other external libraries (SZIP, ZLIB) A SRB version 3.4 or later from
http://www.sdsc.edu/srb/ Follow instruction on how to run SRB server from UG
packed with SRB source release or online at http://hdf.ncsa.uiuc.edu/hdf-srb-html/HDF-SRB-UG.html
Any client application that implements HDF5-SRB Objects No HDF5 library is required on the client Example client application: HDFView 2.3 or above
February 2-3, 2006 SRB Workshop, San Diego 25/26
Short Demo HDFViewSupport Windows and Linux
February 2-3, 2006 SRB Workshop, San Diego 26/26
Question / Comments?