QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University...
-
Upload
john-harrison -
Category
Documents
-
view
217 -
download
1
Transcript of QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University...
QCDgrid User Interfaces
James Perry, Andrew Jackson, Stephen Booth, Lorna Smith
EPCC, The University Of Edinburgh
QCDgrid Summary
QCDgrid project is developing a data and compute grid for scientists in the UKQCD collaboration– data storage grid has been up and running for some months
now– job submission system is in early stages of development– software developed is released as open source– builds on Globus 2.0, eXist XML database and various other
technologies
For more information on the project in general, see Lorna’s talk this afternoon!
User Requirements
User interface is naturally driven by users’ requirements– most QCDgrid users have a good understanding of computers– for them, advanced scripting capabilities are more important than
user-friendly GUIs– powerful command line interface is top priority for QCDgrid
software
GUIs also useful for some operations– for example, searching and browsing the metadata catalogue
C and Java APIs facilitate integration with software in many different programming languages
Datagrid Interfaces
Data management aspect of grid consists of two distinct parts
Low level data replication deals with files themselves– files at this level are just blocks of binary data – they could
contain anything– Globus replica catalogue maps logical filenames to actual
physical locations
Metadata catalogue associates some meaningful, structured information with each file– allows users to search for data more easily– maps interesting characteristics of data (structured as XML) to
logical filenames
Low Level Data Grid Interface
Low level operations provided by command line tools and C API– Java interface using JNI also available– SRM-compliant interface to some functionality
Fairly small set of basic operations– put a file or directory on the grid– get a file or directory from the grid– delete a file or directory– list files on grid– register interest in a file or directory
User must have a valid Globus proxy initialised
Data Grid Example Commands
Some example data grid commands:
put-file-on-qcdgrid /home/username/myfile gridfile
- puts the local file ‘myfile’ onto the grid under logical name ‘gridfile’- replication software will take care of deciding where to store the file, adding replica catalogue entries, etc.
get-file-from-qcdgrid -R griddir /tmp/mydir
- gets directory ‘griddir’ from the grid, storing it in local directory /tmp/mydir- ‘-R’ switch means ‘recursive’, works with most QCDgrid commands
More Example Commands
More example commands...qcdgrid-list
- lists all files on grid by logical name
i-like-this-file interestingdata
- registers interest in the file with logical name ‘interestingdata’- replication system takes this into account, tries to store files close to where they are most often wanted
qcdgrid-delete olddata
- removes all copies of the file ‘olddata’ from grid
Data Grid APIs
APIs provide similar functionality
– Example:
QCDgridClient grid = QCDgridClient.getClient(true);String logicalFile = “gridfile”;File physicalFile = new File(“localfile”);
grid.getFile(logicalFile, physicalFile);
Metadata
Problem: logical file names may not be meaningful– users may have trouble finding data
Solution: metadata catalogue– associate some meaningful information with each file on the
grid– including date produced, machine used, code used, actual
physical parameters– users can then search on these fields– metadata is XML, stored in eXist XML database– queried using XPath query language
Command line, GUI and Java interfaces (via standard XMLDB API) available
Metadata Interface: Commands
Command line functionality currently limited to 3 operations– submit– remove– update schema
Examples:
java QCDgridMetadataClient localhost:8080/exist \updateSchema newschema.xsd
java QCDgridMetadataClient localhost:8080/exist \submit newfile.xml newdocumentid
Metadata Interface: GUI
Metadata browser GUI allows users to easily search for the data they want
- XPath queries can be built using simple graphical input methods- GUI generated automatically from current schema- when schema is updated, GUI updates itself- matching data can be easily retrieved from the grid
Searching MDC, Step 1
Main browser window gives a list of saved queries– these are stored in the
user’s profile– support for ‘libraries’ of
queries is planned
Searching MDC, Step 2
Creating a new query– first a node in the XML
document structure must be selected from the tree
– tree is automatically generated from schema when browser starts up
– e.g. to find all the data produced on a certain date, user should select the ‘date’ node
Searching MDC, Step 3
Once node has been selected, predicate must be specified– this is just an XPath
term for criteria for matching node data
– predicate can be entered as raw XPath if desired
– most users will want to make use of form to simplify process
Searching MDC, Step 3 cont...
More complex queries can be created relatively easily– in this example, the query is extended to search for data from
2 years– for most queries, knowledge of XPath is not required
Searching MDC, Step 4
New query now appears on list– from here, queries can be
managed– queries can be combined
together– or can be submitted to the
database backend
Searching MDC, Step 5
Matching metadata documents are displayed– XML is parsed into easy-to-read, expandable tree format– corresponding data files can be fetched from grid at the press
of a button
Job Submission
QCDgrid job submission still in very early stages As with data management, users require command
interface that can be used from scripts– integration with data grid will simplify user interface– unlike plain Globus, job input, output and error streams can be
redirected to and from the user’s console– this allows for interactive jobs on the grid – useful for
debugging etc.
GUI or web portal interface may be added later
Example Commands
Early prototype of job submission software is up and running
Syntax quite similar to globus-job-run
– Example commands:
qcdgrid-job-submit qcdtest.epcc.ed.ac.uk \/bin/date
qcdgrid-job-submit doorstopper.epcc.ed.ac.uk \/usr/bin/program arg1 arg2 arg3 \--fetch-from-qcdgrid gridName localName
Administrator Interface
Previous slides have focussed on normal end users’ experience
QCDgrid software also provides tools to aid in administration– commands to add and remove grid nodes, and change the
state of existing nodes– commands for building and maintaining the Globus replica
catalogue– commands for maintaining directory of grid users
Admin GUI to integrate many of these functions is a possibility
Some Admin Commands
Administrators are identified by their certificate subjects– Must have a valid proxy with subject listed in the config file
before executing these commands
add-qcdgrid-node newnode.ed.ac.uk Edinburgh \/home/qcdgrid
disable-qcdgrid-node notworking.ed.ac.uk
verify-qcdgrid-rc
setup-security.sh adduser James Perry \/O=certificate/O=subject/CN=jtp \[email protected]
Interface Summary
Low level data grid has command line interface and APIs
Metadata catalogue mainly accessed through browser GUI– this also integrates with low level data grid
Job submission currently usable from command line only– possible GUI/web portal in future
Various admin tools exist or are in development Better integration of the different parts of the project is
planned