QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University...

22
QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh

Transcript of QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University...

Page 1: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

QCDgrid User Interfaces

James Perry, Andrew Jackson, Stephen Booth, Lorna Smith

EPCC, The University Of Edinburgh

Page 2: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

QCDgrid Summary

QCDgrid project is developing a data and compute grid for scientists in the UKQCD collaboration– data storage grid has been up and running for some months

now– job submission system is in early stages of development– software developed is released as open source– builds on Globus 2.0, eXist XML database and various other

technologies

For more information on the project in general, see Lorna’s talk this afternoon!

Page 3: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

User Requirements

User interface is naturally driven by users’ requirements– most QCDgrid users have a good understanding of computers– for them, advanced scripting capabilities are more important than

user-friendly GUIs– powerful command line interface is top priority for QCDgrid

software

GUIs also useful for some operations– for example, searching and browsing the metadata catalogue

C and Java APIs facilitate integration with software in many different programming languages

Page 4: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Datagrid Interfaces

Data management aspect of grid consists of two distinct parts

Low level data replication deals with files themselves– files at this level are just blocks of binary data – they could

contain anything– Globus replica catalogue maps logical filenames to actual

physical locations

Metadata catalogue associates some meaningful, structured information with each file– allows users to search for data more easily– maps interesting characteristics of data (structured as XML) to

logical filenames

Page 5: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Low Level Data Grid Interface

Low level operations provided by command line tools and C API– Java interface using JNI also available– SRM-compliant interface to some functionality

Fairly small set of basic operations– put a file or directory on the grid– get a file or directory from the grid– delete a file or directory– list files on grid– register interest in a file or directory

User must have a valid Globus proxy initialised

Page 6: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Data Grid Example Commands

Some example data grid commands:

put-file-on-qcdgrid /home/username/myfile gridfile

- puts the local file ‘myfile’ onto the grid under logical name ‘gridfile’- replication software will take care of deciding where to store the file, adding replica catalogue entries, etc.

get-file-from-qcdgrid -R griddir /tmp/mydir

- gets directory ‘griddir’ from the grid, storing it in local directory /tmp/mydir- ‘-R’ switch means ‘recursive’, works with most QCDgrid commands

Page 7: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

More Example Commands

More example commands...qcdgrid-list

- lists all files on grid by logical name

i-like-this-file interestingdata

- registers interest in the file with logical name ‘interestingdata’- replication system takes this into account, tries to store files close to where they are most often wanted

qcdgrid-delete olddata

- removes all copies of the file ‘olddata’ from grid

Page 8: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Data Grid APIs

APIs provide similar functionality

– Example:

QCDgridClient grid = QCDgridClient.getClient(true);String logicalFile = “gridfile”;File physicalFile = new File(“localfile”);

grid.getFile(logicalFile, physicalFile);

Page 9: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Metadata

Problem: logical file names may not be meaningful– users may have trouble finding data

Solution: metadata catalogue– associate some meaningful information with each file on the

grid– including date produced, machine used, code used, actual

physical parameters– users can then search on these fields– metadata is XML, stored in eXist XML database– queried using XPath query language

Command line, GUI and Java interfaces (via standard XMLDB API) available

Page 10: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Metadata Interface: Commands

Command line functionality currently limited to 3 operations– submit– remove– update schema

Examples:

java QCDgridMetadataClient localhost:8080/exist \updateSchema newschema.xsd

java QCDgridMetadataClient localhost:8080/exist \submit newfile.xml newdocumentid

Page 11: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Metadata Interface: GUI

Metadata browser GUI allows users to easily search for the data they want

- XPath queries can be built using simple graphical input methods- GUI generated automatically from current schema- when schema is updated, GUI updates itself- matching data can be easily retrieved from the grid

Page 12: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Searching MDC, Step 1

Main browser window gives a list of saved queries– these are stored in the

user’s profile– support for ‘libraries’ of

queries is planned

Page 13: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Searching MDC, Step 2

Creating a new query– first a node in the XML

document structure must be selected from the tree

– tree is automatically generated from schema when browser starts up

– e.g. to find all the data produced on a certain date, user should select the ‘date’ node

Page 14: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Searching MDC, Step 3

Once node has been selected, predicate must be specified– this is just an XPath

term for criteria for matching node data

– predicate can be entered as raw XPath if desired

– most users will want to make use of form to simplify process

Page 15: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Searching MDC, Step 3 cont...

More complex queries can be created relatively easily– in this example, the query is extended to search for data from

2 years– for most queries, knowledge of XPath is not required

Page 16: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Searching MDC, Step 4

New query now appears on list– from here, queries can be

managed– queries can be combined

together– or can be submitted to the

database backend

Page 17: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Searching MDC, Step 5

Matching metadata documents are displayed– XML is parsed into easy-to-read, expandable tree format– corresponding data files can be fetched from grid at the press

of a button

Page 18: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Job Submission

QCDgrid job submission still in very early stages As with data management, users require command

interface that can be used from scripts– integration with data grid will simplify user interface– unlike plain Globus, job input, output and error streams can be

redirected to and from the user’s console– this allows for interactive jobs on the grid – useful for

debugging etc.

GUI or web portal interface may be added later

Page 19: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Example Commands

Early prototype of job submission software is up and running

Syntax quite similar to globus-job-run

– Example commands:

qcdgrid-job-submit qcdtest.epcc.ed.ac.uk \/bin/date

qcdgrid-job-submit doorstopper.epcc.ed.ac.uk \/usr/bin/program arg1 arg2 arg3 \--fetch-from-qcdgrid gridName localName

Page 20: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Administrator Interface

Previous slides have focussed on normal end users’ experience

QCDgrid software also provides tools to aid in administration– commands to add and remove grid nodes, and change the

state of existing nodes– commands for building and maintaining the Globus replica

catalogue– commands for maintaining directory of grid users

Admin GUI to integrate many of these functions is a possibility

Page 21: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Some Admin Commands

Administrators are identified by their certificate subjects– Must have a valid proxy with subject listed in the config file

before executing these commands

add-qcdgrid-node newnode.ed.ac.uk Edinburgh \/home/qcdgrid

disable-qcdgrid-node notworking.ed.ac.uk

verify-qcdgrid-rc

setup-security.sh adduser James Perry \/O=certificate/O=subject/CN=jtp \[email protected]

Page 22: QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Interface Summary

Low level data grid has command line interface and APIs

Metadata catalogue mainly accessed through browser GUI– this also integrates with low level data grid

Job submission currently usable from command line only– possible GUI/web portal in future

Various admin tools exist or are in development Better integration of the different parts of the project is

planned