3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department...

15
3rd Nov 2000 HEPiX/HEPNT 2000 1 CDF-UK CDF-UK MINI-GRID MINI-GRID Ian McArthur Oxford University, Physics Department [email protected]

Transcript of 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department...

Page 1: 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department Ian.McArthur@physics.ox.ac.uk.

3rd Nov 2000 HEPiX/HEPNT 2000 1

CDF-UK CDF-UK MINI-GRIDMINI-GRID

Ian McArthur

Oxford University, Physics Department

[email protected]

Page 2: 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department Ian.McArthur@physics.ox.ac.uk.

3rd Nov 2000 HEPiX/HEPNT 2000 2

BackgroundBackground CDF collaborators in the UK applied for JIF grant

for IT equipment in 1998. Awarded £1.67M in summer 2000.

First half of grant will buy – Multiprocessor systems plus 1TB of disk for 4

Universities– 2 multiprocessors plus 2.5 TB of disk for RAL– A 32 CPU farm for RAL– 5 TB of disk and 8 high end workstations for FNAL

Emphasis on high IO throughput ‘super-workstations’.

A dedicated network link from London to FNAL

Page 3: 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department Ian.McArthur@physics.ox.ac.uk.

3rd Nov 2000 HEPiX/HEPNT 2000 3

CDF-UK Equipment BidCDF-UK Equipment Bid

Page 4: 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department Ian.McArthur@physics.ox.ac.uk.

3rd Nov 2000 HEPiX/HEPNT 2000 4

Hardware and NetworkHardware and Network Tender document is written and schedule is on target

for equipment delivery in May 2001. Second phase starts June 2002

Developed a scheme for transparent access to CDF systems via the US link.– Each system CDF-UK requires to use the link has an alternative

IP name and address to allow the data to be sent down the dedicated link.

– A Network Address Translation scheme ensures that return traffic takes the same path (symmetric routing)

– Demonstrated the scheme working with 2 Cisco routers on a local network.

– Starting to talk to network providers to implement physical link.– Must try to make Kerberos work across this link

Page 5: 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department Ian.McArthur@physics.ox.ac.uk.

3rd Nov 2000 HEPiX/HEPNT 2000 5

Page 6: 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department Ian.McArthur@physics.ox.ac.uk.

3rd Nov 2000 HEPiX/HEPNT 2000 6

Software ProjectSoftware Project

JIF proposal only covered hardware but in the meantime GRID has arrived !

Aim to provide a scheme to allow efficient use of the new equipment and other distributed resources.

Concentrate on solving real-user issues. Develop an architecture for locating data, data

transfer and job submission within a distributed environment

Based on the GRID architecture initially on top of the Globus toolkit. Gives us experience in this rapidly developing field.

Page 7: 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department Ian.McArthur@physics.ox.ac.uk.

3rd Nov 2000 HEPiX/HEPNT 2000 7

Some RequirementsSome Requirements

Want an efficient environment: so automate routine tasks as much as possible

With few resources available must make best use of the existing packages and require few or no modifications to existing software.

To make best use of the systems available:– data may need to be moved to where these is available

CPU,– or a job may need to be submitted to a remote site to

avoid moving the data. Produce a simple but useful system ASAP.

Page 8: 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department Ian.McArthur@physics.ox.ac.uk.

3rd Nov 2000 HEPiX/HEPNT 2000 8

Design principlesDesign principles

All sites are equal All sites hold meta-data describing only local data Use LDAP to publish meta-data kept in:

– Oracle - at FNAL– msql - at most other places

• may go to MySQL

– Can introduce caching but keep it simple at first Use local intelligence at each end of data transfer

– allows us to take account of local idiosyncrasies e.g. use of near-line storage, disk space management

Use existing Disk Inventory Manager

Page 9: 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department Ian.McArthur@physics.ox.ac.uk.

3rd Nov 2000 HEPiX/HEPNT 2000 9

CDF DataCDF Data Dataset: a primary dataset contains all the

processed data from a specific physics channel. – Secondary datasets by event selection

– Datasets will grow over time as more data is taken and data continues to be processed.

Fileset: smallest collection of data which can be requested from the data handling system. At Fermilab, a fileset is mapped to a single partition on a tape and contains a few files.

File: A member of a fileset. The smallest unit of data known to a filesystem, typically 1GB.

Metadata: Stores relationships between files, filesets and datasets, run conditions, luminosity etc.

Page 10: 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department Ian.McArthur@physics.ox.ac.uk.

3rd Nov 2000 HEPiX/HEPNT 2000 10

Data Location/CopyData Location/Copy

Page 11: 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department Ian.McArthur@physics.ox.ac.uk.

3rd Nov 2000 HEPiX/HEPNT 2000 11

LayersLayers

User Interface

Dataset maintainer

Data locator

Data copier

Globus toolkit

...

Job Submission

Page 12: 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department Ian.McArthur@physics.ox.ac.uk.

3rd Nov 2000 HEPiX/HEPNT 2000 12

Functionality at a siteFunctionality at a site A mechanism to allow jobs from participating sites

to be run. Publication of the local metadata Publication of  information about other system

resources (CPU, Disk, Batch queues etc). Transmission of data via network.

– This may involve staging of data from tape to disk before transmission.

Receive data from the network or from tapes.

Copy or construct metadata

Some sites may have reduced functionality

Page 13: 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department Ian.McArthur@physics.ox.ac.uk.

3rd Nov 2000 HEPiX/HEPNT 2000 13

ScopeScope

Plan to install at – 4 UK universities (Glasgow, Liverpool, Oxford,

UCL)

– RAL

– FNAL (although this would be reduced functionality, data and metadata exporter)

– More non-UK sites could be included Intend to have basic utilities in place at

time of equipment installation (May 2001)

Page 14: 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department Ian.McArthur@physics.ox.ac.uk.

3rd Nov 2000 HEPiX/HEPNT 2000 14

Work so farWork so far Project plan under development – once finished

additional resources will be requested. Globus installed at a number of sites. Remote execution

of shell commands checked. Some bits demonstrated:

– LDAP to Oracle via Python script• Python convenient scripting language for the job • May use a daemon to hold connection to ORACLE• LDAP only implement search - and even this is quite tricky because

your script should support filter, base and scope.• LDAP schema will not reflect full SQL schema but just what is

needed.

– Java to LDAP (via JNDI)• JNDI (Java Naming and Directory Interface) gives very elegant

interface to LDAP

Page 15: 3rd Nov 2000HEPiX/HEPNT 20001 CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department Ian.McArthur@physics.ox.ac.uk.

3rd Nov 2000 HEPiX/HEPNT 2000 15

Longer Term GoalsLonger Term Goals User Interface to be implemented as Java

application to give platform independence. UI to automate or suggest strategies for moving

data/submitting jobs– Need to include cost/elapsed time estimates for task

completion– Need to look up dataset sizes, network health, time to copy

from tape or disk, cpu load etc. Look for more generic solutions Evaluate any new GRID tools which might

standardize any parts we’ve implemented ourselves. Consolidation with other GRID projects