A PPARC funded project Deploying and Configuring a Registry .
May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration...
-
Upload
jesse-chase -
Category
Documents
-
view
216 -
download
0
description
Transcript of May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration...
![Page 1: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/1.jpg)
May 2005 PPARC e-Science PG School 1
QCDgrid
Chris Maynard
A Grid for UKQCD
National collaboration for lattice QCD
![Page 2: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/2.jpg)
May 2005 PPARC e-Science PG School 2
Contents UKQCD science
– Bluffers guide to lattice QCD– Why UKQCD needs a grid
QCDgrid– QCDgrid architecture– Metadata, metadata, metadata
• XML schema
Sharing data Putting it all together
![Page 3: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/3.jpg)
May 2005 PPARC e-Science PG School 3
Lattice QCD Quantum Chromodynamics is …
– Theory of strong interaction• Properties of hadrons (proton, pion etc)• from quarks and gluons
– Strongly coupled• Perturbation theory only works in a specific regime
Replace space-time with 4d lattice Compute numerically QFT path integral
– Integrate operators over all quark and gluon fields
![Page 4: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/4.jpg)
May 2005 PPARC e-Science PG School 4
Monte Carlo integration Generate finite number of configurations of
quark and gluon fields Probability given by the integrand
– Importance sampling• Only generate configs which contribute
Integral finite sum– Compute operators on each configuration– Average over ensemble– Statistical estimate of expectation value
![Page 5: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/5.jpg)
May 2005 PPARC e-Science PG School 5
The hard bit The fermion matrix (quark-gluon coupling)
– Proportional to the volume– Large and sparse and badly conditioned
qN mC 1
min
max
Up and down quark mass nearly zero!– Calculate determinant and inverse M
• Generate Markov Chain Requires biggest computer you can find
![Page 6: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/6.jpg)
May 2005 PPARC e-Science PG School 6
QCDOC – system on a chip
![Page 7: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/7.jpg)
May 2005 PPARC e-Science PG School 7
Data and processing Many data sets ensembles of hundreds of
configurations Depend on lattice spacing, quark mass etc Very costly to produce UKQCD members distributed around UK Processing jobs on
– Workstation, Linux cluster, QCDOC data grid
![Page 8: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/8.jpg)
May 2005 PPARC e-Science PG School 8
What do I want from the grid Data security
– Replication - Multiple copies of data– Who has read/write permission
Data access– Metadata - discover what is stored– Get data without having to know location– Logistics of data moving taken care of
![Page 9: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/9.jpg)
May 2005 PPARC e-Science PG School 9
Machines – LHC terminology Tier 1 system
– 50TByte SAN system• QCDOC front end directly connected
– Hold one copy of all UKQCD data• Machine exists, not yet part of grid
Tier 2 systems – Edinburgh, Liverpool, Southampton, Swansea
• Linux server + RAID array disk• Now: 5 TByte. Upgrade to12.5 TByte
50 TByte capacity + 50 TByte replicated
![Page 10: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/10.jpg)
May 2005 PPARC e-Science PG School 10
Software Middleware: Globus 2.4 RC: Globus RC MC: Native XML database eXist Bespoke Control Thread (CT) Command line client tools GUI client tools
– Metadata browser• Based on OGSA-DIA browser
– Job submission tool
![Page 11: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/11.jpg)
May 2005 PPARC e-Science PG School 11
Control Thread
![Page 12: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/12.jpg)
May 2005 PPARC e-Science PG School 12
Command line tools For grid administrator
– Add/remove machine– Disable/enable machine– Add/remove users (see later)– Remove Data
For user– Add, retrieve, store locally, data– Job submission post-production on grid data
• Remote machine not required to be QCDgrid
![Page 13: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/13.jpg)
May 2005 PPARC e-Science PG School 13
QCDgrid GUI Same functionality as command line Metadata browser
– Search and read metadata Based on OGSA-DIA browser
– Freely available from qcdgrid.forge.nesc.ac.uk Find data via metadata Can read MC without authentificaton
![Page 14: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/14.jpg)
May 2005 PPARC e-Science PG School 14
Authenticated access QCDgrid data access
– X509 certificate from trusted CA• Currently only UK e-Science CA• Will trust other CAs! For data sharing
– QCDgrid is not closed• Not anonymous, but authenticated access.
– ILDG data sharing• Technically could move to anonymous access• Require data sharing policy from ILDG board
![Page 15: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/15.jpg)
May 2005 PPARC e-Science PG School 15
Job submission tools Can talk to any system running globus
– You need authenticated access Submit job and which data on QCDgrid Get back results
– Data can be stored on QCDgrid For instance NGS can be used to process
QCDgrid data
![Page 16: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/16.jpg)
May 2005 PPARC e-Science PG School 16
Grid concepts - Namespace Conventional namespace is a
mathematical set– Used in definition of Logical filename
XML Namespace defined by W3.org as A collection of names identified by a URI
reference XML namespace has internal structure Can amalgamate namespaces
![Page 17: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/17.jpg)
May 2005 PPARC e-Science PG School 17
Grid concepts – logical filename
Logical filename (LFN) is a name in a namespace which identifies a file
Often it is a URI– Not machine.domain:/path
Data grid LFN references a file which can have several copies
Replica catalogue maps LFN to file instances
![Page 18: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/18.jpg)
May 2005 PPARC e-Science PG School 18
Grid concepts – replica catalogue Replica catalogue maps LFN to actual file
instance Data grid - several copies of file Replica catalogue tracks number and
location of file instances Data access is via the LFN and the replica
catalogue
![Page 19: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/19.jpg)
May 2005 PPARC e-Science PG School 19
Metadata Data about data “meaningful” filenames not enough Require a scheme for organising metadata Scheme has to be extensible
– New things not previously thought of Only know full metadata when data
created Metadata capture is very important
![Page 20: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/20.jpg)
May 2005 PPARC e-Science PG School 20
D52C202K3500U010010_LL3450X_FL3400X_CMesonT00T31
A broken scheme
UKQCD filename– What does X stand for?
Wilson, Rotated, Clover– Many different clover. Scheme broken– X means none of the above!
D52C202K3500U010010_LL3450X_FL3400X_CMesonT00T31
Dynamical cSW=2.02
NP determined – no information
![Page 21: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/21.jpg)
May 2005 PPARC e-Science PG School 21
XML
XML eXtensible Markup Language www.w3.org/XML
– XML is for structuring data
– XML looks a bit like HTML
– XML is text, but isn't meant to be read
– XML is verbose by design
– XML is a family of technologies
– XML is license-free, platform-independent and well-supported
![Page 22: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/22.jpg)
May 2005 PPARC e-Science PG School 22
FAQs about XML schema
What is XML schema?– Collection of rules for XML documents– An XML schema is itself an XML document
Why do we need an XML schema?– Computers can read and understand XML IDs– <length>16</length>– Meaning of length is context dependent
![Page 23: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/23.jpg)
May 2005 PPARC e-Science PG School 23
QCDml1.1
Metadata split into two schemata– Ensemble XML <markovChain/>– Config XML <gaugeConfiguration/>
• N.B. use lowerCamelConvention
ILDG website for XML schema files– http://www.lqcd.org/ildg– Go to Metadata and follow links– Version 1.1 online and ready to use
![Page 24: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/24.jpg)
May 2005 PPARC e-Science PG School 24
Namespaces
Example XML ID for UKQCD data
XML Namespace defined by W3.org as A collection of names identified by a URI
reference
![Page 25: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/25.jpg)
May 2005 PPARC e-Science PG School 25
First namespace
URI defines namespace for QCDml
This is the default namespace All elements of QCDml belong to this
namespace
![Page 26: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/26.jpg)
May 2005 PPARC e-Science PG School 26
Second namespace
Namespace of XML schema itself
Prefix <xsi:> for elements of XML schema
XML ID is valid against WC3 XML schema
![Page 27: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/27.jpg)
May 2005 PPARC e-Science PG School 27
SchemaLocation
The namespace of the schema
The file which contains the schema URI namespace can be URL of the
schema instance – not compulsory
![Page 28: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/28.jpg)
May 2005 PPARC e-Science PG School 28
Logical filename
Unique URI for a file in a namespace
Uniquely identifies this ensemble in ILDG namespace
![Page 29: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/29.jpg)
May 2005 PPARC e-Science PG School 29
Validation
Verify XML ID is valid against a schema– Schema aware applications can use XML ID
Can write XML in vi,emacs etc CMM uses XMLSpy for schema and ID
manipulation– built in validator, create XML ID from schema
http://www.w3.org/XML/Schema– Many different tools
![Page 30: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/30.jpg)
May 2005 PPARC e-Science PG School 30
Configuration XML
Logical File name of the ensemble in the ILDG namespace
![Page 31: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/31.jpg)
May 2005 PPARC e-Science PG School 31
dataLFN
Logical File name of the configuration in the ILDG namespace
![Page 32: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/32.jpg)
May 2005 PPARC e-Science PG School 32
The markov chain
Where the configuration is in the trajectory of markov chain
![Page 33: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/33.jpg)
May 2005 PPARC e-Science PG School 33
avePlaquette
Very useful metadata, can be used to check data transformations are correct
![Page 34: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/34.jpg)
May 2005 PPARC e-Science PG School 34
Config: UKQCD example
![Page 35: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/35.jpg)
May 2005 PPARC e-Science PG School 35
Name hierarchy
Unique name in for ensemble in ILDG namespace
<markovChainURI/> Ensemble
<markovChainURI/> Config
<dataLFN/>
Replica catalogue
Actual file instances Multiple copies
![Page 36: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/36.jpg)
May 2005 PPARC e-Science PG School 36
Metadata browser
Builds Tree from schema
Researcher walks tree
Selects relevant node
![Page 37: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/37.jpg)
May 2005 PPARC e-Science PG School 37
Graphical query
Construct XPath query via GUI
![Page 38: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/38.jpg)
May 2005 PPARC e-Science PG School 38
Returned results
Read the XML
Download data
Search the config XML for this ensemble
Extract LFN from config XML for command-line tool scripting
![Page 39: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/39.jpg)
May 2005 PPARC e-Science PG School 39
Forum for data sharing amongst international collaborations– Share Code (open source)– Share Machines
• QCDOC built in collaboration with Columbia, NY– Share data!
Agree data format – done Agree metadata QCDml1.1 Agree middleware – interfaces agreed
![Page 40: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/40.jpg)
May 2005 PPARC e-Science PG School 40
QCDgrid in ILDG ILDG is grid of grids Each member will offer data,
– Not replicate each others Access other grids using own grid tools
– Via common interfaces Technical problems are hard Sociological problems
– E.g. Rules for sharing – Harder!
![Page 41: May 2005 PPARC e-Science PG School1 QCDgrid Chris Maynard A Grid for UKQCD National collaboration for lattice QCD.](https://reader036.fdocuments.net/reader036/viewer/2022062600/5a4d1b617f8b9ab0599ad675/html5/thumbnails/41.jpg)
May 2005 PPARC e-Science PG School 41
Summary QCDgrid is a secure, robust data grid
– In operation Data access is via metadata Future work
– Metadata mark-up tools• Currently rely on application codes and human
intervention– Middleware tools which conform to agreed
ILDG common interfaces