PPDG LHC Computing ReviewNovember 15, 2000
PPDGThe Particle Physics Data Grid
Making today’s Grid software work for HENP experiments,
Driving GRID science and technology.
(www.ppdg.net)
Richard P. Mount
November 15, 2000
PPDG LHC Computing ReviewNovember 15, 2000
PPDG
• Who is involved?
• How is it funded?
• What has it achieved?
• How does it fit in to the big Grid picture?
• How is it relevant for LHC?
PPDG LHC Computing ReviewNovember 15, 2000
California Institute of Technology Harvey B. Newman, Julian J. Bunn, Koen Holtman, Asad Samar, Takako Hickey, Iosif Legrand, Vladimir Litvin, Philippe Galvez, James C.T. Pool, Roy Williams
Argonne National Laboratory Ian Foster, Steven Tuecke Lawrence Price, David Malon, Ed May
Berkeley Laboratory Stewart C. Loken, Ian Hinchcliffe, Doug Olson, Alexandre Vaniachine Arie Shoshani, Andreas Mueller, Alex Sim, John Wu
Brookhaven National Laboratory Bruce Gibbard, Richard Baker, Torre Wenaus
Fermi National Laboratory Victoria White, Philip Demar, Donald Petravick Matthias Kasemann, Ruth Pordes, James Amundson, Rich Wellner, Igor Terekhov, Shahzad Muzaffar
University of Florida Paul Avery
San Diego Supercomputer Center Margaret Simmons, Reagan Moore,
Stanford Linear Accelerator Center Richard P. Mount, Les Cottrell, Andrew Hanushevsky, Adil Hasan, David Millsom, Davide Salomoni
Thomas Jefferson National Accelerator Facility Chip Watson, Ian Bird, Jie Chen
University of Wisconsin Miron Livny, Peter Couvares, Tevfik Kosar
PPDG Collaborators
PPDG LHC Computing ReviewNovember 15, 2000
PPDG Collaborators Particle Accelerator Computer Physics Laboratory Science
ANL X X
LBNL X X
BNL X X x
Caltech X X
Fermilab X X x
Jefferson Lab X X x
SLAC X X x
SDSC X
Wisconsin X
PPDG LHC Computing ReviewNovember 15, 2000
PPDG
BaBar
D0
CDF
Nuclear Physics
CMSAtlas
Globus Users
SRB Users
Condor Users
STAR
BaB
ar D
ata
Man
agem
ent
CM
S D
ata Managem
ent
Nuclear Physics Data Management
D0 Data M
anagement
CDF Data ManagementA
tlas
Dat
a M
anag
emen
t
Globus Team
Condor
SRB Team
ST
AC
S
PPDG: A Coordination Challenge
PPDG LHC Computing ReviewNovember 15, 2000
PPDG Funding• FY 1999:
– PPDG NGI Project approved with $1.2M ($2M requested) from DoE Next Generation Internet program.
• FY 2000– DoE NGI program not funded– $1.2M funded by DoE/OASCR/MICS ($470k) and
HENP ($770k)
• FY 2001+– Proposal (to be written) for DoE/OASCR/MICS
and HENP funding in SciDAC context. Likely total FY2001 request: ~$3M.
PPDG LHC Computing ReviewNovember 15, 2000
Initial PPDG Goals
Implement and Run two services in support of the major physics experiments at BNL, Fermilab, JLAB, SLAC:
– “High-Speed Site-to-Site File Replication Service”;Data replication up to 100 Mbytes/s
– “Multi-Site Cached File Access Service”: Based on deployment of file-cataloging, and transparent cache-management and data movement middleware
Using middleware components already developed by the collaborators.
PPDG LHC Computing ReviewNovember 15, 2000
PPDG Site-to-Site Replication Service
SECONDARY SITECPU, Disk, Tape Robot
PRIMARY SITEData Acquisition,
CPU, Disk, Tape Robot
PPDG LHC Computing ReviewNovember 15, 2000
Progress:100 Mbytes/s Site-to-Site
• Focus on SLAC – Caltech over NTON at OC48 (2.5 gigabits/s);
• Fibers in place;
• SLAC Cisco 12000 with OC48 and 2 × OC12 in place;
• Caltech Juniper M160 with OC48 installed;
• 990 Mbits/s achieved between SC2000 and SLAC.
PPDG LHC Computing ReviewNovember 15, 2000
Throughput from SC2000 to SLAC
Up to 990 Mbits/s using two machines at each end plus multi-stream TCP with large windows
PPDG LHC Computing ReviewNovember 15, 2000
PPDG Multi-site Cached File Access System
UniversityUniversityCPU, Disk, CPU, Disk,
UsersUsers
PRIMARY SITEPRIMARY SITEData Acquisition,Data Acquisition,Tape, CPU, Disk, Tape, CPU, Disk,
RobotRobot
Satellite SiteSatellite SiteTape, CPU, Tape, CPU, Disk, RobotDisk, Robot
Satellite SiteSatellite SiteTape, CPU, Tape, CPU, Disk, RobotDisk, Robot
UniversityUniversityCPU, Disk, CPU, Disk,
UsersUsers
UniversityUniversityCPU, Disk, CPU, Disk,
UsersUsers
Satellite SiteSatellite SiteTape, CPU, Tape, CPU, Disk, RobotDisk, Robot
PPDG LHC Computing ReviewNovember 15, 2000
PPDG Cached File Access Progress
• Demonstration of multi-site cached file access based mainly on SRB*.(LBNL, ANL, U.Wisconsin)
• Development of HRM storage management interface and implementation in SRB and SAM (D0 data management)
* Storage Resource Broker (SDSC)
PPDG LHC Computing ReviewNovember 15, 2000
Test of PPDG Storage Management API (HRM)
• 2 separate Clients request and get files from:– SRB catalog and HPSS – LBL and Wisconsin– D0 SAM catalog, disk cache and Enstore storage
system – Fermilab and Wisconsin. Demo’d at SC2000.
• Agreed on common Storage Resource Management interface.
• Next step – Client that requests and gets files from each/both storage management systems – goal to meet the PPDG “multi-site file caching file access” across 2 existing grid components.
PPDG LHC Computing ReviewNovember 15, 2000
PPDG: Initial Architecture
PPDG LHC Computing ReviewNovember 15, 2000
Initial PPDG “System” Components Middleware Components (Initial Choice): See PPDG Proposal Page 15
Object and File-Based Objectivity/DB (SLAC enhanced) Application Services GC Query Object, Event Iterator,
Query Monitor FNAL SAM System Resource Management Start with Human Intervention
(but begin to deploy resource discovery & mgmnt tools) File Access Service Components of OOFS (SLAC) Cache Manager GC Cache Manager (LBNL) Mass Storage Manager HPSS, Enstore, OSM (Site-dependent) Matchmaking Service Condor (U. Wisconsin) File Replication Index MCAT (SDSC) Transfer Cost Estimation Service Globus (ANL) File Fetching Service Components of OOFS File Movers(s) SRB (SDSC); Site specific End-to-end Network Services Globus tools for QoS reservation Security and authentication Globus (ANL)
PPDG LHC Computing ReviewNovember 15, 2000
Request Interpreter
Storage Accessservice
RequestManager
CacheManager
Request to move files {file: from,to}
logical request(property predicates / event set)
Local Site Manager
To Network
File Access service
Fig 1: Architecture for the general scenario - needed APIs
files to be retrieved {file:events}
Logical Indexservice
Storage Reservationservice
Request to reserve space
{cache_location: # bytes}
MatchmakingService
FileReplicaCatalog
GLOBUS Services Layer
Remote Services
ResourcePlanner
Application(data request)
Client(file request)
Local ResourceManager
CacheManager
Properties,Events,
FilesIndex
1
4
2
6
5
8
713
9
3
12
11 10
PPDG LHC Computing ReviewNovember 15, 2000
Current PPDG Focus:File Replication Service
• Use cases from BaBar, D0, CMS, etc.
• Typical target: BaBar SLAC-Lyon transfers(current low-tech approach absorbs about 2 FTE).
• Replica catalog distinct from Objectivity catalogs;
• GRIDftp transfer.
• Globus inter-site security.
PPDG LHC Computing ReviewNovember 15, 2000
The Big Grid Picture
QoS, Reservations
High-throughput IP
Reliable ObjectTransfer
Modeling Prototypes Products
Deployment in Experiments
Security/AuthenticationTechnology
Security/AuthenticationArchitecture
Matchmaking
Resource Policy
Resource Discovery
User SupportTestbeds
Cost/FeasibilityEstimation
Distributed Transaction Management
Distributed Replica Catalog
Worldwide Grid Project Coordination
Software Configuration Control
Derived-Object DefinitionDatabase
Mobile Agents
Grid Architecture and Interface Definition
Error Tracing
Instrumentation
PPDG LHC Computing ReviewNovember 15, 2000
The Big Grid Picture
• Grid projects must become coordinated (in progress);
• Progress in the commercial world must be exploited;
PPDG LHC Computing ReviewNovember 15, 2000
PPDG in the Big Grid Picture
• Rapid deployment of Grid software in support of HENP experiments;
• Drive and contribute to Grid architecture:– Architecture must define interfaces between evolving
components;
• Design and develop new Grid middleware components (deliverables to be defined in consultation with GriPhyN, EU-DataGrid …):– Focus on rapid delivery to HENP experiments (to
validate concepts, get feedback and be useful).
PPDG LHC Computing ReviewNovember 15, 2000
PPDG and LHC? BaBar Example
0.0
100.0
200.0
300.0
400.0
500.0
600.0
700.01997
1998
1999
2000
2001
2002
2003
2004
2005
Cu
mu
lati
ve I
nvers
e F
em
tob
arn
s
Cumulative IntegratedLuminosity (fb-1)
Moore's Law
SLAC CCIN2P3
RAL
CASPUR
PPDG-SLAC-IN2P3-BaBar plan to implement Grid components allowing SLAC + CCIN2P3 + … to become an (adequately) integrated data analysis resource.
Delivery of useful service: scheduled for end 2001
PPDG LHC Computing ReviewNovember 15, 2000
PPDG and LHC
• US LHC groups are strong participants in PPDG;
• Computer scientists in PPDG see the LHC challenge as the leading opportunity to advance the science of data-intensive Grids;
• PPDG, GriPhyN and EU-DataGrid are creating coordinated management and joint working groups:– Interoperable systems with consistent components.
Top Related