Robin Middleton RAL/PPD DG Co-ordination Rome, 23rd June 2001.
-
Upload
derek-bishop -
Category
Documents
-
view
226 -
download
2
Transcript of Robin Middleton RAL/PPD DG Co-ordination Rome, 23rd June 2001.
Robin MiddletonRAL/PPD
DG Co-ordinationRome, 23rd June 2001
GridPP e-Science Presentation Slide 2
GridPP Collaboration Meeting
1st GridPP Collaboration Meeting - Coseners House - May 24/25 2001
GridPP e-Science Presentation Slide 3
GridPP History
Collaboration formed by all UK PP Experimental Groups in 1999 to submit £5.9M JIF bid for Prototype Tier-1 centre at RAL (later superceded)
Added some Tier-2 support to become part of PPARC LTSR - “The LHC Computing Challenge” Input to SR2000
Formed GridPP in Dec 2000 included CERN, CLRC and UK PP Theory Groups
From Jan 2001 handling PPARC’s commitment to EU DataGrid
UKUK
GridPP e-Science Presentation Slide 4
Proposal Executive Summary
• £40M 3-Year Programme• LHC Computing Challenge• = Grid Technology• Five Components:
– Foundation– Production– Middleware– Exploitation– Value-added
Exploitation• Emphasis on Grid Services
and Core Middleware
• Integrated with EU DataGrid, PPDG and GriPhyN
• Facilities at CERN, RAL and up to four UK Tier-2 sites
• Centres = Dissemination• LHC developments
integrated into current programme (BaBar, CDF, D0, ...)
• Robust Management Structure
• Deliverables in March 2002, 2003, 2004
GridPP e-Science Presentation Slide 5
GridPP Component Model
Foundation
Value-added
Production
Middleware
Exploitation
Component 1: Foundation
The key infrastructure at CERN and within the UK
Component 2: Production
Built on Foundation to provide an environment for experiments to use
Component 3: Middleware
Connecting the Foundation and the Production environments to create a functional Grid
Component 4: Exploitation
The applications necessary to deliver Grid based Particle Physics
Component 5: Value-Added
Full exploitation of Grid potential for Particle Physics
£21.0M
£25.9M
GridPP e-Science Presentation Slide 6
Major Deliverables
Prototype I - March 2002• Performance and scalability testing of components• Testing of the job scheduling and data replication software from
the first DataGrid release.
Prototype II - March 2003• Prototyping of the integrated local computing fabric, with
emphasis on scaling, reliability and resilience to errors. • Performance testing of LHC applications. Distributed HEP and other
science application models using the second DataGrid release.
Prototype III - March 2004• Full scale testing of the LHC computing model with fabric
management and Grid management software for Tier-0 and Tier-1 centres, with some Tier-2 components.
GridPP e-Science Presentation Slide 7
First Year DeliverablesEach Workgroup has detailed deliverables. These will be refined each year and build on the successes of the previous year.
The Global Objectives for the first year are:• Deliver EU DataGrid Middleware (First prototype [M9])• Running experiments to integrate their data management
systems into existing facilities (e.g. mass storage)• Assessment of technological and sociological Grid analysis
needs• Experiments refine data models for analyses• Develop tools to allow bulk data transfer• Assess and implement metadata definitions• Develop relationships across multi-Tier structures and countries• Integrate Monte Carlo production tools• Provide experimental software installation kits• LHC experiments start Data Challenges• Feedback assessment of middleware tools
GridPP e-Science Presentation Slide 8
GridPP Organisation
Software development organised around a number of Workgroups
Hardware development organised around a number Regional Centres
• Likely Tier-2 Regional Centres
• Focus for Dissemination and Collaboration with other disciplines and Industry
• Clear mapping onto Core Regional e-Science Centres
GridPP e-Science Presentation Slide 9
GridPP Workgroups
A - Workload Management
Provision of software that schedule application processing requests amongst resources
B - Information Services and Data Management
Provision of software tools to provide flexible transparent and reliable access to the data
C - Monitoring Services
All aspects of monitoring Grid services
D - Fabric Management and Mass Storage
Integration of heterogeneous resources into common Grid frameworkE - Security
Security mechanisms from Certification Authorities to low level components
F - Networking
Network fabric provision through to integration of network services into middleware
G - Prototype Grid
Implementation of a UK Grid prototype tying together new and existing facilities
H - Software Support
Provide services to enable the development, testing and deployment of middleware and applications at institutes
I - Experimental Objectives
Responsible for ensuring development of GridPP is driven by needs of UK PP experiments
J - Dissemination
Ensure good dissemination of developments arising from GridPP into other communities and vice versa
Technical work broken down into several workgroups - broad overlap with EU DataGrid
GridPP e-Science Presentation Slide 10
GridPP and CERN
UK involvement through GridPP will boost CERN investment in key areas:– Fabric management software– Grid security– Grid data management– Networking– Adaptation of physics applications– Computer Centre fabric (Tier-0)
For UK to exploit LHC to the full:Requires substantial investment at CERN to support LHC computing.
GridPP e-Science Presentation Slide 11
GridPP Management Structure
GridPP e-Science Presentation Slide 12
Management Status
The Project Management Board (PMB)
The Executive board chaired by Project Leader - Project Leader being appointed. Shadow Board in operation
The Collaboration Board (CB)
The governing body of the project - consists of Group Leaders of all Institutes - established and Collaboration Board Chair elected
The Technical Board (TB)
The main working forum chaired by the Technical Board Chair - interim task force in place
The Experiments Board (EB)
The forum for experimental input into the project - nominations from experiments underway
The Peer Review Selection
Committee (PRSC)
Pending approval of Project
The Dissemination Board (DB)
Pending approval of Project
GridPP e-Science Presentation Slide 13
GridPP is not just LHC
UK
GridPP e-Science Presentation Slide 14
Tier1&2 Plans
• RAL already has 300 cpus, 10TB disk, and STK tape silo which can hold 330TB
• Install significant capacity at RAL this year to meet BaBar TierA Centre requirements
• Integrate with worldwide BaBar work• Integrate with DataGrid testbed• Integrate Tier1 and 2 within GridPP• Upgrade Tier2 centres through SRIF (UK university
funding programme)
GridPP e-Science Presentation Slide 15
Tier1 Integrated Resources
0
20
40
60
80
100
120
2001 2002 2003
kSI95
disk TB
tape TB/10
GridPP e-Science Presentation Slide 16
Liverpool
• MAP - 300 cpus + several TB of disk – delivered simulation for LHCb and others for
several years• Upgrades of cpus and storage planned for 2001
and 2002– currently adding Globus – develop to allow analysis work also
GridPP e-Science Presentation Slide 17
Imperial College
• Currently– 180 cpus– 4TB disk
• 2002– adding new cluster in 2002– shared with Computational Engineering– 850 nodes– 20TB disk– 24TB tape
• CMS, BaBar, D0
GridPP e-Science Presentation Slide 18
Not Fully Installed
Lancaster
Worker
Worker
196 Worker CPUs
Switch
Controller Node
Switch500 GB
Bulkserver
100 MB/s Ethernet
Tape LibraryCapacity ~ 30 TB
k£11/30 TB
Worker
Controller Node
500 GB Bulkserver
1000 MB/s Ethernet
Fiber
Finalizing Installation of Mass Storage System ~ 2 Months
GridPP e-Science Presentation Slide 19
Lancaster
• Currently D0 – analysis data from FNAL for UK– simulation
• Future– upgrades planned– Tier2 RC– Atlas-specific
GridPP e-Science Presentation Slide 20
• Tendering now• 128 CPU at Glasgow• 5 TB Datastore + server at Edinburgh• ATLAS/LHCb• Plans for future upgrades to 2006• Linked with UK Grid National Centre
ScotGrid
GridPP e-Science Presentation Slide 21
Network
• UK Academic Network, SuperJANET entered phase 4 in 2001
• 2.5GB backbone, December 2000-April 2001• 622Mbit to RAL, April 2001• Most MANs have plans for 2.5GB on their
backbones• Peering with GEANT planned at 2.5GB
GridPP e-Science Presentation Slide 22
Wider UK Grid
• Prof Tony Hey leading Core Grid Programme• UK National Grid
– National Centre – 9 Regional Centres
• Computer Science lead• includes many sites with PP links
– Grid Support Centre (CLRC)– Grid Starter Kit
• vesion 1 based on Globus, Condor, ......• Common software
• e-science Institute• Grid Network Team• Strong Industrial Links• All Research Areas have their own e-science plans
GridPP e-Science Presentation Slide 23
Summary
• UK has plans for a national grid for particle physics– to deliver the computing for several virtual
organisations (LHC and non-LHC)• Collaboration established, proposal approved, plan in place• Will deliver
– UK commitment to DataGrid, – prototype Tier1 and 2– UK commitment to US experiments
• Work closely with other disciplines• Have been working towards this project for ~ 2 years
building up hardware• Funds installation and operation of experimental testbeds,
key infrastructure, generic middleware and making application code grid aware
GridPP e-Science Presentation Slide 24
The End
GridPP e-Science Presentation Slide 25
UK StrengthsWish to build on UK strengths -• Information Services• Networking• Security• Mass Storage
UK Major Grid Leadership roles -• Lead DataGrid Architecture Task Force (Steve Fisher)• Lead DataGrid WP3 Information Services (Robin Middleton)• Lead DataGrid WP5 Mass Storage (John Gordon)•Strong Networking Role in WP7 (Peter Clarke, Robin Tasker)
• ATLAS Software Coordinator (Norman McCubbin)• LHCb Grid Coordinator (Frank Harris)
Strong UK Collaboration with Globus• Globus people gave 2 day tutorial at RAL to PP community• Carl Kesselman attended UK Grid Technical meeting • 3 UK people visited Globus at Argonne
Natural UK Collaboration with US PPDG and GriPhyN
GridPP e-Science Presentation Slide 26
• 8x80cpu farms• 10 sites with 12TB disk and Suns• simulation• data mirroring from SLAC - Kanga, Objectiity• data movement and mirroring across UK• data location discovery across UK - mySQL• remote job submission - Globus and PBS• common usernames across UK - GSI gridmapfiles
• Find data - submit job to data - register output• BaBar planning a distributed computing model
– TierA centres
BaBar
GridPP e-Science Presentation Slide 27
CDF
• Similar model to BaBar with disk and cpu resources at RAL and universities plus farm for simulation.
• Development of Grid access to CDF databases.• Data replication from FNAL to UK and around UK• Data Location Discovery through metadata
GridPP e-Science Presentation Slide 28
D0
• Large data centre at Lancaster• ship data from FNAL to UK• simulation in UK and ship data back to FNAL• Gridify SAM access to data
– data at FNAL and Lancaster