T. BowcockA.Moreton, M.McCubbin
CERN-IT 5/00
29 May 2000 CERN-IT T. Bowcock
2
University of Liverpool
•MAP System•COMPASS •Grid•Summary
29 May 2000 CERN-IT T. Bowcock
3
MAP@Liverpool
• LHCb Experiment– CP violation– Rare B decays– signals of 103 to 106
• Backgrounds– Potentially all 1014
collisions/year!
323222
About 11012 BB produced/year
29 May 2000 CERN-IT T. Bowcock
4
LHCb Experiment
Vertex detector
29 May 2000 CERN-IT T. Bowcock
5
LHCb Experiment
Optimize the DetectorStudy the Backgrounds
29 May 2000 CERN-IT T. Bowcock
6
Simulation
• Full GEANT3 simulation– Event takes of order 120-200s on a 400MHz PC
• Put together a simulation facility– Samples of 107 to 108 / year– Many times more passed through GEANT– Monte Carlo Array Processor– Similar or larger samples– 109 institute/year
• Analysis, reprocessing
29 May 2000 CERN-IT T. Bowcock
7
Philosophy
• Fixed Purpose (MC): simplicity• Low Cost
– No Gbit ethernet until price falls– Don’t buy top of range processors– No SMP boards
• 1998/1999
– No tapes • Develop architecture with future in mind
– Minimum maintenance/development
29 May 2000 CERN-IT T. Bowcock
8
Using MAP
Disposable MC(throwaway!)• Cost• Write out ntuple/summary information
• I/O not really limited by architecture
• Events may be written out
• Small internal disks
29 May 2000 CERN-IT T. Bowcock
9
Hardware
• 300 processors– 400MHz PII– 128 Mbytes memory– 3 Gbytes disk/processor (IDE)– D-Link 100BaseT ethernet +hubs– commercial units
• custom boxes for packing and cooling
– Total 600kChF inc 17.5% VAT 1998/1999 (Funding Jan 99). ITS
• Including installation and 3-yr next day on-site maintenance.
29 May 2000 CERN-IT T. Bowcock
10
MAP-OS
• Linux– Originally RH5.2 (also tested 6.1)– Stripped to minimum
• On disk 180MBytes!
– Will (with FCS) reinstall/upgrade itself– Access/security
29 May 2000 CERN-IT T. Bowcock
11
View
HighGflops/m2
Old Mainframe Room
Power supply(3 phase)0.1MW max
50kw cooling
29 May 2000 CERN-IT T. Bowcock
12
Architecture
Master
Ext
ern
al E
ther
net
MAPSlaves
Hub(Switch- 00)
Hub(Switch - 00)100BaseT
29 May 2000 CERN-IT T. Bowcock
13
Design Features
• Mother boards/bios– No keyboard etc required on boot!
• Front panels– All connections except power
• Access to each PC via trolley on wheels• Cheaper than patch panel! Very convenient.
– Cooling (room air flow)• 30kW required 50kW capacity• Power cutoff installed
• Rack Mount– 30/rack, easy to extract
29 May 2000 CERN-IT T. Bowcock
14
Learned…
• Prototype• Cables
– Cheaper ethernet cables seem OK• Would have been nice to have
– On board power/heat sensing• Don’t really need power system
– Daisy chain in groups of 5– Transients can be huge!
29 May 2000 CERN-IT T. Bowcock
15
Bad things happen…
• Catastrophic power failure– No UPS (original design had one)– 4% needed manual intervention but no
hardware failure
• Burn-in & 4 months of operation– 1 power supply exploded– 4 PC’s with mother-board problems– 5 HD failures (within 1 week of turn on)– NIC cards fail – Typically 1% nodes may have a problem
29 May 2000 CERN-IT T. Bowcock
16
Flow Control System
• MAP-FCS – UDP level (frames)– solve packet-loss problem
• Bad hubs(D-Link)• NIC Realtek clones with high failure rate
– Broadcast system• 4 Mbytes/s 300 (Master to Slaves)
– Point to point on fail– “Standard Mode” Communication only with
master– Control up to 10,000 PC’s
29 May 2000 CERN-IT T. Bowcock
17
Performance
• Jan/May 00– 15 million GEANT events for
optimization– cf 250,000 possible at CERN– DELPHI events
• 500,000/day• Trilinear Gauge Couplings, W-mass
systematics
– ATLAS, CDF, H1
29 May 2000 CERN-IT T. Bowcock
18
User
• Interface to master only– Web/Grid interface– Security
• Submission script– Job Control File
• Sequential jobs, files to keep etc• Quick and easy to use
• Statically linked executable• Toolkit
– Enables assembly/merging of 300 outputs
29 May 2000 CERN-IT T. Bowcock
19
SearchAnalysis
• As a search-engine MAP architecture is ideal– Low search and recovery times– Chemistry
• Centre for Innovative Catalysis (JIF ’00), promises world lead for Liverpool.
– Bio-informatics• Compute/search farms
29 May 2000 CERN-IT T. Bowcock
20
ExtendingMAP
• Wish to store events– Part of our mindset (reevaluate?)
• With existing system– Build an analysis and storage system– Add on disk servers
29 May 2000 CERN-IT T. Bowcock
21
COMPASS
29 May 2000 CERN-IT T. Bowcock
22
COMPASS-99
DELL ITS
29 May 2000 CERN-IT T. Bowcock
23
COMPASS-00
• 3Tbytes – On top of 1TByte
MAP internal
• Rack Mounted• Prototype of
40Tbyte system
29 May 2000 CERN-IT T. Bowcock
24
COMPASS
• Low cost(25KCHf/Tbyte inc 17.5%VAT)– SCSI disks(10 50GByte)– Dual Redundant Power Supplies– No RAID backplane– No hotswap– 750 MHz processors + 512MBytes memory– Linux– Act as MAP masters
29 May 2000 CERN-IT T. Bowcock
25
COMPASS
• Have 3Tbytes of store for R&D on GRID and exploitation of MAP
• MAP & COMPASS are complementary…
• Originally requested 40TBytes of store– For H1, BaBar, ATLAS, DELPHI
29 May 2000 CERN-IT T. Bowcock
26
MAP&COMPASS
• DST or processed data stored– From MAP
• Reprocessed/analysed locally – COMPASS
• Limit data movement off site– COMPASS farm in own right– Powerful analysis engines– Access from remote sites– Designed to, in parallel, analyse very large
data sets (Data split nodes – June 00)
29 May 2000 CERN-IT T. Bowcock
27
Data Transfer2000-2003
• Data transfer to/from – Liverpool-CERN/RAL– Liverpool-SLAC/FNAL
• High Speed link may be a waste of money– 3MCHF for 2MBs line!– Quality of service– Probably not true in long term t
• Transfer disks
29 May 2000 CERN-IT T. Bowcock
28
MAP-2001
• Extension of existing architecture– Vast underestimate of amount of MC
required– Extend to 1000 PC’s
• 720 800MHz PIII with 72Gbyte disks• 128MBytes memory• Switched network (&higher quality!)• Better NICs/(onboard?)
29 May 2000 CERN-IT T. Bowcock
29
MAP-2001
• Companies more willing to discuss COTS type architecture– Many selling BEOWULF systems– Even IBM!– ITS will provide a turnkey system
including our version of MAP control
29 May 2000 CERN-IT T. Bowcock
30
MAP-2001
• Capability– Standard MAP mode – DST transfer– Search Engine– Interprocess communication– Large Internal Store
• Minimize network traffic• Reprocessing
29 May 2000 CERN-IT T. Bowcock
31
MAP-2001
• Increase power by factor of 5• Aim for 1.5M LHCb events/day
– Non-volatile 1 Tbyte/day– 50Days internal store
• Use for reprocessing data• Disk size will increase by
calendar 2001 • Multi-user and projects
29 May 2000 CERN-IT T. Bowcock
32
Issues
• Authentication and Security• Quality of Service• Resource Allocation
29 May 2000 CERN-IT T. Bowcock
33
Grid
• Adding Globus (June 2000)• Access from CERN &
– Cambridge University, JMU, Liverpool, RAL
• Remote submission
29 May 2000 CERN-IT T. Bowcock
34
Grid 2005
Tier 1
T2
T2
T2
T2
3
3
3
33 3
3
3
3
3
3
3
Tier 0 (CERN
)
44 4 4
33
??????
T2
29 May 2000 CERN-IT T. Bowcock
35
Grid-LHCb
• Aim to use MAP as an LHCb testbed– MC production– Data access– Analysis– UK and CERN sites– Interaction with RAL
29 May 2000 CERN-IT T. Bowcock
36
HealthGrid
• Virtual Population Laboratory– Co-proposed by Liverpool for a “world
scale met office for disease prediction”• in collaboration with WHO
– Analysis power based on MAP• 5000 PC system
29 May 2000 CERN-IT T. Bowcock
37
HealthGrid
• Community Health Surveillance– WAP, local data bases
• Information – statistics,
• Analysis– MAP like centres for Health Policy
• WHO Med Centre
29 May 2000 CERN-IT T. Bowcock
38
Comments
• High Power MC systems vital for HEP– Do we have/plan enough for LHC?
• Cost and Techniques of Storage– Small groups can’t afford/want HSM– Is tape obsolete?
• Problems for institutes not the same as for Tier 0/1 centres
29 May 2000 CERN-IT T. Bowcock
39
Summary
• MAP fulfils its design goals (works!)– MAP-FCS control up to 10,000PC’s
• Minimum manpower 0.5FTE to date– Maintenance and development
• COTS architecture a success– Low cost has its ups and downs!– MAP available off the shelf for HEP-MC
• Low cost high density storage server farm in prototype
• Grid enabled– Access from CERN – and UK HEP institutes soon
• MAP-2001– Test a 1000PC farm for LHC
Top Related