HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.) 1 Belle Computing / Data...
-
Upload
peter-pitts -
Category
Documents
-
view
219 -
download
2
Transcript of HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.) 1 Belle Computing / Data...
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
1
Belle Computing / Data Belle Computing / Data HandlingHandling
What is Belle and why we need large-scale computing?
Current Belle computing system & data handling Planning for super-B era A case study
Youngjoon KwonYonsei Univ.
Jysoo Lee
KISTI
&
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
2
What is Belle ?What is Belle ?
• KEKB asymmetric energy collidere+ (3.5 GeV) e- (8 GeV)
• design Luminosity = 1034 /cm2/s• E(cm) = 10.58 GeV on resonance of (4S) production• Belle detector optimized for studying matter-antimatter asymmetry in the Universe
e- (8 GeV) e+ (3.5 GeV)
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
3
The Belle Experiment.The Belle Experiment. To study matter-anitmatter asymmetry in B meson decays.
Accumulated 100 million pairs since turn-on in 1999. Published 44 journal papers and over 200 conference
contributions
BB
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
4
Belle’s need for large-scale Belle’s need for large-scale computingcomputing
To achieve ½ of Belle’s physics goals: need ~108 events
Time required for “Real Data” analysis – 40 days/ 100Mevts / 1GHz– Need 10GHz/analysis to finish one data loop within 1 week– Belle produce ~ 20 papers/year– a typical paper takes ~2 years to finish analysis
=> 40 analyses being done simultaneously– Hence, we need ~400 GHz to sustain current activity of “real data”
analysis alone But, we also need Monte-Carlo sample (x4 in size)
– 10 sec/evt/GHz => 130 years/GHz– Hence, need ~200 GHz to provide MC sample within a year
Need almost 1 THz to sustain physics analysis activites
We need additional CPU’s for raw data processing, etc.
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
5
Central Belle computing Central Belle computing systemsystem
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
6
CPUsCPUs Belle’s reference platform: Sparc’s running Solaris 2.7
– 9 workgroup servers (500 MHz, 4CPU)– 38 compute servers (500 MHz, 4CPU)
• LSF batch system / 40 tape drives (2 each on 20 servers)
– Fast access to disk servers– 20 user workstations with DAT, DLT, AITs
Additional Intel CPUs– Compute servers (@KEK, Linux RH 6.2/7.2)
• 4 CPU (Pentium Xeon 500-700 MHz) servers~96 units• 2 CPU (Pentium III 0.8~1.26 GHz) servers~167 units
– User terminals (@KEK to log onto the group servers)• 106 PCs (~50Win2000+X window sw, ~60 Linux)
– User analysis PCs(@KEK, unmanaged)– Compute/file servers at universities
• A few to a few hundreds @ each institution• Used in generic MC production as well as physics analyses at each
institution
– Tau analysis center @ Nagoya U. for example
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
7
Disk servers @ KEKDisk servers @ KEK 8TB NFS file servers 120TB HSM (4.5TB staging disk)
– DST skims– User data files
500TB tape library (direct access)– 40 tape drives on 20 sparc servers– DTF2:200GB/tape, 24MB/s IO speed– Raw, DST files– generic MC files are stored and read by users(batch jobs)
~12TB local data disks on PCs– Not used efficiently at this point
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
8
Data storage requirementsData storage requirements
Raw data: 1GB/pb-1 (100 TB /100 fb-1) DST: 1.5GB /pb-1/copy (150 TB /100 fb-1) Skims for calibration: 1.5GB /pb-1
MDST: 50GB/fb-1 (5 TB /100 fb-1) Other physics skims: 30GB/fb-1 (3 TB /100 fb-1) Generic MC (MDST): ~20 TB/year
Total: ~450 TB/year
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
9
CPU requirements – DST CPU requirements – DST productionproduction
Goal: 3 months to reprocess all data– Often we have to wait for
const.– Often we have to restart due
to bad constants
300 GHz (PIII) for 1fb-1/day
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
10
CPU requirements – MC CPU requirements – MC productionproduction
For every real data set, need to generate at least x3 as many MC events
240 GB/fb data in the compressed format No intermediate info (DC hits; ECL showers) are saved
– With every new release of the s/w library,need to produce new generic MC sample
400 GHz (PIII) for 1fb-1/day
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
11
Data transfer to remote Data transfer to remote usersusers
A firewall & login servers make the data transfer miserable (100 Mbps max.)
DAT tapes are used for massive data transfer– Compressed hadron skim files– MC events generated by outside institutions
Dedicated GbE network to a few institutions are now being added– Total 10 Gbit to/from KEK being added
Slow network to most other collaborators
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
12
Compute problem?Compute problem?
Obviously, the existing computing resources are already stretched to over-capacity
Data set is doubling every year with no end in site.
Management of data and CPU is already a major burden
By far the most cost effective solution are large clusters of commodity PCs running Linux.
How to manage these? GRID!
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
13
Prototype GRID-style Prototype GRID-style analysis analysis
Need to run multi-parameter fitting program for CP violation measurement => a multi-CPU CP-fitter
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
14
Planning for Planning for Super-BSuper-B era era
x15 increase in luminosity is planned c. 2006 Data accumulation: ~ 2PB/year Including MC’s, need 10PB of storage to start super-B To re-process 2 year’s accumulation (2 ab-1) of data in 3
months, we need x30 CPU power– CPU @ KEK alone is not enough– A cluster of local data centers (connected by GRID) is planned!
One unit of LDC– 300 GHz + 60 TB + 3 MBps to KEK– Cost: $0.3M + $0.2M + $(Network)
Can we afford one?
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
15
Belle-GRID – a case Belle-GRID – a case studystudyTwo Australian collaborators in Belle (U. Melobourne & U. Sydney) ar
e working on a GRID prototype for Belle physics analyses
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
16
Belle-GRID – a case Belle-GRID – a case studystudy
Blue-print for Belle-GRID in Australia
Belle-GRID – a case studyBelle-GRID – a case study Belle analysis using a Grid environment
– useful locally » adopted by Belle » wider community– construction of a Grid Node at Melbourne
• Certificate Authority to approve security• Globus toolkit...• GRIS (Grid Resource Information Service) - LDAP with Grid security• Globus Gateway - connected to local queue (GNU Queue; PBS?)• GSIFTP - data resource providing access to local storage• Replica Catalog - LDAP for virtual data directory
– Replicate this in Sydney– initial test of Belle code with grid node & queue– data access via the grid (Physical File Names as stored in Replica Catalo
g)– modification of Belle code to access the data on the grid– test of Belle code with grid node & queue & grid data access– connect 2 grid nodes (Melbourne EPP and Sydney EPP)– test of Belle code running over separated grid clusters– implement or build Resource Broker
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
18
Belle-GRID – a case studyBelle-GRID – a case study
Belle analysis test case…– Analysis of charmless B meson decays to 2 vector mesons, used to
determine 2 angles of the CKM unitarity triangle. Belle analysis code over Grid resources (10 files ; 2 GB total)
– Data files processed serially 95 mins– Data files processed over Globus 35 mins
Data access (2 secure protocols GASS/GSIFTP ; 100 Mbit network)– NFS access for comparison 8.5 MB/s– GASS access 4.8 MB/s– GSIFTP access 9.1 MB/s
Belle analysis using Grid data access– NFS access for comparison 0.34 MB/s– GSIFTP data streaming 0.36 MB/s
HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.)
19
SummarySummary
Belle’s computing resources are stretched to over-capacity.
Moreover, we are planning a x15 increase in luminosity (so called the “super KEKB”) within a few years.
Perhaps, Local Data Centers connected by GRID is the only viable option.
Two Australian groups are working on a Belle-GRID analysis prototype. So far it has been working as planned.