Computing at Belle II - DESY · Computing at Belle II Thomas Kuhr KIT DESY ... Altmannshofer,...
Transcript of Computing at Belle II - DESY · Computing at Belle II Thomas Kuhr KIT DESY ... Altmannshofer,...
Computing at Belle II
Thomas KuhrKIT
DESYComputing Seminar
14.06.2010
KEKB: 1 ab-1
SuperKEKB: 50 ab-1
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 2
World Cup
Japan : Cameroon
0 : 0(at 16:00)
(20:30 Italy : Paraguay)
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 3
Outline
● Physics at a (Super) B Factory
● SuperKEKB and Belle II
● Belle II Computing Model
● Cloud Computing
● Job and Data Management
● Offline Software
● Resources and Summary
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 4
Motivation for B Factories
● 1964, Cronin, Fitch: Discovery of CP violation in K0 system,small effect O(10-3)
➢ 1972, Kobayashi, Maskawa:
➔ CP violation possible, if there are 6 quark flavors
✔ 1974, Burton, Richter: Discovery of charm quark
✔ 1977, E288: Discovery of bottom quark
✔ 1995, CDF, D0: Discovery of top quark
➔ CP violation?
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 5
(C)KM-Matrix
● Complex elements Vij
➔ 18 parameters
● Unitarity (VV† = 1) +quark phases
➔ 4 free par.:3 angles + 1 phase (CP)
Vtb
Vcb
Vub
Vts
Vcs
Vus
Vtd
Vcd
Vud
VCKM =
➔ Flavor changing coupling of W to quark pairs:
➔ CP violation determined by just one parameter➔ Prediction of large CP violation in B0 system
➢ Experimental verification needed B factories→
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 6
Physics Objective of Belle
535 x 106 BB
PRL 98,031802 (2007)
B0
B0
✔ Confirmation of KM mechanism of CP in the Standard Model
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 7
Baryon Asymmetry in the Universe
● Big bang: Equal amount of matter and antimatter● Today: Only matter left
● Requirements (1967, Sakharov):
➢ Baryon number violation➢ Departure from thermal equilibrium➢ C and CP violation
✗ CP violation in the SM by many orders of magnitudetoo small to generate observed asymmetry
➢ Need sources of CP violation beyond the SM➔ Super B factory
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 8
Measurement of CP Violation
● Measurement of phase requires interference betweentwo diagrams
➢ Time dependent asymmetry measurement:
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 9
Time Dependent CP Measurement
e- e+Y(4S)
B0CP
B0tag
K0S
-
+
+
-
l+
-s
-
K+
z
● e+e- Y(4S) B0B0 (50%) B+B- (50%)
● Continuum background
● Asymmetric beam energies ⇒ t = z / c
● Entanglement ⇒ af(t) af(t)
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 10
KEK Site
Tsukuba
Tokyo
Mt. FujiNarita
e+
e–
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 11
KEKB Performance
➢ World record luminosity: 2.1 x 1034 cm-2s-1
Twice design→➢ 1 ab-1 of integrated
luminosity
Design
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 12
Places to Look for New Physics
Time dependent CP violationin b qqs →decays
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 13
More Places to Look for New Physics
➢ Charged Higgsin B →
➢ Precision CKM angle measurements➢ Forward-backward asym. in B K→ *l+l- ➢ Direct CP violation (Kπ puzzle)➢ CP violation in D0 mixing➢ Lepton flavor violation
Signal = zero energy
Belle
2.6
X(3872)
➢ The unexpected
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 14
Flavor Physics in the LHC Era
Altmannshofer, Buras, Gori, Paradis and StraubNucl.Phys.B830, 17-94, 2010
● Complementary to direct search for NP at LHC
➔ “DNA test” of NP
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 15
Outline
● Physics at a (Super) B Factory
● SuperKEKB and Belle II
● Belle II Computing Model
● Cloud Computing
● Job and Data Management
● Offline Software
● Resources and Summary
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 16
Aim For 50 ab-1
8 x 1035
cm-2 s-1SuperKEKB
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 17
SuperKEKB Upgrade: Nano Beam Scheme
e- 2.6 A
e+ 3.6 A
Replace long TRISTAN dipoles with shorter ones (HER).
New damping ring
New IR
TiN coated beam pipe with antechambers
New low emmitanceelectron source
Larger crossing angle 2φ = 22 mrad → 83 mrad
Smaller asymmetry 3.5 / 8 GeV → 4 / 7 Gev
Belle IIBelle IINew Superconducting /permanent final focusingquads near the IP
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 18
Projection of Luminosity
50 ab-1
in 2020
Shutdownfor Upgrade
inte
gr a
ted
(ab
-1)
Inst
an
tan
eou
s(c
m-2s-1
)
50 timesBelledata
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 19
Belle II Detector
➔ Higher background, higher event rate
Belle II
Nano beam:Synchrotron radiation low,
Touschek background much higher
Belle
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 20
Belle II Detector: Tracking and Vertexing
Belle II
4 layer siliconstrip detector
Drift chamber
Belle
2 layer DEPFET pixel detector
→ much higher event size
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 21
Belle II Detector: Particle Identification
Belle IIECL
KLM
Belle
“Focusing”RICH
Time of propagation
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 22
Belle II Collaboration
➢ June 2004: Letter of Intent
➢ March 2008: First proto collaboration meeting
➢ December 2008: New collaboration founded
➔ ~300 members47 institutes from 13 countries
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 23
International Collaboration
● Item
You are welcome to join!
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 24
Project Status
✔ Preliminary approval by Japanese government in January
● Final decision expected soon
✔ TDR is written (~480 pages)
➔ Reviewed three weeks ago➔ To be published in July
● Belle data taking ends in June
➔ Then upgrade work can start
➢ Well on track to start data taking in 2014
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 25
Outline
● Physics at a (Super) B Factory
● SuperKEKB and Belle II
● Belle II Computing Model
● Cloud Computing
● Job and Data Management
● Offline Software
● Resources and Summary
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 26
Belle II Data Acquisition / High Level Trigger
➔ Designed for input rate of 30 kHz
● One challenge:PXD event datareduction1 MB ~200 kB→(all other Belle II detectors together ~100 kB)
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 27
Estimated Data Rates
➔ High data rate is a challenge!
Experiment Rate [Hz] Event Size [kB] Rate [MB/s]
High rate scenario for Belle II DAQ
Belle II 6,000 300 1,800
LCG TDR (2005)
ALICE (HI) 100 12,500 1,250
ALICE (pp) 100 1,000 100
ATLAS 200 1,600 320
CMS 150 1,500 225
LHCb 2,000 25 50
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 28
Belle Computing: Centralized at KEK
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 29
Considerations for Belle II Computing
● 50 times more data in ~10 years
➔ O(50) times more computing resources
Can not expect KEK to provide all resources
Have to enable all Belle II member institutes to contribute
➔ Distributed computing system based on grid services (gLite)
➢ Can benefit from existing LCG infrastructure➢ Profit from experience of LHC experiments and their
well-established and matured solutions
● Tasks: Data processing, MC production, physics analysis
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 30
Raw Data Processing
● Huge data size (raw data rate up to 1.8 GB/s) comparable to LHC experiments→
● Write once, read only few times in managed way➔ Tape as storage medium
➢ Most cost and power efficient, but high maintenance effort
● Source of raw data is Belle II detector➔ Store and process at KEK, no replication to remote sites
➢ Would require huge network bandwidth, high operation efficiency, maintenance of tape system
➢ Simpler than LCG model
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 31
Raw Data Processing
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 32
MC Production
● Run-dependent, generic MC: B0B0, B+B-, charm, and uds events
● Corresponding to N times the real data size
● Produced in managed way
● No input data needed(except generator and background data files)
➔ Well suited for a distributed environment
➢ Including cloud computing resources
➔ Output stored at grid sites where it is produced
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 33
MC Production
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 34
Physics Analysis
● Need real / MC data in mDST format as input➔ Copy mDST real data to grid sites➔ Replicate MC datasets in a managed way if needed
➢ Produce ntuples on the grid
● Random, uncoordinated access➔ Store mDST real / MC data on disk
● Need fast turn-around time for ntuple analysis➔ Copy them to local resources
➢ Do ntuple analysis on local resources
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 35
Physics Analysis
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 36
Belle II Computing Model
Raw Data Storageand Processing
MC Productionand Ntuple Production
MC Production(optional)
NtupleAnalysis
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 37
Tasks of Computing Facilities
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 38
Outline
● Physics at a (Super) B Factory
● SuperKEKB and Belle II
● Belle II Computing Model
● Cloud Computing
● Job and Data Management
● Offline Software
● Resources and Summary
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 39
(Commercial) Cloud Computing
● Resource demands vary with time
● Fair-share can solve this issue only to some extent
➔ Cloud computing allows to buy resources on demand
➢ Well suited to absorb peaks in varying resource demand
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 40
Cloud Computing in Belle II
● Risk: vendor lock-in➔ No permanent data storage on the cloud➔ Much less critical for CPU resources
● Large data transfer / storage not cost efficient (now)➔ Use cloud primarily for MC production➔ No data processing➔ Maybe physics analysis
● Accounting issues
➢ Baseline of computing resources provided by the grid➢ Cloud computing is option for peak demands
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 41
DIRAC for VM management
➢ Task: Belle MC production
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 42
Belle MC Production with DIRAC
Modularity of DIRAC
allows easy integration of
different cloud providers
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 43
DIRAC Test Phase I: Only Cloud
250 VMs with 8 cores
Data transferto grid sites
● 120M events (2.7 TB) produced in 10 days
● 0.46 USD / 10k events
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 44
DIRAC Test Phase II: Cloud + Local
Cloud (60%)
● 170M events (3.6 TB) produced in 6 days● Amazon Spot Instances 0.20 USD / 10k events→
Local (40%)
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 45
DIRAC Test Phase III: Cloud + Grid + Local
Cloud (8 cores)
Local (8 cores)
Grid (1 core)
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 46
Outline
● Physics at a (Super) B Factory
● SuperKEKB and Belle II
● Belle II Computing Model
● Cloud Computing
● Job and Data Management
● Offline Software
● Resources and Summary
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 47
Metadata Catalog
● Task: Identify files matching physics selection criteria (dataset)
➔ AMGA (Arda Metadata Grid Application)
● Official gLite metadata service● Support for several
DB backends● Grid-based authentication● Replication
➢ File level metadata: O(60 GB)
➢ Event level metadata investigated: O(20 TB)
Test of AMGAscalability
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 48
Analysis Model
● Task: Process selected set of files with given analysis code
➔ Project
➢ Have to make sure each file is processed exactly once
● No solution on project level provided by grid services
● Start with implementation of simple solution Static job – data assignment Job push model
● Then gradually add more complexity Dynamic job – data assignment (a la SAM, CDF/D0) Job pull model (pilot agents)
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 49
Project Client
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 50
Project Server: Dynamic Data Assignment
✔ Easier job submission (identical jobs)✔ More fine grained monitoring (processed files)✔ Only running jobs at working sites get data assigned✔ Faster nodes get more input files
Automatic load balancing, reduced overall time to finish→
Additionalcentral
component
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 51
Same Strategy for MC Production
Input data file is replaced by run number, #events
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 52
Outline
● Physics at a (Super) B Factory
● SuperKEKB and Belle II
● Belle II Computing Model
● Cloud Computing
● Job and Data Management
● Offline Software
● Resources and Summary
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 53
Code Management
● Developer with different level of experience,distributed around the world
➔ Need reliable, user-friendly, well-maintainable code
Central subversion code repository with access control Code split in packages, responsible librarians
Nightly and regular integration builds Feature driven releases passing quality tests Compile on different OS (with different compilers)
Documentation: repository browser, doxygen, twiki Language: only C++, python for scripts Coding conventions, code reviews, style formatting tool
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 54
Coding Conventions @ TWiki
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 55
Framework
● Inspired by frameworks of Belle (basf) + other experiments
✔ Used for simulation, reconstruction, analysis, and DAQ✔ ROOT I/O as data format✔ Software bus
with dynamically loaded modules
✔ Python steering✔ Parallel processing✔ (Backward compatibility with Belle)
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 56
Parallel Processing
● roobasf
● new basf2
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 57
Python Steering Example
import osfrom basf2 import *
#Register modulesgearbox = fw.register_module("Gearbox")
#Set parametersgearbox.param("InputFileXML",os.path.join(basf2dir,"Belle2.xml"))gearbox.param("SaveToROOT", True)gearbox.param("OutputFileROOT", "Belle2.root")
#Create pathsmain = fw.create_path()
#Add modules to pathsmain.add_module(gearbox)
#Process eventsfw.process(main,1)
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 58
Geometry and Conditions Database
● (Time-dep.) parameters stored in XML files and DB
➔ Unique interface to obtain parameters Gearbox→
● TGeo geometry created from parameters Geant4 simulation using G4root Same geometry and parameters used for reconstruction
and analysis
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 59
Generic Event Viewer
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 60
Outline
● Physics at a (Super) B Factory
● SuperKEKB and Belle II
● Belle II Computing Model
● Cloud Computing
● Job and Data Management
● Offline Software
● Resources and Summary
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 61
Human Resources
Computing group:● ~10 in core
software team● ~15 in core
distributed computing team(including computing scientists)
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 62
Hardware Resources
● Preliminary estimates depend on many unknown parameters(accelerator performance, data reduction, performance of simulation/reconstruction, analysis requirements, ...)
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 63
Summary
➢ SuperKEKB is planned to deliver O(50) times more data than current B factories
➔ Allows Belle II to discover, identify or exclude New Physics
➔ Huge data volume is challenge for computing Distributed computing system based on grid Cloud computing option for peak demands,
already works for Belle MC production Development of user interface to distributed
resources and offline software in progress
✔ 6th Open Belle II Meeting July 5-7
➔ You are welcome to attend http://belle2.kek.jp
DESY Computing Seminar 14.06.2010Thomas Kuhr Page 64
World Cup
Japan : Cameroon
1 : 0(my guess at half time)