Post on 01-Jan-2016
Distributed Computing and Data Analysis for CMS in view
of the LHC startup
Peter Kreuzer
RWTH-Aachen IIIa
International Symposium on Grid Computing (ISGC)Taipei, April 9, 2008
Peter Kreuzer - CMS Computing & Analysis
2
Outline
• Brief overview of Worldwide LHC Grid: WLCG
• Distributed Computing Challenges at CMS– Simulation– Reconstruction– Analysis
• The physicist view
• The road to the LHC startup
Peter Kreuzer - CMS Computing & Analysis
3
From local to distributed Analysis
• Before : centrally organised Analysis
• Example CMS : 4-6 PBytes data per year, 2900 scientists, 40 countries, 184 institutes !
• Solution : ´´Tiered´´ Computing Model
• Since 20 years the amount of Data and of Physicists per experiment grew drastically
(x 10)(x 10)
Peter Kreuzer - CMS Computing & Analysis
4
• Level of distribution motivated by the desire to leverage and empower resources + share load, infrastructure and funding
Worldwide LHC Computing GRID
Aggregate Rate from CERN to Tier-1s > 1.0 GByte/s
Transfer Rateto Tier-2 50-500 MBytes/s
Tier-2s primarily at Universities- Simulation- User Analysis
Tier-1s at large national labs or universities- Re-Reconstruction- Physics ´skimming´- Data Serving- Archiving
Tier-0 at CERN- Prompt Reconstruction- Calibration and Low latency work- Archiving 1.0 GByte/s
Tier-3s at Institutes withmodest Infrastructure- Local User Analysis- Opportunistic Simulation
Peter Kreuzer - CMS Computing & Analysis
5
WLCG Infrastructure• EGEE Enabling Grid for E-Science• OSGOpen Science Grid
1 Tier-0 + 11 Tier-1 + 67 Tier-2Tier-0 -- Tier-1: dedicated 10Gbs Optical Network
CMS : 1 Tier-0 + 7 Tier-1 + 35 Tier-2
Peter Kreuzer - CMS Computing & Analysis
6
Examples of SitesT2 RWTH (Aachen) - CPU : 540 KSI2k = 360 cores- Disc : 100TB- Network (WAN): 2Gbit/sec (2009 : 450 cores & 150TB)
T1 ASGC- CPU: 2.4 MSI2k ~1800 cores- Disc : 930TB 1.5PB- Tape : 586TB 800TB - Network : 10Gbit/secT2 Taiwan- CPU : 150 KSI2k - Disc : 19TB 62TB- Network : up to 10Gbit/sec
Peter Kreuzer - CMS Computing & Analysis
7
0
50
100
150
200
250
300
350
400
2007 2008 2009 2010 2011 2012
Year
MS
I2K
0
20
40
60
80
100
120
140
160
180
2007 2008 2009 2010 2011 2012
Year
MS
I2K
Tier-2
CERN
Tier-1
2008 : 40 PetaBytes
Pledged WLCG Resources
Pet
aByt
es
(Tape Storage = 33 PBytes in 2008)
Tier-2
CERN
Tier-1
2008 : 66,000 cores
250,000 cores
• 1MSI2K = 670 coresMS
I2k
(Reference : LCG Project Planning – 1.3.08)
CPU
Disc Storage
Peter Kreuzer - CMS Computing & Analysis
8
• Scale-up and test distributed Computing Infrastructure – Mass Storage Systems and Computing Elements– Data Transfer– Calibration and Reconstruction– Event ´skimming´– Simulation– Distributed Data Analysis
• Test CMS Software Analysis Framework• Operate in quasi-real data taking conditions and
simulateously at various Tier levels Computing & Software Analysis (CSA) Challenge
Challenges for Experiments :Example CMS
Peter Kreuzer - CMS Computing & Analysis
9
CMS Computing and Software Analysis Challenges
• CMS Scaling-up in the last 4 years Test (year) Goal : Jobs/day Scale
– DC04 : 15,000 5%– 2005 - 2006 : New Data Model and
New Software Framework– CSA06 : 50,000 25%– CSA07 : 100,000 50%– CSA08 : 150,000 100%
• Requires 100s M simulated events input
?
Peter Kreuzer - CMS Computing & Analysis
10
The CSA07 data Challenge
TIER-0CASTOR
TIER-1TIER-1
CAF
TIER-2 TIER-2 TIER-2 TIER-2
Reconstruction 100Hz
Re-ReconstructionSkimms 25k jobs/day
Simulation50M evt/month
Calibration& Express Analysis
300MB/s
~10MB/s
HLT
TIER-1TIER-1
Analysis75k jobs/day
100M Simulated Data
20-200MB/s
Peter Kreuzer - CMS Computing & Analysis
11
In this presentation
• Mainly covering CMS Simulation, Reconstruction and Analysis challenges
• Data transfers challenges covered in talk by Daniele Bonacorsi during this session
Peter Kreuzer - CMS Computing & Analysis
12
CMS Simulation System
ProdRequest
ProductionManager
ProdAgent
ProdAgent
Tier-2
Tier-2
Tier-2
Tier-1
Tier-2
Tier-2
CMS Physicist
ProdAgent
Tier-2
Tier-2
Tier-2GRID
Global Data Bookkeeping
(DBS)
<< Where aremy data ? >>
<< Please simulate newphysics >>
Peter Kreuzer - CMS Computing & Analysis
13
ProdAgent workflows
• Data processing / bookkeeping / tracking / monitoring in local-scope • Output promoted to global-scope DBS & Data transfer system PhEDEx• Scaling achieved by running in parallel multiple ProdAgent instances
ProdAgent
GridWMS Tier-
2
Tier-1
Tier-2
Processing
Small output file from Processing job
SE
SE
SE
LocalDBS
Processing
Processing
GridWMS Tier-
2
Tier-1
Tier-2
Merging
Large output file fromMerge job
SE
SE
SE
PhEDEx
LocalDBS
ProdAgent
Merging
Merging
1) Processing: 2) Merging:
Peter Kreuzer - CMS Computing & Analysis
14
CMS Simulation Performance
~ 250M Events in 5 months
• Tier-2 alone ~ 72%• OSG alone ~ 50%
(Overall 07-08: 450M)
• 20k jobs/day reached• < Job efficiency > ~ 75% 30
40
50
60
70
Jan
Jul
Oct
Ap
r
M Evts / Month
Production Rate x 1.8
June – November 2007
Peter Kreuzer - CMS Computing & Analysis
15
Utilization of CMS Resources• average ~50%• In best productions periods 75%
Missing Requests
June – November 2007
5000job-slots
Peter Kreuzer - CMS Computing & Analysis
16
CSA07 Simulation lessons• Major boost in scale and reliability of production machinery• Still too many manual operations. From 2008 on:
– Deploy ProdManager component (in CSA07 was ´human´ !) – Deploy Resource Monitor– Deploy CleanUpSchedule component
• Further improvments in scale and reliability– gLite WMS bulk submission : 20k jobs/day with 1 WMS server– Condor-G JobRouter + bulk submission : 100k jobs/day and can
saturate all OSG resources in ~1 hour.– Threaded JobTracking and Central Job Log Archival
• Introduced task-force for CMS Site Commissioning– help detect site issues via stress-test tool (enforce metrics)– couple site-state to production and analysis machinery
• Regular CMS Site Availability Monitoring (SAM) checks
Peter Kreuzer - CMS Computing & Analysis
17
CMS Site Availability Monitoring
Important tool to protect CMS use cases at sites
(ARDA ´Dashboard´)03/22/08 04/03/08
Availability Ranking
0% 100%
Peter Kreuzer - CMS Computing & Analysis
18
CSA07 Reconstruction & Skimming0) preparation of ´´Primary Datasets´´
1) Archive and Reconstruction at CERN T02) Archive and Re-Reconstruction at T1s3) Skimming at T1s4) Express analysis & Calibration at CERN Analysis Facility
3 different calibrations 10pb-1,100pb-1, 0pb-1
mimicsreal CMSDetector+Triggerdata
Peter Kreuzer - CMS Computing & Analysis
19
Produced CSA07 Data VolumesTotal CSA07 event counts: 80M GEN-SIM80M DIGI-RAW80M HLT330M RECO (3 diff. calibrations)250M AOD100M skims---------------------------920M events
• Total Data volume: ~2PB Corresponds to
expected 2008 volume !
CMS data in CASTOR@CERN: 3.7PB
x1e+8 DIGI-RAW-HLT-RECO events
10/’07 02/’08
Peter Kreuzer - CMS Computing & Analysis
20
CSA07 Reconstruction lessons
• T0 Reconstruction at 100Hzonly in bursts, mainly due to stream splitting activity
• Heavy load on CASTOR• Usefull feedback to ProdAgent Developpers to prepare
2008 data taking (repacker, …)• T1 Processing : submission rate was main limitation.
Now based on gLite bulk submission and reaching 12-14k jobs/day with 1 ProdAgent instance
• Further rate improvment to be expected with T1 resource up-scaling
2k running jobsT0 and T1 processing
Peter Kreuzer - CMS Computing & Analysis
21
CRAB = CMS Remote Analysis BuilderAn interface to the GRID for CMS physicists Challenge : match processing resources with large quantities of data = ´´chaotic´´ Processing
CRAB
CMS Analysis System
CRAB Server
Tier-2
Tier-2
Tier-2
Tier-1
Tier-2
Tier-2
CMS Physicist
Tier-2
Tier-2
Tier-2
GRID
Global Data Bookkeeping
(DBS)
<< Where aremy jobs ? >>
<< Please analyse datasets X/Y >>
Peter Kreuzer - CMS Computing & Analysis
22
CRAB Architecture• Easy and transparent means for CMS users to submit analysis jobs via the GRID (LCG RB, gLite WMS, Condor-G)
• CSA07 analysis: direct submission by user to GRID. Simple, but lacking automation and scalability 2008 : CRAB server
• Other new feature: local DBS for “private” users
Peter Kreuzer - CMS Computing & Analysis
23
CSA07 Analysis• 100k jobs/day not achieved - mainly due to lacking data during the challenge- still limitted by data distribution: 55% jobs at 3 largest Tier-1s- and failure rate too high
53% Successful Jobs20% failed Jobs27% Unknown
20k jobs/day achieved+ regularly ~30k/day JobRobot submissions
Main causes:- data-access- remote stage out- manual user settings
Number of jobs
Peter Kreuzer - CMS Computing & Analysis
24
CMS Grid Users since 1 year
CRAB Server
• plot showing distinct users• 300 users during February 2008 • 20 most active users carry 1/3 of jobs
Users
Month
Peter Kreuzer - CMS Computing & Analysis
25
The Physicist View• SUSY Search in
di-lepton + jets + MET
• Goal : Simulate excess over Standard Model (´LM1´ at 1 fb-1)• Infrastructure
– 1 desktop PC– CMS Software Environment (´CMSSW´ , ´CRAB´, ´Discovery´ GUI, …)– GRID Certificate + member of a Virtual Organisation (CMS)
• Input data (CSA07 simulation/production) – Signal (RECO) : 120k events = 360 GB– Skimmed Background (AOD) : 3.3 M events = 721 GB
• WW / WZ / ZZ / single top• ttbar / Z / W + jets
– Unskimmed Background : 27 M events = 4 TB (for detailed studies only)
• Location of input data– T0/T1 : CERN (CH), FNAL (US), FZK (Germany)– T2 : Legnaro (Italy), UCSD (US), IFCA (Spain)
~1.1 TB
Peter Kreuzer - CMS Computing & Analysis
26
GRID Analysis Result
Z peak fromSUSY cascades
End-PointSignal
Analysis Latency• Signal + Bgd = 322 jobs 22h to produce this result !
• Detailed studies = 1300 jobs ~3.5 days
Georgia Karapostoli – Athens Univ. [GeV]
Peter Kreuzer - CMS Computing & Analysis
27
CSA07 Analysis lessons• Improve Analysis scalability, automation and reliability
– CRAB-Server– Automate job re-submission– Optimize job distribution– Decrease failure rate
• Move Analysis to Tier-2s– To protect Tier-0/1 LSF and storage systems– To make use of all available GRID resources
• Encourage Tier-2_to_Physics_group association– In close collaboration with sites– With solid overall Data Management strategy– Assess local scope DM for Physics groups & storage of user data
• Aim for 500 users by June and exceed capacity of several gLite WMS
Peter Kreuzer - CMS Computing & Analysis
28
Goals for CSA08 (May ’08)• “Play through” first 3 months of data taking• Simulation
– 150M events at 1 pb-1 (“S43”)
– 150M events at 10 pb-1 (“S156”)
• Tier-0 : Prompt reconstruction– S43 with startup-calibration– S156 with improved calibration
• CERN Analysis Facility (CAF)– Demonstrate low turn-around Alignment&Calibration workflows– Coordinated and time-critical physics analyses– Proof-of-principle of CAF Data and Workflow Managment Systems
• Tier-1 : Re-Reconstruction with new calibration constants– S43 : with improved constants based on 1 pb-1 – S156 : with improved constants based on 10 pb-1
• Tier-2 :– iCSA08 simulation (GEN-SIM-DIGI-RAW-HLT)– repeat CAF-based Physics analyses with Re-Reco data ?
Peter Kreuzer - CMS Computing & Analysis
29
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Detector installation,Detector installation,commissioning and operationcommissioning and operation
Preparation of Software,Preparation of Software,Computing and Physics analysisComputing and Physics analysis
iCSA08 / CCRC’08-2
CCRC’08-12007 Physics Analyses results
CMSSW 2.0 release[production start-up MC samples]
CMSSW 2.1 release[all basic sw components readyfor LHC, new T0 prod tools]
fCSA08 or beam!
Private global runs(2 days/week) &Private mini-daq
Cooldown ofmagnet
Beam-pipe baked-outPixels installed
Low i test
CMS closed
Initial CMS ready for run
Must keep exercisesmostly non-overlapped
“CROT”
“CRAFT”
CMSSW 1.8.0 sample production
2 weeks of 2.0 testing
iCSA08 sample generationCR 0T
CR 0T
GRUMM
pre CR 4T
CR 4T
2008
CCRC = Common-Vo Computing Readiness ChallengeCR = Commissioning Run
Peter Kreuzer - CMS Computing & Analysis
30
Where do we stand ?• WLCG : major up-scaling since 2 years !• CMS : impressive results and valuable lessons from CSA07
– Major boost in Simulation – Produced ~2 PBytes data in T0/T1 Reconstruction and Skimming– Analysis : number of CMS Grid-users ramping up fast !– Software : addressed memory footprint and data size issues
• Further Challenges for CMS : scale from 50% to 100%– Simultaneous and continuous operations at all Tier levels– Analysis distribution and automation– Transfer rates (see talk by D.Bonacorsi) – Upscale and commission the CERN Analysis Facility (CAF)CSA08, CCRC08, Commissioning Runs
• Challenging and motivating goals in view of Day-1 LHC !