CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic -...
Transcript of CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic -...
![Page 1: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/1.jpg)
Katy Ellis
28th August 2019
GriddPP43
CMS usage of UK resources
![Page 2: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/2.jpg)
Introduction
• Katy Ellis
• New(ish) CMS / RAL Tier 1 Liaison
• Started in September 2018
• New to CMS (PhD on ATLAS, 2012)
• Included in my role – operations, improving job efficiency/failure rate, “projects”• DOMA TPC, Rucio integration for CMS with CTA, XRootD investigations.
![Page 3: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/3.jpg)
Contents
• CMS experiment upgrades in LS2
• Review of CMS UK computing resources
• News from Tier 2s
• Additional CMS resources
• Progress with Rucio for CMS
![Page 4: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/4.jpg)
LHC schedule
![Page 5: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/5.jpg)
CMS current status
• Preparing for Run 3• Hardware upgrades
• Computing upgrades
• Preparing for the longer term future• Long Shutdown 3 activities already
being planned in detail.
• HL-LHC civil-engineering work is ongoing since June 2018 - five new buildings on the surface, as well as modifications to the underground cavern and galleries.
![Page 6: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/6.jpg)
Long shutdown 2 CMS upgrades
• Installation of new beampipe
• Replacement pixel detector (innermost layer)
• Upgraded power system for the magnet
• Installation of new multi-GEM chambers for increased coverage of muon detection
![Page 7: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/7.jpg)
CMS UK Computing ResourcesSite CPU (HS06) CPU (HS06) Disk Storage Disk Storage
Pledge Provision Pledge Provision
RAL Tier 1 52,000 61,408 5.44 PB* 5.44 PB*
Brunel London 1.49 PB
Imperial College London 44,198 4.50 PB
RALPP (Tier2) South 24,122 3.73 PB
Bristol 727 TB
QMUL - 14,934 N/A 2 TB
RHUL - Opportunistic - 0
Oxford - Opportunistic - 41 TB
Glasgow - Opportunistic - 174
DODAS N/A Opportunistic - N/A
CMS @ home N/A Volunteered - N/A
* Includes 200 TB tape buffer
Source: REBUS
Total T2 CPU pledge 50 kHS06London CPU 32,641London Disk 2.925 PBSouth CPU 17,326South Disk 975 TB
+ 17.6 PB Tape pledge at T1
![Page 8: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/8.jpg)
CPU in last 6 months – Total completed jobs
![Page 9: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/9.jpg)
Running cores
![Page 10: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/10.jpg)
CMS Pledges
From EGI Accounting
52kHS0648kHS06
T1
T2s
Opp
Total T2 pledge = 50kHS06
![Page 11: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/11.jpg)
CMS Pledges
From EGI Accounting
52kHS0648kHS06
The Katy Effect?
![Page 12: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/12.jpg)
Imperial College news• IC have moved their data centre from Kensington to Slough!
• Coordination by Simon Fayer
• ~ 1 week
• Data centre is now run remotely, but tended by local system administrators - no noticeable difference
![Page 13: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/13.jpg)
RALPP news
• In the last month, RALPP has joined the LHCONE network• LHC Open Network Environment.
• Improve data access by flattening the T1/2/3 hierarchy so that any site may connect with any other.
• One issue connecting with FNAL FTS, but quickly resolved.
• This is a precursor to RAL T1 joining LHCONE – other T1s already connected.
![Page 14: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/14.jpg)
![Page 15: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/15.jpg)
Incorporating additional sites
• Analysis jobs submitted to RAL T1/T2, IC or Brunel will also match Glasgow, Oxford, QMUL and RHUL.
• UK sites are now a mesh structure• ”CMS Tier 3” sites read/write data
from/to RALPP
• CMS are keen to extend this to other non-CMS sites.
![Page 16: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/16.jpg)
Incorporating additional sites
• Analysis jobs submitted to RAL T1/T2, IC or Brunel will also match Glasgow, Oxford, QMUL and RHUL.
• UK sites are now a mesh structure• ”CMS Tier 3” sites read/write data
from/to RALPP
• CMS are keen to extend this to other non-CMS sites.
![Page 17: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/17.jpg)
DODAS – “Dynamic On Demand Analysis Service”
• Start a personal Tier 2 (3?) on the cloud.
• Running in ~20 minutes on OpenStack, Azure, AWS, EGI-clouds.
• Developed in Italy, adapted and tested by Riccardo di Maria at IC.• Looking for a new person.• Tested on a temporary cloud of 800
nodes at IC.
• “Useful if you have a deadline and a credit card”.
![Page 18: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/18.jpg)
CMS@home *
• Volunteers running CMS jobs on their personal computers
• They can view monit plots with CERN or affiliate credentials, or Facebook/Google/etc.
• Almost entirely single-core MC Production jobs
• Planning to make submission more automated.
• Trying to move to SLC7.
* Talk to Ivan Reid if you want to know more
![Page 19: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/19.jpg)
Move to Rucio
• Rucio will replace PhEDEx from Run 3• File transfer service• File catalogue• Highly scalable• Heterogeneous storage systems worldwide• Run centrally
• Used successfully by ATLAS for several years
• Now open to the wider community
• CMS activities include: integration with Production and User Analysis job submission, setup of databases for e.g. User accounts, data synchronization performance testing, setup of monitoring, etc.
![Page 20: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/20.jpg)
Rucio for CMS – current status
• Rucio components are automated in Kubernetes.• Monitoring will be added in Kibana.
• ‘Million file test’ progressing well – being repeated.• Monitoring in early stages…
• Some level of synching on many sites - NanoAOD. • Subscriptions.
![Page 21: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/21.jpg)
Rucio for CMS – Tape
• Able to transfer into all Tier 1 tape systems• RAL was most difficult – different Rucio config
• Have now fixed RAL config, and able to use the tape as source
• Cannot yet use other T1 tapes as source• Hoping the fix for RAL will point towards a solution
• Able to transfer into and out of CERN Tape Archive (CTA)• Must be done via EOS
• Started more substantial tests, ~10 TB
• Waiting for monitoring from CTA (2.155 TB were written in 3 hours)
• Working towards a ~200 TB transfer test
![Page 22: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/22.jpg)
Summary
• CMS is preparing for Run 3 and beyond.
• UK sites are meeting their pledge, and often exceeding it.
• CMS Tier 2s are working well.
• Thanks to other sites for offering spare capacity.
• Tier 1 continues to make improvements.
• Rucio integration and testing for CMS is in full swing.
![Page 23: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/23.jpg)
Backup
![Page 24: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/24.jpg)
VO shares for UK sites
![Page 25: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/25.jpg)
Increase in data rate
![Page 26: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/26.jpg)
Long shutdown activities
• Detector upgrades
• Production
• Tape and disk cleaning
• CERN Tape Archive
• Rucio
![Page 27: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/27.jpg)
CMS detector upgrades for Run 3 Further upgrades for HL-LHC in Runs 4 and 5 already in detailed planning stage via TDRs.
![Page 28: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/28.jpg)
Rucio and CERN Tape Archive (CTA)
• CTA: CERN Tape Archive, which replaces CASTOR at CERN this summer• Meta-data migration only
• Change to the high-level structure? Possible issue with Rucio
• RAL will also be changing tape system in the medium-term• Tender is out
• Useful to gain expertise integrating Rucio with tape systems
• Pre-production service on CTA, with a test Rucio Storage Element (RSE)
![Page 29: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/29.jpg)
Details on CMS T2s (pledge)• Imperial (2200TB)
• Moving the data centre to Slough in June
• 2 * 100 Gb/s network (one is fallback)
• RAL_PP (1100TB, 1600TB imminently) • Connecting to LHCONE soon
• Brunel (500TB)• CMS using close to 100% of storage due to a bug
• Some issues after upgrading to DOME version of DPM. Better testing before deployment would improve this situation.
• 40Gb/s coming in May
![Page 30: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/30.jpg)
‘Old’ CERN CASTOR tape setup
![Page 31: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/31.jpg)
‘New’ CERN CTA tape setup
![Page 32: CMS UK Computing - Indico...QMUL - 14,934 N/A 2 TB RHUL - Opportunistic - 0 Oxford - Opportunistic - 41 TB Glasgow - Opportunistic - 174 DODAS N/A Opportunistic - N/A CMS @ home N/A](https://reader035.fdocuments.net/reader035/viewer/2022071501/6120a2d0868ec004f1338f78/html5/thumbnails/32.jpg)
Why is it so complicated?