A. Mohapatra, HEPiX 2013 Ann Arbor1 UW Madison CMS T2 site report D. Bradley, T. Sarangi, S. Dasu,...
-
Upload
nancy-hodges -
Category
Documents
-
view
216 -
download
1
Transcript of A. Mohapatra, HEPiX 2013 Ann Arbor1 UW Madison CMS T2 site report D. Bradley, T. Sarangi, S. Dasu,...
A. Mohapatra, HEPiX 2013 Ann Arbor 1
UW Madison CMS T2 site reportUW Madison CMS T2 site reportUW Madison CMS T2 site reportUW Madison CMS T2 site report
D. Bradley, T. Sarangi, S. Dasu, A. MohapatraHEP Computing Group
Outline
InfrastructureResourcesManagement & OperationContributions to CMSSummary
A. Mohapatra, HEPiX 2013 Ann Arbor 2
EvolutionEvolutionEvolutionEvolution
Started out as a grid3 site Played a key role in the formation of
the Grid laboratory of Wisconsin (GLOW)
HEP/CS (Condor team) collaboration • Designed standalone MC production system,
adapted CMS software, and run it robustly in non-dedicated environments (UW grid & beyond)
Selected as one of the 7 CMS Tier2 site in the US.
Became a member of WLCG and subsequently OSG.
Serving all OSG supported VOs besides CMS
A. Mohapatra, HEPiX 2013 Ann Arbor 3
InfrastructureInfrastructureInfrastructureInfrastructure
3 machine rooms, 16 racks Power supply – 650 KW Cooling – Chilled water based air coolers and
POD based hot aisles
A. Mohapatra, HEPiX 2013 Ann Arbor 4
Compute / Storage Compute / Storage ResourcesResources
Compute / Storage Compute / Storage ResourcesResources
Compute (SL6) • T2 HEP Pool – 4200 cores
• Dedicated to CMS• CHTC and GLOW Pool – 3200
cores• Opportunistic
Storage (Hadoop)•Migrated from dCache
to Hadoop 3 years ago.•3PBs distributed across
350 nodes •Will add 1B soon.
A. Mohapatra, HEPiX 2013 Ann Arbor 5
Network ConfigurationNetwork ConfigurationNetwork ConfigurationNetwork Configuration
Internet2
NLR
ESNET
Chicago
PurdueFNAL Nebraska
UW Campus
T2 LAN
Server
Server
Server
Sw
itch
Sw
itch
100GNew
3x10G
10G
10G
10G
1G4x10G
Perfsonar (Latency & Bandwidth)nodes are used to debug LAN andWAN (+ USCMS matrix) issues
A. Mohapatra, HEPiX 2013 Ann Arbor 6
Network Configuration (2)Network Configuration (2)Network Configuration (2)Network Configuration (2)
Strong support from compus network team (DoIT)
Good rate for CMS data transfer from T1/T2 sites
A. Mohapatra, HEPiX 2013 Ann Arbor 7
Software and ServicesSoftware and ServicesSoftware and ServicesSoftware and Services
File systems• AFS, NFS, CVMFS, ext4
Job batch systems• Condor
OSG software • Globus, GUMS, glexec
Storage• Hadoop (hdfs), BestMan2(srm), gridFtp,
Xrootd
Cluster management• Local yum repo, Puppet
Cluster monitoring• Nagios, Ganglia, and tons of home grown scripts
A. Mohapatra, HEPiX 2013 Ann Arbor 8
Cluster Software ManagementCluster Software ManagementCluster Software ManagementCluster Software Management
Cfengine usage until fall of 2012 Migration to puppet
• Translation and successful validation of individual cfengine based modules
A. Mohapatra, HEPiX 2013 Ann Arbor 9
Cluster Health MonitoringCluster Health MonitoringCluster Health MonitoringCluster Health Monitoring
Nagios•Hardware, disks etc.
Ganglia•Services, memory, cpu/disk usage,
network, storage
OSG tools•RSV
CMS specific •SAM, Hammer Cloud
Miscellaneous Scripts•Cron’ed
A. Mohapatra, HEPiX 2013 Ann Arbor 10
CMS Usage (production/analysis)CMS Usage (production/analysis)CMS Usage (production/analysis)CMS Usage (production/analysis)
A. Mohapatra, HEPiX 2013 Ann Arbor 11
Contributions to CMSContributions to CMSContributions to CMSContributions to CMS
• HTCondor and Glidein technology
A. Mohapatra, HEPiX 2013 Ann Arbor 12
Any data, Anytime, AnywhereAny data, Anytime, AnywhereAny data, Anytime, AnywhereAny data, Anytime, Anywhere
A. Mohapatra, HEPiX 2013 Ann Arbor 13
Experience with Amazon EC2 Experience with Amazon EC2 Experience with Amazon EC2 Experience with Amazon EC2