AGLT2 and MWT2 Networking Requirements
description
Transcript of AGLT2 and MWT2 Networking Requirements
efi.uchicago.educi.uchicago.edu
AGLT2 and MWT2 Networking Requirements
Rob Gardner
Computation and Enrico Fermi InstitutesUniversity of Chicago
AGLT2-MWT2 Networking MeetingJune 3, 2013
efi.uchicago.educi.uchicago.edu2
Scale of resources
• MWT2 capacity– 6392 compute cores– 3614 TB
• AGLT2 capacity– 4756 job slots– 3510 TB
• This summer more compute nodes will be added• Next 3 years more CPU and storage will be added
according to ATLAS physics needs• Resources to upgrade connectivity to 100 Gbps
approved
efi.uchicago.educi.uchicago.edu
3
MWT2, from 5-year proposal.
Capacities adjust torequirements, adjustedannually
CPU capacity in SpecINT 2006
Job slots in 2015 > 10K(original plan; expect toreach this in 2014)
BW/job slot ~ 10-500 Mbpsdepending on workload
Difficult to model resultingnetwork load
efi.uchicago.educi.uchicago.edu4
Evolving computing model & federation
• Over the past few years the rigid T0/T1/T2/T3 hierarchy changing into a flatter mesh-like infrastructure
• Made possible by faster, reliable and affordable networks; new virtual peering projects such as LHCONE
• Offers opportunity to create new ways of accessing data from production and analysis jobs– E.g. remove restriction that CPU and data located at same
site• Tools to allow jobs to flexibly reach datasets beyond
the local site storage are being developed
efi.uchicago.educi.uchicago.edu
5
A file is searchedlocally using theunique global name
If not found at the site the search is expandedto the region usinga network of redirectors
File may be copied tothe local storage, orread directly over theWAN
Network latency requires intelligent caching by the clientif file is directly read
controldata
efi.uchicago.educi.uchicago.edu6
Federation traffic
Early Tier 3 users+ measurement jobs
Modest levels nowwill grow when in production
700 MB/s
efi.uchicago.educi.uchicago.edu7
Comparing local to wide area performance
Ping time (ms)
read time (s)
local
local
efi.uchicago.educi.uchicago.edu8
Types of ATLAS network traffic (I)
• Intra-Tier2– Both AGLT2 and MWT2 are multi-site federations– Datasets resident at each site– Jobs at one site can read at another– Three transfer modes have been used:
o Direct read access (file opened over the network)o File copied from the remote site to local worker node
scratch disko File read triggers a pool-to-pool replication to a local
cache
efi.uchicago.educi.uchicago.edu9
Types of ATLAS network traffic (II)
• Tier1-Tier2– Replication of input data sets– Output of production datasets
• Tier2-Tier2– Driven by managed production tasks
• (Tier3, OSG, Cloud, HPC)-Tier2– Expect this mode to grow– As well as other resources – such as from Campus
Grids, OSG, Cloud and HPC centers
efi.uchicago.educi.uchicago.edu10
AGLT2-MWT2
• Creation of low latency multi-federation• 20k jobs reading 15000 TB