ATLAS Sites Jamboree...• Proposal – Very small sites (

17
ATLAS Sites Jamboree 18 - 20 Jan 2017

Transcript of ATLAS Sites Jamboree...• Proposal – Very small sites (

Page 1: ATLAS Sites Jamboree...• Proposal – Very small sites (

ATLASSitesJamboree

18-20Jan2017

Page 2: ATLAS Sites Jamboree...• Proposal – Very small sites (

StorageDT•  ReminderfromAlessandra•  IfstoragegoesinDTfor>=48h•  analysisqueueswillbesetin–  brokeroff120hbeforetheDTand–  offline72hbefore,– whileproducLon

•  queueswilljustbesetoffline48hearlier.•  Thepoliciesarehere.–  hOps://twiki.cern.ch/twiki/bin/view/AtlasCompuLng/GridDataProcessing#Switcher

2

Page 3: ATLAS Sites Jamboree...• Proposal – Very small sites (

MatureAtlasDistributedCompuLng

•  Almost3weeksof“unaOended”producLonoverChristmasbreak

•  Forthe1stLme,ATLASproducLonanddatamanagementsystem

•  workedonit’sown–  OnlyashortglitchwhennewDBReleasewasinstalledwithoutthesetup.shfileincvfms

–  NointervenLonwasrequiredfromeithercentralsystemsorfromproducLonmanagers

•  Noreprocessingwasrunning•  ButderivaLoncampaignisofthesameorderofcomplexity•  Noseriousissueswithsites.BigThanks!!

3

Page 4: ATLAS Sites Jamboree...• Proposal – Very small sites (

4

Page 5: ATLAS Sites Jamboree...• Proposal – Very small sites (

Resourcesfor2017

5

2016pledges

2017OLD

2017Approved

2017Pledges

Balancewrtpledge

Balancewrt2016

T0CPU 257 300 404 404 0% 57%T1CPU 571 682 921 808 -12% 42%T2CPU 633 846 1125 928 -18% 47%SUMCPU 1461 1828 2450 2140 -14% 46%T0DISK 17 20 25 25 0% 18%T1DISK 52 57 68 69 1% 33%T2DISK 68 78 83 78 -6% 15%SUMDISK 137 155 176 172 -2% 26%T0TAPE 42 53 77 77 0% 83%T1TAPE 119 173 188 174 -7% 46%SUMTAPE 161 226 265 251 -5% 56%

Page 6: ATLAS Sites Jamboree...• Proposal – Very small sites (

•  FromSimoneCampanaTalk:

–  Flatbudgetforesees+20%CPUand+15%storageperyear.Wegetmuchmorethanthat

–  Weareshortbyonly2%instorageand14%inCPU.FAsdidlistentous(weaskedtoinvestinstorage)

–  SomeFAsdidnotpledgemoresaidtheywillprovidemore

–  Whatwas“opportunisLc”lastyearmighthavebeenpledgedthisyear(sowemightseelessopportunisLcresources)

–  Someagencies/sitesinvestedindisk,someinCPUasweaskedoneyearagoalready

–  IthinkthesituaLonisrathergood.ManythankstoallFAsfortheireffort

6

Page 7: ATLAS Sites Jamboree...• Proposal – Very small sites (

SummaryofrecommendaLonsforsitesDavidCameron

•  Workernodes–  Memory-sLll2GB/core

•  Don’tkillonvmem–  Scratch-20GB/coreor100GB/8cores–  Network–0.25MB/s/core–  Sooware–HEPOS_libsandcvmfsisallyouneed

•  OS/VirtualisaNon/Containers–  CentOS7issLllnotrecommendedL

•  ATLASisreadybutmiddleware(EMIWN)isnot–  Many(most?)sitesusingsomevirtualisaLonlikeOpenstack

•  Wedon’tseethissodon’tworryaboutit–  Containers:

•  Singularitylookspromising,usefultomanageCentOS7transiLon

•  Batchsystem/CE–  Amodernbatchsystem(SLURM,HTCondor)makesthingseasierforATLAS–  Requestedfairshares

•  Analysis:T2:25%,T1:5%•  ProducLon:T2:75%,T1:95%

–  SCORE:20%–  MCORE:80%

–  DynamicSCORE/MCOREparLLonsarerecommended–  CEs

•  HTCondor-CE(US)andARCCE(restofworld)arebecomingstandards

7

Page 8: ATLAS Sites Jamboree...• Proposal – Very small sites (

SummaryofrecommendaLonsforsitesDavidCameron

•  DDM/Storage–  ConsolidaLonofsmallstorageisencouraged

•  UseanalternaLveremoteSE•  Orbecomeacachesite(xrootd/ARCcache)

–  Non-SRMdiskSEisnowpossible–  ATLASsLllasksforGridFTP,HTTPandXrootdtobesupported–  AstandardspacereporLngmethodisevolving–  Tapes

•  Wewillworkonfilesizes•  Networking

–  NosignificantchangesinusageforeseenfortherestofRun-2–  Usageisheavilyinfluencedbyjobbrokeringstrategy,recentlythishas improvedalot–  MONARChasreallygone,site“closeness”isbasedonactualmeasurements–  ForRun-3,increasewillberequired,100GbpsisprobablyokforTier-1s–  IPv6

•  SitesmayprovideIPv6-onlyworkernodesaoer1stApril2017•  AllATLASserviceswillbedual-stackbythen•  FullsiteIPv6-only:probably4yearsawayatleast

•  Monitoring–  CurrentdashboardswillstayunLlnewinfrastructureisready

•  Butpleasetrytousethenewonebeforeyouareforcedto–  Heavydevelopmentandcommissioningofnewdashboardsongoing

•  Newframeworkisveryflexible(maybetooflexible)•  Customdashboardsareusefulbut1perpersonisprobablytoomuch•  ADCwillworkonofficial/validateddashboards•  Harvester/Eventservice

•  Harvester/Eventservice–  ATLASwantstobeOerusethesites–  ForthisitneedsmoreinformaLon

•  InfowillbetakenfromtheCEorpilot•  ThiswillbeeasieronthemoremodernCE/batchsystems

–  Oneconsequenceshouldbefewer(visible)pandaqueuespersite•  Ideallyone

–  Eventservice•  CurrentlybeingcommissionedonGridsites•  IfyouhavepreemptablequeuesparLcipaLonisencouraged

8

Page 9: ATLAS Sites Jamboree...• Proposal – Very small sites (

IPv6StatusandPlansAlastairDewhurst

•  DualStackStorage

–  Allsitesareencouragedtoupgradetheirstoragetodualstack:•  10+sitesalreadyupgraded.•  [email protected].

–  OnlyCERNFTSisconfiguredtoallowFTStransfersviaIPv6currently.•  BNLwillupgradeearlyFebruary2017.•  RALwillupgradesummer2017

–  IfasiteupgradestodualstackcontactDDMsupportsotheycanswitchyoutoCERNFTS.

•  IPv6onlyCPU–  FromApril2017,sitescanprovidetheirCPUresourcesasIPv6only.–  Ifasitewantstodothis,pleasegetincontactinadvance.–  By2018wewouldhopethisistransparent.–  QMULandBrunelhaveIPv6onlyCPUalready.

9

Page 10: ATLAS Sites Jamboree...• Proposal – Very small sites (

IPv6StatusandPlansAlastairDewhurst

•  ATLASstatus–  Assumingsitehasdualstackstorage.WNwilltalktothefollowingcentralnodes:

•  ForPanda:–  ProducLonPandaServers:aipanda03[0-7].cern.ch

•  ForRucio–  Authnodes:rucio-auth-prod-0[1,2].cern.ch–  Prodnodesthrough3HAproxyfrontendsrucio-lb-prod-0[1-3].cern.ch

•  AllusehOp(s).–  Rucio

•  Attheendoflastyear,RucioteammigratedallnodestoCC7.•  EnabledIPv6atthesameLme.•  Allrequirednodesarenowdone!

–  RucioUIwebfront(rucio-ui.cern.ch)alsomadedualstack.–  Panda

•  PqndaproducLonnodessLllIPv4only.•  aipanda007.cern.ch(devnodeisdualstack)

–  PilotsrunningagainstBrunel/QMUL.–  Debuggingproblemswithpilotcode.

•  SLllbelievewecanmeetApril2017deadline.–  Otherservices(FronLer,APF,AGIS)

•  Plantoupgradeby2018.

•  IPv6deploymentplanwillbeupdatedin2018.•  AllowsitestocompletelymigratetoIPv6bystartofRun3(2021)?

10

Page 11: ATLAS Sites Jamboree...• Proposal – Very small sites (

CentOS7AlessandraForL

•  CentOS7naNve–  Workonrelease21startedaoerthesummer2016PhysicsvalidaLon–  OnceC7releasesareavailablebothpla{ormwillbeusedunLltheendofRun2–  C7releasescannotbevalidatedonSL6nodes–  cannotbemixedresourcesbehindthesamePandaQueues–  CurrenttesLning

•  CurrenttesNng–  oneedtocreateanewPandaSitefortesLng–  addSL7tothequeuesnametoeasilyidenLfythem–  SWvalidaLon–  HCtests

•  Middlewarestatus–  OSGdistributesCentOS7middleware–  EGIhasnowanrpmintheUMDtesLngrepositories–  MWREADY-135–  UMDCentOS7tesLngrepository–  YUMrepofile–  Noneedtogothroughrecipestogettherpmsinplace–  Itneedstobetestedbysites

•  TRIUMFstartedtolookintoit–  TarballversionoftheUMDrpmavailableinCVMFS

•  /cvmfs/grid.cern.ch/centos7-wn-preview-v01

11

Page 12: ATLAS Sites Jamboree...• Proposal – Very small sites (

CentOS7AlessandraForL

•  Containers&VirtualisaLon–  RALismovingtowardsrunningservicesinsidecontainers,controlledbyMesos.

•  Someservicesalreadymigrated(FTSandSquid).–  BatchFarmwillbeenLrelymigratedbyAprilnextyear.–  C7machineswillrun“SL6WN”insidecontainers.–  ATLAS,CMS,LHCbandALICEjobswork

•  HOWTOmigrate–  ItisrecommendedtokeeptheSL6andC7resourcesseparatedevennow–  Bigbangupgrade:declareadownLmeandcomebackwithC7workernodesbehindthesamePandaSiteand

PandaQueues.•  swreleaseswillhavetobewipedandrevalidated

–  SL5releaseswillnotbereinstalled–  Rollingupgrade:declarean“atrisk”DT,createanewPandaSitewithnewmaster+slavesqueues.–  swreleaseswillautomaLcallybevalidated–  IneithercasethemigraLonhastobecommunicatedtoATLAS([email protected])–  SitesusingSL6containersorVMdon'tneedtoannounce

•  Recommended?–  MovingtheWNstoCentOS7isNOTyetrecommendedonEGIresourcesasthemiddlewareisnotingood

shapeyet.•  TesLngappreciatedwouldspeedupitsreleasethough

–  ATLASSL6applicaLonsrunningincompaLbilitymodehavebeenvalidatedandsitesthathavetomovecanmove

–  Ifyouhavetomovepleasedon'tsetupthenodeswithouttellingatlas.–  hOps://twiki.cern.ch/twiki/bin/view/AtlasCompuLng/CentOS7Readiness–  AskAlessandra

12

Page 13: ATLAS Sites Jamboree...• Proposal – Very small sites (

LightweightSitesCedricSerfon

•  SoluNonsforlightweightsites–  DifferentpossibiliLesforlightweightsites:

•  Storage-lesssite•  HavingalocalStorageElementisnotarequirementtorunproducLonjobs:

–  Eveneasiernowwithnewsitemover–  2sitesalreadyintheprocessofmigraLngtostorage-less:RO-14andRO-16–  Morecandidateswouldbegood.DDMopswillhelpindecommissioningtheendpoints

•  Siteusingcache(arc-cacheorxrootdcache)–  Arccache–  Ruciosupportstheconceptofcacheswhicharecontrolledoutsideitselfandmaynotbeconsistent.–  ArcCEcachecanpublishitscontenttoRuciothroughadd/deletemessages.–  Thecacheservicecancreatedumpsofcachecontent,andaseparatescriptrunsperiodicallytocalculatethedifferencesandsendmessagestoRucio.–  ThecacheRSEsareassociatedtotheCE’sPanDAqueueandsoPanDAcanbrokerjobstoqueueswherethedataiscached.–  Arccacheperformances

»  QueueusingArccacherunningformanymonthsinDurham»  EfficiencyofqueuesusingArc-cacheandlocalstorageverysimilar(arccachequeueevenabitbeOerxrootdcache

–  xrootdcache»  Squidlikecacheproxyonsurface

•  Usedisktocachethedata•  Workaroundfirewall•  Easytouse:hOpproxy,infuturerootproxy•  Differentunderthehook•  ForstaLclargefiled•  MulL-threadtohandledataintensiveload•  Capableofbothwholefilecachingandfileblockcaching•  Protocoltoclient:xrootandhOp

•  DistributedStorage

–  AlaNordugrid.WorkstransparentlyforNordicsites(dCache)foryears–  NevertestedforDPMbutshouldbedoableonwellconnectedsites:

»  Onesiterunstheheadnode+somedisknodesOthersitesonlyrundisknodes »  Probablyneedssomeworkonthefirewallrules

–  IfonewantstoconsolidateexisLngsites,nontrivialtomergetheDBfromthedifferentsites(rememberLFCconsolidaLon)•  FederatedStorage

–  WebDAVfederaLon—DynaFed–  Thisisnotdistributedstorage,butahostedserviceisprovidingthefederaLon–  WebDAVisnowsupportedbothinROOT(TDavixFile)andDDMlevel(Rucio+Metalink)–  ProvidesinglevolaLleRSEas"entrypoint"totheFederaLon–  FederatedstoragesystemsmustnotberegisteredasRSEsseparately

»  DoubleaccounLngofdata»  UnavoidabledeleLon/transferracecondiLons

–  AllowsthetransparentuseofMicrosooAzureandAmazon-styleS3cloudstorages–  Privatekeysundercontrolofsite,doesnotneedtobepublishedtoATLAS

13

Page 14: ATLAS Sites Jamboree...• Proposal – Very small sites (

LightweightSitesCedricSerfon

•  SRMlessstorage– Upload/downloadinthepilotviaruciouploadallowsuseofnon-SRMprotocols(gsiop,xroot,WebDAV)

–  3rdpartytransferwithgsiop,xrootdvalidated.SLllneedtovalidateWebDAV

•  Proposal–  Verysmallsites(<100TB):

•  Keepthepandaqueues•  DecommissiontheStorageelement•  Possibilitytosetupacache(arc/xrootd)

–  Smallsites(>100TBand<400TB):•  Trytoconsolidatewithoneclosesite•  Orgotofederatedstorage

14

Page 15: ATLAS Sites Jamboree...• Proposal – Very small sites (

RecommendaNonsforsitesAlessandraForL

•  Memory–  fromcgroupsRSS

•  SmapsPSS:physicalmemoryusedbyajobwithoutdoublecounLng•  cgroupsRSS:physicalmemoryusedbythejobswithoutdoublecounLng

•  Whatbatchsystemsdo?–  Batchsystemswithoutcgroups

•  SeethesameRSSasreportedinsmaps•  KillonvmemwhichisNOTaphysicalmemorymeasure•  Ifyouinsistonthisyouneedtosetitatleast3LmestheRAMrequestedbythejob•  Ifyoukillwiththescheduleritislikelytothesameproblem

–  Siteswithcgroups•  Cansetupsooandhardlimitsonthevaluesthejobreports•  SoolimitallowsthekerneltodecideifthejobcankeeponusingtheextraRAMorhastoswap•  HardlimitwillkillthejobbasedonRAM•  Ooensetto2or3LmestheRAMrequestedbythejob

•  Whichbatchsystem–  ATLASrecommendssitesmovetoaBSsupporLngcgroupsandothermoremodernfeatures.

•  HTCondor•  SLURM

•  Ces–  ARC-CE

•  MostusedatnewsitesandsitesmovingtoHTCondoratEGIsites•  WellintegratedwithSLURM

–  HTCondorCE•  MostusedintheUS•  IfyouhaveHTCondorbatchsystemisjustanaddiLonallayerofconfiguraLon

–  CREAM-CE•  MostusedinEGIforlegacyreasonsbutATMisbestintegratedwitholderbatchsystemsliketorque/mauiandSGE•  IfyouchangebatchsystemyoumaywanttoconsiderreviewingalsoyourCE

•  Sharesatsites–  Analysis:25%–  ProducLon:75%

•  SCORE:20%•  MCORE:80%

–  HoweverthisshareisnotconstanLnLme•  StaLcparLLonsetupisNOTrecommended•  Reminderthatrecipesformoredynamicapproachesfor3batchsystemscanbefound

•  hOps://twiki.cern.ch/twiki/bin/view/LCG/DeployMulLCore#Batch_system_related_informaLon

15

Page 16: ATLAS Sites Jamboree...• Proposal – Very small sites (

RecommendaNonsforsitesAlessandraForL

•  WNHardware–  About20GBofdiskscratchspace–  Foran8coreMCOREslot,~80-100GBissufficient–  Atleast2GBof(physical)RAM

•  Having3-4GBwouldbebeneficial•  EnoughswapspacesuchthatRAM+swap>=4GB

–  Asaruleofthumb,about0.25Gbit/sofnetworkbandwidth•  MightwanthigherformorepowerfulCPUs.

•  Storage–  Changeofstoragetopology–  Biggersites(T1andT2)withsatellitesindepentlyfromlocaLon–  EvoluLonofsitestowardscachesorfederaLons–  Consolidatestorage

•  75%ofstorageat~30sites•  Smallsites<400TBdiscouragedfrombuyingstorageunlesstheycangoaboveoraggregatewithothersites

•  Squid–  CondiLondataandsoowareareaccessedthroughsquid–  FronLer&CVMFS–  Sitesarerequestedtoinstallatleastonesquid

•  Twoforredundancyandloadbalancing–  FronLersquidorOSsquid?

•  FronLersquidhassomepatchestoboostperformance.Itisalsoahigherversionwithbugfixes.•  OSsquidiseasiertomaintainbecauseistherebydefault.•  T2scangetawaywiththeOSsquidbutATLASrecommendstousetheFronLerversion

–  Monitoring:•  h_p://wlcg-squid-monitor.cern.ch/snmpstats/indexatlas.html

•  Traceability–  Glexechasbeendropped–  WLCGTraceabilityTFworkingonothertoolsandmodels

•  SingularitycontainersoluLonbeingtestedatCERNandinOSG•  1singleexecutabledoesn'tneedadaemon•  Canisolatepayloadfrompilotenvironment•  CannotdotraceabilitythatwillhavetobedoneatVOlevel

–  ATLASalreadydoesthis»  Site:Lme/(IP|host)->VO

–  VO:Lme/host->user+payload•  BeingintegratedinslurmandHtcond

16

Page 17: ATLAS Sites Jamboree...• Proposal – Very small sites (

Recap–AFSphaseout@CERN•  Drivenbyslowdemiseofupstreamproject•  No“hard”deadlinebutexpecttofinalizeinLS2(~4Q2018)•  Docs:•  hOps://twiki.cern.ch/twiki/bin/view/IT/AfsPhaseout•  hOps://twiki.cern.ch/twiki/bin/viewauth/AtlasCompuLng/AtlasAFSPhaseOut

•  2017plans–EOS–  Knownproblems:–  Numberoffilesperinstance(AFS:3.5GB)→newEOSnamespace,ETA3Q2017–  “Smallfile”performance(create:untar,compile)→EOSFUSErewrite,ETA2Q2017

•  Also:–  Roll-outof“citrine”EOSbranch–  UpgradetoCC7

•  Recap2016AFSphaseout-“easy”–  Targets:

•  Sooware→CVMFS–•  Websites→EOSWEB(setup,start)•  FUSEaccessforPLUS&BATCH

–  Starton/projectmigraLon–  hOps://its.cern.ch/jira/browse/NOAFS

•  2017–externalAFSdisconnecLontest•  2017-02-1509:00CET;24h;ITSSBentry•  Goals:flushunknownAFSdependencies+createawareness

–  Experimentsetup-sLllrefertoCERNAFS?–  Sitesetup:

•  preferAFSoverCVMFS?•  /afs/cern.chuserhomedirectories?Talktous.Before.

–  Usersetup:•  PreferAFSoverCMFS,EOS?

•  (NudgetowardsalternaLves)•  Willrepeat,willeventuallyclosecompletely(earlyifsecurityissues)

•  2017plans–AFS“harderstuff”•  (conLnuewith2016)

–  ProjectmigraLons–  ActuallyremovesoowarefromAFS(ATLASproblem:CMTnightlies)–  WebspacemigraLon(alsoforuserpages)–ATLASWebsite(s)?

•  /work:–  –2Q2017:Stopself-service,createviaLckets(jusLficaLon?thingsnotworkingonEOS)–  –4Q2017:nolongercreateAFSworkspaces(per-experimentdecision),startmigraLon

•  /user:groundworkforlater$HOME•  –Split“UNIX”accountfrom“AFSaccount•  – Allownon-AFShomedirectoriesinLDAP•  HEPIXscripts”-ATLAS:takefromCVMFS?”

17