21 May 2015 Edgar Fajardo On behalf of OSG Software and ...

21 May 2015

Shooting for the sky: Testing the limits of condor

1

Edgar FajardoOn behalf of OSG Software and Technology

HTCondor Week 2015

HTCondor Week 2015

Acknowledgement

2

Although I am the one presenting. This work is a product of a collaborative effort from:

• HTCondor development team• GlideInWMS development team• HTCondor UW CHTC (A Moate), UCSD T2 (Terrence M) and CMS FNAL T1 (Burt H), who provided the hardware for the testbed.

• CMS Computing and WLCG sites provided the worker node resources.

• OSG Software team.

HTCondor Week 2015

Vanilla Condor on a Slide

3

Legend:Schedd

Central Manger

Worker node Proudly providing HTC for more than 20 years

HTCondor Week 2015

GlideInWMS on a slide

4

Legend:Schedd

Central Manger

Happily working for more than 6

years !!!

Worker node

HTCondor Week 2015

Vanilla HTCondor vs GlideInWMS in

the numbers

5

Vanilla Condor GlideinWMS

Number of nodes O(1k) O(10k)

# Different types of machines O(10) O(100)

Location of Schedds and WN Private WAN

HTCondor Week 2015

The Challenge:

6

Wow, We would expect so. Nobody has tried

that big of a pool

CMSHi OSG Software folks

For Run II, we would like to have a single global pool of 200,000 running jobs, u think Condor can handle it?

Nobody has found the Higgs beforewe and ATLAS did

Touche

Just 20k? Didn’t u say u want a pool of 200k? Nvm we will figure it out.

Btw, u can use some of my WLCG resources to test. About 20k slots

HTCondor Week 2015

The Challenge in the numbers

7

CMS Requirements for LHC Run II

# Startd 200,000

Autoclusters (types of machines) 600

Number of Schedds <5

High Availability YES !!!

Challenge Accepted !!!

HTCondor Week 2015

Without using your credit card

How to get 200k slots?

8

Our Solution: The UberGlideIn

Gathering from a commercial cloud for example*:

*At spot pricing and without using the Uber GlideIn

HTCondor Week 2015

Uber GlideIn

9

GlideIn

1 Core

One Master

One Startd

Normal GlideIn UberGlideIn

GlideIn

1 Core

Master 1

Startd_1 Stard_2 Startd_3 Startd_n

Master 2 Master 3 Master n

pilot_cpu_benchmark.py

This works because jobs are sleep jobs.

HTCondor Week 2015

Moreover we wanted them

distributed all over

10

Why?In production, the network

latency of having Startd all over the WAN is known to cause problems. See Bradley et all

http://iopscience.iop.org/1742-6596/219/6/062036/

HTCondor Week 2015

Does it work?

11

~20k real core slots

150k Startd

Yes it does !!!

HTCondor Week 2015

Well most of the time it works:

12

Ohh dude, sorry about that. We will fix it

Hi OSG Software folks

About those tests you are running at our site. Just FYI u

brought down our firewall, while using > 10k ports

Mental note: Talk to the HTCondor dev team to reduce the long lived TCP connections from the WN to the outside (Collector, Negotiator …)

PIC Admin

HTCondor Week 2015

Now we have the resources, lets test:

13

Test

Report/ Brainstorm

FixGanglia features was key. THAnKS !!!GANGLIAD_REQUIREMENTS = MyType =!= "Machine"

For ~ 3 months

HTCondor Week 2015

Did it work? YES !!!

14

One full weekend

HTCondor Week 2015

“It was not all a bed of roses”, maybe it

was…

15

Max parallel Running Jobs single HTCondor Pool

Max

Par

alle

l Run

ning

Job

s

0

50,000

100,000

150,000

200,000

8.3.0

8.3.1

8.3.2200000

150000

92000

HTCondor Week 2015

• Non blocking GSI authentication at the Collector• Shared port at the worker node. In general reduce #

of long lived TCP connections.• Removed file locking at the Schedd• Reduced incoming TCP connections at the Schedd• Batched resources request from the Collector

HTCondor Improvements

16

For more details see Todd’s talk on What’s new in HTCondor 2015?

HTCondor Week 2015

Scale Improvements throughout “History”

17

Max parallel Running Jobs single HTCondor Pool in latest series

Max

Par

alle

l Run

ning

Job

s

0

50,000

100,000

150,000

200,000

HTCondor Series

7.0.1

7.1.3

7.3.1

7.5.5

8.3.0

8.3.1

8.3.2

200000

150000

9500090000

2000010000

2000

CHEP 08

CH

EP 09CHEP 2015

HTCondor Week 2015

Ahh, One more thing …

18

Sounds good, Which levels are u looking for?

Hi Edgar,

Since u and Jeff are working on the scaling tests, what about we scale test our new rockstar: The

HTCondor CE.

Are u kidding me? That is twice as much of what any OSG site needs?

Brian B

About the size of UNL sleeper pool ~16k parallel running jobs?

HTCondor Week 2015

Step back: What is a CE?

19

HTCondor Week 2015

HTCondor-CE in a slide

HTCondor Case

HTCondor-CE Schedd

HTCondor Schedd

Job Router Transform

CE Job

HTCondor Job (vanilla)

Submit Host

HTCondor Schedd

Job (grid universe)

HTCondor-C submit

20

PBS Case

HTCondor-CE Schedd

PBS

Job Router TransformGrid Job

Routed Job (grid uni)

PBS Job

blahp-based transform

Submit Host

HTCondor Schedd

Job (grid universe)

HTCondor-C submit

HTCondor-CE

HTCondor

HTCondor-CE

PBS

HTCondor Week 2015

Condor + Configuration

What is HTCondor-CE?

21

For more details see OSG AHM 2015 or CHEP 2015 talks.

https://indico.fnal.gov/getFile.py/access?contribId=32&sessionId=15&resId=1&materialId=slides&confId=8580

https://indico.cern.ch/event/304944/session/4/contribution/519/material/slides/0.pdf

HTCondor Week 2015

Did we make it?YES !!!

22

Better than any Gram5 known to date

UNL sleeper batch started to swap

16 k Jobs achieved

After some development cycles:

HTCondor Week 2015

HTCondor CE In the Numbers:

23

HTCondor-CE GRAM 5

Best max running jobs 16k* 10k

Network Port usage(per running job 2 4

StartUp Rate 70 jobs/min 55 jobs/min*

*Disclaimer: This tests were done on different hardware with 5 years in between them.

HTCondor Week 2015

• “The OSG Software team, in conjunction with HTCondor and GlideinWMS development teams have

collaborated to push the scalability limits of a single HTCondor pool “

• The HTCondor-CE is ready to rock and roll

Conclusions

24

HTCondor Week 2015

1-900-scale-masters

Questions?

25

Contact us at:

HTCondor Week 2015

Just Kidding

26

[email protected]

Contact us:

mailto:[email protected]

21 May 2015 Edgar Fajardo On behalf of OSG Software and ...

Documents

Transcript of 21 May 2015 Edgar Fajardo On behalf of OSG Software and ...