Wedding convenience and control with RemoteCondor
-
Upload
igor-sfiligoi -
Category
Technology
-
view
5.797 -
download
0
description
Transcript of Wedding convenience and control with RemoteCondor
![Page 1: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/1.jpg)
Apr 2012 Remote Condor 1
UCSD HEP Group Trainings
Weddingconvenience and control
withRemoteCondor
by Igor SfiligoiRemoteCondor co-developed with J. Dost
UC San Diego
![Page 2: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/2.jpg)
Apr 2012 Remote Condor 2
The Condor Batch System
● Condor is a Workload Management System● i.e. a batch system
● Strong points● Fault tolerant● Robust feature set● Flexible
● Large community base● Both commercial and scientific
http://research.cs.wisc.edu/condor/
![Page 3: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/3.jpg)
Apr 2012 Remote Condor 3
Condor Architecture
● Clearly separates● Resource providers
from● Resource consumers
● Each has a daemonprocess to represent it● Startd for resource provides● Schedd for resource consumers
● A central service connects them all● Managed by a Collector/Negotiator pair
Machines (aka worker nodes)CPUs, Memory, IO,...
Job queues (aka submit nodes)Jobs submitted by users
![Page 4: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/4.jpg)
Apr 2012 Remote Condor 4
Startd
Condor Architecture
Schedd
Schedd Startd
..
....
CollectorNegotiator
in a picture
![Page 5: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/5.jpg)
Apr 2012 Remote Condor 5
The truth about submit nodes
● Corollary● The submit node is a server!
● There is no real “Condor client”● The cmdline tools are just a convenience
to talk to the daemon process
Schedd
condor_submitcondor_q
Submit node
CollectorNegotiator
Startd
![Page 6: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/6.jpg)
Apr 2012 Remote Condor 6
Implications
● Being a server has several implications● Security implications
● Will have incoming connectivity● All security configuration on the submit node● Submit node controls user
authentication and authorization
● Unfriendly to non-dedicated hardware● Requires always on operation● Must be on a public&static IP address
![Page 7: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/7.jpg)
Apr 2012 Remote Condor 7
Implications
● Being a server has several implications● Security implications
● Will have incoming connectivity● All security configuration on the submit node● Submit node controls user
authentication and authorization
● Unfriendly to non-dedicated hardware● Requires always on operation● Must be on a public&static IP address
High exploit risk
Requires high trustbetween all nodes
in the cluster
Impossible touse on a laptop
![Page 8: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/8.jpg)
Apr 2012 Remote Condor 8
Implications
● Being a server has several implications● Security implications
● Will have incoming connectivity● All security configuration on the submit node● Submit node controls user
authentication and authorization
● Unfriendly to non-dedicated hardware● Requires always on operation● Must be on a public&static IP address
High exploit risk
Requires high trustbetween all nodes
in the cluster
Impossible touse on a laptop
Not suitablefor an unmanaged
user machine
![Page 9: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/9.jpg)
Apr 2012 Remote Condor 9
What are the alternatives?
● Out of the box, Condor provides● Remote submission● Condor-C
● In the contrib sections, you can find● RemoteCondor
![Page 10: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/10.jpg)
Apr 2012 Remote Condor 10
What are the alternatives?
● Out of the box, Condor provides● Remote submission● Condor-C
● In the contrib sections, you can find● RemoteCondor
This presentationargues that this isthe best solution
![Page 11: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/11.jpg)
Apr 2012 Remote Condor 11
What are the alternatives?
● Out of the box, Condor provides● Remote submission● Condor-C
● In the contrib sections, you can find● RemoteCondor
This presentationargues that this isthe best solution
So what is wrong with these?
![Page 12: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/12.jpg)
Apr 2012 Remote Condor 12
Schedd
Schedd node
Remote submission
● Essentially, connecting to a remote Schedd● condor_submit -remote … + condor_transfer_data
and● condor_q -name ..., condor_rm -name ..., …
● So no daemon processes on the submit node● A true client solution!
Scheddcondor_submit
condor_qcondor_transfer_data
Submit node
CollectorNegotiator
StartdAu
thhttp://research.cs.wisc.edu/condor/manual/v7.6/condor_submit.html
http://research.cs.wisc.edu/condor/manual/v7.6/condor_transfer_data.html
![Page 13: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/13.jpg)
Apr 2012 Remote Condor 13
So, what's the problem?
● No local user log file● Must use
condor_qto monitor progress
● Fully Condor-based user authentication● While rich, not what users expect
(e.g. no user/password)
● Hard to tie into campus-wide auth
● Staged input data not shared
● Annoying at best● High monitoring load● And it does not work
with DAGMan
Could be a problem with large datasets
![Page 14: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/14.jpg)
Apr 2012 Remote Condor 14
Condor-C
● Based on the Grid paradigm● Submit locally, then delegate to remote Schedd
● Still running a daemon process● But requires no incoming connections
Schedd
Schedd node
Schedd
condor_submitcondor_q
Submit node
CollectorNegotiator
StartdAu
th
● Secure● Laptop
friendly
Schedd
http://research.cs.wisc.edu/condor/manual/v7.6/5_3Grid_Universe.html#sec:Condor-C
![Page 15: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/15.jpg)
Apr 2012 Remote Condor 15
What are the drawbacks?
● Awkward syntax● At least compared to Vanilla universe● See the Condor manual for examples
● Has scalability problems● Could likely be improved,
but this is the current state-of-the-art
● Fully Condor-based user authentication● Staged input data not shared
Same as remotesubmissions
Can be mitigatedwith Job Router
(but adds anotherlayer of complexity)
![Page 16: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/16.jpg)
Apr 2012 Remote Condor 16
Introducing
RemoteCondor
![Page 17: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/17.jpg)
Apr 2012 Remote Condor 17
What's the big idea?
● Let the users login into a remote machine● And run the cmdline tools there True client
approach
![Page 18: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/18.jpg)
Apr 2012 Remote Condor 18
What's the big idea?
● Let the users login into a remote machine● And run the cmdline tools there
Advantages:● True local Condor experience● Standard system authentication and authorization
● No admin privileges for the users
● Trust based on “central” Schedd admin skills● Can regulate and transform Condor submissions
● Minimize security risk● Central handling● Familiar to users
No exceptions
![Page 19: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/19.jpg)
Apr 2012 Remote Condor 19
What's the big idea?
● Let the users login into a remote machine● And run the cmdline tools there
Advantages:● True local Condor experience● Standard system authentication and authorization
● No admin privileges for the users
● Trust based on “central” Schedd admin skills● Can regulate and transform Condor submissions
● Minimize security risk● Central handling● Familiar to users
No exceptions
Big deal!
Where's the news?
![Page 20: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/20.jpg)
Apr 2012 Remote Condor 20
What's the big idea?
● Let the users login into a remote machine● And run the cmdline tools there
● … while preserving the local look-and-feel● RemoteCondor provides
● Wrappers around major Condor cmdline tools● Integration with sshfs
https://condor-wiki.cs.wisc.edu/index.cgi/wiki?p=RemoteCondor
![Page 21: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/21.jpg)
Apr 2012 Remote Condor 21
RemoteCondor wrappers
● Provide wrappers that use ssh under the hood● Users (almost) unaware of the trick
● But may be prompted for a password● Works best with public key authentication
sshd
Schedd node
Schedd
condor_submitcondor_q
Submit nodeCollector
Negotiator
StartdAu
th
condor_submitcondor_q
![Page 22: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/22.jpg)
Apr 2012 Remote Condor 22
RemoteCondor and sshfs
● But being able to talk to Condor is not enough● Users must be able to create and read data!
● Using sshfs solves the problem● Schedd-local disk mounted on submit node● Using ssh as a tunnel● All in user space (FUSE)
● RemoteCondor will properly convert paths(within certain limits)
http://fuse.sourceforge.net/sshfs.html
Disk local to Scheddfor maximum performance
![Page 23: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/23.jpg)
Apr 2012 Remote Condor 23
RemoteCondor and sshfs
● But being able to talk to Condor is not enough● Users must be able to create and read data!
● Using sshfs solves the problem● Schedd-local disk mounted on submit node
sshd
Schedd node
Schedd
Submit nodeCollector
Negotiator
StartdAu
th
Real disksshfs
![Page 24: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/24.jpg)
Apr 2012 Remote Condor 24
Using RemoteCondor
● Distributed in the Condor src tarball● In the Contrib section
● Requires a “make install”● To put the proper files in place
● Plus minimal configuration● Where is the remote Schedd node?● What username to use?● Where to mount the sshfs partition?
https://condor-wiki.cs.wisc.edu/index.cgi/wiki?p=RemoteCondor
![Page 25: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/25.jpg)
Apr 2012 Remote Condor 25
Summary
● Traditional Condor not suitable for user machines● Keeping Schedd nodes professionally maintained
highly desirable● To minimize security risks and control job flow
● RemoteCondor allows this operation modewhile preserving the local look-and-feel● Requires minimal local install
![Page 26: Wedding convenience and control with RemoteCondor](https://reader034.fdocuments.net/reader034/viewer/2022051818/54b6bccc4a795956098b459f/html5/thumbnails/26.jpg)
Apr 2012 Remote Condor 26
Acknowledgements
This work is partially sponsored by ● the US National Science Foundation under Grants No. OCI-0943725 (STCI) and PHY-0612805 (CMS Maintenance & Operations),
and ● the US Department of Energy under Grant No. DE-FC02-06ER41436 subcontract No. 647F290 (OSG).