Software infrastructure for the I-WAY high-performance distributed computing experiment
description
Transcript of Software infrastructure for the I-WAY high-performance distributed computing experiment
Software infrastructure for the I-WAY high-performance distributed computing experiment
Ian Foster, Jonathan Geisler, Bill Nickless, Warren Smith, and Steven Tuecke
Grid Computing - Making the Global Infrastructure a Reality, chapter 4, pages pp. 101~106. Wiley and Sons
Outline
Introduction The I-WAY network I-Way infrastructure
Point of presence machinesSchedulerSecurity
Parallel programming toolsFile systems
Conclusions
Outline
Introduction The I-WAY network I-Way infrastructure
Point of presence machinesSchedulerSecurity
Parallel programming toolsFile systems
Conclusions
I-WAY
In brief, the I-WAY was an ATM network connecting supercomputers, mass storage systems, and advanced visualization devices at 17 different sites within North America.
I-Soft, I-POP, I-WAY
Novel concepts and techniques
Point of presence machines Scheduler proxies Authorization proxies Network-aware parallel programming tools
Outline
Introduction The I-WAY network I-Way infrastructure
Point of presence machinesSchedulerSecurity
Parallel programming toolsFile systems
Conclusions
The I-WAY network
The I-WAY network connected display devices (CAVE, ImmersaDesk)
mass storage systems
specialized instruments
supercomputers of different architectures…
Why ATM? ATM was chosen rather than traditional Internet connectivity
because it provides a broader bandwidth and is able to handle
audio, video, and data more efficiently.
Outline
Introduction The I-WAY network I-Way infrastructure
Point of presence machinesSchedulerSecurity
Parallel programming toolsFile systems
Conclusions
Point of presence machines
I-POPIt provide uniform authentication, resource reservation, process
creation, and communication functions across I-WAY resources.
I-Soft It was a software environment deployed on these I-POP machines.
It provides a variety of services.1. scheduling2. security3. parallel programming support4. a distributed file system
I-POP design
I-POP discussion
All I-POPs shared a single AFS cell proved extremely useful as a means of maintaining a single, shared copy of I-Soft code and as a mechanism for distributing I-WAY scheduling information.
We never exploited this capability to monitor or control the ATM network.
Outline
Introduction The I-WAY network I-Way infrastructure
Point of presence machinesSchedulerSecurity
Parallel programming toolsFile systems
Conclusions
Scheduler design
Computational Resource Broker (CRB)Requests are handled by an independent entity (CRB), which then negotiates with the site schedulers that manage individual resources. In the I-WAY, one was sufficient.
Virtual machines Predefined disjoint subsets of I-WAY computers.
User-to-CRB and CRB-to-resource protocols
Scheduler design (cont.)
Functions of scheduler1. management functions2. user functions
Central scheduler and local scheduler Two-part strategy 1. Central scheduler daemon that managed and allocated time on the different virtual machines on a first-come, first-served basis. 2. A local scheduler daemon communicating directly with the local site scheduler. Local schedulers performed site-dependent actions in response to requests from the central scheduler to allocate resources, create processes, and deallocate resources.
Scheduler discussion
Limitations 1. Too-restrictive interfaces between user and scheduler and
scheduler and local resources.
2.The concept of using fixed virtual machines as schedulable units
was only moderately successful.
3.The long-term solution probably is to develop more sophisticated
schedulers for resources that are to be incorporated into I-WAY–
like systems.
Outline
Introduction The I-WAY network I-Way infrastructure
Point of presence machinesSchedulerSecurity
Parallel programming toolsFile systems
Conclusions
Security design
Two partsauthentication to the I-POP environment
authentication to the local sites
Authentication to I-POPs was handled by using a telnet client modified to use Kerberos authentication and encryption.
The scheduler software served as an ‘authentication proxy.’
Security discussion
Authenticate once A more fundamental limitation of the I-WAY
authentication scheme as implemented was that each user had to have an account at each site to which access was required.
Outline
Introduction The I-WAY network I-Way infrastructure
Point of presence machinesSchedulerSecurity
Parallel programming toolsFile systems
Conclusions
Parallel tools design
I-WAY must support the creation of processes on different processors and the communication of data between these processes.
These tools should ideally relieve the programmer of the need to consider low-level details relating to network structure.
Parallel tools design (cont.)
Irsh and ixtermNexus multithreaded communication libr
aryNexus supports automatic configuration mechanisms that allow it to use information contained in resource databases to determine which startup mechanisms, network interfaces, and protocols to use in different situations.
CAVEcomm and MPICH
Parallel tools discussion
A significant difficulty revealed by the I-WAY experiment related to the mechanisms used to generate and maintain the configuration information used by Nexus.
Automatic discovery techniques.
Outline
Introduction The I-WAY network I-Way infrastructure
Point of presence machinesSchedulerSecurity
Parallel programming toolsFile systems
Conclusions
File systems
I-WAY–like systems introduce three related requirements with a file-system flavor.1. Many users require access to various status data and utility
programs at many different sites.
2. Users running programs on remote computers must be able to
access executables and configuration data at many different sites.
3. Application programs must be able to read and write potentially
large data sets.
The I-Soft system supported only the first of these requirements.
File Systems (cont.)
An AFS cell (with three servers for reliability) was deployed and used as a shared repository for I-WAY software, and also to maintain scheduler status information.
Outline
Introduction The I-WAY network I-Way infrastructure
Point of presence machinesSchedulerSecurity
Parallel programming toolsFile systems
Conclusions
Conclusions
SC’95 Further I-WAY–like systems.