The Network Architecture of the Connection Machine CM-5 Charles E. Leiserson et al (Thinking...

8
The Network Architecture of the Connection Machine CM-5 Charles E. Leiserson et al (Thinking Machines Corporation) Presented by Eric Carty-Fickes 1/28/04

Transcript of The Network Architecture of the Connection Machine CM-5 Charles E. Leiserson et al (Thinking...

Page 1: The Network Architecture of the Connection Machine CM-5 Charles E. Leiserson et al (Thinking Machines Corporation) Presented by Eric Carty-Fickes 1/28/04.

The Network Architecture of the Connection Machine CM-5Charles E. Leiserson et al (Thinking Machines Corporation)

Presented by Eric Carty-Fickes

1/28/04

Page 2: The Network Architecture of the Connection Machine CM-5 Charles E. Leiserson et al (Thinking Machines Corporation) Presented by Eric Carty-Fickes 1/28/04.

major themes of CM-5

good performance (measured how?) ease of use (by programmers), flexibility

let programmer access nonpriveleged functions do not involve OS if possible

availability, reliability use commodity parts and same part when

possible – economy of mechanism split system into three separate networks

(data, control, diagnostic)

Page 3: The Network Architecture of the Connection Machine CM-5 Charles E. Leiserson et al (Thinking Machines Corporation) Presented by Eric Carty-Fickes 1/28/04.

network interface

same interface for data and control networks provides context switching capability, makes

processor save state interface appears as memory-mapped FIFO

registers protection enforced by processor

users access relative processor addresses only; easy protection and error checking

users unaware of network topology

Page 4: The Network Architecture of the Connection Machine CM-5 Charles E. Leiserson et al (Thinking Machines Corporation) Presented by Eric Carty-Fickes 1/28/04.

data network

fat-tree architecture used keeps local traffic separate can be adapted to various bandwidth schemes keeps traffic balanced claimed near-optimal data routing

modified fat-tree uses two input and two output FIFO's to guarantee no deadlock

variable-length packets (fixed for control) bandwidth scales linearly to 16,384 nodes

Page 5: The Network Architecture of the Connection Machine CM-5 Charles E. Leiserson et al (Thinking Machines Corporation) Presented by Eric Carty-Fickes 1/28/04.

network protection

flow control sent to message originator to protect buffers

central clock synchronizes everything (good idea?) messages tagged with routing and processing info

plus error check errors traced to origin (how many simultaneous

errors detected/masked?) all-fall-down mode saves in-flight messages in

random nodes

Page 6: The Network Architecture of the Connection Machine CM-5 Charles E. Leiserson et al (Thinking Machines Corporation) Presented by Eric Carty-Fickes 1/28/04.

control network

synchronizes processing nodes checks contract between processors and data network, reports

errors hybrid MIMD architecture

combines SIMD's broadcasting with ability to run different parts of code

barrier synchronization = line of code all processors must reach before continuing improved with split-phase barriers

broadcast = individual processors send out mass interrupts, code, data, etc.

combining = select sets of nodes (only certain functions available)

Kirchoff's law for messages assures at least no pair of messages lost

Page 7: The Network Architecture of the Connection Machine CM-5 Charles E. Leiserson et al (Thinking Machines Corporation) Presented by Eric Carty-Fickes 1/28/04.

diagnostic network

goal of functionality independence, use JTAG individual chips and collections can be tested network tree inherently self-testing hierarchy

Page 8: The Network Architecture of the Connection Machine CM-5 Charles E. Leiserson et al (Thinking Machines Corporation) Presented by Eric Carty-Fickes 1/28/04.

Questions

are there any errors, glaring or minor, that you can see with CM-5?

do you really agree with the authors that it is okay to allow a user to cause deadlock? should there be a check in place to prevent it? might it not prevent an error in the network from

being detected? would CM-5 really work just as well as

technology progressed?