CSE 8383 - Advanced Computer Architecture Week-11 April 1, 2004 engr.smu.edu/~rewini/8383.

30
CSE 8383 - Advanced Computer Architecture Week-11 April 1, 2004 engr.smu.edu/~rewini/8383

Transcript of CSE 8383 - Advanced Computer Architecture Week-11 April 1, 2004 engr.smu.edu/~rewini/8383.

CSE 8383 - Advanced Computer Architecture

Week-11April 1, 2004

engr.smu.edu/~rewini/8383

Contents Message Passing (Distributed

Memory) Systems Static Networks Dynamic Networks

MIMD Distributed Memory Systems

Interconnection Networks

M M M M

P P P P

Distributed Memory Multiple address spaces Communication via send & receive Synchronization via message

passing

Interconnection Network Several IN classification criteriaSeveral IN classification criteria

Mode of operation: synchronous vs. Asynchronous

Control strategy: centralized vs. decentralized

Switching techniques: circuit vs. packet switching

Topology: static vs. dynamic

Interconnection Network Taxonomy

Interconnection Network

Static Dynamic

Bus-based Switch-based1-D 2-D HC

Single Multiple SS MS Crossbar

Static Interconnection Networks

Direct links, which are fixed once built

Fixed Connections, unidirectional or bi-directional

Fully Connected (Completely Connected Network (CCN))

Limited Connection Network (LCN).

Dynamic Interconnection Networks

Communication patterns are based on program demands

Connections are established on the fly during program execution

Multistage Interconnection Network (MIN) and Crossbar

Static Network Topology

Linear arrays Ring (Loop) networks Two-dimensional arrays Tree networks Cube network ….

Static Network Analysis

Graph Representation Parameters

Cost Degree Diameter Fault tolerance

Graph Review G = (V,E) -- V: nodes, E: edges Directed vs. Undirected Weighted Graphs Path, path length, shortest path Cycles, cyclic vs. acyclic Connectivity: connected, weakly

connected, strongly connected, fully connected

Linear Array

N nodes, N-1 edgesNode Degree:

Diameter:

Cost:

Fault Tolerance:

Ring

N nodes, N edges

Node Degree:

Diameter:

Cost:

Fault Tolerance:

Chordal Ring

N nodes, N edges

Node Degree:

Diameter:

Cost:

Fault Tolerance:

Barrwl Shifter Number of nodes N = 2n

Start with a ring Add extra edges from each node to

those nodes having power of 2 distance

i & j are connected if |j-I| = 2r, r = 0, 1, 2, …, n-1

Barrel Shifter N = 16

Node Degree:

Diameter:

Cost:

Fault Tolerance:

Tree and Star

Node Degree:

Diameter:

Cost:

Fault Tolerance:

Mesh and Torus

Node Degree: Internal 4Other 3, 2

Diameter: 2(n-1)

N = n*n

Node Degree: 4

Diameter: 2* floor(n/2)

Fully Connected

6 3

1 2

5 4

Node Degree:

Diameter:

Cost:

Fault Tolerance:

Hypercubes N = 2d

d dimensions (d = log N) A cube with d dimensions is made out

of 2 cubes of dimension d-1 Symmetric Degree, Diameter, Cost, Fault

tolerance Node labeling – number of bits

Hypercubes

d = 0 d = 1 d = 2 d = 3

0

1

0100

1110

000

001

100 110

111

011

101

010

Hypercubes

1110 1111

1010 1011

0110 0111

0010 0011

1101

1010

1000 1001

0100 0101

0010

0000 0001

S

d = 4

Hypercube of dimension d

N = 2d d = log n

Node degree = d

Number of bits to label a node = d

Diameter = d

Number of edges = n*d/2

Hamming distance!

Routing

Subcubes and Cube Fragmentation What is a subcube? Shared Environment Fragmentation Problem Is it Similar to something you

know?

Cube Connected Cycles (CCC) k-cube 2k nodes k-CCC from k-cube, replace each

vertex of the k cube with a ring of k nodes

K-CCC k* 2k nodes Degree, diameter 3, 2k Try it for 3-cube

K-ary n-Cube n = cube dimension K = # nodes along each dimension N = kn

Wraparound Hupercube binary n-cube Tours k-ary 2-cube

Analysis and performance metricsstatic networks

Performance characteristics of static networks

Network Degree(d) Diameter(D)Cost(#lin

k)Symmetr

yWorst delay

Fully connected

N-1 1 N(N-1)/2 Yes 1

Linear Array 2 N-1 N-1 No N

Binary Tree 3 2(log2N –1) N-1 No log2N

n-cube log2N log2N nN/2 Yes log2N

2D-Mesh 4 2(n-1) 2(N-n) No N

K-ary n-cube 2n nk/2 nN Yes K x log2N

Dynamic Network Analysis

Parameters: Cost: number of switches Delay: latency Blocking characteristics Fault tolerance

Switch Modules A x B switch module A inputs and B outputs In practice, A = B = power of 2 Each input is connected to one or

more outputs (conflicts must be avoided)

One-to-one (permutation) and one-to-many are allowed

Binary Switch

2x2Switch

Legitimate States = 4

Permutation Connections = 2