Exploiting superior tensile properties of a novel network ...
Program and Network Properties
Transcript of Program and Network Properties
-
7/24/2019 Program and Network Properties
1/94
Chapter 2: Program and Network Properties
Conditions of parallelism
Program partitioning and scheduling
Program flow mechanisms
System interconnect architectures
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
2/94
EENG-630 - Chapter 2
Conditions of Parallelism
The exploitation of parallelism in computing requires
understanding the basic theor associated !ith it"
#rogress is needed in se$eral areas%computation models for parallel computing
interprocessor communication in parallel architectures
integration of parallel sstems into general en$ironments
-
7/24/2019 Program and Network Properties
3/94
Data dependences
The ordering relationship between statements is indicated by the data
dependence.
low dependence
!nti dependence
"utput dependence
#$" dependence
%nknown dependence
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
4/94
Data Dependence & '
low dependence: S' precedes S2( and at least one output of S' is
input to S2.
!ntidependence: S' precedes S2( and the output of S2 o)erlaps the
input to S'.
"utput dependence: S' and S2 write to the same output )ariable.
#$" dependence: two #$" statements *read$write+ reference the
same )ariable( and$or the same file.
-
7/24/2019 Program and Network Properties
5/94
Data Dependence & 2
%nknown dependence:
o The subscript of a )ariable is itself subscripted.
o The subscript does not contain the loop inde, )ariable.
o ! )ariable appears more than once with subscripts ha)ing different
coefficients of the loop )ariable *that is( different functions of the loop
)ariable+.
o The subscript is nonlinear in the loop inde, )ariable.
Parallel e,ecution of program segments which do not ha)e total data
independence can produce non&deterministic results.
-
7/24/2019 Program and Network Properties
6/94
Data dependence e,ample
EENG-630 - Chapter 2
&'% (oad )'* +&2% +dd )2* )'
&3% ,o$e )'* )3
&% &tore .* )'
&'
&2 &
&3
-
7/24/2019 Program and Network Properties
7/94
#$" dependence e,ample
EENG-630 - Chapter 2
&'% )ead /* +/1&2% )e!ind /
&3% rite /* ./1
&% )e!ind /
&' &314
-
7/24/2019 Program and Network Properties
8/94
Control dependence
The order of e,ecution of statements cannot be determined before
run time
o Conditional branches
o Successi)e operations of a looping procedure
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
9/94
Control dependence e,amples
Do2- # '( N
!*#+ C*#+
#*!*#+ ./T. -+ !*#+'
2- Continue
Do'- # '( N
#*!*#&'+ .01. -+ !*#+-
'- Continue
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
10/94
esource dependence
Concerned with the conflicts in using shared resources
o #nteger units
o loating&point units
o egisters
o 3emory areas
o !/%
o 4orkplace storage
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
11/94
5ernstein6s conditions
Set of conditions for two processes to e,ecute in parallel
#'"2 7
#2"' 7
"'"2 7
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
12/94
5ernstein6s Conditions & 2
#n terms of data dependencies( 5ernstein6s conditions imply that
two processes can e,ecute in parallel if they are flow&independent(
antiindependent( and output&independent.
The parallelism relation 88 is commutati)e *Pi88Pj impliesPj 88Pi +(
but not transiti)e *Pi88PjandPj88Pkdoes not implyPi88Pk+ .
Therefore( 88 is not an e9ui)alence relation.
#ntersection of the input sets is allowed.
-
7/24/2019 Program and Network Properties
13/94
%tiliing 5ernstein6s conditions
P': C D , 0
P2: 3 ; < C
P=: ! 5 < C
P>: C / < 3
P?: ; $ 0
EENG-630 - Chapter 2
#'
#3
#2 #
#5
-
7/24/2019 Program and Network Properties
14/94
%tiliing 5ernstein6s conditions
-
7/24/2019 Program and Network Properties
15/94
@ardware parallelism
! function of cost and performance tradeoffs
Displays the resource utiliation patterns of
simultaneously e,ecutable operations Denote the number of instruction issues per
machine cycle: k-issueprocessor
! multiprocessor system with n k&issue processorsshould be able to handle a ma,imum number of nk
threads of instructions simultaneously
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
16/94
Software parallelism
Defined by the control and data dependence of programs
! function of algorithm( programming style( and compiler
organiation
The program flow graph displays the patterns of simultaneously
e,ecutable operations
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
17/94
3ismatch between software and hardware parallelism & '
(' (2 (3 (
' 2
7 -
+ .
Maximum softwareparallelism (L=load,
X/+/- = arithmetic).
Ccle '
Ccle 2
Ccle 3
-
7/24/2019 Program and Network Properties
18/94
3ismatch between software and hardwareparallelism & 2
('
(2
(
(3'
2
7
-+
.
Ccle '
Ccle 2
Ccle 3
Ccle
Ccle 5
Ccle 6
Ccle 8
Same problem, but
cosideri! the
parallelism o a two-issuesuperscalar processor.
-
7/24/2019 Program and Network Properties
19/94
3ismatch between software and hardwareparallelism & =
('
(2
&'
'
7
(5
(3
(
&2
2
-
(6
.+
Ccle '
Ccle 2
Ccle 3
Ccle
Ccle 5
Ccle 6
Same problem,with two si!le-
issue processors
= iserted for
s"chroi#atio
-
7/24/2019 Program and Network Properties
20/94
Software parallelism
Control parallelism A allows two or more operations to be
performed concurrently
o Pipelining( multiple functional units
Data parallelism A almost the same operation is performed o)er
many data elements by many processors concurrently
o Code is easier to write and debug
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
21/94
Types of Software Parallelism
Control Parallelism A two or more operations can be performedsimultaneously. This can be detected by a compiler( or aprogrammer can e,plicitly indicate control parallelism by usingspecial language constructs or di)iding a program into multiple
processes. Data parallelism A multiple data elements ha)e the same operations
applied to them at the same time. This offers the highest potentialfor concurrency *in S#3D and 3#3D modes+. Synchroniation inS#3D machines handled by hardware.
-
7/24/2019 Program and Network Properties
22/94
Sol)ing the 3ismatch Problems
De)elop compilation support
edesign hardware for more efficient e,ploitation by compilers
%se large register files and sustained instruction pipelining.
@a)e the compiler fill the branch and load delay slots in code
generated for #SC processors.
-
7/24/2019 Program and Network Properties
23/94
The ole of Compilers
Compilers used to e,ploit hardware features to impro)eperformance.
#nteraction between compiler and architecture design is a necessityin modern computer de)elopment.
#t is not necessarily the case that more software parallelism willimpro)e performance in con)entional scalar processors.
The hardware and compiler should be designed at the same time.
-
7/24/2019 Program and Network Properties
24/94
Program Partitioning B Scheduling
The sie of the parts or pieces of a program that can be considered
for parallel e,ecution can )ary.
The sies are roughly classified using the term granule sie( or
simply granularity.
The simplest measure( for e,ample( is the number of instructions in
a program part.
;rain sies are usually described asfine( mediumor coarse(depending on the le)el of parallelism in)ol)ed.
-
7/24/2019 Program and Network Properties
25/94
/atency
Latencyis the time re9uired for communication between different
subsystems in a computer.
Memory latency( for e,ample( is the time re9uired by a processor to
access memory.
Synchronization latencyis the time re9uired for two processes to
synchronie their e,ecution.
Computational granularity and communicatoin latency are closelyrelated.
-
7/24/2019 Program and Network Properties
26/94
/e)els of Parallelism
9obs or programs
1nstructions
or statements
Non-recursi$e loopsor unfolded iterations
#rocedures* subroutines*
tas:s* or coroutines
&ubprograms* ;ob steps or
related parts of a program
cycles *se9uential program on one processor+
o M>' cycles *L processors+ & speedup '.'
o >> cycles *> processors+ & speedup '.O>
-
7/24/2019 Program and Network Properties
41/94
Static multiprocessor scheduling
;rain packing may not be optimal
Dynamic multiprocessor scheduling is an NP&hard problem
Node duplication is a static scheme for multiprocessor scheduling
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
42/94
Node duplication
Duplicate some nodes to eliminate idle time and reduce
communication delays
;rain packing and node duplication are often used Eointly to
determine the best grain sie and corresponding schedule
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
43/94
Schedule without node duplication
EENG-630 - Chapter 2
+*a*'
b*' c*?
a*?
c*'
C*'.*'
@*2 E*2
e*d*
#' #2 #2#'
+
@
.
1
E
C
1
6
'3
2'
28
23
20
'6
'
'2
-
7/24/2019 Program and Network Properties
44/94
Schedule with node duplication
EENG-630 - Chapter 2
+*a*'
b*' c*'
a*'
c*'
CA*'.*'
@*2 E*2
#' #2 #2#'
+
@
.
E
C
+
6
'0
'
'3
B
6
C*'
+A*
a*'
C8
-
7/24/2019 Program and Network Properties
45/94
;rain determination and scheduling optimiation
Step ': Construct a fine&grain program graph
Step 2: Schedule the fine&grain computation
Step =: ;rain packing to produce coarse grains
Step >: ;enerate a parallel schedule based on
the packed graph
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
46/94
Program low 3echanisms
Con)entional machines used control flow mechanism in which order of
program e,ecution e,plicitly stated in user programs.
Dataflow machineswhich instructions can be e,ecuted by determining
operand a)ailability.
Reduction machinestrigger an instruction6s e,ecution based on the
demand for its results.
-
7/24/2019 Program and Network Properties
47/94
Control low )s. Data low
Control flow machines used shared memory for instructions and
data. Since )ariables are updated by many instructions( there may
be side effects on other instructions. These side effects fre9uently
pre)ent parallel processing. Single processor systems are
inherently se9uential.
#nstructions in dataflow machines are unordered and can be
e,ecuted as soon as their operands are a)ailableF data is held in
the instructions themsel)es. Data tokensare passed from an
instruction to its dependents to trigger e,ecution.
-
7/24/2019 Program and Network Properties
48/94
Data low eatures
No need for
o shared memory
o program counter
o
control se9uencer
Special mechanisms are re9uired to
o detect data a)ailability
o match data tokens with instructions needing them
o enable chain reaction of asynchronous instruction e,ecution
-
7/24/2019 Program and Network Properties
49/94
! Dataflow !rchitecture & '
The !r)ind machine *3#T+ has N P0s and an N&by&Ninterconnection network.
0ach P0 has a token&matching mechanism that dispatchesonly instructions with data tokens a)ailable.
0ach datum is tagged witho address of instruction to which it belongso conte,t in which the instruction is being e,ecuted
Tagged tokens enter P0 through local path *pipelined+( andcan also be communicated to other P0s through the routing
network.
-
7/24/2019 Program and Network Properties
50/94
! Dataflow !rchitecture & 2
#nstruction address*es+ effecti)ely replace the program counter in a
control flow machine.
Conte,t identifier effecti)ely replaces the frame base register in a
control flow machine.
Since the dataflow machine matches the data tags from one
instruction with successors( synchronied instruction e,ecution is
implicit.
-
7/24/2019 Program and Network Properties
51/94
! Dataflow !rchitecture & =
!nI-structurein each P0 is pro)ided to eliminate e,cessi)e copying
of data structures.
0ach word of the #&structure has a two&bit tag indicating whether the
)alue is empty( full( or has pending read re9uests.
This is a retreat from the pure dataflow approach.
0,ample 2. shows a control flow and dataflow comparison.
Special compiler technology needed for dataflow machines.
-
7/24/2019 Program and Network Properties
52/94
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
53/94
Demand&Dri)en 3echanisms
Data&dri)en machines select instructions for e,ecution based on the
a)ailability of their operandsF this is essentially a bottom&up approach.
Demand&dri)en machines take a top&down approach( attempting to
e,ecute the instruction *a demander+ that yields the final result. This
triggers the e,ecution of instructions that yield its operands( and so
forth.
The demand&dri)en approach matches naturally with functional
programming languages *e.g. /#SP and SC@030+.
-
7/24/2019 Program and Network Properties
54/94
eduction 3achine 3odels
String&reduction model:o each demander gets a separate copy of the e,pression string to
e)aluateo each reduction step has an operator and embedded reference to
demand the corresponding operandso each operator is suspended while arguments are e)aluated
;raph&reduction model:o e,pression graph reduced by e)aluation of branches or subgraphs(
possibly in parallel( with demanders gi)en pointers to results ofreductions.
o
based on sharing of pointers to argumentsF tra)ersal and re)ersal ofpointers continues until constant arguments are encountered.
-
7/24/2019 Program and Network Properties
55/94
System #nterconnect !rchitectures
Direct networks for static connections
#ndirect networks for dynamic connections
Networks are used for
o internal connections in a centralied system among
processors
memory modules
#$" disk arrays
o distributed networking of multicomputer nodes
-
7/24/2019 Program and Network Properties
56/94
;oals and !nalysis
The goals of an interconnection network are to pro)ideo low&latencyo high data transfer rateo wide communication bandwidth
!nalysis includeso
latencyo bisection bandwidtho data&routing functionso scalability of parallel architecture
-
7/24/2019 Program and Network Properties
57/94
Network Properties and outing
Static networks: point&to&point direct connections that will not
change during program e,ecution
Dynamic networks:
o switched channels dynamically configured to match user program
communication demands
o include buses( crossbar switches( and multistage networks
5oth network types also used for inter&P0 data routing in S#3D
computers
-
7/24/2019 Program and Network Properties
58/94
Network Parameters
Network sie: The number of nodes in the graph used to represent
the network
Node Degree d: The number of edges incident to a node. Sum of in
degree and out degree
Network Diameter D: The ma,imum shortest path between any two
nodes
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
59/94
Network Parameters *cont.+
5isection 4idth:o Channel bisection width b: The minimum number of edges along the cut
that di)ides the network in two e9ual hal)eso 0ach channel has w bit wireso 4ire bisection width: 5bwF 5 is the wiring density of the network. #t
pro)ides a good indicator of tha ma, communication bandwidth along thebisection of the network
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
60/94
Terminology & '
Network usually represented by a graph with a finitenumber of nodes linked by directed or undirected edges.
Number of nodes in graph network size. Number of edges *links or channels+ incident on a node
node degree d *also note in and out degrees whenedges are directed+. Node degree reflects number of #$"ports associated with a node( and should ideally be smalland constant.
DiameterDof a network is the ma,imum shortest pathbetween any two nodes( measured by the number of linkstra)ersedF this should be as small as possible *from acommunication point of )iew+.
T i l
-
7/24/2019 Program and Network Properties
61/94
Terminology & 2
Channel isection width minimum number of edges cutto split a network into two parts each ha)ing the samenumber of nodes. Since each channel has wbit wires( thewire isection width! w. 5isection width pro)idesgood indication of ma,imum communication bandwidthalong the bisection of a network( and all other cross sections
should be bounded by the bisection width. "ire*or channel+ length length *e.g. weight+ of edges
between nodes. Network is symmetric if the topology is the same looking
from any nodeF these are easier to implement or to
program. "ther useful characteriing properties: homogeneousnodesI buffered channelsI nodes are switchesI
-
7/24/2019 Program and Network Properties
62/94
Data outing unctions
Shifting otating Permutation *one to one+ 5roadcast *one to all+ 3ulticast *many to many+ Personalied broadcast *one to many+ Shuffle 0,change 0tc.
-
7/24/2019 Program and Network Properties
63/94
Permutations
or n obEects there are nQ permutations by whichthe n obEects can be reordered.The set of allpermutations form a permutation group with
respect to a composition operation. Cyclenotation can be used to specify a permutationoperation.
Permutation *a( b( c+*d( e+ means: a&Rb( b&Rc(c&Ra( d&Re and e&Rd in a circular fashion. The
cycle *a( b( c+ has a period of =( and the cycle *d(e+ has a period of 2. will ha)e a period e9ual to2 , = .
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
64/94
Permutations *cont.+
Can be implemented using crossbar switches( multistage networks
or with shifting or broadcast operations.
Permutation capability is an indication of network6s data routing
capabilities
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
65/94
Perfect Shuffle and 0,change
Stone suggested the special permutation that entries according to
the mapping of the k&bit binary number a # kto c # k a*that is(
shifting ' bit to the left and wrapping it around to the least
significant bit position+.
The in)erse perfect shuffle re)erses the effect of the perfect shuffle.
-
7/24/2019 Program and Network Properties
66/94
Perfect Shuffle
Special permutation function
n 2kobEectsF each obEect representation re9uires k bits
Perfect shuffle maps , to y where:
o , * ,k&'( ( ,'( ,- +
o y * ,k&2( ( ,'( ,-( ,k&'+
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
67/94
0,change
n 2kobEectsF each obEect representation re9uires k bits
The e,change maps , to y where:
o , * ,k&'( ( ,'( ,- +
o y * ,k&'( ( ,'( ,-6 +
@ypercube routing functions are e,changes
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
68/94
@ypercube outing unctions
#f the )ertices of a n&dimensional cube are labeled with n&bit
numbers so that only one bit differs between each pair of adEacent
)ertices( then nrouting functions are defined by the bits in the node
*)erte,+ address.
or e,ample( with a =&dimensional cube( we can easily identify
routing functions that e,change data between nodes with addresses
that differ in the least significant( most significant( or middle bit.
-
7/24/2019 Program and Network Properties
69/94
5roadcast and 3ulticast
5roadcast: "ne&to&all mapping
3ulticast: one subset to another subset
Personalied 5roadcast: Personalied messages to only selected
recei)ers
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
70/94
Network Performance
unctionality
Network latency
5andwidth
@ardware comple,ity
Scalability
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
71/94
actors !ffecting Performance
unctionality A how the network supports data routing( interrupt
handling( synchroniation( re9uest$message combining( and
coherence
Network latency A worst&case time for a unit message to betransferred
5andwidth A ma,imum data rate
@ardware comple,ity A implementation costs for wire( logic(
switches( connectors( etc. Scalability A how easily does the scheme adapt to an increasing
number of processors( memories( etc.I
-
7/24/2019 Program and Network Properties
72/94
Static Networks
/inear !rray
ing and Chordal ing
5arrel Shifter
Tree and Star
at Tree
3esh and Torus
-
7/24/2019 Program and Network Properties
73/94
Static Networks A /inear !rray
N nodes connected by n&' links *not a bus+F segments between
different pairs of nodes can be used in parallel.
#nternal nodes ha)e degree 2F end nodes ha)e degree '.
Diameter n&'
5isection '
or small n( this is economical( but for large n( it is ob)iously
inappropriate.
-
7/24/2019 Program and Network Properties
74/94
Static Networks A ing( Chordal ing
/ike a linear array( but the two end nodes are connected by an n thlinkF the ring can be uni& or bi&directional. Diameter is n$2for abidirectional ring( or n for a unidirectional ring.
5y adding additional links *e.g. chords in a circle+( the node degreeis increased( and we obtain a chordal ring. This reduces the network
diameter. #n the limit( we obtain a fully&connected network( with a node
degree of n &' and a diameter of '.
-
7/24/2019 Program and Network Properties
75/94
Static Networks A 5arrel Shifter
/ike a ring( but with additional links between all pairs of nodes thatha)e a distance e9ual to a power of 2.
4ith a network of sie$ 2n( each node has degree d 2n &'( andthe network has diameterD n $2.
5arrel shifter connecti)ity is greater than any chordal ring of lower
node degree. 5arrel shifter much less comple, than fully&interconnected network.
-
7/24/2019 Program and Network Properties
76/94
Static Networks A Tree and Star
! k&le)el completely balanced binary tree will ha)e$ 2kA ' nodes(
with ma,imum node degree of = and network diameter is 2*k A '+.
The balanced binary tree is scalable( since it has a constant
ma,imum node degree.
! star is a two&le)el tree with a node degree d$A ' and a constant
diameter of 2.
-
7/24/2019 Program and Network Properties
77/94
Static Networks A at Tree
! fat tree is a tree in which the number of edges between nodes
increases closer to the root *similar to the way the thickness of limbs
increases in a real tree as we get closer to the root+.
The edges represent communication channels *wires+( and since
communication traffic increases as the root is approached( it seems
logical to increase the number of channels there.
-
7/24/2019 Program and Network Properties
78/94
Static Networks A 3esh and Torus
Pure meshA$ n k nodes with links between each adEacent pair of
nodes in a row or column *or higher degree+. This is not a
symmetric networkF interior node degree d 2k( diameter k *nA
'+. Illiac mesh*used in #lliac #H computer+ A wraparound is allowed(
thus reducing the network diameter to about half that of the
e9ui)alent pure mesh.
! torushas ring connections in each dimension( and is symmetric.!n n n binary torus has node degree of > and a diameter of 2 n $
2.
-
7/24/2019 Program and Network Properties
79/94
Static Networks A Systolic !rray
! systolic array is an arrangement of processing elements and
communication links designed specifically to match the
computation and communication re9uirements of a specific
algorithm *or class of algorithms+.
This specialied character may yield better performance than more
generalied structures( but also makes them more e,pensi)e( and
more difficult to program.
-
7/24/2019 Program and Network Properties
80/94
Static Networks A @ypercubes
! binary n&cube architecture with$ 2nnodes spanning along n
dimensions( with two nodes per dimension.
The hypercube scalability is poor( and packaging is difficult for
higher&dimensional hypercubes.
-
7/24/2019 Program and Network Properties
81/94
Static Networks A Cube&connected Cycles
k&cube connected cycles *CCC+ can be created from a k&cube by
replacing each )erte, of the k&dimensional hypercube by a ring of k
nodes.
! k&cube can be transformed to a k&CCC with k2knodes.
The maEor ad)antage of a CCC is that each node has a constant
degree *but longer latency+ than in the corresponding k&cube. #n
that respect( it is more scalable than the hypercube architecture.
-
7/24/2019 Program and Network Properties
82/94
Static Networks A k&ary n&Cubes
ings( meshes( tori( binary n&cubes( and "mega networks *to be
seen+ are topologically isomorphic to a family of k&ary n&cube
networks.
nis the dimension of the cube( and kis the radi,( or number of of
nodes in each dimension.
The number of nodes in the network($( is k n.
olding *alternating nodes between connections+ can be used to
a)oid the long end&around delays in the traditional
implementation.
-
7/24/2019 Program and Network Properties
83/94
Static Networks A k&ary n&Cubes
The cost of k&ary n&cubes is dominated by the amount of wire( not
the number of switches.
4ith constant wire bisection( low&dimensional networks with wider
channels pro)ide lower latecny( less contention( and higher hot&
spot throughput than higher&dimensional networks with narrower
channels.
-
7/24/2019 Program and Network Properties
84/94
Network Throughput
$etwork through%utA number of messages a network can handle in
a unit time inter)al.
"ne way to estimate is to calculate the ma,imum number of
messages that can be present in a network at any instant *itsca%acity+F throughput usually is some fraction of its capacity.
! hot s%otis a pair of nodes that accounts for a disproportionately
large portion of the total network traffic *possibly causing
congestion+.
&ot s%ot through%utis ma,imum rate at which messages can be
sent between two specific nodes.
-
7/24/2019 Program and Network Properties
85/94
3inimiing /atency
/atency is minimied when the network radi, kand dimension nare
chose so as to make the components of latency due to distance * of
hops+ and the message aspect ratioL $ "*message length / di)ided
by the channel width " + appro,imately e9ual.
This occurs at a )ery low dimension. or up to '-2> nodes( the best
dimension *in this respect+ is 2.
-
7/24/2019 Program and Network Properties
86/94
Dynamic Connection Networks
Dynamic connection networks can implement all communication
patterns based on program demands.
#n increasing order of cost and performance( these include
o
bus systemso multistage interconnection networks
o crossbar switch networks
Price can be attributed to the cost of wires( switches( arbiters( and
connectors. Performance is indicated by network bandwidth( data transfer rate(
network latency( and communication patterns supported.
-
7/24/2019 Program and Network Properties
87/94
Dynamic Networks A 5us Systems
! ussystem *contention us( time-sharing us+ has
o a collection of wires and connectors
o multiple modules *processors( memories( peripherals( etc.+ which
connect to the wireso data transactions between pairs of modules
5us supports only one transaction at a time.
5us arbitration logic must deal with conflicting re9uests.
/owest cost and bandwidth of all dynamic schemes.
3any bus standards are a)ailable.
-
7/24/2019 Program and Network Properties
88/94
Dynamic Networks A Switch 3odules
!n aswitch module has a inputs and b outputs. ! binary switch
has a 2.
#t is not necessary for a ( but usually a 2k( for some integer
k.
#n general( any input can be connected to one or more of the
outputs. @owe)er( multiple inputs may not be connected to the
same output.
4hen only one&to&one mappings are allowed( the switch is called a
crossar switch.
-
7/24/2019 Program and Network Properties
89/94
3ultistage Networks
#n general( any multistage network is comprised of a collection of a
switch modules and fi,ed network modules. The aswitch
modules are used to pro)ide )ariable permutation or other
reordering of the inputs( which are then further reordered by thefi,ed network modules.
! generic multistage network consists of a se9uence alternating
dynamic switches *with relati)ely small )alues for aand 'with
static networks *with larger numbers of inputs and outputs+. The
static networks are used to implement interstage connections *#SC+.
-
7/24/2019 Program and Network Properties
90/94
"mega Network
! 2 2 switch can be configured foro Straight&througho Crosso)ero %pper broadcast *upper input to both outputs+o /ower broadcast *lower input to both outputs+o *No output is a somewhat )acuous possibility as well+
4ith four stages of eight 2 2 switches( and a staticperfect shuffle for each of the four #SCs( a ' by '"mega network can be constructed *but not allpermutations are possible+.
#n general ( an n&input "mega network re9uires log 2nstages of 2 2 switches and n $ 2 switch modules.
-
7/24/2019 Program and Network Properties
91/94
EENG-630 - Chapter 2
-
7/24/2019 Program and Network Properties
92/94
5aseline Network
! baseline network can be shown to be topologically e9ui)alent to
other networks *including "mega+( and has a simple recursi)e
generation procedure.
Stage k *k -( '( + is an m m switch block *where m N $ 2k+
composed entirely of 2 2 switch blocks( each ha)ing two
configurations: straight through and crosso)er.
-
7/24/2019 Program and Network Properties
93/94
> > 5aseline Network
-
7/24/2019 Program and Network Properties
94/94
Crossbar Networks
! m n crossbar network can be used to pro)ide a constant latency
connection between de)icesF it can be thought of as a single stage switch.
Different types of de)ices can be connected( yielding different
constraints on which switches can be enabled.o 4ith m processors and n memories( one processor may be able to generate re9uests
for multiple memories in se9uenceF thus se)eral switches might be set in the same
row.
o or m m interprocessor communication( each P0 is connected to both an input
and an output of the crossbarF only one switch in each row and column can be
turned on simultaneously. !dditional control processors are used to manage the
crossbar itself.