Computer Science and Engineering Advanced Computer Architecture CSE 8383 April 17, 2008 Session 11.
Advanced Computer Architecture CSE 8383
description
Transcript of Advanced Computer Architecture CSE 8383
![Page 1: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/1.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Advanced Computer Advanced Computer ArchitectureArchitecture
CSE 8383CSE 8383
April 4, 2006April 4, 2006
Session 21Session 21
![Page 2: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/2.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Contents
Message Passing Systems (Chapters 5 & 7)
Communication Patterns Client/Server Systems Clusters
![Page 3: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/3.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Message Passing Mechanisms Message Format
Message arbitrary number of fixed length packets
Packet basic unit containing destination address. Sequence number is needed
A packet can further be divided into flits (flow control digits)
Routing and sequence occupy header flit
![Page 4: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/4.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Message, Packets, Flits
Message
Packet
Data flit
Destination
Sequence
![Page 5: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/5.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Store and Forward Routing
Packets are the basic units of information flow
Each node uses a packet buffer A packet is transferred from S to D
through a sequence of intermediate nodes
Channel and buffer must be available
![Page 6: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/6.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Wormhole Routing Flits are the basic units of information
flow Each node uses a flit buffer Flits are transferred from S to D through
a sequence of intermediate routers in order (Pipeline)
Can be visualized as a railroad train Flits from different packets cannot be
mixed up
![Page 7: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/7.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Latency Analysis
L packet length (in bits) W Channel bandwidth
(bits/sec) D Distance (number of
hops) F flit length (in bits)
![Page 8: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/8.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Store and Forward Latency
W
L
W
L
W
L
SFT
D
![Page 9: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/9.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
WH Latency
W
L
WT
D
![Page 10: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/10.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Latency Analysis L packet length (in bits) W Channel bandwidth (bits/sec) D Distance (number of hops) F flit length (in bits) TSF = D * L/W TWH = L/W + D* F/W L/W if L>>F
(independent of D)
![Page 11: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/11.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Communication Patterns
Point to Point 1 - 1 Multicast 1 - n Broadcast 1 - all Conference n - n
![Page 12: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/12.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Routing potential problemsDeadlock:
When 2 messages, each is holding the resources required by the other in order to move, both messages will be blocked (cyclic dependency for resources)
Straightforward solution (but inefficient) is rerouting
Another solution is avoidance of occurrence of deadlock using a strict monotonic order of network resources
Channel dependency graph (CDG) is a technique for developing a deadlock-free routing algorithm.
![Page 13: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/13.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
0
3 2
1
c1
c2c8
c5c6c4 c7
c3
c1 c2 c3
c5
c4
c6 c7c8
c8c7c6c5
c1c2 c3 c4
(a) A 4-node network (b) Channel dependency graph (CDG)
(c) CDG for a deadlock-free version of the network
A 4-node network and its CDGs
![Page 14: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/14.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Livelock: A message goes around the network and never
reaches its destination
It results from using adaptive routing algorithms with dynamic injection, where nodes inject their messages in the network at arbitrary times
Policies to avoid livelock are based on assigning a priority to a message injected to the network:
Messages are routed according to their priorities Once a message is injected, only a finite number of
messages will be injected with higher or equal priority.
![Page 15: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/15.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Starvation: A node suffers from starvation if it has a
message to inject into the network but is never allowed to do so.
The simplest policy to avoid starvation is to allow each node to have an injection queue that competes with the queues of the incoming links to the same node.
The main disadvantage is that a node with a high message injection rate can slow down all the other nodes in the network.
![Page 16: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/16.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Routing Efficiency
Two Parameters
Channel Traffic (number of channels used to deliver the message involved)
Communication Latency (distance)
![Page 17: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/17.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Multicast on a mesh (5 unicasts)
Traffic ?
Latency ?
![Page 18: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/18.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Multicast on a mesh (multicast pattern 1)
Traffic ?
Latency ?
![Page 19: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/19.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Multicast on a mesh (multicast pattern 2)
Traffic ?
Latency ?
![Page 20: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/20.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Broadcast (tree structure)
3 2 3 4
2 1 2 3
1 1 2
![Page 21: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/21.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Message Passing in PVM (Revisit)
User
applicationLibrary
Daemon
1
2 3
4
User
applicationLibrary
Daemon
5
6 7
8
Sending Task Receiving Task
![Page 22: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/22.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Client/Server Systems
InterconnectionNetwork
InterconnectionNetwork
Server Threads
ClientServer
Client
![Page 23: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/23.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
A Client Server Framework for Parallel Applications
InterconnectionNetwork
InterconnectionNetwork
Master (Supervisor)
Server 1 Server 2 Server 3 Server n
Client
Slaves (Workers)
![Page 24: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/24.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
ClustersProgramming Environment and Tools
InterconnectionNetwork
InterconnectionNetwork
Middleware
OS
M
C
P
I/O
OS
M
C
P
I/O
OS
M
C
P
I/O
![Page 25: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/25.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Interconnection Networks in Clusters
Interconnection Network
Data Rate Switching Routing
Ethernet 10 Mbit/sec Packet Table-based
Fast Ethernet 100 Mbit/sec Packet Table-based
Gigabit Ethernet 1 Gbit/sec Packet Table-based
Myrinet 1.28 Gbit/sec wormhole Source-path
Quadrics 7.2 Gbyte/sec wormhole Source-path
![Page 26: Advanced Computer Architecture CSE 8383](https://reader033.fdocuments.net/reader033/viewer/2022061614/568151bd550346895dbfef63/html5/thumbnails/26.jpg)
Computer Science and EngineeringCopyright by Hesham El-Rewini
Source-Path versus Table Based
Port 0
Port 1
Port 2
Port 3
Port 4
Port 5
Port 6
Port 7
605
Port 0
Port 1
Port 2
Port 3
Port 4
Port 5
Port 6
Port 7
Dest-id
6id
Routing table