By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo...

24
By Matthew Johnson

Transcript of By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo...

Page 1: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

By Matthew Johnson

Page 2: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

HistoryImportanceComparison to TomasuloDemoDataflow Overview:◦ Static◦ Dynamic◦ HybridsProblems:◦ CAM◦ I-StructuresPotential ApplicationsConclusion

Page 3: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

1970- MIT Static Dataflow Processors: Tokens trigger next instruction1970- Kahn Dataflow Processors: Sequential Processes communicate witheach via messages through FIFO’s (Hybrid)

1967- Tomasulo

1980- MIT Dynamic Dataflow Processors

1964- 1st Scoreboard Computer: CDC 6600

1986- Restricted Out of Order Execution > RISC: Yale Pratt publishes paper on in which he simulates a restricted OO architecture and it outperforms RISC benchmarks

1985 – Exception Solutions in Out of Order Execution : Smith & Pleszkun devise precise exception handling

1990– IBM POWER1: 1st Out of Order Microprocessor (floating point only)1995– Out of Order goes Mainstream

1988– MIT Monsoon Processors

KEYData FlowHybridOut of Order

Page 4: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

Naturally “latency tolerant” because of its ability to dynamically switch between threads

Eliminate processor switching problems

Eliminates Data dependencies

Naturally takes advantage of Inherent parallelism in code

Page 5: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications
Page 6: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

Token- Output of an operation and its destination

Operation Packet- contains instructions, destination and 2 operands

Instruction Cell- The location in main memory that holds the operation packet

Page 7: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

MemoryMemory

ADDADDYY

11 33

MultipleMultipleXX

44 YY

MultipleMultipleZZ

44 YY

Distribution Network

Arbitration Network

Y=4Y=4

Y=4Y=4

Y=4Y=4

Y = 1 + 3X = 4 * YZ = 4 * X

Page 8: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

MemoryMemory

MultipleMultipleXX

44 44

MultipleMultipleZZ

44 44

Distribution Network

Arbitration Network

Y = 1 + 3X = 4 * YZ = 4 * X

Page 9: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

SWITCH node

T

T

F

F

T

T

F

F

T

F

execution

executionSWITCH SWITCH

SWITCH SWITCH

MERGE node

execution

executionX X

XX

Page 10: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

X

f g

T F

x

bni

nj

nk

SWITCH P

Page 11: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

Describes dataflow graphData tokens propagate along the arcSingle Assignment Rule: variables can only be assigned onceExamples: VAL, LUCID, …

22 2

2

z y x

+

+

+

+

-

sqrt

Mean

StDev

3

3

÷

÷

Mean = (x + y + z)/3;StDev = SQRT( (x^2 + y^2 + z^2 )/3

– Mean^2);

Page 12: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

Static

Dynamic

Page 13: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

Allows at most 1 data token on the arc of data flowTo Ensure 1 token per arc, must use hand shakingCan only run 1 iteration of a loop at a timeStatic Pipeline:o Check instructions for

operandso Select subset of available

instructiono Update instructions that are

completeExample: MIT Static Dataflow Machines

*

data token

acknowledge signal

data arc

acknowledgement arc

sqrt

x y

z

n i

n j

32

Page 14: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

Pros:Simple Model

Cons:Loops can not be run in parallelToken Traffic DoubleHand shaking wastes timeNo support for procedure calls

Page 15: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

Allows multiple data tokens on the arc of data flowAllows multiple loop iterations at one timeAn extra tag is added to tokens and contains address context and loop numberInstruction Cells are changed to accept multiple tokens of different tagsOverall effect is the creation of reentrant sub-graphs Examples: MIT Tagged-Token Data Flow, Manchester Dataflow

Page 16: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

Pros:Greater Performance( No handshaking,

multiple tokens/arc, loops can run in parallel)Supports procedure calls

Cons:Tokens need to be buffered because matching

tokens takes LOTS of time(Large Buffer Needed)No instruction locality (can’t use registers)

Page 17: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

Threaded Dataflow- subgraphs with low degrees of parallelisms are converted into sequential thread

Monsoon- Replaces associative memory with ram◦ Uses eight stage pipeline to inject instructions and

execute enabled instructions

Page 18: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

Lack of locality make current storage hierarchy ineffective

Matching tokens takes a large amount of time and requires buffering

Building CAMs large enough to hold dependencies of a real program

Data Structures are difficult to implement

Page 19: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

Content Addressable MemoryRead- returns the address of the dataSize is small, RAM is 8X (DRAM: 64 MB CAM: 8MB –single chip )ExpensiveLarge footprintExcessive power

Page 20: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

Main problem: every write to a structure requires the entire structure to be rewritten

Solution:◦ Define size of data

structure◦ Each element has a series

of status bits (states)◦ Can only write an element

once

Page 21: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

Digital Signal Processing

Network routing

Graphics processing

Telemetry

Data warehousing

Everywhere Parallel computing is used

Page 22: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

Potential to unlock the inherent parallelism of the algorithm

Naturally eliminates data hazards

Current technologies in CAM are not advanced enough to machines based in RAM

Serious Problems still remain such as implementing data structures

Page 23: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications
Page 24: By Matthew Johnsonmeseec.ce.rit.edu/551-projects/spring2013/3-1.pdf · 2013. 5. 10. · `Demo `Dataflow Overview: Static Dynamic Hybrids `Problems: CAM I-Structures `Potential Applications

SILC, JURIJ , BORUT ROBIC, and THEO UNGERER, eds. "DATAFLOW ARCHITECTURES."DATAFLOW ARCHITECTURES. Jožef Stefan Institute. Web. 7 May 2013.

Wikipedia contributors. "DATAFLOW ARCHITECTURES." Wikipedia, The Free Encyclopedia. <http://en.wikipedia.org/wiki/Dataflow_architecture>.

Wikipedia contributors. "Out of Order Execution."Wikipedia, The Free Encyclopedia. <http://en.wikipedia.org/wiki/Out-of-order_execution>.

K. Pagiamtzis, A. Sheikholeslami, “Content-Addressable Memory (CAM) Circuits and Architectures: A Tutorial and Survey,” IEEE J. of Solid-state circuits. March 2006

Rajaraman, V. Elements of Parallel Computing. EE. New Delhi: PHI Learning, 1990. 86-108. Web. <http://books.google.com/books?id=MprnhSS1ucoC&pg=PA102&lpg=PA102&dq=data flow "acknowledgement“tokens&source=bl&ots=1akpb6Pd8T&sig=mpFUClFJ1ogBpJO3STs1bYZBS8Y&hl=en&sa=X&ei=kSyJUc23MIjM0wH_iYHoDA&ved=0CEQQ6AEwBA

Dennis, Jack, and David Misunas. "A preliminary architecture for a basic data-flow processor." ACM SIGARCH Computer Architecture News. 3.4 (1974): 126-132. Web. 7 May. 2013. <http://dl.acm.org/citation.cfm?id=642111>.

Najjar, Walid , Edward Lee, and Guang Gao . "Advances in the Dataflow Computational Model." Parallel Computing. 25. (1999): 1907-1929. Web. 7 May. 2013.