Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar...

43
Delay-Optimal Delay-Optimal Technology Mapping Technology Mapping by DAG Covering by DAG Covering Yuji Kukimoto Yuji Kukimoto Robert K Robert K Brayton Brayton Prashant Sawkar Prashant Sawkar Presented by Bret Victor, Presented by Bret Victor, 4/5/00 4/5/00
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    2

Transcript of Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar...

Page 1: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

Delay-Optimal Technology Delay-Optimal Technology Mapping by DAG CoveringMapping by DAG Covering

Yuji Kukimoto Yuji Kukimoto Robert K Brayton Robert K Brayton Prashant SawkarPrashant Sawkar

Presented by Bret Victor, 4/5/00Presented by Bret Victor, 4/5/00

Page 2: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

AbstractAbstract

• An algorithm for minimal-delay library-based An algorithm for minimal-delay library-based technology mappingtechnology mapping

• Subject graphs can be mapped directly as Subject graphs can be mapped directly as DAGs, without decomposition into treesDAGs, without decomposition into trees

• Algorithm is polynomial timeAlgorithm is polynomial time• Experiments show that it worksExperiments show that it works

Page 3: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

OutlineOutline

• Review of standard library-based tech Review of standard library-based tech mappingmapping

• FlowMap algorithm for FPGA tech mappingFlowMap algorithm for FPGA tech mapping• Application of FlowMap to library-based Application of FlowMap to library-based

designsdesigns• Extensions to the algorithmExtensions to the algorithm• Experimental resultsExperimental results

Page 4: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

Algorithmic tech mappingAlgorithmic tech mapping

• Decompose everything into NAND2s and Decompose everything into NAND2s and invertersinverters

• Decomposed circuit is the “subject graph”Decomposed circuit is the “subject graph”• Decomposed library gates are the “pattern Decomposed library gates are the “pattern

graphs”graphs”• Cover subject graph using pattern graphsCover subject graph using pattern graphs• Try to optimize somethingTry to optimize something

Page 5: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

Tech mapping for areaTech mapping for area

• If subject graph is a DAG, optimizing for If subject graph is a DAG, optimizing for minimum area is NP-hardminimum area is NP-hard

• But if subject and pattern graphs are trees, But if subject and pattern graphs are trees, can cover optimally in linear timecan cover optimally in linear time

• So, decompose DAG into treesSo, decompose DAG into trees• Cover each treeCover each tree• Glue the results togetherGlue the results together

Page 6: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

Tree decompositionTree decomposition

• Trees can’t have multiple-fanout nodes!Trees can’t have multiple-fanout nodes!• Snip DAG at multiple-fanout nodes to form Snip DAG at multiple-fanout nodes to form

treestrees

oror

Page 7: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

Tech mapping for delayTech mapping for delay

• There is also an algorithm to optimize delay There is also an algorithm to optimize delay while tree mapping, in linear timewhile tree mapping, in linear time

• Further work included loading effects and Further work included loading effects and buffer treesbuffer trees

• But what about directly mapping DAGs for But what about directly mapping DAGs for minimum delay?minimum delay?

Page 8: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

FPGA mappingFPGA mapping

• Fundamental node in FPGAs is the lookup Fundamental node in FPGAs is the lookup table (LUT)table (LUT)

• LUT implements any function of up to LUT implements any function of up to kk inputs inputs ((kk depends on the FPGA technology) depends on the FPGA technology)

• FPGA mapping to minimize area is NP-hard FPGA mapping to minimize area is NP-hard for for kk > 3 > 3

• FPGA mapping to minimize delay can be FPGA mapping to minimize delay can be solved in linear time using FlowMap algorithmsolved in linear time using FlowMap algorithm

Page 9: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

FlowMap algorithmFlowMap algorithm

• Maps circuit directly as DAG (no tree Maps circuit directly as DAG (no tree decomposition)decomposition)

• Two steps:Two steps:– LabelingLabeling

• Visit nodes in topological order (input to output)Visit nodes in topological order (input to output)• Clump nodes into LUT such that delay is minimizedClump nodes into LUT such that delay is minimized• Label node with best clump and best delayLabel node with best clump and best delay

– ClumpingClumping• Visit nodes in reverse topological order (output to input)Visit nodes in reverse topological order (output to input)• Create LUT for each clump, and the clump’s faninsCreate LUT for each clump, and the clump’s fanins

Page 10: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

FlowMap: LabelingFlowMap: Labeling

• Primary inputs are labeled “0” (because they Primary inputs are labeled “0” (because they are available at time are available at time t t = 0)= 0)

• At each intermediate node, investigate all At each intermediate node, investigate all cuts of size cuts of size kk and find the one that gives and find the one that gives the best delay (smallest depth)the best delay (smallest depth)

• Label node with best cut and its depthLabel node with best cut and its depth

1)(maxmin)(:

iXxkXcutX

xdepthxdepthi

Page 11: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

FlowMapFlowMap

aa bb cc dd ee ff gg hh

Page 12: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

FlowMapFlowMap

11 22

33

44

55

99

66

77

88

1010

aa bb cc dd ee ff gg hh

Topological ordering Topological ordering of nodesof nodes

Page 13: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

FlowMapFlowMap

11 22

33

44

55

99

66

77

88

1010

aa bb cc dd ee ff gg hh

Labeling optimum Labeling optimum depths, using depths, using kk = 3 = 3

11 11

11

11

22

11

11

11

22 22

Page 14: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

FlowMapFlowMap

11 22

33

44

55

99

66

77

88

1010

aa bb cc dd ee ff gg hh

The shown cut has a The shown cut has a maximum fanin depth maximum fanin depth of 1. So node 9 gets of 1. So node 9 gets labeled with “2”.labeled with “2”.

11 11

11

11 11

11

11

22 22

22

11

00

11

Page 15: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

FlowMap: ClumpingFlowMap: Clumping

• Start at a primary output node and form a Start at a primary output node and form a LUT to implement the cut that the node was LUT to implement the cut that the node was labeled withlabeled with

• Move to the cut’s fanin nodes, and use their Move to the cut’s fanin nodes, and use their labels to form LUTslabels to form LUTs

• Continue until you are down to primary inputsContinue until you are down to primary inputs• Repeat for each primary outputRepeat for each primary output

Page 16: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

FlowMapFlowMap

11 22

33

44

55

99

66

77

88

1010

aa bb cc dd ee ff gg hh

Start at last node Start at last node (node 10) and (node 10) and turn 3-input cut turn 3-input cut into 3-input LUTinto 3-input LUT

11 11

11

11

22

11

11

11

22 22

Page 17: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

FlowMapFlowMap

11 22

33

44

55

99

66

77

88

1010

aa bb cc dd ee ff gg hh

Make another Make another LUT for its fanin LUT for its fanin gategate

11 11

11

11

22

11

11

11

22 22

Page 18: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

FlowMapFlowMap

11 22

33

44

55

99

66

77

88

1010

aa bb cc dd ee ff gg hh

Next primary Next primary output is node 9output is node 9

11 11

11

11

22

11

11

11

22 22

Page 19: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

FlowMapFlowMap

11 22

33

44

55

99

66

77

88

1010

aa bb cc dd ee ff gg hh

And form the LUTs And form the LUTs for its fanins. Note for its fanins. Note that nodes 6, 7, that nodes 6, 7, and 8 have been and 8 have been duplicated!duplicated!

11 11

11

11

22

11

11

11

22 22

Page 20: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

Back to library-based designBack to library-based design

• Ideas behind FlowMap’s labeling procedure Ideas behind FlowMap’s labeling procedure can be used for library-based minimum-delay can be used for library-based minimum-delay DAG mapping tooDAG mapping too

• Two main changes to FlowMap:Two main changes to FlowMap:– instead of instead of kk-input cuts, look at library patterns that -input cuts, look at library patterns that

match at nodematch at node– pin-to-pin delays of library patterns must be used pin-to-pin delays of library patterns must be used

instead of unit delay assumed for FPGA LUTinstead of unit delay assumed for FPGA LUT

Page 21: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

Pattern matchingPattern matching

• Three types of pattern matches:Three types of pattern matches:– Standard match: one-to-one mapping of pattern Standard match: one-to-one mapping of pattern

graph nodes into subject graph nodesgraph nodes into subject graph nodes– Exact match: standard match, except a subject Exact match: standard match, except a subject

node covered by an intermediate pattern node node covered by an intermediate pattern node cannot fanout to other nodes not covered by the cannot fanout to other nodes not covered by the patternpattern

– Extended match: standard match, except not Extended match: standard match, except not necessarily one-to-onenecessarily one-to-one

Page 22: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

Pattern matchingPattern matching

Subject graphSubject graph

Page 23: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

Pattern matchingPattern matching

xx

Subject graphSubject graph Pattern graphPattern graph

This is a This is a standardstandard match, but not an match, but not an exactexact match, because match, because node x fans out to a node not covered by the pattern.node x fans out to a node not covered by the pattern.

Page 24: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

Pattern matchingPattern matching

xx

Subject graphSubject graph Pattern graphPattern graph

This is an This is an extendedextended match, because pattern nodes a and b match, because pattern nodes a and b can both be mapped to node x in the subject graph.can both be mapped to node x in the subject graph.

bbaa

Page 25: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

Pattern matchingPattern matching

• Conventional tree mapping requires exact Conventional tree mapping requires exact matchesmatches

• FlowMap-based DAG mapping can use either FlowMap-based DAG mapping can use either standard or extended matchesstandard or extended matches

Page 26: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

DAG library mappingDAG library mapping

• Like FlowMap, step one is labelingLike FlowMap, step one is labeling• Visit nodes in topological orderVisit nodes in topological order• For each node, find best library gate to For each node, find best library gate to

implement that node (gate that minimizes implement that node (gate that minimizes total delay to the node)total delay to the node)

• Label node with best gate and best delay Label node with best gate and best delay informationinformation

Page 27: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

DAG library mappingDAG library mapping

11 22

33

44

55

99

66

77

88

1010

aa bb cc dd ee ff gg hh

W W 44

X X 33

Y Y 66

Z Z 77

Library: (name and delay)Library: (name and delay)

Page 28: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

DAG library mappingDAG library mapping

11 22

33

44

55

99

66

77

88

1010

aa bb cc dd ee ff gg hh

W W 44

X X 33

Y Y 66

Z Z 77

Library: (name and delay)Library: (name and delay)

W W 44

X X 33

X X 33

W W 44

Y Y 66

Y Y 1010

oror W W 1010

X X 77

W W 1111

Z Z 1414

Y Y 1313

Page 29: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

DAG library mappingDAG library mapping

11 22

33

44

55

99

66

77

88

1010

aa bb cc dd ee ff gg hh

W W 44

X X 33

Y Y 66

Z Z 77

Library: (name and delay)Library: (name and delay)

W W 44

X X 33

X X 33

W W 44

Y Y 66

X X 77

Z Z 1414

Y Y 1313

6633 77

00

Page 30: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

DAG library mappingDAG library mapping

• Step two is committing nodes to library gatesStep two is committing nodes to library gates• Choose a primary output, and implement the Choose a primary output, and implement the

gate that it was labeled withgate that it was labeled with• Go to the nodes that are the fanins of that Go to the nodes that are the fanins of that

gate, and implement their labeled gatesgate, and implement their labeled gates• Continue until down to primary inputsContinue until down to primary inputs• Repeat for each primary output Repeat for each primary output

Page 31: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

DAG library mappingDAG library mapping

11 22

33

44

55

99

66

77

88

1010

aa bb cc dd ee ff gg hh

W W 44

X X 33

Y Y 66

Z Z 77

Library: (name and delay)Library: (name and delay)

W W 44

X X 33

X X 33

W W 44

Y Y 66

X X 77

Z Z 1414

W W 1515

Y Y 1010

oror W W 1010

W W 1111

Page 32: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

DAG library mappingDAG library mapping

11 22

33

44

55

99

66

77

88

1010

aa bb cc dd ee ff gg hh

W W 44

X X 33

Y Y 66

Z Z 77

Library: (name and delay)Library: (name and delay)

W W 44

X X 33

X X 33

W W 44

Y Y 66

X X 77

Z Z 1414

W W 1515

Y Y 1010

oror W W 1010

W W 1111

Page 33: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

DAG library mappingDAG library mapping

11 22

33

44

55

99

66

77

88

1010

aa bb cc dd ee ff gg hh

W W 44

X X 33

Y Y 66

Z Z 77

Library: (name and delay)Library: (name and delay)

W W 44

X X 33

X X 33

W W 44

Y Y 66

X X 77

Z Z 1414

W W 1515

Y Y 1010

oror W W 1010

W W 1111

Page 34: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

DAG library mappingDAG library mapping

11 22

33

44

55

99

66

77

88

1010

aa bb cc dd ee ff gg hh

W W 44

X X 33

Y Y 66

Z Z 77

Library: (name and delay)Library: (name and delay)

W W 44

X X 33

X X 33

W W 44

Y Y 66

X X 77

Z Z 1414

W W 1515

Y Y 1010

oror W W 1010

W W 1111

ZZ

YYXX

WW

XX

WW

WW

Note that node Note that node 8 is duplicated!8 is duplicated!

Page 35: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

ComplexityComplexity

• Finding all matches at a given node is O(Finding all matches at a given node is O(pp), where ), where pp is the number of nodes in the pattern graphsis the number of nodes in the pattern graphs

• In step one, this is done once for every subject node, In step one, this is done once for every subject node, so step one is O(so step one is O(spsp), where ), where ss is the number of is the number of subject nodessubject nodes

• Step two only visits subject nodes once: O(Step two only visits subject nodes once: O(ss))• Algorithm is O(Algorithm is O(spsp))• pp is a constant defined by the library, so algorithm is is a constant defined by the library, so algorithm is

linear with respect to number of subject nodeslinear with respect to number of subject nodes

Page 36: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

Comparison: DAG and tree mappingComparison: DAG and tree mapping

• Subject graphs with multiple-fanout nodes Subject graphs with multiple-fanout nodes must be snipped before tree mapping and must be snipped before tree mapping and reglued afterward. Thus, multiple-fanout reglued afterward. Thus, multiple-fanout points in the subject graph are completely points in the subject graph are completely preserved in the final mapping.preserved in the final mapping.

• DAG mapping does not mind multiple-fanout DAG mapping does not mind multiple-fanout nodes, and can map across them in order to nodes, and can map across them in order to optimize the delay.optimize the delay.

Page 37: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

Comparison: DAG and tree mappingComparison: DAG and tree mapping

• In tree mapping, pattern nodes and subject In tree mapping, pattern nodes and subject nodes match one-to-one. No duplication is nodes match one-to-one. No duplication is allowed.allowed.

• In DAG mapping, subject nodes may be In DAG mapping, subject nodes may be duplicated in order to take advantage of fancy duplicated in order to take advantage of fancy library patterns and minimize delay.library patterns and minimize delay.

Page 38: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

ExtensionsExtensions

• Lehman-Watanabe mapping (mapping Lehman-Watanabe mapping (mapping graphs, choice nodes, etc.) is compatible with graphs, choice nodes, etc.) is compatible with this DAG mapping technique and can be this DAG mapping technique and can be used in conjuction with itused in conjuction with it

• A FlowMap-like method developed for finding A FlowMap-like method developed for finding the minimum cycle time for an FPGA design the minimum cycle time for an FPGA design can be easily adapted for use on library-can be easily adapted for use on library-based designsbased designs

Page 39: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

ExperimentExperiment

• SIS’s technology mapper modified to do SIS’s technology mapper modified to do delay-optimal mapping using DAG coveringdelay-optimal mapping using DAG covering

• Standard matches (not extended matches) Standard matches (not extended matches) were usedwere used

Page 40: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

ExperimentExperiment

Delay Area CPU timecircuit

tree DAG tree DAG tree DAGC2670 11.54 9.43 1552 2008 2.3 2.6

C3540 17.20 14.00 2075 2926 3.1 3.7

C5315 16.55 13.04 3687 4275 5.4 6.0

C6288 56.99 41.95 4107 9291 4.9 5.9

C7552 14.23 11.06 4983 6452 6.8 8.4

Results with a standard sized library (Results with a standard sized library (lib2.genliblib2.genlib))

Page 41: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

ExperimentExperiment

Delay Area CPU timecircuit

tree DAG tree DAG tree DAGC2670 27 18 2998 4568 2.0 2.0

C3540 42 30 4007 6640 2.7 2.8

C5315 46 33 6817 8352 4.6 4.8

C6288 125 120 7782 7121 4.3 4.4

C7552 39 28 9552 11149 6.0 6.3

Results with a tiny sized library (Results with a tiny sized library (44-1.genlib44-1.genlib) with 7 gates) with 7 gates

Page 42: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

ExperimentExperiment

Delay Area CPU timecircuit

tree DAG tree DAG tree DAGC2670 22 10 2314 3943 92.2 159.7

C3540 28 13 2983 6148 128.2 255.6

C5315 31 15 5115 6685 220.4 341.5

C6288 125 42 7694 14775 155.1 229.5

C7552 27 13 7062 13267 248.7 491.0

Results with a big “rich” library (Results with a big “rich” library (44-3.genlib44-3.genlib) with 625 gates) with 625 gates

Page 43: Delay-Optimal Technology Mapping by DAG Covering Yuji Kukimoto Robert K Brayton Prashant Sawkar Presented by Bret Victor, 4/5/00.

ConclusionConclusion

• Delay-optimal library-based technology Delay-optimal library-based technology mapping of DAGs can be solved in linear timemapping of DAGs can be solved in linear time

• Experiments confirm that using DAGs instead Experiments confirm that using DAGs instead of trees gives a significant performance of trees gives a significant performance improvementimprovement