Clock Gating Methodology
-
Upload
girish-babu -
Category
Documents
-
view
220 -
download
22
description
Transcript of Clock Gating Methodology
Clock Gating Methodologyfor
Power and CTS QoR
2
Agenda
• Objective• Introduction to clock gating• Clock gating methodology
– Overview– RTL synthesis– Physical synthesis– Clock tree synthesis– Summary of recommendations
• Sample results• Planned enhancements • Summary
3
Objective
• Describe the clock gating methodology to meet target– Skew– Insertion delay– Power
• Discuss recommendations during – RTL synthesis using Design Compiler– Physical synthesis using IC Compiler or Physical Compiler– Clock tree synthesis using IC Compiler or Astro
4
Agenda
• Objective• Introduction to clock gating• Clock gating methodology
– Overview– RTL synthesis– Physical synthesis– Clock tree synthesis– Summary of recommendations
• Sample results• Planned enhancements• Summary
5
What is Clock Gating?
• Register banks disabled during some clock cycles– Typical implementation uses multiplexers– Clock gating cell replaces multiplexers
EN
CLK
D Q
gclkLow
activity
EN
QD
CLK
High activity
6
Benefits of Clock Gating
• Dynamic power savings– With low toggle rate on clock pin, internal power of registers is
reduced– Gated by the enable signal, the clock network has less switching
activity and consumes less switching power
• Area savings– Eliminating multiplexers saves area
• Easy to implement– No RTL code change is required– Clock gating is automatically inserted by the tool– Technology independent
7
Agenda
• Objective• Introduction to clock gating• Clock gating methodology
– Overview– RTL synthesis– Physical synthesis– Clock tree synthesis– Summary of recommendations
• Sample results• Planned enhancements• Summary
8
Clock Gating Methodology Overview
Merge clock gates
Placement and placement optimization
Merge clock gates
Placement and placement optimization
Replicate clock gates
Clock tree synthesis
Detail routing
Replicate clock gates
Clock tree synthesis
Detail routing
Input RTL Insert clock gating
Compile
Insert clock gating
Compile
Design CompilerDesign Compiler
Physical CompilerPhysical Compiler
AstroAstro
Merge clock gates
Placement and placement optimization
Replicate clock gates [BETA]
Clock tree synthesis
Detail routing
Merge clock gates
Placement and placement optimization
Replicate clock gates [BETA]
Clock tree synthesis
Detail routing
IC CompilerIC Compiler
Design Compiler X-2005.09IC Compiler v1.1Physical Compiler X-2005.09Astro X-2005.09
Unified Flow in IC Compiler
9
Agenda
• Objective• Introduction to clock gating• Clock gating methodology
– Overview– RTL synthesis
MethodologyClock gating considerations
– Physical synthesis– Clock tree synthesis– Summary of recommendations
• Sample results• Planned enhancements• Summary
10
Clock Gating Methodology During RTL Synthesis
Input RTLRead in Verilog read_verilog
Read in Verilog read_verilog
Define the clocks create_clock
Define the clocks create_clock
Set the clock gating style set_clock_gating_style
Set the clock gating style set_clock_gating_style
Insert clock gating insert_clock_gating
Insert clock gating insert_clock_gating
Compile compile
Compile compile
RTL Synthesis
11
Specify Clock Gating Options
• Use the set_clock_gating_style command
• Maximum fanout– This value is the maximum fanout of each clock gating
element– By default, the fanout is unlimited
• Minimum bitwidth– This is the minimum bitwidth of register banks that will be
gated– By default, the minimum bitwidth is 3– No area or power benefit with register banks with bitwidth
less than 3RTL Synthesis
12
Insert Clock Gating During RTL Synthesis
• Use the insert_clock_gating commandThe -global option looks across hierarchical boundaries for the common enable
RTL Synthesis
b
clk
CG
d1
d2
a
CG
Module A
Module B
Regular clock gatingTop
EN
EN
b
clk
CG
d1
d2
a
Module A
Module B
Hierarchical clock gating
Extra ports added
Top
EN
13
Measure the Quality of Inserted Clock Gating: Report Power and Clock Gating
• Use the report_power command
• Use the report_clock_gating command Clock Gating Summary
------------------------------------------------------------| Number of Clock gating elements | 222 || | || Number of Gated registers | 167512 (99.92%) || | || Number of Ungated registers | 137 (0.08%) || | || Total number of registers | 167649 |------------------------------------------------------------
RTL Synthesis
Cell Internal Power = 160.6544 mW (61%)Net Switching Power = 102.5581 mW (39%)
---------Total Dynamic Power = 263.2125 mW (100%)
Cell Leakage Power = 3.0961 mW
14
Agenda
• Objective• Introduction to clock gating• Clock gating methodology
– Overview– RTL synthesis
MethodologyClock gating considerations
– Physical synthesis– Clock tree synthesis– Summary of recommendations
• Sample results• Planned enhancements• Summary
15
Clock Gating Considerations
• Clock gate styles• Enable signal timing
– Ensure that you meet the setup and hold time on the enable pin of clock gate
• Impact of clock gate fanout on– Power and enable pin timing– Clock tree structure
RTL Synthesis
16
Clock Gate Styles
• Integrated, latch-based, clock gate (ICG) is recommended• Discrete, latch-based or latch-free (simple AND or OR-AND
gate) clock gates are also supported– Discrete clock gates are not recommended (details on next slide)
• Latch-based clock gates prevent a glitch on the enable from being propagated to the gated clock
CLK
EN
GCLK
EN
CLK
D Q
GCLK
No glitches on gated clock
RTL Synthesis
17
Integrated Versus Discrete Clock Gating
EN
CLK
RTL SynthesisIntegrated clock gating is recommended
EN
CLK
Integrated clock gate Discrete clock gate
GCLK GCLK
Ensure minimum skew between latch and AND gate
Specify latch clock pin as a non stop pin for CTS
Specify the setup and hold time
This adds complexity to the flow
No clock skew between latch and AND gate
Timing analysis and CTS handle the clock gate automatically
Setup and hold check modeled in library
Easy to use in the flow
18
Enable Signal Timing
• Setup time on the enable pin of clock gate
• Synthesis assumes that the clock signal arrives at all registers and clock gates at same time (within skew)
• Clock signal reaches the clock gating cell earlier than it reaches the registers
• Timing constraints on the enable signals need to be adjusted
Note: The closer the clock gating cell is to the registers, the less constrained the enable signal
CLK
CG
( ) ( + )
EN
CLK
RTL Synthesis
19
Impact of Clock Gate Fanout
• Clock gate fanout is determined by– The -max_fanout option of the set_clock_gating_style
command in Design Compiler– By default, the fanout is unlimited
• Impact of clock gate fanout on– Power and enable pin timing– Clock tree structure
RTL Synthesis
20
Impact of Clock Gate Fanout on Power and Timing
Easier to meet enable pin timingPower might be affected
ICG
ICG
ICG
ICG
Fewer clock gating cellsBetter power reductionMore constrained enable
ICG
RTL Synthesis
Large max fanout Small max fanout
21
Impact of Clock Gate Fanout on Clock Tree Structure
More balanced clock structureEasier to meet CTS QoR
Unbalanced clock structureDepending on design skew requirement,
may need processing for CTS QoR
RTL Synthesis
Large max fanout Small max fanout
108
8
300
60
ICG
ICG
ICG
8
60
ICG
ICG
ICG
30
30
27
27
ICG
ICG
22
Impact of Clock Gate Fanout Summary
RTL Synthesis
• By default, max fanout is unlimited– Results in best power savings and reasonable CTS QoR
• If CTS QoR is a higher priority,– Make your clock structure as balanced as possible
set_clock_gating_style –minimum_bitwidth value \-max_fanout value
Use similar value for min_bitwidth and max_fanoutBalance fanout of each clock gateEliminate small fanoutSelect the value based on your design
Experiments have shown that using a balanced fanout of 128 or 256 results in improved CTS QoR
23
Agenda
• Objective• Introduction to clock gating• Clock gating methodology
– Overview– RTL synthesis– Physical synthesis– Clock tree synthesis– Summary of recommendations
• Sample results• Planned enhancements• Summary
24
Clock Gating Usage During Placement Optimization
• Large or unlimited fanout– By default, no group bounds are created for the clock gate
and its fanout during placementAvoid congestion around the clock gateYou will get better overall timing QoR–Placement of the registers is based on timing–Not constrained by location of clock gate
• Small fanout– To keep the clock gate and its register fanout together
during placement, useset physopt_disable_auto_bound_for_gated_clock false
Helps meet timing of the enable pin
Physical Synthesis
25
Optimizing the Clock Structure in a Gate-Level Design
• Consider the following scenarios:– Clock gate insertion done during RTL synthesis with small
fanout– Gate-level netlist with clock gates from a third party and
with small clock gate fanout
• To improve power, you can– Optimize or minimize the clock gates in your design
Run merge_clock_gates on your design
Physical Synthesis
26
Merging Clock Gates
Physical Synthesis
Placement optimization Placement optimization
Clock tree synthesis
Gate-level design
Merge clock gatesmerge_clock_gates
Merge clock gatesmerge_clock_gates
Merges clock gates that share a
common enable
Identify clock gatesidentify_clock_gates
Identify clock gatesidentify_clock_gates
Only required in a Verilog-based flow
27
Agenda
• Objective• Introduction to clock gating• Clock gating methodology
– Overview– RTL synthesis– Physical synthesis– Clock tree synthesis
Prepare your clock structure for CTSReplicate clock gates
– Summary of recommendations• Sample results• Planned enhancements• Summary
28
Prepare the Clock Structure for CTS
Complex clock gating presents a challenge for CTS. You can– Insert “always enabled” clock gates– Replicate clock gates Add “always enabled” clock
gates to create a more balanced tree
Replicate clock gates
ICG
ICG108
25
8
300ICG
60
ICG
Clock Tree Synthesis
8
ICG
ICG
25
ICG
60
ICG
ICG
ICGICG
ICGICG 31
28
34
28
29
Creating More Balanced Clock Structures During RTL Synthesis
– To enable, useset power_cg_all_registers true
– Also set the following variableset power_remove_redundant_clock_gates false
ICG
ICG
ICG
EN1
EN2
Active High
RTL Synthesis
ICG
ICG
EN1
EN2
30
What is Replicate Clock Gates?
Balances fanout by fixing DRC at the output of the ICG
Same engine used for clustering in clock tree synthesis and clock gate replication
Clock Tree Synthesis
108
25
25
ICG
ICG
ICG
ICG
ICG
ICG
31
20
25
25
ICG32
25Adds buffers to drive registers
that are not gated
31
What Does Replicate Clock Gates in Astro and IC Compiler do?
• Replicates clock gate with new instances using the same reference cell
• Balances the fanout of clock gates based on design rule constraints
• Considers the location of registers• In Astro, marks the output net of the clock gate as “synthesized”
– Astro CTS does not modify the net– IC Compiler CTS checks the net for a DRC violation, but does not modify the
net if it is DRC clean
• Inserts buffers to drive registers that are not gated• The number of clock gates increases
– Clock gates are larger than clock buffers and consume more power– Impact on power and area
Clock Tree Synthesis
32
When to Replicate Clock Gates?
Clock tree synthesis Clock tree synthesis
Meet target skew ?
Detail routing
Yes
Placed design
Clock Tree Synthesis
Replicate clock gates Replicate clock gates
Only when needed
Yes
Check other factors
No
No
Unbalanced clock
structure ?
33
Prerequisites for Replicating Clock Gates in Astro1. Ensure that you have logically equivalent cells (LEQs) in
the reference library– This allows the sizing of ICGs
2. Set the DRC constraints– Use the astClockOptions command
3. To enable the insertion of buffers to drive registers that are not gated, use the following command:
axSetIntParam "acts" "push down clock ports" 1
4. If you want to prevent the tool from using certain ICG cells– Define the design LEQs (see the appendix for details)
Clock Tree Synthesis
34
Prerequisites for Replicating Clock Gates in IC Compiler1. Ensure that you have logically equivalent cells (LEQs) in
the reference library– This allows the sizing of ICGs
2. Set the DRC constraints– Use the set_clock_tree_options command
3. To enable insertion of buffers to drive registers that are not gated, set the following variable:
set cts_push_down_buffer true
4. If you want to prevent the tool from using certain ICG cells, set dont_use on the cells
Clock Tree Synthesis
35
Using astSplitClockNet in Astro
– File contains either- Instance names of the cells to be replicated- Nets names (all fanout on specified nets are processed)
astSplitClockNetsetFormField “Split Clock Net" "Clock Gated Cells File Name"
“split.txt"formOK “Split Clock Net“
Clock Tree Synthesis
36
Using split_clock_net in IC Compiler
split_clock_net –objects object_list-gate_sizing–gate_relocation
– The object_list is a list of instances or nets whose fanout is to be replicated
– Enable sizing or relocation of ICGs
Clock Tree Synthesis
37
Creating Balanced Clock Fanout at RTL Versus Replicate Clock Gates Before CTS
DRC at output of clock gate (includes input capacitance of registers and net capacitance)Clustering based on placement location
Clock gate fanoutBased on
Selected maximum fanout at RTL synthesis for maximum power savings.Need to preprocess clock structure to meet target skew.
CTS QoR is a priority.Enable pin timing is a priority.
Why?
Replicate clock gates before CTS.
Insert clock gating at RTL synthesis.
When?
Replicate Clock GatesBalanced Clock Fanoutat RTL
38
Agenda
• Objective• Introduction to clock gating• Clock gating methodology
– Overview– RTL synthesis– Physical synthesis– Clock tree synthesis– Summary of recommendations
• Sample results• Planned enhancements• Summary
39
Recommendations for RTL Synthesis
– Select the maximum fanout based on your design priorityLarge fanout gives you more power savingsBalanced fanout gives good CTS QoR
– Use integrated, latch-based clock gating cells
40
Recommendations for Physical Synthesis/CTS
• Physical synthesis– Use group bounds only when the maximum fanout is small
• Clock tree synthesis– Replicate clock gates only if necessary– Use DRC constraints to control the number of replicated
clock gates
41
Agenda
• Objective• Introduction to clock gating• Clock gating methodology
– Overview– RTL synthesis– Physical synthesis– Clock tree synthesis– Summary of recommendations
• Sample results• Planned enhancements• Summary
42
Sample Results: Design 1
With replication of clock gates
Clock tree synthesis
No group boundsPhysical synthesis
No max fanout constraint (default: unlimited)Insert always active clock gating cells
RTL synthesisInsert clock gating
Flow highlights
Achieved target skew with replication of clock gates
48mWTotal power without clock gating
150psTarget skew
90nm, 160MHz clock, 181K instances, 37 macros
Design details
27mWFinal power
141psFinal skew *See sample scripts in the appendix
Results
43
Sample Results: Design 2
No replication of clock gates
Clock tree synthesis
No group boundsPhysical synthesis
No max fanout constraint (default: unlimited)Insert always active clock gating cells
RTL synthesisInsert clock gating
Flow highlights
Achieved target skew without replication of clock gates
21mWTotal power without clock gating
100psTarget skew
90nm, 85MHz clock, 39K instances, 1 macro
Design details
Results
16mWFinal power
91psFinal skew *See sample scripts in the appendix
44
Agenda
• Objective• Introduction to clock gating• Clock gating methodology
– Overview– RTL synthesis– Physical synthesis– Clock tree synthesis– Summary of recommendations
• Sample results• Planned enhancements• Summary
45
Planned Enhancements for Clock Gating Methodology
• Astro and IC Compiler– Improved QoR with clock gating
Create a more balanced clock structure before doing CTSCreate a clock tree with equal levels of logic to each sink
• IC Compiler only– Use clock gate optimization to optimize the timing of the
enable pin after CTS
46
Agenda
• Objective• Introduction to clock gating• Clock gating methodology
– Overview– RTL synthesis– Physical synthesis– Clock tree synthesis– Summary of recommendations
• Sample results• Planned enhancements• Summary
47
Summary
• Understand the power and CTS requirements of your design
• Choose the clock gating methodology based on your design requirements– Use integrated clock gating– Process the clock structure based on your CTS and power
requirementsSelect the right fanout of clock gates during RTL synthesisUse merge and replication of clock gates only if necessary
48
Appendix
• Sample scripts• Summary of clock gating methodologies• Overview of clock gating methodology using ASCII
interchange format• How to handle enable signal timing• Equivalence checking in Formality• Clock gating and design-for-test• Details on replicate clock gates• Additional considerations with discrete clock gating
49
Sample DC Script
#Set clock gating options, max_fanout default is unlimitedset_clock_gating_style -sequential_cell latch \
-positive_edge_logic {integrated} \-control_point before \-control_signal scan_enable
#Create a more balanced clock tree by inserting “always enabled” ICGsset power_cg_all_registers trueset power_remove_redundant_clock_gates true
read_db design.gtech.dbcurrent_design toplinksource design.cstr.tcl
#Insert clock gatinginsert_clock_gatingcompile
#Generate a report on clock gating insertedreport_clock_gating
50
Sample IC Compiler Script
#Open the Milkyway designopen_mw_lib design_lib.mwopen_mw_cel top
current_design toplink
#Placement & placement optimizationplace_opt
#Set clock tree optionsset_clock_tree_options –clock_tree Clk \
–max_capacitance 0.3 \-max_transition 0.3
#Replicate clock gatessplit_clock_net –object_list “*latch*” –gate_sizing –gate_relocation
#Clock tree synthesis and optimizationclock_opt
51
Sample Astro Script
#Open the Milkyway design geOpenLibsetFormField "Open Library" "Library Name" “design.mw"formOK "Open Library"geOpenCellsetFormField "Open Cell" "Cell Name" “top"formOK "Open Cell“
#Set clock tree optionsastClockOptionssetFormField "Clock Common Options" "Maximum Transition Delay“ “0.3”setFormField "Clock Common Options" "Maximum Load Capacitance" “0.3”formOK "Clock Common Options"
#Replicate clock gatesastSplitClockNetsetFormField "Duplicate Clock Gated Cells" "Clock Gated Cells File Name" “split.lst"formOK "Duplicate Clock Gated Cells"
#Clock tree synthesis astCTSformOK "Clock Tree Synthesis"
52
Format of file for astSplitClockNet
• Line separated list of instances or net names• Allows wildcard “.*”• Example:
cg_latch_inst_1cg_latch_inst_2cg_latch_inst_3
53
Design LEQs in Astro
• Define design LEQsastLoadDesignLEQ file_name
– Example:cell1 cell2cell2 cell3cell4 cell5
cell1, cell2, and cell3 are in the same classcell4 and cell5 are in the same class
• Clear/dump design LEQs– astClearDesignLEQ– astDumpDesignLEQ
54
Summary of Clock Gating Methodologies
Clock gate fanout
Power is a priority.CTS QoR, enable pin constraints more flexible.
Insert clock gating at RTL synthesis.
Unlimited Clock Fanout at RTL
DRC at output of clock gate (includes input capacitance of registers and net capacitance)Clustering based on placement location
Clock gate fanoutBased on
Selected maximum fanout at RTL synthesis for maximum power savings.Need to preprocess clock structure to meet target skew.
CTS QoR is a priority.Enable pin timing is a priority.
Why?
Replicate clock gates before CTS.
Insert clock gating at RTL synthesis.
When?
Replicate Clock GatesBalanced Clock Fanout at RTL
55
Clock Gating Methodology Overview Using ASCII Interchange Format (Verilog)
Identify clock gating cells
Merge clock gates
Placement and placement optimization
Identify clock gating cells
Merge clock gates
Placement and placement optimization
Replicate clock gates (astSplitClockNet)
Clock tree synthesis
Detail routing
Skew analysis
Replicate clock gates (astSplitClockNet)
Clock tree synthesis
Detail routing
Skew analysis
Input RTL Insert clock gating
Compile
Insert clock gating
Compile
Design CompilerDesign Compiler
Physical CompilerPhysical Compiler
AstroAstro
Identify clock gating cells
Merge clock gates
Placement and placement optimization
Replicate clock gates [BETA](split_clock_net)
Clock tree synthesis
Detail routing
Skew analysis
Identify clock gating cells
Merge clock gates
Placement and placement optimization
Replicate clock gates [BETA](split_clock_net)
Clock tree synthesis
Detail routing
Skew analysis
IC CompilerIC Compiler
56
How to Handle Enable Signal Timing
• Estimate delay of clock tree after clock gating cell before synthesis to avoid timing problems later– It can be modeled through the clock gate setup
checkset_clock_gating_style -setup (ideal_setup + Δ)propagate_constraints -gate_clock
– It can also be modeled by specifying a clock latency for the clock and then a modified clock latency for all the clock gate clock pinsset_clock_latency 1.7 CLK
This is the delay seen at the input of any ungated registerset_clock_latency 1.1 $ICGClkInputPins
This is the delay seen at the input of the clock gatesset_clock_latency 1.7 $ICGClkOutputPins
This is the delay seen at the input of the gated registers
CLK
Registers
CG
( ) ( + )
57
Formal Verification
• The Synopsys formal verification tool, Formality, can perform equivalence checking when the design has inserted clock gating cells
• The following command instructs Formality to account for clock gating logic
… …fm_shell > set verification_clock_gate_hold_mode any… …
58
Clock Gating and Test
• Controllability• Observability• Test signal connections
59
Potential Loss of Coverage
EN
CLK
Enablelogic
Levels of design
hierarchy
D Q
GLatch
Data in Data out
ENCLK
= fully tested
= partially tested= not tested
Di D Q
Flip-flops
D Q
Flip-flops
Clock is not controllable
Logic not observable
60
Test Coverage With Scan Enable
scan_enable
EN
CLK
Controllogic
Levels of design
hierarchy
D Q
D Q
GLatch
Data in Data out
ENCLKRegister
bank
= fully tested
= partially tested= not tested
D Q
Flip-flops
“0” during capture
Di
Control point
61
Test Coverage With Test Mode
test_mode
ENCLK
Enablelogic
Levels of design
hierarchy
D Q
D Q
GLatch
Data in Data out
ENCLKRegister
bank
= fully tested
= partially tested= not tested
“1”
Di
Control point
D Q
Flip-flops
62
Complete Observability
testmode
EN
CLK
D Qdataout
Observe flop
CLK
Otherobservabilitynodes
Latch
EN3
EN2
EN1
Unobservable point
63
Test Signal Connections
hookup_testports[-verbose][-se_port port][-tm_port port][-se_pin pin][-tm_pin pin]
SE1 FFFFCG1
FFFFCG1
SE2
SE3
hookup_testports –se_port SE3
64
Details on Replicate Clock Gates: Pictorial Description
Load on each ICG: 0.25pf (< Max Cap of 0.3pf)Load on
ICG: 2pf
Replication of ICG
Insertion of buffer to drive ungated registers
8 ICGs
DRC fixed on the output of each instanceIn Astro, net is marked as “synthesized”In IC Compiler, net is not marked as “synthesized”
65
Details on Replicate Clock Gates: Inputs, Constraints and Behavior
• Inputs– Requires a list of nets or instances
•If a net is specified, all instances on the fanout of the net are processed
• Constraints– The replication of the specified instances is based on fixing DRC at the
output of each instance– The DRC constraints considered are maximum fanout, maximum
capacitance and maximum transition•The tool converts maximum fanout and maximum transition into equivalent capacitance values, and uses the tightest of the three capacitance values as the maximum capacitance constraint
• Behavior– The tool splits the specified instance as many times as is necessary to
fix the DRC on the output of each clock gate
66
Details on Replicate Clock Gates: Example1
• Consider the following scenario:– Root clock net clk drives
1000 ungated registersClock gate cg1, which drives 2000 registersClock gates cg2, which drives 3000 registers
– You would like the clock gates driven by net clk to be balanced based on a maximum capacitance constraint of 0.35• Solution
– Set the following DRC constraints:set_clock_tree_options –max_capacitance 0.35split_clock_net –object clk
Load on each ICG < 0.35pf
Fanout of each ICG ~ 25
1000 registers
2000 registers
3000 registers
~120 ICGs
~80 ICGs
67
Details on Replicate Clock Gates: Example2
• Consider the following scenario:– Root clock net clk drives
1000 ungated registersClock gate cg1, which drives 2000 registersClock gate cg2, which drives 3000 registers
– You would like the clock gates driven by net clk to be balanced based on a maximum capacitance constraint of 0.35– You would like to make the clock structure more balanced by inserting a buffer to drive the ungated registers
• Solution– Set the following DRC constraints:
set_clock_tree_options –max_capacitance 0.35set cts_push_down_buffer truesplit_clock_net –object clk
Load on each ICG < 0.35pf
Fanout of each ICG ~ 25
1000 registers
2000 registers
3000 registers
~120 ICGs
~80 ICGs
68
Details on Replicate Clock Gates: Example3
• Consider the following scenario:– Root clock net clk drives
1000 ungated registersClock gate cg1, which drives 2000 registersClock gate cg2, which drives 3000 registers
– You would like the clock gates driven by net clk to be balanced based on a maximum fanout constraint of ~1000• Solution
– Set the following DRC constraints (specify a large maximum capacitance and maximum transition constraint, so that the tool chooses the maximum fanoutconstraint as the tightest constraint)set_clock_tree_options \–max_capacitance 10000 \–max_transition 10000 \–max_fanout 1000
split_clock_net –object clk
Fanout of each ICG ~1000
3 ICGs
2 ICGs
1000 registers
2000 registers
3000 registers
1000 registers
69
Details on Replicate Clock Gates: Example4
• Consider the following scenario:– Root clock net clk drives
1000 ungated registersClock gate cg1, which drives 200 registersClock gate cg2, which drives 3000 registersClock gate cg3, which drives 195 registers
– You would like the clock gates driven by net clk to be balanced based on a maximum fanout constraint of ~200• Solution
– Replicate the clock gate cg2 such that the fanout of each replicated instance is ~200set_clock_tree_options \–max_capacitance 10000 \–max_transition 10000 \–max_fanout 200
split_clock_net –object cg2
Fanout of each ICG ~ 200
~15 ICGs1000 registers
200 registers
195 registers
3000 registers
200 registers
195 registers
1000 registers
70
Additional Consideration With Discrete Clock Gating Cells
• Clock skew between latch and AND gate
– Clock at B later than A– Skew > latch delay
CLK@ A
EN
GCLK
CLK@ BEN1
skewdelay
glitch!
CLK
ENGCLK
EN1A
B
71
Using Discrete Clock Gating Cells
• In Design Compiler and Physical Compiler,– Do not ungroup the clock gating hierarchy– Enable group bounds to place the elements of the clock
gate (latch and AND gate) close togetherset physopt_disable_auto_bound_for_gated_clock false
• In Astro,– Place the latch and AND gates close together
Specify a large netweight on the net– Get the clock to go through the latch, that is, ignore the CLK
pin of the latch as a sync pinUse the astSetClockNonStop command
Refer to SolvNet article 003097