usefulskewappnote_v10_1491
-
Upload
gpraveenroy -
Category
Documents
-
view
48 -
download
1
description
Transcript of usefulskewappnote_v10_1491
Predictable Success
Useful Skew Application Note PresentationIC Compiler Version Z-2007.03-SP3
07/31/2007
© 2007 Synopsys, Inc. (2)
Predictable Success
Contents
•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew•Analyzing and Debugging Useful Skew Results•Case Study
© 2007 Synopsys, Inc. (3)
Predictable Success
Contents
•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew•Analyzing and Debugging Useful Skew Results•Case Study
© 2007 Synopsys, Inc. (4)
Predictable Success
What is Useful Skew?
IN OUT
CLK
(-1)
(+2)
(-1)
Paths with negativeslack
Path with positiveslack
(0)
• Most timing violations are fixed by data path optimization• With useful skew, you fix timing violations by adjusting
clock arrival times at the registers or latches
© 2007 Synopsys, Inc. (5)
Predictable Success
IN OUT
CLK
(-1)
(+2)
(-1)
Paths with negativeslack
Path with positiveslack
(0)
Fixing Timing Violations By Using Clock Skew
CLK
IN OUT
(0)
(0)
(0)
Decrease clock arrival time at this pin
Increase clock arrival time at this pin
(0)
•Fixed timing violations
•Increased clock skew No change at this pin
© 2007 Synopsys, Inc. (6)
Predictable Success
Two Approaches to Implementing Useful Skew
• Apply useful skew on your design before clock is synthesized
+ Clock tree synthesis can achieve larger latency adjustment targets; design can have more useful skew
– At pre clock tree synthesis stage, parasitics are estimated based on virtual routing with more scope for miscorrelation
– Factors such as timing derate on clock path are not considered since the clock is ideal
• Apply useful skew incrementally to fix timing violations in the post clock tree synthesis or post route stage
+ After detail routing, timing should be most accurate; therefore applying useful skew should be effective
– Clock tree optimization can only make small latency adjustments– Pre route clock tree optimization allows sizing, relocation and delay
insertion. However, the ability to use these techniques to meet the latency adjustment is limited.
– Post route clock tree optimization only allows sizing
© 2007 Synopsys, Inc. (7)
Predictable Success
Known Limitations of the Useful Skew Approach
• Useful skew approach cannot improve timing forLooping paths from a register to itself Feedthrough paths from input to output port
IN OUT
CLK
(-1)
(+2)
(-1)
Feedthrough path
(0)
IN2 OUT2
Looping path from register to itself
© 2007 Synopsys, Inc. (8)
Predictable Success
Contents
•Overview of Useful Skew•Useful Skew in IC Compiler
Overview of Useful Skew in IC CompilerAnalyzing the Timing of the DesignDetermining the Pins to Be OptimizedUnderstanding the Solution File Generated by skew_optSourcing the Solution FilePrerequisites for Running Useful SkewHow skew_opt Works With Clock Gating, I/O Paths, Hold Fixing, and Scan ChainsKnown Issues With the Current Useful Skew Implementation
•Running Useful Skew•Analyzing and Debugging Useful Skew Results•Case Study
© 2007 Synopsys, Inc. (9)
Predictable Success
Overview of Useful Skew in IC Compiler
Analyzes the timing of the design
Determines the pins to be optimized
Determines the optimal solution
Writes the solution to a file
Sources the solution file onto the design
A simple look at what skew_opt does “under the hood”
© 2007 Synopsys, Inc. (10)
Predictable Success
Analyzing the Timing of the Design
• Multiple paths to each end point• Multiple paths from each start point• Determine interclock relationships
CK1 CK2
CK2
CK2
CK1
CK1
© 2007 Synopsys, Inc. (11)
Predictable Success
Determining the Pins to Be Optimized
• Based on the paths to be optimized, skew_opt determines the pins whose latency should be adjusted
• The following pins are not optimized:“Fixed”: nonoptimized pins that are not written into the solution file
• I/O ports• Nonstop pins
Clock pins of clock-gating cellsClock pins of registers with generated clock definitionsExplicit nonstop pins
“Fragile”: nonoptimized pins that are written into the solution file• Unconstrained pins• Clock pins inside interface logic models (ILMs)• Level-sensitive latches
Clock pins of level-sensitive latchesClock pins of registers on paths to or from level-sensitive latches
• When skew_opt_optimize_to_clock_gates is false (default is true), registers generating enable signals for clock gates are not optimized. See slide 49 for more details
© 2007 Synopsys, Inc. (12)
Predictable Success
Nonstop Pins Are Not Optimized by skew_opt
• skew_opt sets float pin exceptions on clock pins whose latency needs to be adjusted (set_clock_tree_exception –float)
• Clock tree synthesis stops traversal when it sees this exceptionon clock pins
The portion of the clock structure beyond these pins is not optimized for skew, causing incorrect results
CLK
IGC ECLK
If a float pin exception is set on this pin, the registers U1 and U2 are not considered part of the clock tree
U1
U2
© 2007 Synopsys, Inc. (13)
Predictable Success
Pins Inside ILMs Are Not Optimized by skew_opt
P1 O1
CLK
IN
OUT
ILM
Top level
Clock tree synthesis can only adjust latency to ILM clock pins
Clock tree synthesis at top level cannot adjust latencies to pins inside ILMs; skew_opt therefore considers them as “fragile” pins and sets the same float pin exception on these pins
© 2007 Synopsys, Inc. (14)
Predictable Success
Contents
•Overview of Useful Skew•Useful Skew in IC Compiler
Overview of Useful Skew in IC CompilerAnalyzing the Timing of the DesignDetermining the Pins to Be OptimizedUnderstanding the Solution File Generated by skew_optSourcing the Solution FilePrerequisites for Running Useful SkewHow skew_opt Works With Clock Gating, I/O Paths, Hold Fixing, and Scan ChainsKnown Issues With the Current Useful Skew Implementation
•Running Useful Skew•Analyzing and Debugging Useful Skew Results•Case Study
© 2007 Synopsys, Inc. (15)
Predictable Success
Understanding the Useful Skew Solution Filecheck_error –reset
__scl 0.129037 {STACK_BLK/MEM_reg_5__9_/CK}
__scte -float_pin_capacitance 0 -float_pin_max_delay_rise -0.09 -float_pin_min_delay_rise -0.09 -float_pins {STACK_BLK/MEM_reg_5__9_/CK}
__sicdo -balance_group { CLOCK }
Clock latency set by skew_opt: If clock tree synthesis is able to implement the clock exceptions defined by skew_opt, you should expect to see the propagated clock latency on this pin very close to this value (__scl is aliased to set_clock_latency in the Tcl file)
Clock exception equivalent to the clock latency set by skew_opt. See next slide on how skew_opt determines the clock exception value from the clock latency values (__scte is aliased to set_clock_tree_exceptions in the Tclfile)
Interclock delay balancing options set by skew_opt based on the timing relationship between the clock domains (__sicdo is aliased to set_inter_clock_delay_options in the Tcl file)
© 2007 Synopsys, Inc. (16)
Predictable Success
Why Three Sets of Tcl Commands?
• set_clock_latency
Understood by the timer; can be used to measure skew_opt QoRClock tree synthesis does not honor clock latencies set at clock pins
• set_clock_tree_exceptions
Not understood by timerClock tree synthesis honors these constraints
• set_inter_clock_delay_options
Interclock delay constraints based on skew_opt analysis
© 2007 Synopsys, Inc. (17)
Predictable Success
Sourcing the Solution File
• By default, all three sets of Tcl commands are sourced:set_clock_latency
set_clock_tree_exceptions
set_inter_clock_delay_options
• Use the following variable settings to control which Tclcommands are sourced from the solution file:
skew_opt_skip_ideal_clocks
skew_opt_skip_propagated_clocks
skew_opt_skip_clock_balancing
• For example, If skew_opt_skip_ideal_clocks is set to true• set_clock_latency commands are not sourced
If skew_opt_skip_propagated_clocks is set to true• set_clock_tree_exceptions commands are not sourced
© 2007 Synopsys, Inc. (18)
Predictable Success
Determining the Clock Exception Values
IN OUT
CLK
(-1)
(+2)
(-1)
Paths with negativeslack
Path with positiveslack
(0)
SDC:set_clock_latency 4.0 CLK
skew_opt clock latencies:
set_clock_latency 5.0 U1/CK
set_clock_latency 3.0 U2/CK
set_clock_latency 4.0 U3/CK
U1
U2
U3
Calculating the clock exception value:
1. Find min (all clock latency values)
2. Float pin value for pin = (Min latency – latency specified for pin)
skew_opt clock exceptions:
set_clock_tree_exceptions -2.0 -float_pin U1/CK
set_clock_tree_exceptions 0.0 –float_pin U2/CK
set_clock_tree_exceptions -1.0 –float_pin U3/CK
Decrease clock arrival time at this pin
Increase clock arrival time at this pin
© 2007 Synopsys, Inc. (19)
Predictable Success
Contents
•Overview of Useful Skew•Useful Skew in IC Compiler
Overview of Useful Skew in IC CompilerAnalyzing the Timing of the DesignDetermining the Pins to Be OptimizedUnderstanding the Solution File Generated by skew_optSourcing the Solution FilePrerequisites for Running Useful SkewHow skew_opt Works With Clock Gating, I/O Paths, Hold Fixing, and Scan ChainsKnown Issues With the Current Useful Skew Implementation
•Running Useful Skew•Analyzing and Debugging Useful Skew Results•Case Study
© 2007 Synopsys, Inc. (20)
Predictable Success
Prerequisites for Running Useful Skew (1/4)
1. Check your clock tree for missing or incorrect constraints or definitions using check_clock_tree
2. Check for preexisting exceptions such as ignore, stop, or float pins set on your clock tree by using report_clock_tree-exceptions
• skew_opt does not consider clock exceptions during analysis and optimization. They are honored only during clock tree synthesis and clock tree optimization
• If the pin with a preexisting exception is an optimizableendpoint, skew_opt overrides the ignore, float, or stop pin exception with the new float pin exception
• Preexisting nonstop exceptions are not overridden
© 2007 Synopsys, Inc. (21)
Predictable Success
Prerequisites for Running Useful Skew (2/4)
3. Ensure that the constraints are correcta. Use a quick run of clock tree synthesis to determine and apply
clock latencies before running skew_opt• By default, update_clock_latency does not create a
set_clock_latency command for generated clocks. Because clock tree synthesis balances the registers on the master clock with those on the generated clock, it is essential that the clock latency of the generated clock be specified before running skew_opt
• You can do one of the following:Manually apply the clock latency of the master clock to the generated clockUse set_latency_adjustment_options to set the latency of the generated clock with respect to its master before running update_clock_latency
b. Running update_clock_latency after a skew_opt flow is incorrect
• Median latency calculated will be incorrect (all clock pins have clock exceptions set by skew_opt solution)
• Changing the constraints after skew_opt will lead to convergence issues
© 2007 Synopsys, Inc. (22)
Predictable Success
Prerequisites for Running Useful Skew (3/4)
3. Ensure that the constraints are correct (continued)c. Ensure that the clock latency specifications on clock pins in the
clock structure are correct• For example, the clock pins of clock gating cells
d. Remove any ideal latencies set on the clock network (remove_ideal_latency)
e. Ensure that the clocks are ideal before running pre clock tree synthesis skew_opt flow (remove_propagated_clock)
f. Ensure that high-fanout nets are marked as ideal or run high-fanoutnet synthesis on these nets
g. Remove any pin_load constraints set on clock ports. For example,set_load -min -pin_load 0.0 Clk
© 2007 Synopsys, Inc. (23)
Predictable Success
Prerequisites for Running Useful Skew (4/4)
4. Optimize the design to minimize timing violationsUseful skew can impact global skew and insertion delayThe smaller the useful skew introduced, the lesser the impact onclock tree synthesis metrics
© 2007 Synopsys, Inc. (24)
Predictable Success
Sample Script: Preparing the Design for skew_opt
#All clocks are ideal before CTSopen_mw_cel placed_celremove_propagated_clock [all_fanout -clock]remove_propagated_clock {*}remove_ideal_latency -allremove_ideal_network –all
#Run clock_opt to get updated latenciesset_inter_clock_delay_balance –balance_groups {clk1 clk2}set_latency_adjustment_options -from_clock clk1 -to_clock vclkclock_opt -inter_clock_balance -update_clock_latencywrite_sdc updated.sdcsh grep set_clock_latency updated.sdc > updated.sdc.1sh grep get_clock updated.sdc.1 > updated.tclclose_mw_cel
#Load updated constraints into placed CEL and optimize the design #before running skew_optopen_mw_cel placed_celsource updated.tclextract_rc -estimateremove_propagated_clock [all_fanout -clock]remove_propagated_clock {*}remove_ideal_latency -allremove_ideal_network -allplace_opt
Generated clock latencies are not updated by update_clock_latency
© 2007 Synopsys, Inc. (25)
Predictable Success
Contents
•Overview of Useful Skew•Useful Skew in IC Compiler
Overview of Useful Skew in IC CompilerAnalyzing the Timing of the DesignDetermining the Pins to Be OptimizedUnderstanding the Solution File Generated by skew_optSourcing the Solution FilePrerequisites for Running Useful SkewHow skew_opt Works With Clock Gating, I/O Paths, Hold Fixing, and Scan ChainsKnown Issues With the Current Useful Skew Implementation
•Running Useful Skew•Analyzing and Debugging Useful Skew Results•Case Study
© 2007 Synopsys, Inc. (26)
Predictable Success
Using skew_opt on Designs That Have Clock Gates: Scenario 1
CLK
U1
ECLK
U2
ICG
• The ideal latency specified for CLK is considered as the clock arrival time for all the pins on that clock domain, such as the clock pins of U1, U2 and integrated clock gating
The enable timing seen by skew_opt is therefore optimistic• After clock tree synthesis, the clock arrival time at the integrated clock gating (ICG) clock
pin will be less than that at the clock pin of U2 (The clock pin of the integrated clock gating is a non stop for clock tree synthesis)
To avoid this, you can explicitly set the clock latency at the clock pins of the clock gates taking into consideration the delay from the clock gate to the endpoints
• Use a quick run of clock tree synthesis to determine these latencies By default, skew_opt adjusts the clock arrival time at the clock pins of U1 and U2, and leaves the clock arrival time at the integrated clock gating unchanged
© 2007 Synopsys, Inc. (27)
Predictable Success
Using skew_opt on Designs That Have Clock Gates: Scenario 2
CLK
U1
ECLK
U2
ICG
• In the above scenario, the register generating the enable signal for the clock gate has a data path to registers that are gated by the same clock gate
• By default, skew_opt adjusts the latency to the clock pins of both U1 and U2
During clock tree synthesis, the float pin exception on these pins causes the clock arrival time at the clock gate to change (as compared to the initial quick run of clock tree synthesis to estimate the clock arrival time at the clock gate), thus invalidating the skew_opt solution
• When skew_opt_optimize_to_clock_gates is set to false, skew_opt does not optimize the latency on the clock pin of the register generating the enable signal
© 2007 Synopsys, Inc. (28)
Predictable Success
skew_opt and I/O timing (-fix_boundary_pins)
• By default, skew_opt optimizes boundary paths
Registers on boundary paths therefore might have adjusted clock latencies
• With the –fix_boundary_pins option, skew_opt keeps the clock arrival times for registers on boundary paths unchanged
© 2007 Synopsys, Inc. (29)
Predictable Success
skew_opt and Hold Timing
• By default, skew_opt optimizes for setup
Optimizing for setup can degrade holdskew_opt minimizes the latency adjustment to minimize impact on hold
• When both the –setup and –hold options are specified,skew_opt tracks WNS for both setup and hold for each startpoint and endpoint; the worst WNS governs the solution
• Specify minimum libraries for more realistic hold timing analysis
© 2007 Synopsys, Inc. (30)
Predictable Success
skew_opt and Scan Chains
• In a skew_opt flow, there will be larger “real” skew between clock pins after clock tree synthesis
optimize_dft currently assumes zero skew between clock pins and can lead to larger hold violations
© 2007 Synopsys, Inc. (31)
Predictable Success
Known Issues With the Current Useful Skew Implementation
• Current solution excludes paths that end on nonstop pins such as
Clock gating cellsRegisters with a generated clock at the outputPins with explicit nonstop exception• This is because setting a float pin exception on these clock
pins causes clock tree synthesis not to traverse beyond these pins
• Pins inside interface logic models are not optimizedClock tree synthesis cannot adjust latencies to pins inside ILMs
• Level-sensitive latch support is limitedWill not optimize paths to or from level-sensitive latches
© 2007 Synopsys, Inc. (32)
Predictable Success
Contents
•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew
Known Issues With the Current Useful Skew ImplementationUseful Skew User Interface
•Analyzing and Debugging Useful Skew Results•Case Study
© 2007 Synopsys, Inc. (33)
Predictable Success
Useful Skew Flows: Pre Clock Tree Synthesis and Post Clock Tree Synthesis
IC CompilerPlaced CEL View
(Prepared for Useful Skew)
skew_opt
clock_opt–inter_clock_balance
route_opt
Pre Clock Tree Synthesis Flow
clock_opt –inter_clock_balance–no_clock_route
set skew_opt_skip_ideal_clocks trueskew_opt
optimize_clock_tree
route_opt
Post Clock Tree Synthesis Flow
This setting is required to avoid losing the propagated attribute that is annotated on the clocks by compile_clock_tree
Postroute flow has QoR limitations
© 2007 Synopsys, Inc. (34)
Predictable Success
Sample Script: Pre Clock Tree Synthesis skew_opt Flow
#All clocks are ideal before clock tree synthesisremove_propagated_clock [all_fanout -clock]remove_propagated_clock {*}remove_ideal_latency -allremove_ideal_network –all
#Run skew_optskew_opt
#Run clock_optset_inter_clock_delay_balance –balance_group {clk1 clk2}set_clock_tree_options -gate_sizing true -gate_relocation true
-buffer_sizing true -delay_insertion false -buffer_relocation true
clock_opt -inter_clock_balance
#Run route_optset_fix_hold [all_clocks]route_opt
© 2007 Synopsys, Inc. (35)
Predictable Success
Sample Script: Post Clock Tree Synthesis skew_opt Flow
#All clocks are ideal before clock tree synthesisremove_propagated_clock [all_fanout -clock]remove_propagated_clock {*}remove_ideal_latency -allremove_ideal_network –all
#Run clock_optset_inter_clock_delay_balance –balance_group {clk1 clk2}set_clock_tree_options -gate_sizing true -gate_relocation true
-buffer_sizing true -delay_insertion false -buffer_relocation true
clock_opt -inter_clock_balance –no_clock_route
#Run skew_opt followed by clock tree optimizationset skew_opt_skip_ideal_clocks trueskew_optoptimize_clock_treeroute_group –all_clock_nets
#Run route_optset_fix_hold [all_clocks] route_opt
© 2007 Synopsys, Inc. (36)
Predictable Success
Contents
•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew
Known Issues With the Current Useful Skew ImplementationUseful Skew User Interface
•Analyzing and Debugging Useful Skew Results•Case Study
© 2007 Synopsys, Inc. (37)
Predictable Success
Variables for skew_opt
• Variables that control which Tcl commands are sourced from the solution file
skew_opt_skip_ideal_clocks
skew_opt_skip_propagated_clocks
skew_opt_skip_clock_balancing
These variables do not affect the solution generated by skew_opt. They control only which Tcl commands are sourced in from the solution file.
© 2007 Synopsys, Inc. (38)
Predictable Success
Description of Variables
• skew_opt_skip_ideal_clocks
Default: falseWhen set to true, skew_opt does not set ideal clock latencies on the clock pins
• skew_opt_skip_propagated_clocks
Default: falseWhen set to true, skew_opt does not set clock exceptions on the clock pins
• skew_opt_skip_clock_balancing
Default: falseWhen set to true, skew_opt does not set interclock balancing options
© 2007 Synopsys, Inc. (39)
Predictable Success
skew_opt Command Optionsskew_opt
–setup
-hold
-pins pin_list
-fix_boundary_pins
-ignore_boundary_paths
-path_groups path_group_list
-output file_name
-no_auto_source
-no_optimization
-setup_margin setup_margin_value
-hold_margin hold_margin_value
-adjustment_limit adjustment_limit_value
-decrease_factor decrease_factor_value
-improvement_threshold improvement_threshold_value
-resolution resolution_value
© 2007 Synopsys, Inc. (40)
Predictable Success
Description of Options (1/4)
-setupOptimize WNS for setup constraints, on by default. When only the –setupoption is specified, skew_opt optimizes for setup but minimizes impact on hold
-holdOptimize WNS for hold constraints; off by default. When both –setup and –hold are specified, the setup solution is constrained by the hold slack. It is possible that setup improvement achieved is not as much as when only the –setup option is used.
-pins pin_list
Specifies a list of pins to optimize; by default, all adjustable clock pins are considered for optimization
-fix_boundary_pins
Do not optimize clock arrival time for registers on boundary paths
© 2007 Synopsys, Inc. (41)
Predictable Success
Description of Options (2/4)
-ignore_boundary_pathsDo not consider I/O paths during optimization. However, the tool can degrade I/O paths while optimizing other register-register paths. By default, I/O paths are included
-path_groups path_groups
Specifies the path groups considered for optimization; by default, all path groups are considered
-output file_name
Specifies a file name for the solution file; by default, the solution file name is skew_opt.tcl
-no_auto_sourceDo not source the solution file at the end of skew_opt; by default, the solution file is sourced at the end of skew_opt
© 2007 Synopsys, Inc. (42)
Predictable Success
Description of Options (3/4)
-setup_margin setup_margin_value
The margin is subtracted from the setup slack to allow you to influence skew_opt to improve paths with positive slack. Default is 0 ns. Unit is ns.
-hold_margin hold_margin_value
The margin is subtracted from the hold slack to allow you to influence skew_opt to improve paths with positive slack. Default is 0 ns. Unit is ns.
-adjustment_limit adjustment_limit_value
Sets a limit on the latency adjustment that can be set on any pin. Default is no limit. Unit is ns.
-decrease_factor decrease_factor_valueSets a fractional limit on latency decreases by using a value between zero and one. Default is 0.5. For designs with many clock tree levels, a larger decrease factor (e.g. 0.75) might yield more slack improvement.
© 2007 Synopsys, Inc. (43)
Predictable Success
Description of Options (4/4)
-improvement_threshold improvement_threshold_value
Do not generate a solution if the solution cannot improve timing QoR (WNS) by at least this value. Default is 0.01 ns. Unit is ns.
-resolution resolution_value
Snaps the clock tree exception value to a multiple of this value. Default is 0.001 ns. Unit is ns. The minimum allowed value is 0.0001 ns.
-no_optimization
Use the clock latencies set at the clock pins. For example, the tool takes the set_clock_latency commands you specified and converts them into clock exceptions; by default, this is disabled
© 2007 Synopsys, Inc. (44)
Predictable Success
Contents
•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew•Analyzing and Debugging Useful Skew Results
Measuring Useful Skew QoR Without Running Clock Tree SynthesisUnderstanding the Log FileDebugging QoR Degradation in a Useful Skew Flow
•Case Study
© 2007 Synopsys, Inc. (45)
Predictable Success
Measuring Useful Skew QoR Without Running Clock Tree Synthesis (Pre Clock Tree Synthesis Only)
IC Compiler Placed CEL View (Prepared for Useful Skew)
skew_opt –no_auto_source
Useful skew flow
Pre Clock Tree Synthesis Flow
set skew_opt_skip_propagated_clocks truesource skew_opt.tcl
report_timing
Timing acceptable?
Y
Run skew_opt with a different set of options or go with the default flow
N
Run skew_optwithout sourcing the solution file Set the variable to disable
loading of clock exceptions in the solution file, then source the solution file
Analyze timing with ideal clock latencies defined by skew_opt
If timing improvement is acceptable, continue with the skew_opt flow
© 2007 Synopsys, Inc. (46)
Predictable Success
Contents
•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew•Analyzing and Debugging Useful Skew Results
Measuring Useful Skew QoR Without Running Clock Tree SynthesisUnderstanding the Log FileDebugging QoR Degradation in a Useful Skew Flow
•Case Study
© 2007 Synopsys, Inc. (47)
Predictable Success
Understanding the Log File: Initial AnalysisUsing boundary paths.
Adjusting boundary pins.
Using all clock pins.
Using all path-groups.
2744398 initial constraints
==================================================
30850 loop constraints
32 feedthrough constraints
1367137 non-worst setup constraints
0 non-worst hold constraints
--------------------------------------------------
346379 remaining setup constraints
0 remaining hold constraints
31176 initial pins
==================================================
557 latencies at I/O ports
0 latencies at clock-gating cells
9 latencies at level-sensitive latches
0 latencies inside interface logic models
--------------------------------------------------
30610 latencies will be optimized
566 latencies will be kept fixed
Indicates the path and pins that skew_opt will work on
skew_opt processes the constraints to determine the ones it will work on
Based on the constraints, skew_optdetermines the pins to work with
Pins on clock-gating cells, I/O ports, ILMs, and level-sensitive latches are excluded
© 2007 Synopsys, Inc. (48)
Predictable Success
Understanding the Log File: Settings
Settings
--------
setup_margin = 0 (ns)
hold_margin = 0 (ns)
adjustment_limit = 1e+30 (ns)
decrease_factor = 0.5
improvement_threshold = 0.01 (ns)
resolution = 0.001 (ns)
Log file indicates all the variable settings
© 2007 Synopsys, Inc. (49)
Predictable Success
Understanding the Log File: Optimization and Results
Optimizing latencies:
Setup WNS Setup CNS Hold WNS Hold CNS
--------- --------- -------- --------
-9.167e-01 -1.982e+02 +0.000e+00 +0.000e+00
-8.718e-01 -1.693e+02 +0.000e+00 +0.000e+00
-8.290e-01 -1.577e+02 +0.000e+00 +0.000e+00
.
.
-2.316e-01 -1.791e+01 +0.000e+00 +0.000e+00
Minimizing latency adjustments:
Setup WNS Setup CNS Hold WNS Hold CNS
--------- --------- -------- --------
-2.310e-01 -1.791e+01 +0.000e+00 +0.000e+00
-2.350e-01 -1.870e+01 +0.000e+00 +0.000e+00
Maximum latency increase: +1.075 --> +0.061
Maximum latency decrease: -0.500 --> -0.089
Indicates the starting QoR and the improvement after each iteration
Cumulative negative slack (CNS): the sum of negative slacks for all the constraints skew_opt is considering
Final QoR achieved by skew_opt
Latency adjustments are minimized; this might have a small impact on the setup WNS
Indicates that latency increases have been reduced from 1.075 ns to 0.061 ns; latency decreases reduced from 0.5 ns to 0.089 ns. The smaller the latency adjustment, the easier it is for clock tree synthesis to meet this target
© 2007 Synopsys, Inc. (50)
Predictable Success
Understanding the Log File: Optimization and Results
Writing set_clock_latency commands ... done.
There are clocks shared by data ports and sink pins in this design
Using a phase resolution of 1 ps will achieve 89% dominant phase.
The number of unique phases will be 207.
Writing set_clock_tree_exception commands ... done.
Writing set_inter_clock_delay_options commands ... done.
Sourcing optimizations from "skew_opt.tcl".
--> sourcing set_clock_latency
--> sourcing set_clock_tree_exceptions
--> sourcing set_inter_clock_delay_options
skew_opt completed successfully.
Log file indicates which settings are sourced
I/O constraints have been specified with respect to a real clock (instead of a virtual clock)
© 2007 Synopsys, Inc. (51)
Predictable Success
Understanding the Log File: skew_opt Unable to Optimize the Design
Optimizing latencies:
Setup WNS Setup CNS Hold WNS Hold CNS
--------- --------- -------- --------
+0.000e+00 +0.000e+00 +0.000e+00 +0.000e+00
+0.000e+00 +0.000e+00 +0.000e+00 +0.000e+00
Resources used for optimization:
1.22e-04 cpu hours
0.00e+00 gigabytes
This design could not be further optimized.
When QoR improvement is less than the threshold, skew_opt does not generate a solution file
© 2007 Synopsys, Inc. (52)
Predictable Success
Contents
•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew•Analyzing and Debugging Useful Skew Results
Measuring Useful Skew QoR Without Running Clock Tree SynthesisUnderstanding the Log FileDebugging QoR Degradation in a Useful Skew Flow
•Case Study
© 2007 Synopsys, Inc. (53)
Predictable Success
Tips for Debugging QoR Degradation in a Useful Skew Flow (1/2)
1. Review the skew_opt log fileCheck if the starting QoR reported by skew_opt is what you expectCheck if skew_opt is able to improve the QoR
2. Run timing analysis before and after skew_opt and after running clock tree synthesis
Use report_qor and report_timing commandsWill help identify where the degradation occurs
3. Create path groups to isolate paths that skew_opt will not optimize
Feedthrough pathsNonstop clock pins such as clock pins of integrated clock gatings
© 2007 Synopsys, Inc. (54)
Predictable Success
Tips for Debugging QoR Degradation in a Useful Skew Flow (2/2)
4. Write out clock latencies before and after clock tree synthesisComparing the two will help identify if miscorrelation between skew_opt and clock tree synthesis is the cause of the degradation
© 2007 Synopsys, Inc. (55)
Predictable Success
Debugging QoR Degradation in a Useful Skew (Pre Clock Tree Synthesis) Flow
Postroute CEL(baseline flow)
Postroute CEL(skew_opt flow)
skew_opt flow timing worse than baseline?
Y
N
skew_opt flow timing after clock tree synthesis worse than baseline?
Y
N
Y
N
Indicates correlation issue between post clock tree synthesis and postroute timing
Indicates clock tree synthesis is not able to implement the skew_opt solution
skew_opt should not degrade QoR; File STAR
Did you follow the prerequisitesfor the skew_opt flow?
Rerun flow after following recommended methodology
Y
*See next slide for details on checking if the clock tree synthesis implementation correlates with the skew_opt solution
Check if interclock balancing tool issued any messages about clocks it could not balance
Does post skew_opttiming correlate with post clock tree synthesis?*
© 2007 Synopsys, Inc. (56)
Predictable Success
Comparing the Clock Tree Synthesis Implementation to the skew_opt Solution in a Pre Clock Tree Synthesis Useful Skew Flow
Post-clock tree synthesis clock timing report (report_clock_timing-nosplit -type latency -nworst 1000000)
Compare clock latency in theskew_opt solution file to the clock timing after clock tree synthesis*
skew_opt solution file (skew_opt.tcl)
Possible convergence issues
© 2007 Synopsys, Inc. (57)
Predictable Success
Contents
•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew•Analyzing and Debugging Useful Skew Results•Case Study
© 2007 Synopsys, Inc. (58)
Predictable Success
Case Study: Design Information, Initial Timing, Timing with Baseline Flow
• Design information65 nm, 325K instances
• Initial timing (after place_opt)
• Final timing with baseline flow (flow without skew_opt)
WNS TNS # Vlns # Hold Vlns
Clk1 -0.35 ns -661.73 ns 6000 43067Clk2 0.87 ns 0.00 ns 0 261
WNS TNS # Vlns # Hold Vlns
Clk1 -0.76 ns -720.39 ns 5114 41849Clk2 0.86 ns 0.00 ns 0 282
© 2007 Synopsys, Inc. (59)
Predictable Success
Case Study: Preparing the Design for skew_opt (1/4)
• Run check_clock_tree on the design
icc_shell> check_clock_tree
1
• Check for clock exceptions on the clock structure
report_clock_tree –exceptions
Clock Tree Exceptions Summary
=============================
1. Clock: Clk1
.
Implicit ignore pins: 34
Default sink pins: 36948
.
.
2. Clock: Clk2
.
Default sink pins: 350
.
.
Implicit ignore pins connecting to clock pins of gates with unconnected output; should not impact skew_opt flow
© 2007 Synopsys, Inc. (60)
Predictable Success
Case Study: Preparing the Design for skew_opt (2/4)
• Check for interclock relationships
icc_shell> report_timing –from Clk1 –to Clk2
****************************************
Report : timing
-path full
-delay max
-max_paths 1
Design : test_design
Version: Z-2007.03-ICC-SP3-CS1
Date : Tue Jul 17 20:21:21 2007
****************************************
* Some/all delay information is back-annotated.
# A fanout number of 1000 was used for high fanout net computations.
Operating Conditions: WCIND Library: tsmc65lp_108125
No paths.
1
icc_shell> report_timing –from Clk2 –to Clk1.
.
© 2007 Synopsys, Inc. (61)
Predictable Success
Case Study: Preparing the Design for skew_opt (3/4)
• Run a quick run of clock tree synthesis to estimate clock latenciesicc_shell> clock_opt -inter_clock_balance -update_clock_latency
.
.
============= Clock Tree Summary ==============
Clock Sinks CTBuffers ClkCells Skew LongestPath TotalDRC BufferArea
-----------------------------------------------------------------------------------
Clk1 36957 1059 1786 0.306 2.139 0 7213.620
Clk2 350 42 71 0.034 1.820 0 108.800
Updating the latencies on clock objeclock tree synthesis.(*psynopt*)
Information: Latency computed from clock Clk1 will be applied on clock Clk1. (clock tree synthesis-530)
Information: Updating the latency of clock Clk1 to 1.804981 (max) 0.735533 (min). (clock tree synthesis-531)
Information: Latency computed from clock Clk2 will be applied on clock Clk2. (clock tree synthesis-530)
Information: Updating the latency of Clk2 to 2.000936 (max) 0.848920 (min). (clock tree synthesis-531)
• Check for high-fanout netsicc_shell> all_high_fanout –nets
1
© 2007 Synopsys, Inc. (62)
Predictable Success
Case Study: Preparing the Design for skew_opt (3/3)
• Create path groups (Not done for this case study)group_path -name INPUTS -from [all_inputs] -to [all_registers]
group_path -name OUTPUTS -from [all_registers] -to [all_outputs]
group_path -name REG2REG -from [all_registers] -to [all_registers]
group_path -name FEEDTHROUGH -from [all_inputs] -to [all_outputs]
group_path –name ENABLE –to $enable_pins
© 2007 Synopsys, Inc. (63)
Predictable Success
Case Study: Running skew_opt
• Script to run useful skew flow
set_clock_latency -max 1.80 [get_clocks Clk1]
set_clock_latency -min 0.74 [get_clocks Clk1]
set_clock_latency -max 2.00 [get_clocks Clk2]
set_clock_latency -min 0.85 [get_clocks Clk2]
remove_propagated_clock [all_fanout -clock]
remove_propagated_clock {*}
remove_ideal_latency -all
remove_ideal_network -all
extract_rc –estimate
report_qor
skew_opt
report_qor
© 2007 Synopsys, Inc. (64)
Predictable Success
Case Study: Check skew_opt LogTiming Path Group ‘Clk1'
-----------------------------------
Critical Path Slack: -0.35
Critical Path Clk Period: 4.00
Total Negative Slack: -662.24
No. of Violating Paths: 6001.00
No. of Hold Violations: 43068.00
Timing Path Group ‘Clk2'
-----------------------------------
Critical Path Slack: 0.87
Critical Path Clk Period: 8.00
Total Negative Slack: 0.00
No. of Violating Paths: 0.00
No. of Hold Violations: 261.00
.
.
Setup WNS Setup CNS Hold WNS Hold CNS
--------- --------- -------- --------
-3.525e-01 -7.615e+03 +0.000e+00 +0.000e+00
-2.090e-01 -3.749e+01 +0.000e+00 +0.000e+00
Minimizing latency adjustments:
Setup WNS Setup CNS Hold WNS Hold CNS
--------- --------- -------- --------
-2.090e-01 -3.749e+01 +0.000e+00 +0.000e+00
-2.090e-01 -3.749e+01 +0.000e+00 +0.000e+00
Timing Path Group ‘Clk1'
-----------------------------------
Levels of Logic: 32.00
Critical Path Length: 3.87
Critical Path Slack: -0.21
Critical Path Clk Period: 4.00
Total Negative Slack: -7.01
No. of Violating Paths: 1016.00
No. of Hold Violations: 44286.00
-----------------------------------
Timing Path Group ‘Clk2'
-----------------------------------
Levels of Logic: 9.00
Critical Path Length: 2.80
Critical Path Slack: 0.87
Critical Path Clk Period: 8.00
Total Negative Slack: 0.00
No. of Violating Paths: 0.00
No. of Hold Violations: 269.00
-----------------------------------
Should correlate
Should correlate
© 2007 Synopsys, Inc. (65)
Predictable Success
Case Study: Running clock_opt and route_opt
• Script to run clock_opt and route_opt
clock_opt -inter_clock_balance
report_qor
set_fix_hold [all_clocks]
route_opt
report_qor
© 2007 Synopsys, Inc. (66)
Predictable Success
Case Study: QoR After clock_opt
Timing Path Group ‘Clk1'
-----------------------------------
Levels of Logic: 18.00
Critical Path Length: 3.80
Critical Path Slack: -0.48
Critical Path Clk Period: 4.00
Total Negative Slack: -885.28
No. of Violating Paths: 6494.00
No. of Hold Violations: 38738.00
-----------------------------------
Timing Path Group ‘Clk2'
-----------------------------------
Levels of Logic: 9.00
Critical Path Length: 2.95
Critical Path Slack: 0.81
Critical Path Clk Period: 8.00
Total Negative Slack: 0.00
No. of Violating Paths: 0.00
No. of Hold Violations: 274.00
-----------------------------------
© 2007 Synopsys, Inc. (67)
Predictable Success
Case Study: Check Final QoR
Timing Path Group ‘Clk1'
-----------------------------------
Levels of Logic: 11.00
Critical Path Length: 0.75
Critical Path Slack: -0.58
Critical Path Clk Period: 4.00
Total Negative Slack: -359.50
No. of Violating Paths: 3483.00
No. of Hold Violations: 40152.00
-----------------------------------
Timing Path Group ‘Clk2'
-----------------------------------
Levels of Logic: 9.00
Critical Path Length: 2.92
Critical Path Slack: 1.00
Critical Path Clk Period: 8.00
Total Negative Slack: 0.00
No. of Violating Paths: 0.00
No. of Hold Violations: 281.00
-----------------------------------
Timing Path Group ‘Clk1'
-----------------------------------
Levels of Logic: 11.00
Critical Path Length: 0.78
Critical Path Slack: -0.76
Critical Path Clk Period: 4.00
Total Negative Slack: -720.42
No. of Violating Paths: 5113.00
No. of Hold Violations: 41851.00
-----------------------------------
Timing Path Group ‘Clk2'
-----------------------------------
Levels of Logic: 9.00
Critical Path Length: 2.93
Critical Path Slack: 0.86
Critical Path Clk Period: 8.00
Total Negative Slack: 0.00
No. of Violating Paths: 0.00
No. of Hold Violations: 282.00
-----------------------------------
Useful skew flow Baseline flow
© 2007 Synopsys, Inc. (68)
Predictable Success
Predictable Success