Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen...
-
Upload
cori-hudson -
Category
Documents
-
view
228 -
download
1
Transcript of Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen...
![Page 1: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/1.jpg)
Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient
Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient
Hao Yu, Yu Hu, Chun-Chen Liu and Lei He
EE Department, UCLA
Presented by Yu HuPresented by Yu Hu
Partially supported by SRC task 1116.Partially supported by SRC task 1116.
![Page 2: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/2.jpg)
IntroductionIntroduction
Both process and operation variations cause uncertainties and may lead to design failure or over-design.
Process variations have been actively studied. Statistical timing analysis Stochastic optimization Post-silicon configuration
Stochastic optimization for operation variations below has been largely ignored Fluctuation of crosstalk noise and P/G network noise due to different
input vectors Time-variant on-chip temperature map over different workloads
This work is the first in-depth study on clock synthesis considering time-variant temperature variations
![Page 3: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/3.jpg)
Limitation of Existing WorkLimitation of Existing Work
The existing work [Cho:ICCAD05] ignores the time-variant temperature variations and assumes a fixed temperature map
Different work loads lead to different temperature maps (e.g., two SPEC2000 applications: Ammp and Gzip)
Optimizing skew for one application hurts the skew for another application, this conflict is solved in this work
DSA=7ns
DSB =7ns
DSA=2ns
DSB =6ns
A A
B B
S S
Skew = 0ns Skew = 4ns
![Page 4: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/4.jpg)
OutlineOutline
Modeling and Problem Formulation
Algorithms
Experimental Results
Conclusions
![Page 5: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/5.jpg)
Stochastic Temperature ModelStochastic Temperature Model The temperature map is unique for each application or program phase
can be obtained by uArch-level simulation
For each region of the chip, temperature is characterized by its mean and variance over a number of maps Primary component analysis (PCA) to decide # of maps
Temperature correlation measured as covariance between regions is high over SPEC2000 benchmark set
Considering temperature
correlations during optimization can
compress searching space!
(i,j) Correlation between region i and j
![Page 6: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/6.jpg)
Problem FormulationProblem Formulation Given:
The source, sinks and an initial tree embedding A set of temperature maps for a benchmark set
Design freedoms: Re-embedding of clock tree Cross link insertion
To minimize the worst case skew among given temperature maps
![Page 7: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/7.jpg)
OutlineOutline
Modeling and Problem Formulation
Algorithms
Experimental Results
Conclusions
![Page 8: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/8.jpg)
Bottom-up Greedy-based Re-embeddingBottom-up Greedy-based Re-embedding
d
x y
a b c
v
a
b
vd
c
x
y
Sink
Original merging point
Re-embedding option
![Page 9: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/9.jpg)
Bottom-up Greedy-based Re-embeddingBottom-up Greedy-based Re-embedding
a
b
vd
c
x
y
d
x y
a b c
v
New merging point
![Page 10: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/10.jpg)
Perturbed Modified Nodal Analysis (MNA) x is for source, sinks and merging point L selects sink responses Defining a new state variable with both nominal (x) and
sensitivity (Δx) [key to triangulate the system]
Structured and parameterized state matrix
Delay and Skew with Re-embeddingDelay and Skew with Re-embedding
The number of re-embedding options I=5N is huge!
(N is number of merging points)
![Page 11: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/11.jpg)
Compressing Solution Space by Temperature CorrelationCompressing Solution Space by Temperature Correlation
Motivation Highly correlated merging points should be re-embedded in the
same fashion
Solution Calculate correlation between two merging points based on
temperature correlations Cluster merging points based on correlation strength Perform the same re-embedding for all points within one cluster
![Page 12: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/12.jpg)
Temperature Correlation Driven ClusteringTemperature Correlation Driven Clustering
Correlation matrix C of merging points is low-ranked, and Singular Value Decomposition (SVD) reveals the rank K
Partition the merging points into K clusters (K-Means) Maximize the correlation strength within each of K clusters
Low-Rank Approx.C KC
K = 4, N = 70
Reduced from 570 to 54
![Page 13: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/13.jpg)
Recap of Skew Calculation with Re-embeddingRecap of Skew Calculation with Re-embedding
G0
(MxM)
DG1
(MxM)
G0
(MxM)
DGN
(MxM)
G0
(MxM)
0(MxM)
DG2
(MxM)
G0
(MxM)
0(MxM)
0(MxM)
G0
(MxM)
DG1
(MxM)
G0
(MxM)
DGK
(MxM)
G0
(MxM)
0(MxM)
G0
(mxm)
DG1
(mxm)
G0
(mxm)
DGK
(mxm)
G0
(mxm)
0(mxm)
Cluster based reduction
(SVD + K-Means)
Block
-wis
e
MOR
[Yu e
t al,
DAC’06]
(Bes
t pap
er a
ward n
omin
ee)
Transient time analysis
(Back-Euler)
Time domain
Vol
tage
resp
onse
K << N
Delay and Skew
![Page 14: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/14.jpg)
Simultaneous Re-embedding and Cross Link InsertionSimultaneous Re-embedding and Cross Link Insertion
1. Decide crosslink candidates according to [Rajaram, DAC04]
2. Cluster crosslink candidates again based on the temperature correlation
3. Calculate skew sensitivities w.r.t. crosslink and re-embedding candidates In a fashion similar to the previous triangular block-wise MOR
4. Bottom-up select the best crosslink or re-embedding
![Page 15: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/15.jpg)
OutlineOutline
Modeling and Problem Formulation
Algorithms
Experimental Results
Conclusions
![Page 16: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/16.jpg)
Experimental SettingsExperimental Settings
Temperature maps are obtained by micro-architecture level power-temperature transient simulator [Liao,TCAD’05] with 6 SPEC2000 applications
100 temperature maps, one for each 10 million clock cycles
Compare four algorithms (two categories) Traditional optimization under nominal temperature and Elmore
delay DME: deferred merging-point embedding to minimize wire-length
for zero-skew xlink: cross-link insertion [Rajaram, ICCAD'04]
The proposed algorithms with temperature variation and high-order delay model
re-embed: re-embedding xlink+ Re-embed: simultaneously re-embedding and cross-link
insertion
![Page 17: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/17.jpg)
Skew Distribution Over 100 Temperature MapsSkew Distribution Over 100 Temperature Maps
X+R = cross link insertion + re-embedding DME = Deferred Merging points Embedding
![Page 18: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/18.jpg)
Worst-case SkewWorst-case Skew
For tree structure, re-embed reduces the worst-case skew by 3x on average (up to 20x) compared to DME.
For non-tree structure, xlink+re-embed reduces the worst-case skew by 30% on average (up to 7x) compared to xlink.
1
10
100
1000
10000
r1 r2 r3 r4 r5
DME xlink re-embed xlink+re-embed
ps
![Page 19: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/19.jpg)
For tree structure, re-embed has less than 1% wire length overhead compared to DME
For non-tree structure, xlink+re-embed has 5% LESS wire length compared to xlink.
0.00E+00
2.00E+06
4.00E+06
6.00E+06
8.00E+06
1.00E+07
1.20E+07
1.40E+07
r1 r2 r3 r4 r5
DME re-embed xlink xlink+re-embed
Wire LengthWire Length
![Page 20: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/20.jpg)
Temperature-aware optimizations (re-embed and xlink+re-embed) are about 10x slower compared to DME and xlink, respectively, but Our work uses high-order delay model DME and xlink use Elmore delay
RuntimeRuntime
Input(node) DME xlink re-embed xlink+re-embed r1(267) 0.5 1.1 1.1 1.4 r2(598) 1.0 3.2 4.5 5.3 r3(862) 1.4 4.7 6.1 13.2
r4(1903) 2.1 5.5 33.6 59.0 r5(3101) 6.2 11.4 86.6 191.4
GeoMean 1.6 4.0 9.7 16.2
ratio 1 2x 6x 10x
![Page 21: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/21.jpg)
Conclusions Conclusions
Studied the clock optimization for workload dependent temperature variation Reduced the worst-case skew by up to 7X with
LESS wire-length compared to best existing method
Correlation-aware modeling and optimization paradigm can be extended to handle PVT variations, and more design freedoms “Temperature Aware Microprocessor Floorplanning
Considering Application Dependent Power Load” [Chu et al, ICCAD07]
“Efficient Decoupling Capacitance Budgeting Considering Operation and Processing Variations” [Shi et al, finalist for Best Paper, ICCAD07]
![Page 22: Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649d0c5503460f949e0d5d/html5/thumbnails/22.jpg)
Thank you!Thank you!
SRC TechCon 2007
Hao Yu (graduated), Yu Hu (presenter),
Chun-Chen Liu and Lei He (PI)
Minimal Skew Clock Embedding Considering Time Variant Temperature Gradient