Formal Modeling and Verification of PCHB Asynchronous Circuits
Asynchronous Circuits
description
Transcript of Asynchronous Circuits
Asynchronous Circuits
Jordi CortadellaUniversitat Politècnica de Catalunya, Barcelona
Collège de FranceMay 14th, 2013
Goals
• Convince ourselves that:– designing an asynchronous circuit is easy– synchronous and asynchronous circuits are similar– asynchronous circuits bring new advantages
• Not to discourage designers with exotic and sophisticated asynchronous schemes
Collège de France 2013 Asynchronous circuits 2
Clocking
Collège de France 2013 Asynchronous circuits
Nvidia KeplerTM GK110
• How to distribute the clock?
• How to determine the clockfrequency?
• How to implement robustcommunications?
• How to reduce and manageenergy?
3
28nm, 7.1B transistors, 550mm2, 2688 CUDA cores,Base clock: 836MHz, Memory clock: 6GHz
Collège de France 2013 Asynchronous circuits 4
Synchronous circuits
Synchronous circuit
Collège de France 2013 Asynchronous circuits
CombinationalLogic
Flip
Flo
ps
Flip
Flo
ps
PLL
6
12112
Synchronous circuit
Collège de France 2013 Asynchronous circuits
CL
Two competing paths:• Launching path• Capturing path
Launching path < Capturing path + Period
CLKtree + CL < CLKtree + Period
CL < Period (no clock skew)
2PLL
7
Source-synchronous
Collège de France 2013 Asynchronous circuits
CLKgen matched delay matched delay matched delay
• No global clock required
• More tolerance to PVT variations
• Period > longest combinational path
• Good for acyclic pipelines
Launching path
Capturing path
8
CLKgen
?
Source-synchronous with forks and joins
Collège de France 2013 Asynchronous circuits
How to synchronize incoming events?
9
C element (Muller 1959)
Collège de France 2013 Asynchronous circuits
CA
BC
A
B
C
A B C0 0 00 1 C1 0 C1 1 1
10
C element (Muller 1959)
Collège de France 2013 Asynchronous circuits
A
B C
A
B
C
A B C0 0 00 1 C1 0 C1 1 1
MAJ
11
(many implementations exist)
Multi-input C element
Collège de France 2013 Asynchronous circuits
C
C
C
C
C
C
a1
a2
a3
a4
a5
a6
a7
c
12
Completion detection
Completion detection
Collège de France 2013 Asynchronous circuits
CLKgen
fixed delay
The fixed delay must be longer than theworst-case logic delay (plus variability)
Q: could we detect when a computation has completed ASAP ?
14
A 1 SP 0 SP 1 SP 1 SP
Delay-insensitive codes: Dual Rail• Dual rail: every bit encoded with two signals
Collège de France 2013 Asynchronous circuits
A.t A.f A0 0 Spacer0 1 01 0 11 1 Not used
A.t
A.f
15
Dual-Rail AND gate
Collège de France 2013 Asynchronous circuits
A B C
SP SP SP
0 - 0
- 0 0
SP 1 SP
1 SP SP
1 1 1
A
BC
A.t
A.f
B.t
B.f
C.t
C.f
16
Dual-Rail Inverter
Collège de France 2013 Asynchronous circuits
A Z
SP SP
0 1
1 0
A.t
A.f
Z.t
Z.f
17
Dual-Rail AND/OR gate
Collège de France 2013 Asynchronous circuits
A
BC
A.t
A.f
B.t
B.f
C.t
C.f
A
BC
A.f
A.t
B.f
B.t
C.f
C.tA
BC
18
Dual rail: completion detection
Dual-rail logic
•••
•••
Collège de France 2013 Asynchronous circuits 19
00
00
00
00
00
00
00
00
00
00
00
00
00
01
10
10
10
01
01
01
10
01
10
10
01
01
Dual rail: completion detection
Dual-rail logic
•••
•••
C done
Completion detection tree
Collège de France 2013 Asynchronous circuits 20
Dual rail: completion detection
Collège de France 2013 Asynchronous circuits
AND
OR
INV
AND
CLKgen
21
Dual rail: completion detection
Collège de France 2013 Asynchronous circuits
AND
OR
INV
AND
C
22
C
Single rail data vs. dual railSome back-of-the-envelope estimations:
Collège de France 2013 Asynchronous circuits
Single rail Dual RailArea 1 2Delay 1 << 1Static power 1 2Dynamic power < 0.2 2
Dual rail:• Good for speed• Large area• High power comsumption
23
Handshaking
Handshaking
Collège de France 2013 Asynchronous circuits
CLKgen unknown delay
Assume that the source module can provide data at any rate:
• When should the CLK generator send an event if the
internal delays of the circuit are unknown?
Solution: handshaking
25
Handshaking
Collège de France 2013 Asynchronous circuits
I have data
I want data
Data
Request
Acknowledge
26
Asynchronous elastic pipeline
C
ReqIn ReqOut
AckIn AckOut
C C C
• David Muller’s pipeline (late 50’s)• Sutherland’s Micropipelines (Turing award, 1989)
Collège de France 2013 Asynchronous circuits 27
Multiple inputs and outputs
Collège de France 2013 Asynchronous circuits 28
Multiple inputs and outputs
Collège de France 2013 Asynchronous circuits
delay
29
Channel-based communication• A channel contains data and handshake wires
Collège de France 2013 Asynchronous circuits
DataReq
Ack
30
DataReq
Ack
Two-phase protocol
• Every edge is active• It may require double-edge triggered flip-flops or
pulse generators
Collège de France 2013 Asynchronous circuits
Data 1 Data 2 Data 3
Req
Ack
Data
Data transfer Data transfer
31
Four-phase protocol
• Valid data on the active edge of Req• Req/Ack must return to zero before the next transfer• Different variations of the 4-phase protocol exist
Collège de France 2013 Asynchronous circuits
Data 1 Data 2 Data 3
Req
Ack
Data
Data transfer Data transfer
32
How to memorize?
Collège de France 2013 Asynchronous circuits
CombinationalLogic LL
delay
CC
? ?
2-phase or 4-phase ?
33
How to memorize?
Collège de France 2013 Asynchronous circuits
CombinationalLogic LL
delay
CC
Pulsegenerator
2-phase
34
How to memorize?
Collège de France 2013 Asynchronous circuits
CombinationalLogic LL
delay
CC 4-phase
35
Performance analysis
Ring oscillators
Collège de France 2013 Asynchronous circuits
CC
CC
C
• Every ring requires an odd number of inverters
• The cycle period is determined by the slowest ring
• The cycle period is adapted to the operating conditions(temperature, voltage)
37
1
2 3 4
5
6 7
Why asynchronous?
Modularity• Time-independent functional composability
– Performance may be affected (but not functionality)
Collège de France 2013 Asynchronous circuits 40
A BDataReq
AckB’
Tracking variability
Collège de France 2013 Asynchronous circuits 41
matched delay
Tracking variability
delay
best typ worst
multi-corner matched delay
critical paths
Good correlation for:
• Process variability (systematic)
• Global voltage fluctuations
• Temperature
• Aging (partially)Collège de France 2013 Asynchronous circuits 42
Margins
Gate and wire delays (typ) P V T AgingPLLJitter
Skew
Rigid Clocks:
Cycle period
Gate and wire delays (typ) P V TA
gin
g
Elastic Clocks:
Skew
Cycle period
Margin reduction
Speed-up / Power savings
Collège de France 2013 Asynchronous circuits 43
wasted timecomputation time
Rigid clock
computation time
Cycle period
Cycle period
Elastic clock
Clock elasticity
Collège de France 2013 Asynchronous circuits 44
Voltage scaling and power savings
-24%-14%
3 ARM926 coreson the same die
Collège de France 2013 Asynchronous circuits 45
Design Automation
Design automation paradigms• Synthesis of asynchronous controllers
– Logic synthesis from Petri nets orasynchronous FSMs
• Syntax-directed translation– Correct-by-construction composition of handshake
components
• De-synchronization– Automatic transformation from synchronous to
asynchronousCollège de France 2013 Asynchronous circuits 47
Synthesis of asynchronous controllers
Collège de France 2013 Asynchronous circuits 48
DSr
LDS
LDTACK
D
DTACK
LDS+ LDTACK+ D+ DTACK+ DSr- D-
DTACK-
LDS-LDTACK-
DSr+
Synthesis of asynchronous controllers
Collège de France 2013 Asynchronous circuits 49
LDS+ LDTACK+ D+ DTACK+ DSr- D-
DTACK-
LDS-LDTACK-
DSr+
DTACKD
DSr
LDS
LDTACK
Example: Petrify
Syntax-directed translation
Collège de France 2013 Asynchronous circuits 50
(A || B) ; C
P = (A || B) ; C
Syntax-directed translation
Collège de France 2013 Asynchronous circuits 51
par
A B
C
A || B
seq
P = (A || B) ; C
Syntax-directed translation
Collège de France 2013 Asynchronous circuits 52
seq
par
A B
C
P = (A || B) ; C
Syntax-directed translation
Collège de France 2013 Asynchronous circuits 53
A B
P = (A ; B) seq
Syntax-directed translation
Collège de France 2013 Asynchronous circuits 54
c := a + b +
c
a b
Syntax-directed translation
Collège de France 2013 Asynchronous circuits
→
SEQ
xR
R
RWMUX
→
yR
R
RWMUX
*
DMX-
DMX-
DMX <>
DMX <
do
→→ @
áá ññ→
out
int = type [0..255]& gcd: main proc (in? chan <<int,int>> & out! chan int)begin x, y: var int| forever do in?<<x,y>>
; do x <> y then if x < y then y:=y-x else x:=x-y fi od
; out!x odend
Sources:
J. Kessels and A. Peeters.DESCALE: A Design Experiment for a SmartCard Application Consuming Low Energy,in Principles of Asynchronous Circuit Design, A Systems Perspective,Eds., J. Sparso and S. Furber, Kluwer Academic Publishers, 2001.
P.A.Beerel, R.O. Ozdag and M. Ferretti.A Designer’s Guide to Asynchronous VLSI,Cambridge University Press, 2010. 55
De-synchronization• Strategy: substitute the clock tree
by local clocks and handshakes
• Combinational logic and latches are not modified
• More tolerance to variability– Similar area, less power and/or more speed
• Cortadella, Kondratyev, Lavagno and Sotiriou. Desynchronization: Synthesis of asynchronous circuits from synchronous specifications.IEEE TCAD, Oct 2006.
Collège de France 2013 Asynchronous circuits 56
Synchronous operation
Collège de France 2013 Asynchronous circuits
CLKgen
Transforming a synchronous circuit into asynchronous (automatically)
57
De-synchronization
Collège de France 2013 Asynchronous circuits
Transforming a synchronous circuit into asynchronous (automatically)
59
Conclusions• Asynchrony offers flexibility in time
– Modularity– Dynamic adaptability– Tolerance to variability
• Better optimization of power/performance
• Why isn’t it an important trend in circuit design?– Lack of commercial EDA support (timing sign-off)– Designers do not feel comfortable with “unpredictable” timing– Other aspects: testing, verification, …
• De-synchronization might be a viable solutionCollège de France 2013 Asynchronous circuits 61
Collège de France 2013 Asynchronous circuits 62