Asynchronous Design - University of · PDF fileJ. Frenzel Asynchronous Design 3 What is...
Transcript of Asynchronous Design - University of · PDF fileJ. Frenzel Asynchronous Design 3 What is...
1
Asynchronous Design:Ready for Prime Time?
Jim FrenzelUniversity of Idaho
[email protected]/~jfrenzel
©James F. Frenzel - 2002
J. Frenzel Asynchronous Design 2
Outline
• Introduction• The Wine Shop• Design Methods• Applications• Conclusions• Credits
2
J. Frenzel Asynchronous Design 3
What is “Asynchronous?”
• Circuits and systems that operate without a clock. (i.e., self-timed)
• Asynchronous sequential circuits, e.g., controllers.
• Circuits which generate completion signals, e.g., datapaths.
• Systems consisting of FSM and datapaths.
J. Frenzel Asynchronous Design 4
Taxonomies
• Fundamental Mode: Bounded gate and wire delays. (Huffman circuits)
• Speed-Independent: Unbounded gate delays, zero wire delays. (Muller circuits)
• Delay-Insensitive: Unbounded gate and wire delays.
• Quasi-Delay-Insensitive and Self-timed: Somewires have matched delays. (isochronic forks)
3
J. Frenzel Asynchronous Design 5
Promises, Promises!
• Clock Skew? No problem.• Average case performance, rather than
worst case performance.• Adapts to process and environment.• “Plug-N-Play” systems.• Lower power.• Reduced noise and EMI.
J. Frenzel Asynchronous Design 6
Reality?
• Well, asynchronous design is hard.• Asynchronous verification is even harder.• Past asynchronous designs have been
relatively small.• There have been few “head to head”
comparisons.
4
J. Frenzel Asynchronous Design 7
Then Why Bother?
• Understanding asynchronous design will make you a better digital designer.
• Guaranteeing signal arrival/ordering in the presence of clock skew is increasingly difficult.
• Asynchronous does appear to have real advantages for certain applications.
J. Frenzel Asynchronous Design 8
Milestones
• 1940 - Turing et al. reject asynchronous• 1960 - Asynchronous lives on in ivory towers• 1989 - Caltech builds asynchronous MIPS.
Sutherland gives Turing Award speech.• 1994 - Manchester builds asynchronous ARM.• 1997 - Intel builds asynchronous instr decoder.• 1998 - Philips ships asynchronous pagers.• 2001 - Pentium 4 incorporates clockless elements.
5
J. Frenzel Asynchronous Design 9
The Wine Shop (Myers)
• Three concurrent processes• Communication via channels• Processes must remain “synchronized”• Shop behavior + environment = specification
of a closed or complete system.
Winery Shop Patron
J. Frenzel Asynchronous Design 10
Channel Communication
Shop: process
begin
receive (winery_shop, shelf);
send (shop_patron, shelf);
end process;
6
J. Frenzel Asynchronous Design 11
Shop’s EnvironmentWinery: processbegin
send (winery_shop, bottle);end process;
Patron: processbegin
receive (shop_patron, bag);end process;
J. Frenzel Asynchronous Design 12
Communication Protocol
• Implementation requires channel protocol• Protocol defines the control mechanism• Control can be implicit or explicit• Specific input/output transitions• Coded data (e.g., dual-rail)• Explicit request/acknowledge
7
J. Frenzel Asynchronous Design 13
Dual-Rail Encoding
• 2 physical bits for every logical bit.
• Most applications use three of the four combinations.
• Also know as delay-insensitive coding.
Not Used11110001
Null00
J. Frenzel Asynchronous Design 14
2-Phase versus 4-Phase
More transitions per cycle, but simpler circuitry
ack
req
ack
req
8
J. Frenzel Asynchronous Design 15
Active versus Passive
• Each channel must have one active and one passive interface.
• If the shop calls the winery, then calls the patron, the shop is active participant in both.
• If the winery calls the shop, then the shop calls the patron, the shop is passive/active.
• If the patron orders the bottle first, then the shop is passive/passive.
J. Frenzel Asynchronous Design 16
4 Phase Active/ActiveShop_4PAA: processbegin
assign (req_wine,’1’);guard (ack_wine,’1’); assign (req_wine,’0’);guard (ack_wine,’0’); assign (req_patron,’1’);guard (ack_patron,’1’); assign (req_patron,’0’);guard (ack_patron,’0’);
end process;
Request to winery
Send to patron
9
J. Frenzel Asynchronous Design 17
Implementation?
• Protocol affects concurrency, delay sensitivity, circuitry, and performance.
• Not all protocols are realizable.
• What happens next if <aw,ap/rw,rp> = <0,0/0,0>?
J. Frenzel Asynchronous Design 18
Channel CommunicationShop_4PAA: processbegin
assign (req_wine,’1’);guard (ack_wine,’1’); assign (req_wine,’0’);guard (ack_wine,’0’); assign (req_patron,’1’);guard (ack_patron,’1’); assign (req_patron,’0’);guard (ack_patron,’0’);
end process;
Should the shop contact the winery?
Or the Patron?!?!
10
J. Frenzel Asynchronous Design 19
Explanation
• As is, the protocol is unrealizable.
• The desired response of the shop cannot be uniquely determined by the I/O state.
• The state graph reveals this.
J. Frenzel Asynchronous Design 20
State Graph
00/00
ap-
00/10 10/10 10/00
00/00
00/0101/0101/00ap+rp-
aw+ rw-aw-rw+
rp+
<aw,ap/rw,rp>
11
J. Frenzel Asynchronous Design 21
Solution
• An internal state variable must be added to resolve the ambiguity, or
• The protocol must be shuffled.
J. Frenzel Asynchronous Design 22
Shuffled ProtocolShop_4PAAS: processbegin
assign (req_wine,’1’);guard (ack_wine,’1’); assign (req_patron,’1’);guard (ack_patron,’1’); assign (req_wine,’0’);guard (ack_wine,’0’); assign (req_patron,’0’);guard (ack_patron,’0’);
end process;
Notify patron before releasing winery
12
J. Frenzel Asynchronous Design 23
Shuffled State Graph
00/00 11/11
00/10 10/10 10/11
11/0101/0100/01rp-
aw+rw+
ap-
rp+ap+
rw-
aw-
Each state is uniquely encode by the inputs and outputs.
J. Frenzel Asynchronous Design 24
Circuit Implementation
rw
aw rp
ap
Completely delay-insensitive, but no concurrency.
13
J. Frenzel Asynchronous Design 25
Passive/Active ProtocolShop_4PPA: processbegin
guard (req_wine,’1’);assign (ack_wine,’1’);guard (ack_patron,’0’); assign (req_patron,’1’);guard (req_wine,’0’);assign (ack_wine,’0’); guard (ack_patron,’1’); assign (req_patron,’0’);
end process;
ack_patron allowed to fall concurrently with the setting of req_wine and ack_wine.
“Decouples” patron from winery.
J. Frenzel Asynchronous Design 26
P/A Circuit Implementation
C
C
ap
aw
rw
rp
Muller C elements represent synchronization points Also delay-insensitive!
14
J. Frenzel Asynchronous Design 27
Petri Net Representation
J. Frenzel Asynchronous Design 28
Hazards
Redundant gates are added to mask “glitches” on outputs.
15
J. Frenzel Asynchronous Design 29
Races
• A “race” exists if two or more state variables are “excited” under a state transition. (i.e., will change)
• A “critical race” exists if the outcome of the race determines the next state of the FSM.
• A “critical race-free” state assignment ensures that next state equations are only a function of stable state variables, for a given transition.
J. Frenzel Asynchronous Design 30
Huffman Design Procedure
• Classical method assumes single input change, fundamental mode operation.
• First, develop a primitive row flow table.• Then, reduce the flow table.• Next, select a race-free state assignment.• Finally, form the next state equations and
implement with hazard-free logic.
16
J. Frenzel Asynchronous Design 31
Limitations
• Original methods focused on single-input changes, fundamental mode operation.
• Circuit must fully absorb a new input before a new input is presented.
• This restriction limits concurrency.• Burst Mode (BM) allows selected input
transitions to occur simultaneously.
J. Frenzel Asynchronous Design 32
Muller Design Procedure
• Unlike the Huffman method, Muller circuit design requires a complete circuit.
• Synthesis procedure starts with a high level specification (e.g., signal transition graph).
• This is then translated into a state graph.• State graph is examined for unique state encoding
and the protocol altered or state variables added.• Finally, the SG is mapped to the target technology.
17
J. Frenzel Asynchronous Design 33
Comparison
• SI circuits are more robust, portable, and easier to verify.
• However, they may be larger and slower.• Burst Mode circuits have robust circuitry,
but impose timing restrictions on the environment and the feedback paths.
• BM specifications limit concurrency.
J. Frenzel Asynchronous Design 34
CAD Tools
• Certain conventional tools, such as simulators and layout tools, are effective.
• Synthesis tools, however, must prevent the introduction of hazards during mapping.
• Most synthesis tools were developed in universities or are proprietary.
• Most begin from a state table or graph representation.
18
J. Frenzel Asynchronous Design 35
Verification & Testing
• Inherent redundancy masks defects• Absence of clock reduces controllability• Some efforts to incorporate scan-based test.• Asynchronous circuits suffer from “state
explosion” (total state = input state + internal state).
• Formal methods may help.
J. Frenzel Asynchronous Design 36
Primary Applications
• High Performance• Low Power• Low Noise and EMC• Featured projects
19
J. Frenzel Asynchronous Design 37
High Performance
• Data-dependent delays (e.g., adders)• Elastic pipelines• Difficult to quantify performance and
compare directly to synchronous. • Completion signal represents overhead• Difficult to translate local advantage into
system-level improvement
J. Frenzel Asynchronous Design 38
Low Power
• Dissipate only when active– Reed-Solomon error corrector– Infrared receiver– Digital filter– Pager
• Low-power processors– ARM-compatible processor– Multimedia and DSP processors
20
J. Frenzel Asynchronous Design 39
Low Noise and EMC
• In a synchronous system, the clock modulates the supply current, peaking near the productive edge.
• This generates noise in the power system.• Similar effects can be observed in the frequency
domain, with peaks at the fundamental frequency and higher harmonics.
• Inductive parasitics may lead to excessive emissions, disrupting radio circuits.
J. Frenzel Asynchronous Design 40
Companies
• Sun Microsystems • Philips• IBM• Intel• Theseus Logic (www.theseus.com)• Fulcrum Logic (www.avlsi.com)
21
J. Frenzel Asynchronous Design 41
Amulet
• Amulet1 (1995): code-compatible ARM using 2-phase bundled-data.
• 2-phase chip interface made debug difficult.• 2-phase circuitry used many translations to
4-phase. Not efficient.• Creating pipes was easy. As a result …• Execution pipes were too deep.
J. Frenzel Asynchronous Design 42
Amulet2e
• Delivered in 1996.• Used 4-phase signaling and shorter pipes.• Conventional chip interface simplified
debug and system development.• Exhibited lower emissions than clocked
versions.
22
J. Frenzel Asynchronous Design 43
Amulet2e Performance
Amulet3 aims to close the gap, emphasizing performance over power.
J. Frenzel Asynchronous Design 44
Intel RAPPID
• Pentium/IA32 Instruction Decoder.• Started in 1995; completed in 1998.• Three times the throughput and half the
latency at half the power.• Optimized for the common case, using
timing information.• Hindered by lack of CAD tools.• Design deemed too “risky.”
23
J. Frenzel Asynchronous Design 45
Philips Pager
• Over fifteen years invested in research.• Circuits synthesized from a HLL, Tangram.• Components use 4-phase handshake.• Asynchronous portions are 20% larger, but
5x more efficient in power.• Achieved 100% stuck-at fault coverage.
J. Frenzel Asynchronous Design 46
24
J. Frenzel Asynchronous Design 47
IBM IPCMOS
• Exploring the use of interlocked pipelines.• Locally asynchronous modules are
interconnected into a globally synchronous processor.
• Power savings of 5-10x• Staggered clocks reduce Ldi/dt noise.• Currently running at 3.3-4.5 GHz in 1.5 V,
0.18 um CMOS process.
J. Frenzel Asynchronous Design 48
Sun FIFO
• Focused on building FIFO that are faster than a clocked shift register.
• Compared two control methods: asP* (pulsed) and micropipelining (2-phase, bundled).
• asP* was comparable in area and performance.• The micropipelined method was 50% faster and
50% larger, due in part to larger latches.• FIFO are a key element in the FLEETzero chip.
25
J. Frenzel Asynchronous Design 49
FLEETzero
• An asynchronous switch fabric, delivering 8-bit data items from any of 8 sources to any of 8 destinations.
• Sources are outputs of logic, arithmetic, and memory; destinations are the inputs.
• This paradigm emphasizes data movement over operation-centric computation.
J. Frenzel Asynchronous Design 50
26
J. Frenzel Asynchronous Design 51
J. Frenzel Asynchronous Design 52
Performance
• Fabricated in a 0.35 um process, it delivers 1.2 Giga-Data-Items/sec.
• Latency through seven stages from source to destination is under 4 ns.
• Each module cycles at about the speed of a 3-stage ring oscillator.
27
J. Frenzel Asynchronous Design 53
The Future?
• Technology may be “stabilizing”. • CAD tools will improve.• Initial usage will be low power, data driven
applications and systems on a chip.• Widespread acceptance may require a
computing paradigm shift. (FLEETzero?)
J. Frenzel Asynchronous Design 54
Obstacles
• Asynchronous design faces many of the same challenges as formal methods.
• Engineers don’t have the time to learn obscure specification & design methods.
• Need industry CAD support.• Definition of “optimum” changes with
technology, assumptions, and goals.
28
J. Frenzel Asynchronous Design 55
Still Hungry for More?
• EE 540, Asynchronous Circuit Design!• Fifteen weeks of in-depth study• Available on VHS and DVD!• Order in time for the holidays!
• See www.uidaho.edu/~jfrenzel/540
J. Frenzel Asynchronous Design 56
Course Outline
• Follows Myers’ outline• Communication channels and protocols• Graphical representations• Huffman Circuits• Muller Circuits• Timing Circuits• Verification• Applications and new results
29
J. Frenzel Asynchronous Design 57
Prefer to Read the Book?
• “Asynchronous Circuit Design,” Chris Myers, Wiley & Sons, 2001.
http://async.elen.utah.edu/book
• Proceedings of the IEEE, February 1999, vol. 87, no. 2, pp. 223-233 and 234-242.
(Special Issue on asynchronous design)
J. Frenzel Asynchronous Design 58
Prefer to Surf?
• The University of Manchester maintains The Asynchronous Logic homepage.
• Comprehensive set of links to publications, tools, and research groups.
http://www.cs.man.ac.uk/async
30
J. Frenzel Asynchronous Design 59
Primary Sources
• The Wine Shop example: “Asynchronous Circuit Design,” Myers, Wiley & Sons, 2001.
• Methods material: “Modeling and Design of Asynchronous Circuits,” Josephs et al., Proceedings of the IEEE, vol. 87, no. 2, Feb 1999, pp. 234-242.
• Applications material: “Applications of Asynchronous Circuits,” van Berkel et al., Proceedings of the IEEE, vol. 87, no. 2, Feb 1999, pp. 223-233.