Delay models (I) A B C Real (analog) behaviorAbstract behavior A B C Abstractions are necessary to...

Post on 31-Mar-2015

224 views 0 download

Tags:

Transcript of Delay models (I) A B C Real (analog) behaviorAbstract behavior A B C Abstractions are necessary to...

Delay models (I)

A

BC

Real (analog) behavior Abstract behavior

A

B

C

Abstractions are necessary to define delay models manageable fordesign, synthesis and verification. Abstractions introduce optimisticand pessimistic simplifications that must be carefully taken into account.

Delay models (II)

Separation between functionality and timing [Muller] Every gate has a zero-delay atomic evaluator

(Boolean function) A delay is associated to every output (gate delay

model)or every input (wire delay model)

Delays can be: Unbounded (arbitrary finite delays) Bounded (within given min/max bounds)

Gate delay model Wire delay model

Delay models (III)

Gate delay model: delays in gates, no delays in wires

Wire delay model: delays in gates and wires

Delay models (IV)

Speed-independent circuitSpeed-independent circuit:hazard-free under the unbounded gate delay model

Delay-insensitive circuitDelay-insensitive circuit:hazard-free under the unbounded wire delay model

Quasi-delay -insensitive circuitQuasi-delay -insensitive circuit:delay-insensitive with some isochronic forks

Speed-independent model

Pessimistic, since delays are typically bounded

Optimistic, since it assumes isochronic forks(negligible skew wrt the receiving gate delays)

Efficient synthesis methods exist

Isochronic fork

Fundamental mode of operation

Circuit

Inputs Outputs

[Huffman 1964] : The circuit/environment interact with two phases(1) The environment sends inputs to the circuit(2) The circuit computes the outputs and the state signals

The environment does not send new inputs until the circuit stabilizes

Normal Fundamental Mode: Only one input changes at each communication cycle

Input/Output mode of operation Computation and communication can

overlap (according to some specified protocol)

Event-based specification models are often used to describe the behavior (e.g., Petri nets).

This tutorial will cover the synthesis of speed-independentcircuits that work under the I/O mode of operation

and are specified using Petri nets.

Delay models for async. circuits

Bounded delays (BD): realistic for gates and wires. Technology mapping is easy, verification is

difficult

Speed independent (SI): Unbounded (pessimistic) delays for gates and “negligible” (optimistic) delays for wires. Technology mapping is more difficult, verification

is easy

Delay insensitive (DI): Unbounded (pessimistic) delays for gates and wires. DI class (built out of basic gates) is almost empty

Quasi-delay insensitive (QDI): Delay insensitive except for critical wire forks (isochronic forks). In practice it is the same as speed independent

BD

SI QDI

DI

Outline Overview of the synthesis flow Specification State graph and next-state functions State encoding Implementability conditions Speed-independent circuit

Complex gates C-element architecture

Review of some advanced topics

Book and synthesis tool

J. Cortadella, M. Kishinevsky, A. Kondratyev,L. Lavagno and A. Yakovlev,Logic synthesis for asynchronouscontrollers and interfaces,Springer-Verlag, 2002

petrify:http://www.lsi.upc.es/petrify

Design flow

Specification

(STG)

State Graph

SG withCSC

Next-state functions

Decomposed functions

Gate netlist

Reachability analysis

State encoding

Boolean minimization

Logic decomposition

Technology mapping

Specification

x

y

z

x+

x-

y+

y-

z+

z-

Signal Transition Graph (STG)

xy

z

Token flow

x

y

z

x+

x-

y+

y-

z+

z-

State graph

x+

x-

y+

y-

z+

z-

xyz000

x+

100y+z+

z+y+

101 110

111

x-

x-

001

011y+

z-

010

y-

Next-state functions

x z x y ( )

y z x

z x y z

xyz000

x+

100y+z+

z+y+

101 110

111

x-

x-

001

011y+

z-

010

y-

Gate netlist

x

z

y

x z x y ( )

y z x

z x y z

Design flow

Specification

(STG)

State Graph

SG withCSC

Next-state functions

Decomposed functions

Gate netlist

Reachability analysis

State encoding

Boolean minimization

Logic decomposition

Technology mapping

VME bus

DeviceLDS

LDTACK

D

DSr

DSw

DTACK

VME BusController

DataTransceiver

BusDSr

LDS

LDTACK

D

DTACK

Read Cycle

STG for the READ cycle

LDS+ LDTACK+ D+ DTACK+ DSr- D-

DTACK-

LDS-LDTACK-

DSr+

LDS

LDTACK

D

DSr

DTACK

VME BusController

Choice: Read and Write cycles

DSr+

LDS+

LDTACK+

D+

DTACK+

DSr-

D-

DTACK-

LDS-

LDTACK-

DSw+

D+

LDS+

LDTACK+

D-

DTACK+

DSw-

DTACK-

LDS-

LDTACK-

Choice: Read and Write cycles

DTACK-

DSr+

LDS+

LDTACK+

D+

DTACK+

DSr-

D-

LDS-

LDTACK-

DSw+

D+

LDS+

LDTACK+

D-

DTACK+

DSw-

Circuit synthesis

Goal: Derive a hazard-free circuit

under a given delay model andmode of operation

Speed independence

Delay model Unbounded gate / environment delays Certain wire delays shorter than certain paths

in the circuit

Conditions for implementability: Consistency Complete State Coding Persistency

Design flow

Specification

(STG)

State Graph

SG withCSC

Next-state functions

Decomposed functions

Gate netlist

Reachability analysis

State encoding

Boolean minimization

Logic decomposition

Technology mapping

STG for the READ cycle

LDS+ LDTACK+ D+ DTACK+ DSr- D-

DTACK-

LDS-LDTACK-

DSr+

LDS

LDTACK

D

DSr

DTACK

VME BusController

Binary encoding of signals

DSr+

DSr+

DSr+

DTACK-

DTACK-

DTACK-

LDS-LDS-LDS-

LDTACK- LDTACK- LDTACK-

D-

DSr-DTACK+

D+

LDTACK+

LDS+

Binary encoding of signals

DSr+

DSr+

DSr+

DTACK-

DTACK-

DTACK-

LDS-LDS-LDS-

LDTACK- LDTACK- LDTACK-

D-

DSr-DTACK+

D+

LDTACK+

LDS+

10000

10010

10110 01110

01100

0011010110

(DSr , DTACK , LDTACK , LDS , D)

Excitation / Quiescent Regions

QR (LDS+)QR (LDS+)

QR (LDS-)QR (LDS-)

ER (LDS+)ER (LDS+)

ER (LDS-)ER (LDS-)

LDS-LDS-

LDS+

LDS-

Next-state function

0 1

LDS-LDS-

LDS+

LDS-

1 0

0 0

1 1

1011010110

Karnaugh map for LDS

DTACKDSrD

LDTACK 00 01 11 10

00

01

11

10

DTACKDSrD

LDTACK 00 01 11 10

00

01

11

10

LDS = 0 LDS = 1

0 1-0

0 0 0 0 0 0/1?

1

111

-

-

-

---

- - - -

-

- ---

- - -

Design flow

Specification

(STG)

State Graph

SG withCSC

Next-state functions

Decomposed functions

Gate netlist

Reachability analysis

State encoding

Boolean minimization

Logic decomposition

Technology mapping

Concurrency reduction

LDS-LDS-

LDS+

LDS-

1011010110

DSr+

DSr+

DSr+

Concurrency reduction

LDS+ LDTACK+ D+ DTACK+ DSr- D-

DTACK-

LDS-LDTACK-

DSr+

State encoding conflicts

LDS-

LDTACK-

LDTACK+

LDS+

10110

10110

Signal Insertion

LDS-

LDTACK-

D-

DSr-

LDTACK+

LDS+

CSC-

CSC+

101101

101100

Design flow

Specification

(STG)

State Graph

SG withCSC

Next-state functions

Decomposed functions

Gate netlist

Reachability analysis

State encoding

Boolean minimization

Logic decomposition

Technology mapping

Complex-gate implementation

)(csccsc

csc

csc

LDTACKDSr

LDTACKD

DDTACK

DLDS

Implementability conditions Consistency

Rising and falling transitions of each signal alternate in any trace

Complete state coding (CSC) Next-state functions correctly defined

Persistency No event can be disabled by another

event (unless they are both inputs)

Implementability conditions Consistency + CSC + persistency

There exists a speed-independent circuit that implements the behavior of the STG

(under the assumption that ay Boolean function can be implemented with one complex gate)

Persistency

100 000 001a- c+

b+ b+

a

cb

a

c

b

is this a pulse ?

Speed independence glitch-free output behavior under any delay

a+

b+

c+

d+

a-

b-

d-

a+

c-a-

0000

1000

1100

0100

0110

0111

1111

1011

0011 1001

0001

a+

b+

c+

a-

b-

c-

a+

c-

a-

a-

d-d+

0000

1000

1100

0100

0110

0111

1111

1011

0011 1001

0001

a+

b+

c+

a-

b-

c-

a+

c-

a-

a-

d-d+

abcd 00 01 11 10

00

01

11

10 1

1 1 11

10

0 000

ER(d+)

ER(d-)

abcd 00 01 11 10

00

01

11

10 1

1 1 11

10

0 000

adcd

0000

1000

1100

0100

0110

0111

1111

1011

0011 1001

0001

a+

b+

c+

a-

b-

c-

a+

c-

a-

a-

d-d+

Complex gate

Implementation with C elements

CR

S z

• • • S+ z+ S- R+ z- R- • • •

• S (set) and R (reset) must be mutually exclusive

• S must cover ER(z+) and must not intersect ER(z-) QR(z-)

• R must cover ER(z-) and must not intersect ER(z+) QR(z+)

abcd 00 01 11 10

00

01

11

10 1

1 1 11

10

0 000

0000

1000

1100

0100

0110

0111

1111

1011

0011 1001

0001

a+

b+

c+

a-

b-

c-

a+

c-

a-

a-

d-d+

CS

Rdc

ca

0000

1000

1100

0100

0110

0111

1111

1011

0011 1001

0001

a+

b+

c+

a-

b-

c-

a+

c-

a-

a-

d-d+

CS

Rdc

ca

but ...

0000

1000

1100

0100

0110

0111

1111

1011

0011 1001

0001

a+

b+

c+

a-

b-

c-

a+

c-

a-

a-

d-d+

CS

Rdc

ca

Assume that R=ac has an unbounded delay

Starting from state 0000 (R=1 and S=0):

a+ ; R- ; b+ ; a- ; c+ ; S+ ; d+ ;

R+ disabled (potential glitch)

abcd 00 01 11 10

00

01

11

10 1

1 1 11

10

0 000

0000

1000

1100

0100

0110

0111

1111

1011

0011 1001

0001

a+

b+

c+

a-

b-

c-

a+

c-

a-

a-

d-d+

CS

Rdc

cba

Monotonic covers

C-based implementations

CS

Rdc

cbaC

d

ab

c

a

b

cd

weak

a

cd

generalized C elements (gC)

weak

Speed-independent implementations Implementability conditions

Consistency Complete state coding Persistency

Circuit architectures Complex (hazard-free) gates C elements with monotonic covers ...

Synthesis exercisey-

z- w-

y+ x+

z+

x-

w+

1011

0111

0011

1001

1000

1010

0001

0000 0101

0010 0100

0110

y-

y+

x-

x+w+

w-

z+

z-

w-

w-

z-

z-y+

y+

x+

x+

Derive circuits for signals x and z (complex gates and monotonic covers)

Synthesis exercise

1011

0111

0011

1001

1000

1010

0001

0000 0101

0010 0100

0110

y-

y+

x-

x+w+

w-

z+

z-

w-

w-

z-

z-y+

y+

x+

x+

wxyz 00 01 11 10

00

01

11

10

-

-

-

-

Signal x

1

0

1

1

1

1

1

0 0

0

0

0

Synthesis exercise

1011

0111

0011

1001

1000

1010

0001

0000 0101

0010 0100

0110

y-

y+

x-

x+w+

w-

z+

z-

w-

w-

z-

z-y+

y+

x+

x+

wxyz 00 01 11 10

00

01

11

10

-

-

-

-

Signal z

1

0 0

0

0

11 1

0

0 0

0

Logic decomposition: example y-

z- w-

y+ x+

z+

x-

w+

1001 1011

1000

1010

0001

0000 0101

0010 0100

0110 0111

0011

y-

y+

x-

x+w+

w-

z+

z-

w-

w-

z-

z-y+

y+

x+

x+

Logic decomposition: example

yz=1yz=0

1001 1011

1000

1010

0001

0000 0101

0010 0100

0110 0111

0011

y-

y+

x-

x+w+

w-

z+

z-

w-

w-

z-

z-y+

y+

x+

x+

1001 1011

1000

1010

0001

0000 0101

0010 0100

0110 0111

0011

y-

y+

x-

x+w+

w-

z+

z-

w-

w-

z-

z-y+

y+

x+

x+

C

C

x

y

x

y

w

z

xyz

y

zw

z

w

z

y

Logic decomposition: example

s-

s+

s-

s-

s=1

s=0

1001 1011

1000

1010

0111

0011y+

x-

w+

z+

z-

0001

0000 0101

0010 0100

0110

x+

w-

w-

w-

z-

z-y+

y+

x+

x+

1001

1000

1010

y+

z-

0111

C

C

x

y

x

y

w

z

x

y

z

w

z

w

z

y

sy-

Logic decomposition: example y-

z- w-

y+ x+

z+

x-

w+

s-

s+

s-

s+

s-

s-

s=1

s=0

1001 1011

1000

1010

0111

0011y+

x-

w+

z+

z-

0001

0000 0101

0010 0100

0110

x+

w-

w-

w-

z-

z-y+

y+

x+

x+

1001

1000

1010

y+

z-

0111

y-

Speed-independent Netlist

LDS+ LDTACK+ D+ DTACK+ DSr- D-

DTACK-

LDS-LDTACK-

DSr+

DTACKD

DSr

LDS

LDTACK

csc

map

Adding timing assumptions LDS+ LDTACK+ D+ DTACK+ DSr- D-

DTACK-

LDS-LDTACK-

DSr+

DTACKD

DSr

LDS

LDTACK

csc

map

LDTACK- before DSr+

FAST

SLOW

Adding timing assumptions

DTACKD

DSr

LDS

LDTACK

csc

map

LDS+ LDTACK+ D+ DTACK+ DSr- D-

DTACK-

LDS-LDTACK-

DSr+

LDTACK- before DSr+

State space domain

LDTACK- before DSr+

LDTACK-

DSr+

State space domain

LDTACK- before DSr+

LDTACK-

DSr+

State space domain

LDTACK- before DSr+

LDTACK-

DSr+

Two more unreachable states

Boolean domain

DTACKDSrD

LDTACK 00 01 11 10

00

01

11

10

DTACKDSrD

LDTACK 00 01 11 10

00

01

11

10

LDS = 0 LDS = 1

0 1-0

0 0 0 0 0 0/1?

1

111

-

-

-

---

- - - -

-

- ---

- - -

Boolean domain

DTACKDSrD

LDTACK 00 01 11 10

00

01

11

10

DTACKDSrD

LDTACK 00 01 11 10

00

01

11

10

LDS = 0 LDS = 1

0 1-0

0 0 - 0 0 1

1

111

-

-

-

---

- - - -

-

- ---

- - -

One more DC vector for all signalsOne state conflict is removed

Netlist with one constraintLDS+ LDTACK+ D+ DTACK+ DSr- D-

DTACK-

LDS-LDTACK-

DSr+

DTACKD

DSr

LDS

LDTACK

csc

map

Netlist with one constraint

LDS+ LDTACK+ D+ DTACK+ DSr- D-

DTACK-

LDS-LDTACK-

DSr+

DTACK D

DSr LDS

LDTACK

LDTACK- before DSr+

TIMING CONSTRAINT

Signal insertion

New signals need to be inserted to solve some synthesis problems (e.g., state encoding, logic decomposition)

For each signal s, the events s+ and s- must be inserted while preserving certain behavioral properties (consistency, persistency).

Each new signal determines a new partition of states (s=0, s=1)

Signal insertion

s=0s=1

s+

s-

From state graphs to Petri nets A state graph may require

transformations to meet certain properties (e.g., state encoding).

The visualization of a state graph is not very informative. Event-based specifications explicitly represent the relations between events.

Resort to the theory of regions

From state graphs to Petri nets

0 1

2

3

4

5

6

a

b

c

c d

d

e

e

f

f

d

a

b

c

d

e f

Region: {2,3}

From state graphs to Petri nets

0 1

2

3

4

5

6

a

b

c

c d

d

e

e

f

f

d

a

b

c

d

e f

Region: {1}

From state graphs to Petri nets

0 1

2

3

4

5

6

a

b

c

c d

d

e

e

f

f

d

a

b

c

d

e f

Not a region: {3,5}

From state graphs to Petri nets

0 1

2

3

4

5

6

a

b

c

c d

d

e

e

f

f

d

a

b

c

d

e f

Region: {0,3,5}

Theory of regions

Region: all arcs of any event have the same relationship with the region (enter, exit, no cross).

Minimal region: not included in any other region

Pre-/post-region of an event: region such that the event exits/enters the region

Property: excitation closure The intersection of all pre-regions of an event is the

excitation region of the event

Thanks to Steve Nowick (Columbia Univ.)

Burst-Mode SpecificationsHow to specify “burst-mode” behavior?:

Hazard-FreeCombinational

Network

inputsoutputs

state

(several bits)

A

B

C

X

Y

Z

input burst output burst

A+ C-/Y- Z+

1

current state

input burst/ output burst 2

next state

Note: -input bursts: must be non-empty (at least 1 input per burst)

-output bursts: may be empty (0 or more outputs per burst)

Burst-Mode Specifications

Example: Burst-Mode (BM) Specification:

- Inputs in specified “input burst” can arrive in any order and at any time

- After all inputs arrive, generate“output burst”

0

1

3

2

4

5

A+ C+/Z-

C-/ Z+

C+/ Y+

A-/ Y-

A+ B+/Y+ Z-

B- C+/Z+

C-/--

Initial Values:ABC = 000

YZ = 01

Burst-Mode Specifications

“Extended Burst-Mode” (XBM):[Yun/Dill ICCAD-93/95]

1. “directed don’t cares” (Rin*): allow concurrent inputs & outputs 2. “conditionals” (<Cnd>): allow “sampling” of level signals

Handles glitchy inputs, mixed sync/async inputs, etc.

0

1

2

3

4

5

6

ok+ Rin*/ FRout+

FAin+ Rin*/ FRout-

FAin- Rin+/ Aout+

Rin* FAin+/ FRout-

<Cnd+> Rin-/ Aout- FRout+

<Cnd-> Rin-/ Aout-

ok- Rin*/ --

Rin+ FAin-/Aout+

New Features:

Syntax-directed translation

2-Place “Ripple Register” (= FIFO) [van Berkel]

proc (a?T & b!T) begin

x0, x1: var T | forever do

b! x1;x1 := x0;a? x0

od end

Tangram Program Intermediate “Handshake Circuit”

A Larger Example

Intermediate“Handshake Circuit”

Components communicate using “4-phase

handshaking” O1: initiates communication O2: completes communication

Channel impltn. => use 2 wires:req => start operationack => operation done

(… can be extended to handle data)

Background: Channel-Based Communication

O1 O2

Channel A

req

ack

Active phase

Return-to-zero (RTZ) phase

passive portactive port

Handshake Components: Sequencer2-Way Sequencer: activated on channel P; then activates 2 processes in sequence on channels A1 and A2

P

A1

SEQ

A2

Goal:Goal: activate two sequential processes (i.e. operations)

Process X1

Process X2

Operation X1; X2

Handshake Components: PAR Component

PAR Component: activated on channel P;

then activates 2 processes in parallel on channels A1 and A2

P

A1

PAR

A2

Goal:Goal: activate two parallel processes

Process X1

Process X2

Operation X1 || X2

procedure Buf1 (

input i: byte;

output o: byte) is

local variable x : byte

begin

loop begin

i -> x ;

o <- x

end

end

Intermediate Representation

;

#

StartStart

XI OO

Tangram SpecTangram Spec Handshake Circuit Handshake Circuit Syntax-Directed TranslationSyntax-Directed Translation

unoptimizedunoptimized

Conclusions

STGs have a high expressiveness power at a low level of granularity (similar to FSMs for synchronous systems)

Synthesis from STGs can be fully automated

Synthesis tools often suffer from the state explosion problem (symbolic techniques are used)

The theory of logic synthesis from STGs can be found in:

J. Cortadella, M. Kishinevsky, A. Kondratyev, L. Lavagno and A. Yakovlev,J. Cortadella, M. Kishinevsky, A. Kondratyev, L. Lavagno and A. Yakovlev,Logic Synthesis of Asynchronous Controllers and InterfacesLogic Synthesis of Asynchronous Controllers and Interfaces ,,Springer Verlag, 2002.Springer Verlag, 2002.