Fast Asynchronous Shift Register for Bit-Serial Communication

18
ASYNC06 ASYNC06 Fast Asynchronous Fast Asynchronous Shift Register Shift Register for Bit for Bit - - Serial Communication Serial Communication Rostislav (Reuven) Dobkin Rostislav (Reuven) Dobkin Dr. Ran Ginosar, Dr. Avinoam Kolodny Dr. Ran Ginosar, Dr. Avinoam Kolodny Technion Technion Israel Institute of Technology Israel Institute of Technology Electrical Engineering Department Electrical Engineering Department VLSI Lab VLSI Lab

Transcript of Fast Asynchronous Shift Register for Bit-Serial Communication

ASYNC06ASYNC06

Fast AsynchronousFast AsynchronousShift RegisterShift Register

for Bitfor Bit--Serial CommunicationSerial Communication

Rostislav (Reuven) DobkinRostislav (Reuven) DobkinDr. Ran Ginosar, Dr. Avinoam KolodnyDr. Ran Ginosar, Dr. Avinoam Kolodny

Technion Technion –– Israel Institute of TechnologyIsrael Institute of Technology

Electrical Engineering Department Electrical Engineering Department –– VLSI LabVLSI Lab

2 ASYNC06Dobkin Reuven, March 2006

Presentation Outline

• On-Chip Communication Challenges• Fast Asynchronous Serial Communication

• Transmitter• Channel• Receiver

• High Data Rate Asynchronous Shift Register• Performance• Conclusions

3 ASYNC06Dobkin Reuven, March 2006

A Future SoC + NoC Structure

4 ASYNC06Dobkin Reuven, March 2006

On-Chip Communication Challenges

• Low Power

• High Throughput (bit-per-second)

• Low-Latency Synchronization

• Total Low Latency (fly time from source to sink)

• Low Area (routability)

• High Reliability

5 ASYNC06Dobkin Reuven, March 2006

• Constructed of multiple wires and repeaters

• Incur high leakage power (line drivers and repeaters)

• Often have low utilization (leakage again)

• Occupy large chip area (routing difficulty)

• Incur cross-coupling (delay matching & skew problems)

• Present a significant capacitive load

N

Parallel NoC Links and SoC BusesParallel NoC Links and SoC Buses

6 ASYNC06Dobkin Reuven, March 2006

Bit-Serial Interconnect

• Fewer lines, fewer line drivers and fewer repeaters

• Reduced leakage power

• Reduced chip area

• Better routability

• Reduced coupling

• Should work N times faster!

FSER=N×FPAR

7 ASYNC06Dobkin Reuven, March 2006

Fast Serial Interconnect

• Synchronous Approach:• Source-synchronous, with Clock Data Recovery (CDR)

• Problem: CDR power and area

• Asynchronous Approach: • Normal methods are too slow

• We consider a faster one

+-

+-

+-

+-

DLL- or PLL-Based Clock

Recovery

8 ASYNC06Dobkin Reuven, March 2006

Serial Link – Top Structure

• Transition signaling instead of sampling: two-phase NRZ Level Encoded Dual Rail (LEDR) asynchronous protocol

• Acknowledge per word instead of per bit

• Wave-pipelining over channel

• Differential encoding

• Low-latency synchronizers

Sender Receiver

Word Ack

Bit-Serial LineSynchronizer SynchronizerEncoder &Serializer

DeSerializer&Decoder

9 ASYNC06Dobkin Reuven, March 2006

Encoding –Two Phase NRZ LEDR

• Two Phase Non-Return-to-Zero Level Encoded Dual Rail

• “delta” encoding (one transition per bit)

State bit (S)

Phase bit (P)LEDRUncoded (B)

Uncoded (B)

State bit (S)

Phase bit (P)

0 0 1 1 0 0 0 0 1 0

10 ASYNC06Dobkin Reuven, March 2006

Seeking Highest Speed

• No “spacers” (NRZ)

• One transition per bit

• Fast signaling • Targeted Speed: One gate delay between bits

• For 65nm technology this results in 67 Gbps

One Gate Delay =

FO4 INV Delay

11 ASYNC06Dobkin Reuven, March 2006

• Main Functions:• Synchronizer (Standard, LDL)

• Dual-Rail Two-Phase Encoder (AFSM)

• Serializer (A Novel Solution)

• Differential Encoder (Differential Driver)

Fast Serial Operation

Slow Parallel Operation

Transmitter – Top Architecture

12 ASYNC06Dobkin Reuven, March 2006

Serializer – Fast SR Approach

Transition

Generator

• Same circuit drives S, S lines

13 ASYNC06Dobkin Reuven, March 2006

RX – De-Serializer and Decoder

Fast Serial Operation

Slow Parallel Operation

Data Shift-Register

14 ASYNC06Dobkin Reuven, March 2006

Fast Asynchronous Shift Register

15 ASYNC06Dobkin Reuven, March 2006

Shift-Register Configurations

Shift-Register

TX – Parallel Load

RX – Parallel Output

16 ASYNC06Dobkin Reuven, March 2006

Circuitry & Performance

• For typical conditions SR worked down to 14ps data cycle (71Gbps)

• Data is shifted inside sub-pipes twice slower than transition rate

XLCTL

CTL

D QX

E PZ

17 ASYNC06Dobkin Reuven, March 2006

Performance

• SPICE simulation show correct operation at target data cycle of 15ps

• SR is asynchronous – allows non-uniform delay intervals between successive bits

• Verified at 24 PVT corners (0.7-1.35V, -10-110°C)

• Verified up to 10σ in-die variation in LEFF and VT(Monte-Carlo simulations)

18 ASYNC06Dobkin Reuven, March 2006

Conclusions

• Novel Asynchronous Communication scheme• Wave-pipelining over the channel

• Word-acknowledgement instead of bit-acknowledgement

• LEDR encoding and differential signaling

• Fast serializers and de-serializers• Based on the fast shift register

• Data cycle of a single FO4 inverter delay• 15ps on 65nm process, 67 Gbps

• Robust to in-die variations• Thanks to asynchronous design

• The fast SERDES is useful for high-bandwidth long on-chip interconnects, where bit serial communication is preferred thanks to reduced area, easier routing and reduced leakage