Analog Front-End Design for 2x Blind ADC-based … Front-End Design for 2x Blind ADC-Based Receivers...

Analog Front-End Design for 2x Blind ADC-based

Receivers

by

Tina Tahmoureszadeh

A thesis submitted in conformity with the requirementsfor the degree of Master of Applied Science

Graduate Department of Electrical and Computer EngineeringUniversity of Toronto

c© Copyright by Tina Tahmoureszadeh 2010

Analog Front-End Design for 2x Blind ADC-BasedReceivers

Tina Tahmoureszadeh

Master of Applied Science, 2010

Graduate Department of Electrical and Computer Engineering

University of Toronto

Abstract

This thesis presents the design, implementation, and fabrication of an analog front-

end (AFE) targeting 2x blind ADC-based receivers. The front-end consists of a

combination of an anti-aliasing filter (AAF) and a 2-tap feed-forward equalizer (FFE)

(AAF/FFE), the required clock generation circuitry (Ck Gen), 4 time-interleaved

4-b ADCs, and DeMUX. The contributions of this design are the AAF/FFE and

the Ck Gen. The overall front-end optimizes the channel/filter characteristics for

data-rates of 2-10 Gb/s. The bandwidth of the AAF is scalable with the data-rate

and the analog 2-tap feed-forward equalizer (FFE) is designed without the need for

noise-sensitive analog delay cells. The test-chip is implemented in 65-nm CMOS and

the AAF/FFE occupies 152×86 μm2 and consumes 2.4 mW at 10 Gb/s. Measured

frequency responses at data-rates of 10, 5, and 2 Gb/s confirm the scalability of the

front-end bandwidth. FFE achieves 11 dB of high-frequency boost at 10 Gb/s.

ii

Acknowledgments

I would like to thank my supervisor, Professor Ali Sheikholeslami, for his support

and guidance throughout this research work. Thanks for making this journey worth

taking.

I would like to thank my colleagues at Fujitsu, notably Hirotaka Tamura, Yasumoto

Tomita, Masaya Kibune, and Bill Walker for their technical help and support over

the course of this project.

Special thanks to my defense committee: Professor Tony Chan Carusone, Professor

Roman Genov, and Professor Teng Joon Lim for their time and valuable feedback.

I would like to thank my parents, Farahnaz and Darioush, and my sisters, Tila and

Taraneh, for their endless love and support. Even if I could give you not only the

whole world, but the whole universe, with all its planets and stars, it will be nothing

compared to the sacrifices you have made for me.

I am forever grateful to the help and support I constantly received from my research

group members, Shayan Shahramian, Siamak Sarvari, Oleksiy Tyshchenko, David

Halupka, and Behrooz Abiri. Thanks for always having the time to offer your help.

To my girl buddies, Ruslana, Farzaneh, and Azadeh, I wouldn’t have made it

without you. I owe this to the long supporting, encouraging, and inspiring chats with

you. I owe this to the endless tea breaks and our adventurous long walks around the

campus.

Thanks to Hamed, Karim, Mike, Dustin, and Kentaro for the rest of the unforget-

table memories.Last but not the least, my dearest ‘koochooloo’ warmed up my heart, day and

night, so I can be here, writing to close this chapter of my life.

iii

Contents

Acknowledgments iii

List of Figures vi

List of Tables viii

List of Acronyms ix

1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Thesis Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Background 42.1 Wire-line Communication System . . . . . . . . . . . . . . . . . . . . 4

2.1.1 Transceiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1.2 Communication Channel . . . . . . . . . . . . . . . . . . . . . 52.1.3 Binary versus ADC-based Receivers . . . . . . . . . . . . . . . 7

2.2 ADC-based Receivers . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2.1 Phase Tracking versus Blind Sampling . . . . . . . . . . . . . 82.2.2 1x versus 2x Sampling Rate . . . . . . . . . . . . . . . . . . . 92.2.3 2x blind ADC-based Receiver . . . . . . . . . . . . . . . . . . 10

2.3 Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3.1 Necessity for an Anti-Aliasing Filter . . . . . . . . . . . . . . . 132.3.2 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.4 Equalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.4.1 Feed-Forward Equalization (FFE) . . . . . . . . . . . . . . . . 152.4.2 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3 Anti-Aliasing Filter (AAF) Design 193.1 Anti-Aliasing Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.1.1 Active RC Filters . . . . . . . . . . . . . . . . . . . . . . . . . 193.1.2 Gm-C Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.1.3 Integration and Dump (I&D) Filters . . . . . . . . . . . . . . 21

3.2 Proposed I&D Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . 24

iv

3.3 I&D Design Methodology and Modeling . . . . . . . . . . . . . . . . 303.3.1 Design Methodology . . . . . . . . . . . . . . . . . . . . . . . 303.3.2 Behavioural Modeling . . . . . . . . . . . . . . . . . . . . . . 303.3.3 Behavioural Simulation Results . . . . . . . . . . . . . . . . . 31

3.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.4.1 Circuit Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.4.2 Circuit Simulation Results . . . . . . . . . . . . . . . . . . . . 35

3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4 Analog Front-End Design (AFE) 374.1 AFE Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.2 AFE Design Methodology and Modeling . . . . . . . . . . . . . . . . 39

4.2.1 Design Methodology . . . . . . . . . . . . . . . . . . . . . . . 394.2.2 Behavioural Modeling . . . . . . . . . . . . . . . . . . . . . . 404.2.3 Behavioural Simulation Results . . . . . . . . . . . . . . . . . 40

4.3 AFE Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.3.1 Cascode-Switching Architecture . . . . . . . . . . . . . . . . . 414.3.2 Reset Cell Architecture . . . . . . . . . . . . . . . . . . . . . . 424.3.3 Clock Generation Design . . . . . . . . . . . . . . . . . . . . . 43

4.4 Circuit Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . 484.4.1 Clock Generation . . . . . . . . . . . . . . . . . . . . . . . . . 484.4.2 Analog Front-End (AFE) . . . . . . . . . . . . . . . . . . . . . 52

4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5 Experimental Results 565.1 Circuit Layout and Equipment Setup . . . . . . . . . . . . . . . . . . 565.2 Channel Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . 605.3 AFE Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.3.1 Frequency Response of AFE . . . . . . . . . . . . . . . . . . . 625.3.2 FFE Performance . . . . . . . . . . . . . . . . . . . . . . . . . 635.3.3 AAF Performance . . . . . . . . . . . . . . . . . . . . . . . . . 66

5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

6 Conclusions and Future Directions 696.1 Thesis Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 706.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

References 72

v

List of Figures

2.1 An example of a two connector backplane from Tyco Electronics [12]. 62.2 The effect of the limited channel bandwidth on the ideal NRZ signal. 72.3 ADC-based receiver architectures. . . . . . . . . . . . . . . . . . . . . 92.4 Extraction of the instantaneous phase (φinst). . . . . . . . . . . . . . 112.5 Data decision scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . 112.6 An example where the sampling theorem is satisfied. . . . . . . . . . 122.7 An example where the sampling theorem is violated (Aliasing). . . . . 132.8 Zero-crossing estimations based on the linear interpolation. . . . . . . 142.9 Equalization in frequency domain. . . . . . . . . . . . . . . . . . . . . 152.10 Feed-forward and feedback equalization. . . . . . . . . . . . . . . . . 152.11 A generic implementation of CTLE. . . . . . . . . . . . . . . . . . . . 162.12 FIR implementation of FFE. . . . . . . . . . . . . . . . . . . . . . . . 16

3.1 Active RC filter presented in [15]. . . . . . . . . . . . . . . . . . . . . 203.2 Gm-C filter presented in [17]. . . . . . . . . . . . . . . . . . . . . . . 213.3 Comparison between output samples of a rectangular filter and an I&D

filter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.4 LTI Model of an I&D filter. . . . . . . . . . . . . . . . . . . . . . . . 243.5 I&D Response. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.6 Previous I&D works in the front-end. . . . . . . . . . . . . . . . . . . 253.7 Proposed AFE for a 2x blind ADC-based receiver including AAF. . . 263.8 Bandwidth scalability of the I&D scheme. . . . . . . . . . . . . . . . 273.9 φerror plot for phase extraction with/without I&D. . . . . . . . . . . . 283.10 Input data used for simulations with various Tini. . . . . . . . . . . . 283.11 Tini = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.12 Tini = 12.5% UI = 12.5ps. . . . . . . . . . . . . . . . . . . . . . . . . 293.13 Tini = 25% UI = 25ps. . . . . . . . . . . . . . . . . . . . . . . . . . . 293.14 Frequency response of the I&D filter from Simulink simulations. . . . 313.15 AAF system block diagram. . . . . . . . . . . . . . . . . . . . . . . . 323.16 Pulses that drive the 4-way time-interleaved I&D system. . . . . . . 333.17 Architectures considered for the I&D filter design. . . . . . . . . . . . 343.18 Frequency response of I&D filter generated from circuit simulations. . 35

4.1 A 2x blind ADC-based receiver with the proposed AFE. . . . . . . . 384.2 AAF/FFE system block diagram. . . . . . . . . . . . . . . . . . . . . 39

vi

4.3 AFE behavioural modeling. . . . . . . . . . . . . . . . . . . . . . . . 404.4 Zero-pole map of the proposed AFE. . . . . . . . . . . . . . . . . . . 414.5 Cascode-switching implementation. . . . . . . . . . . . . . . . . . . . 424.6 Reset cell implementation. . . . . . . . . . . . . . . . . . . . . . . . . 434.7 Clock generation block diagram. . . . . . . . . . . . . . . . . . . . . . 444.8 CMOS logic implementation. . . . . . . . . . . . . . . . . . . . . . . . 444.9 Half-rate CMOS clock generator implementation. . . . . . . . . . . . 454.10 CML divider implementation. . . . . . . . . . . . . . . . . . . . . . . 464.11 A 2x blind ADC-based receiver with the proposed AFE. . . . . . . . 464.12 Timing diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.13 Simulated nodes in clock generation circuitry. . . . . . . . . . . . . . 484.14 Simulated clock generation waveforms at 10 Gb/s. . . . . . . . . . . . 494.15 Simulated clock generation waveforms at 5 Gb/s. . . . . . . . . . . . 504.16 Simulated clock generation waveforms at 2 Gb/s. . . . . . . . . . . . 514.17 Simulated time-domain waveforms of the AFE. . . . . . . . . . . . . . 534.18 Simulated frequency response at 10 Gb/s. . . . . . . . . . . . . . . . 544.19 Simulated frequency response at 5 Gb/s. . . . . . . . . . . . . . . . . 544.20 Simulated frequency response at 2 Gb/s. . . . . . . . . . . . . . . . . 55

5.1 AFE micrograph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575.2 Measurement setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585.3 S21 plots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615.4 Measured vs simulated frequency response (10 Gb/s). . . . . . . . . . 635.5 Measured vs simulated frequency response (5 Gb/s). . . . . . . . . . 645.6 Measured vs simulated frequency response (2 Gb/s). . . . . . . . . . 645.7 Data-rate = 10 Gb/s - Channel loss = 13.3 dB @ 5 GHz. . . . . . . . 655.8 Data-rate = 5 Gb/s - Channel loss = 13 dB @ 2.5 GHz. . . . . . . . 655.9 Data-rate = 2 Gb/s - Channel loss = 11.7 dB @ 1 GHz. . . . . . . . 665.10 Verification of the anti-aliasing filter. . . . . . . . . . . . . . . . . . . 675.11 Jitter tolerance comparison with AAF on/off. . . . . . . . . . . . . . 67

vii

List of Tables

3.1 Simulated 3-dB bandwidths of I&D filter. . . . . . . . . . . . . . . . . 35

4.1 Simulated results of the clock generation circuitry. . . . . . . . . . . . 494.2 Simulated results of AFE. . . . . . . . . . . . . . . . . . . . . . . . . 52

5.1 Description of the pin-list. . . . . . . . . . . . . . . . . . . . . . . . . 595.2 Description of the test channels. . . . . . . . . . . . . . . . . . . . . . 60

6.1 Performance summary. . . . . . . . . . . . . . . . . . . . . . . . . . . 70

viii

List of Acronyms

AAF anti-aliasing filter

AFE analog front-end

BER bit error rate

BERT bit error rate tester

CDR clock and data recovery

CML current mode logic

CMOS complimentary MOS

CTLE continuous time linear equalizer

DAC digital-to-analog converter

DCD duty-cycle distortion

DeMUX demultiplexer

DFE decision feedback equalizer

DSP digital-signal processor

FFE feed-forward equalizer

FIR finite impulse response

Gb/s gigabits per second

HDMI high-definition multi-media interface

IC integrated circuit

I&D integration and dump

ISI intersymbol interference

LSB least-significant bit

ix

LTI linear-time invariant

MOSFET metal oxide semiconductor field effect transistor

MSB most-significant bit

NMOS negative-channel metal oxide semiconductor

NRZ non-return-to-zero

PCB printed circuit board

PCIe Peripheral Component Interconnect Express

PD phase detector

PI phase interpolater

PMOS positive-channel metal oxide semiconductor

PLL phase-locked loop

PVT process-voltage-temperature

PRBS pseudo-random binary sequence

S&H sample-and-hold

SATA Serial Advanced Technology Attachment

SerDes serializer/deserializer

SNR signal-to-noise ratio

SONET synchronous optical network

UI unit interval

USB Universal Serial Bus

VCO voltage-controlled oscillator

VNA vector network analyzer

x

1 Introduction

High definition television, voice over Internet, and even gaming systems are creating

a large demand for faster data transmission which could range from chip-to-chip to

over continental distances. Established industry standards such as high-definition

multi-media interface (HDMI), Peripheral Component Interconnect Express (PCIe),

Universal Serial Bus (USB), and Serial Advanced Technology Attachment (SATA)

have driven this market for years. New circuit innovations are sought everyday to

carve out a path for multi-gigabits per second (Gb/s) data transmission.

1.1 Motivation

In a typical high-speed transceiver, serial data streams are sent from a transmitter

to a receiver via a communication channel. The existing low-cost channel materials

demonstrate low-pass behaviour at 1 Gb/s and above. This causes the transmitted

pulse to spread to over one unit interval (UI) and interfere with its neighbouring

symbols. Known as intersymbol interference (ISI), this phenomenon complicates the

task of the data recovery on the receiver side.

The most common practice to counteract the deteriorating channel effects is to use

equalization. Equalizers can be implemented in the analog or digital domain. Binary

receivers [1, 2, 3], use only 1 bit (i.e. the sign) of the received data to recover the

data and the embedded clock. ADC-based receivers [4, 5, 6, 7, 8, 9], on the other

hand, have access to more than 1 bit (i.e. the sign and the magnitude) of the received

signal. In these receivers, a front-end ADC digitizes the input signal and enables

more complex equalization circuitry in the digital domain.

Blind ADC-based receivers are a sub-category of the ADC-based receivers. They

utilize a feed-forward architecture and enable the design of a fully digital receiver.

Digital implementation is advantageous since it has low noise sensitivity compared to

the analog circuits, facilitates power/area scaling, and improves the flexibility of the

design. The focus of this thesis is to explore ways to improve the performance of the

1

2 1 Introduction

blind ADC-based receivers.

The industry standards, mentioned previously, demand supporting a wide range

of data-rates and various channel characteristics. Usually, they are required to be

fully compatible with prior generations. For example, PCIe 3.0 that aims for 8 GHz

is required to be backward compatible with PCIe 2.0 and 1.0 which support 5 GHz

and 2.5 GHz, respectively [10]. This shows the significance of designing receivers that

cover a large range of data-rates.

Current blind ADC-based receivers [4, 5] are limited in their operating data-rate.

The main problem is that they rely on the communication channel to perform the

necessary anti-aliasing. For a system that covers a wide range of data-rates, the

channel bandwidth is not sufficient to filter the transmitted signal at the lower speed.

To overcome this disadvantage, our work proposes an analog front-end (AFE) whose

bandwidth automatically adjusts with the data-rate.

The proposed AFE consists of a combined anti-aliasing filter (AAF) and an equal-

izer which extends the operating data-rate from 2 to 10 Gb/s. The equalizer turns

on when the channel imposes severe attenuation which occurs at the higher speeds.

The AAF, as explained above, turns on for low data-rates where the channel does

not have enough bandwidth to prevent aliasing.

1.2 Thesis Objectives

This thesis presents the design of an AFE for 2x blind ADC-based receivers. The

main objectives of this thesis are as follows:

• Exploring AAF solutions with adjustable bandwidth to expand the applications

of 2x blind ADC-based receivers to support multiple data-rates.

• Investigating the incorporation of an equalizer in conjunction with the band-

width scalable AAF.

• Design, fabrication, and measurement of the proposed AFE to prove function-

ality.

1.3 Thesis Outline 3

1.3 Thesis Outline

The remaining chapters of this thesis are organized as follows. Chapter 2 provides

a background on wire-line communication system, ADC-based receivers, and the sig-

nificance of equalization. It serves as a foundation for the discussions of its following

chapters. Chapter 3 presents the design methodology, modeling, and simulation re-

sults of the AAF. The design of the complete AFE including both the AAF and

the equalizer is presented in Chapter 4 followed by the simulation results. Chapter

5 discusses the measurements of the test-chip. Chapter 6 concludes this thesis and

outlines the future directions for this work.

2 Background

The rapid increase of speed in high-capacity networks and computer systems has

created a large demand for high-speed data transmission. Gigabit Ethernet, long-haul

optical channels, memory, and chip-to-chip interconnect are applications that directly

benefit from the multi-Gb/s serial link technology. This chapter presents the main

challenges in high-speed signaling along with their commonly used solutions. This

material serves as a background to frame the discussions in the following chapters.

Section 2.1 introduces a typical wire-line communication system consisting of a

transmitter and a receiver block communicating through a wired backplane. Channel

impairments that distort the transmitted signal, therefore complicating the task of

the receiver, are also discussed in this section. Binary and ADC-based receivers are

introduced towards the end of this section as the two well-known receiver architec-

tures. Section 2.2 discusses the ADC-based receivers in more detail to provide the

necessary context for the upcoming sections. Sections 2.3 and 2.4 present the ne-

cessity for an anti-aliasing filter (AAF) and equalization as a part of the currently

employed ADC-based receivers. Section 2.5 concludes this chapter.

2.1 Wire-line Communication System

High-speed signaling refers to the exchange of information from a transmitting device

to a receiving device at data-rates in excess of 1 Gb/s. The data is transmitted via

a physical medium based on which the communication system can be classified as

wire-line, wireless, optical, and etc. In the remainder of this section we discuss the

building blocks of a wire-line communication.

2.1.1 Transceiver

A generic high-speed transceiver consists of a transmitter on one chip and a receiver

on another. The task of the transceiver is to transfer the data from the transmitter to

the receiver through a communication channel. This communication channel, which

4

2.1 Wire-line Communication System 5

could range from hundreds of feet of cable in network interfaces to less than one foot

of a PCB trace in chip-to-chip signaling interfaces, suffers from non-idealities which

distort the transmitted signal. The role of the transceiver is to compensate for the

losses introduced by the physical channel and recover the data with an acceptable bit

error rate (BER).

The evolving integrated circuit (IC) technology has enabled the design of the high-

speed transceivers. The wire-line backplane, on the other hand, does not advance

with the same pace and continues to be the bottleneck. Although a transceiver

incorporates other blocks such as drivers, serializers, deserializers, and samplers [11],

its main design challenge is attributed to compensating for deteriorating channel

effects. In the next section, we study the channel impairments.

2.1.2 Communication Channel

In a wire-line communication system, which is the focus of this thesis, the communica-

tion channel carries electrical signals from the source to the destination. This channel

could be a twisted pair, coaxial cable, an Ethernet cable, or a USB cable. Although

these wire-line channels are of various natures, they impose similar challenges to the

designers.

A typical channel in serial high-speed signaling consists of connectors, a PCB trace,

and cables. A backplane trace, provided by Tyco Electronics [12], is shown in Fig.

2.1(a). It comprises of two line-cards of length 10” connected through a PCB trace of

length 20”. The material used in both the line-cards and the trace is Nelco 4000-13SI.

This channel can be modeled as a linear-time invariant (LTI) system with a frequency

response plotted in Fig. 2.1(b).

The limited bandwidth of this channel, which is an example of a generic wire-line

link, attenuates the higher frequency content of the transmitted signal. In the time-

domain, this translates to the spreading of the data bit, which consequently interferes

with its adjacent bits. This phenomenon, known as intersymbol interference (ISI), is

more clearly explained in the following example.

If an ideal non-return-to-zero (NRZ) signal is applied to a channel with infinite

bandwidth, an undistorted NRZ output is obtained, as shown in Fig. 2.2(a). The

corresponding histogram of the samples, shown on the right side, includes an impulse

at the zero level and another at the one level, indicating only two possible sample

values. Fig 2.2(b), in contrast, presents a similar case except with a practical channel

6 2 Background

10"

20"

(a) Physical dimensions of the channel.

10 1010 1010

-1

10

0

10

1

-70

-60

-50

-40

-30

-20

-10

0

Frequency (GHz)

|S

21

| (dB

)

(b) The channel transfer function.

Figure 2.1: An example of a two connector backplane from Tyco Electronics [12].

that has a limited bandwidth. The samples, in this scenario, not only depend on the

current bit value but also on the bits before and after it. Now, the histogram on the

right, consists of more than two impulses, indicating a range around the one and zero

levels, as possible sampled values. The samples affected by ISI vary with time and

can be misinterpreted by the receiver.

The narrower the channel bandwidth is, the longer the UI extends in the time-

domain, and the more severely ISI affects the transmitted waveform. The waveform

will also fail to reach the full levels of zero and one due to the destructive interference

from the neighbouring bits. These effects, if not compensated for, result in erroneous

data detection and increased BER of the receiver. The most common practice to

compensate for ISI is to utilize equalization to flatten the combined frequency response

of the channel and the equalizer. We will see this in more detail in Section 2.4.

2.1 Wire-line Communication System 7

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

-1

-0.6

-0.2

0.2

0.6

1

Time (ns)

Voltage (V

)

(a) Output of a channel with infinite bandwidth.

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

-1

-0.6

-0.2

0.2

0.6

1

Time (ns)

Voltage (V

)

(b) Output of a channel with finite bandwidth.

Figure 2.2: The effect of the limited channel bandwidth on the ideal NRZ signal.

2.1.3 Binary versus ADC-based Receivers

The task of a receiver is to correctly detect the transmitted signal. Ideally, the high-

speed serial data stream would be transmitted across the channel in parallel with the

corresponding clock signal. Non-idealities of the communication channels, however,

distort the data and the clock differently to the extent that they will no longer be

phase-aligned. Furthermore, the increased link cost for carrying the clock signal makes

this receiver topology undesirable. Consequently, in today’s high-speed transceivers,

the clock is not transmitted along with the data and the task of the clock recovery is

solely left to the receiver.

Generally, receivers are categorized to a binary or an ADC-based receiver accord-

ing to their front-end sampler. The more traditional receivers are of the type binary

as they use a flip-flop at the front-end to sample the incoming signal. The binary

sample carries the sign information of the data while discarding the magnitude in-

8 2 Background

formation. This requires the necessary compensation to be performed before the

sampling. Usually, an analog equalizer that precedes the binary sampler takes care

of this compensation.

ADC-based receivers, however, sample the received signal with an ADC. Each

sample is now represented with a set of more than one bit, preserving both the sign

and the magnitude information. The extra information about the received signal

enables the ADC-based receivers to employ more complex equalization in the digital

domain in addition to the analog equalization. As the advancement in wire-line

channels lags the rapid increase in the data-rate, the need for intensive compensation

of the channel impairments grows. Accordingly, ADC-based receivers which allow

for a higher degree of equalization are a promising solution for wire-line multi-Gb/s

transceivers above 20 Gb/s [13].

2.2 ADC-based Receivers

As discussed in the previous section, ADC-based receivers offer extensive channel

loss compensation in digital domain as they use more than one bit to represent each

sample. ADC-based receivers can be categorized based on their sampling topology

and sampling rate. By sampling topology, as will be discussed in Section 2.2.1, the

receivers can be grouped to phase-tracking and blind. Based on the sampling rate,

as will be discussed in Section 2.2.3, the receivers can be classified as the baud rate

sampling (i.e. 1x) or twice the baud rate sampling (i.e. 2x).

2.2.1 Phase Tracking versus Blind Sampling

The more well-known ADC-based receivers, known as the phase-tracking receivers,

align the sampling clock of the front-end to the phase of the incoming signal via

internal feedback [7, 8, 9, 6]. These receivers, shown in 2.3(a), recover the embedded

clock in the incoming signal by a phase recovery unit that drives a DAC to generate

a control signal for the clock generator unit. The clock generator unit that is either

a voltage-controlled oscillator (VCO) or a phase interpolater (PI) is an analog block

that produces a sampling clock in phase with the input signal. The design of such

analog blocks in multi-Gb/s signaling is the main disadvantage of the phase-tracking

architectures since they are sensitive to noise, prevent fast production-level testing,

and they do not easily port to new technologies.

2.2 ADC-based Receivers 9

ADC

Phase

Recovery UnitDAC

Clock

Generator

Decision

Input Data

Recovered Data

Sampling Clock Control Signal

(a) Phase-tracking ADC-based receiver architecture.

ADC

Phase

Detector

Filter

Decision

Input Data

Recovered Data

Blind

Sampling Clock

Фinst

Фave

(b) Feed-forward (blind) ADC-based receiver architecture.

Figure 2.3: ADC-based receiver architectures.

In an attempt to design an all-digital receiver, blind ADC-based architectures have

been introduced [4, 5]. In these architectures, shown in 2.3(b), a blind sampling clock

(i.e. a clock signal with no phase relation with respect to the data) samples the

incoming data. The newly introduced digital blocks, a phase detector and a digital

filter, replace the undesirable analog VCO/PI blocks in the phase-tracking receiver.

As the input is sampled blindly, there is no need for a feedback loop to recover the

clock. This is why these architectures are also known as feed-forward ADC-based

receivers.

2.2.2 1x versus 2x Sampling Rate

ADC-based receivers are also classified based on the number of the samples per unit

interval (UI) taken from the received signal. The most common sampling rates are

twice per UI (i.e. 2x) or once per UI (i.e. 1x). Either of the phase-tracking or blind

ADC-based receiver can utilize a baud-rate sampling or twice a baud-rate sampling

rate, resulting in total of four possible architectures: 1x phase-tracking, 2x phase-

10 2 Background

tracking, 1x blind, and 2x blind.

As explained in the previous section, phase-tracking receivers depend on a feedback

loop to align the sampling clock to the received signal. This feedback loop often

demands considerable design and verification resources. One solution to simplify

the design is to remove the feedback phase recovery loop and investigate the blind

architectures. The motivation to go from the 2x to the 1x sampling rate is to relax

the ADC conversion rate, allowing for increased baud rate. The 1x blind ADC-based

receivers, however, are not feasible since in the worst case, the samples can fall on

the zero-crossings of the input and make the task of the data recovery impossible. In

the next section, 2x blind ADC-based receivers are discussed in more detail.

2.2.3 2x blind ADC-based Receiver

A block diagram of a blind-sampling ADC-based receiver [4, 5] is presented in 2.3(b).

A blind clock samples the received signal twice per UI. The phase detector uses

these samples to approximate the instantaneous zero-crossings, φinst, of each data

transition. The value of φinst is further filtered to generate the average instantaneous

phase, φave. Both phase values along with the digital samples are sent to the decision

block for data recovery. In this section, we explain the functionality of the phase

detector and the decision block to provide the context for the remaining chapters.

Fig. 2.4 illustrates the task of the phase detector which relies on the linear inter-

polation between the two consecutive samples of opposite signs. If X and Y are the

two samples, the instantaneous phase of X (i.e. φinst) can be derived by the similar

triangles theorem as shown in equation 2.1. Xamp and Yamp are the absolute values

of the amplitudes of X and Y respectively.

φinst =

(0.5Xamp

Xamp + Yamp

)UI (2.1)

The filter that follows the phase detector, subtracts the current value of the φave

from the φinst to generate a phase error (φer) for every UI. Similar to the phase-

tracking receiver, φer is low-pass filtered to recover the φave. The decision block, uses

φinst, φave, and the value of the 3 consecutive samples to extract the data. First,

the average eye-center phase, known as the data-picking phase, φpick, is calculated by

adding 0.5UI to the φave. Second, the value of the φinst and φpick are compared and

the sample closest to the φpick or furthest to the φinst is chosen as the decided data

2.3 Aliasing 11

value. Fig. 2.5 illustrates this data decision scheme.

Y

X

t

0.5UI

Фinst

Yamp

Xamp

Figure 2.4: Extraction of the instantaneous phase (φinst).

A B C

ФinstФpick

A B C

ФinstФpick

Picked Data=A

Picked Data=B

Figure 2.5: Data decision scheme.

The next section discusses the limitations of the phase recovery scheme used in 2x

blind ADC-based receivers. This serves as a necessary background to motivate our

proposed design in this work.

2.3 Aliasing

A typical signal processing system samples and digitizes the incoming analog signal,

performs the necessary digital signal processing, and converts the final output back

to the continuous domain to interface with the analog world. The sampling theo-

rem [14] states that a continuous signal, g(t), strictly band-limited to B Hz, can be

12 2 Background

reconstructed from its samples only if the sampling frequency is more than 2B Hz.

Otherwise, the sampling process will not be reversible.

B

...

f

G(f)

B-B

f

Gs(f)

...

2B-2B

-B

f

H(f)

B-B

Figure 2.6: An example where the sampling theorem is satisfied.

Suppose that g(t) with a frequency spectrum, G(f), is sampled at 2B Hz. The

frequency spectrum of the sampled signal, Gs(f), would be the sum of the replications

of the G(f) around the integer multiples of 2B Hz, as shown in Fig. 2.6. The low-pass

filter, H(f), with cutoff frequency of B can be used to recover the original signal, g(t).

The sampling theorem relies on the assumption that g(t) is strictly band-limited.

No practical signal, however, is strictly band-limited, with the result that under-

sampling always occurs to some degree. This phenomenon, known as aliasing, refers

to the overlap of the frequency content as highlighted in Fig. 2.7. Once the signal

is aliased, it is impossible to differentiate between the frequencies in band and out of

band.

To combat the effects of aliasing, low-pass anti-aliasing filters are placed prior to

the sampler to attenuate the higher frequency content of the signal. Although ADC-

based receivers recover one bit per UI and not the actual transmitted pulse waveform,

being a sampled system, they need to deal with aliasing. The next section discusses

the aliasing issues specific to the design of the 2x blind ADC-based receivers.

2.3 Aliasing 13

B

...

f

G(f)

B-B

f

Gs(f)

...

2B-2B

-B

Figure 2.7: An example where the sampling theorem is violated (Aliasing).

2.3.1 Necessity for an Anti-Aliasing Filter

The phase recovery scheme employed in 2x blind ADC-based receivers [4, 5] was

described in Section 2.2.3. The scheme relies on the linear interpolation between

the two consecutive samples of opposite sign. This interpolation leads to erroneous

estimation of the zero crossings and reduced jitter tolerance if the received signal

contains sharp transitions. Fig. 2.8 compares the results of the interpolation on an

ideal signal against a filtered one. Linear interpolation gives a far better estimation

of the zero-crossings when the signal is filtered as opposed to when it is not.

The 2x blind ADC-based receivers reported so far [4, 5] leave the task of anti-

aliasing to the communication channel. Therefore, they can not be applied to the

standards where the backplane trace is as short as a few centimeters as in PCIe.

To expand the applications of the 2x blind ADC-based receivers, it is important to

incorporate an anti-aliasing filter at the front-end to reduce aliasing and improve the

jitter tolerance of the receiver.

2.3.2 Previous Work

To date, no anti-aliasing filter has been incorporated in the design of the 2x blind

ADC-based receivers. Anti-aliasing filters, however, are used in almost every elec-

tronic circuit. In audio systems, they are used for preamplification, equalization, and

tone control. Communication systems use them for tuning to specific frequencies. In

digital signal processing, the filters avoid the aliasing of the out-of-band noise and

interference. These systems primarily utilize low-pass filters prior to the ADC to

14 2 Background

0 0.2 0.4 0.6 0.8 1

-1

-0.6

-0.2

0.2

0.6

1

Time (ns)

Voltage (V

)

(a) Ideal signal.

0 0.2 0.4 0.6 0.8 1

-1

-0.6

-0.2

0.2

0.6

1

Time (ns)

Voltage (V

)

(b) Filtered signal.

Figure 2.8: Zero-crossing estimations based on the linear interpolation.

eliminate the undesired aliased information in the signal path.

Op-amp RC filters [15, 16] are attractive anti-aliasing solutions as they offer low

noise and high dynamic range. While feedback is mainly responsible for these desired

features, it limits the bandwidth of the filters. Gm-C (transconductance-C) filters

[17, 18] are more suited for high-frequency performance as they eliminate the feedback.

Section 3.1 discusses the different types of anti-aliasing filters in more detail.

2.4 Equalization

As explained in Section 2.1, the signal that travels from the transmitter to the receiver

is distorted by the non-idealities of the channel. Limited channel bandwidth disperses

the current UI in the time-domain such that it interferes with its neighbouring UIs.

The most common way to cancel ISI is to use equalization to make the cascade of the

channel and the equalizer have a flat frequency response, as shown in Fig. 2.9.

2.4 Equalization 15

Channel Equalizer Channel & Equalizerx =

f f f

Figure 2.9: Equalization in frequency domain.

Equalization can be performed on the transmitter side [19, 20, 21], the receiver

side [5, 22, 23], or both [9, 24, 25]. In the remainder of this section we focus on the

receiver-side implementations of the equalization.

2.4.1 Feed-Forward Equalization (FFE)

Equalization [26] at the receiver can be performed in a linear or a non-linear man-

ner. The former has a feed-forward architecture while the latter uses a feedback,

as shown in 2.10. Feed-forward equalizers (FFE) do not have a feedback path and

can be implemented either as a continuous time linear equalizer (CTLE) or a finite

impulse response (FIR) filter. A common way to implement CTLE is a differential

pair with source degeneration consisting of a capacitor in parallel with a resistor [27].

The capacitor becomes a short at high frequencies which increases the gain and com-

pensates for the channel losses. Fig. 2.11 shows a generic schematic of this approach

along with the corresponding frequency response. This approach is limited by the

gain-bandwidth product of the source-coupled differential pair and if designed well

can provide 4-6 dB gain/stage at 10 Gb/s in 90-nm CMOS technology [28].

FFE

Input Data Recovered Data

+

FIR

-

FeedbackFeed-forwad

Figure 2.10: Feed-forward and feedback equalization.

16 2 Background

At HF

Vi

f

|H(f)|

Figure 2.11: A generic implementation of CTLE.

A 2-tap FIR implementation of the FFE is shown in Fig. 2.12. In this realization,

equalization is achieved by subtracting a fraction of the previously sampled data from

the current sample. This fraction (α2/α1) can be adjusted to obtain the required

high-frequency boost to counteract the channel attenuation. For severe cases of ISI,

FIR filter can be generalized to have more taps to account for the disturbance of

more neighbouring bits. Analog FIR filters are often not a suitable choice since the

analog delay cells exhibit sensitivity to noise and process-voltage-temperature (PVT)

variations. ADC-based receivers, on the other hand, facilitate the use of digital

FIR filters as FFE. In digital FIR filters, equalization is performed on the digital

presentations of the incoming data.

Input Data Equalized Data

n-tap

Delay

Σ

-

α1

α2

Delay

α3

...

αn

...

...

...

2-tap

Figure 2.12: FIR implementation of FFE.

2.5 Summary 17

As mentioned above, linear equalization amplifies the high-frequency content of the

data; this also amplifies the high-frequency noise which reduces the signal-to-noise

ratio (SNR) and increases the BER. Noise enhancement problem can be avoided

by employing non-linear equalization such as the decision feedback equalizer (DFE)

presented in Fig. 2.10. In this technique, the decided bit drives the equalization

eliminating the noise from the received signal. In an ADC-based receiver, on the

other hand, DFE enhances the quantization noise introduced by the ADC. Therefore,

it is a good idea to employ both an analog FFE and a digital FFE or DFE in ADC-

based receivers to minimize BER degradations due to both the high-frequency and

quantization noise.

2.4.2 Previous Work

ADC-based receivers are potential solutions for data-rates above 20 Gb/s [13] since

they enable complex equalization to cancel the deteriorating channel effects. The

2x blind ADC-based receiver presented in [5] employs a linear analog equalizer prior

to the ADC plus a digital FFE following the ADC. The analog equalizer provides

a nominal gain of 6 dB at 2.5 GHz by using an RC-degenerated differential pair

designed in 65-nm CMOS. The digital FFE, implemented as a half-a-UI-spaced 2-

tap FIR filter, further equalizes the digital signal. The combined analog and digital

equalization is capable of compensating 15 dB of signal loss caused by the cable.

Future chapters explain our proposed architecture to achieve almost the same boost-

ing effect except with an all-analog implementation. This is useful when the ADC

quantization noise becomes a limiting factor in the design; this occurs when the ADC

resolution is chosen to be low for power-saving purposes.

2.5 Summary

This chapter provided an introduction on wire-line communication systems explaining

the roles of a receiver, a transmitter, and an equalizer. The signal degradations due to

the communication channel were studied and the methods to counteract these effects

were presented. The ADC-based receiver architecture was introduced as a potential

solution for data-rates of 20 Gb/s and above. It was explained that the current ADC-

based receivers lack anti-aliasing filters at the front-end. In the following chapters,

18 2 Background

we present our solution to this problem expanding the applications of the ADC-based

receivers to standards that support multi data-rates and a variety of channels.

3 Anti-Aliasing Filter (AAF) Design

Anti-aliasing filters are widely used in today’s data acquisition systems. These types

of systems consist of a front-end sampler, an ADC, and digital-signal processor (DSP)

circuitry. The anti-aliasing filter, which is placed prior to the sampler, ensures that

the input signal does not contain frequencies higher than half the sampling rate. This

guarantees the reconstruction of the input signal. In 2x blind ADC-based receivers,

aliasing prevents the accurate phase recovery of the input. As a result, it is impor-

tant to investigate solutions to prevent aliasing in these receivers and expand their

operating input frequency range.

This chapter presents the design of the anti-aliasing filter (AAF) portion of the

proposed analog front-end (AFE). Section 3.1 studies various ways to implement

the desired AAF. Section 3.2 selects the integration and dump (I&D) scheme as

the desired solution and presents the proposed architecture. Section 3.3 discusses

the design methodology for the I&D filter along with the behavioural modeling and

simulation results. Section 3.4 describes the circuit implementation of the I&D filter

followed by the simulation results. Section 3.5 concludes this chapter.

3.1 Anti-Aliasing Filters

As discussed in Section 2.3, in order for 2x blind ADC-based receivers to accommodate

data-rates ranging from 2-10 Gb/s, it is crucial that the AFE design includes an AAF

whose bandwidth adjusts with the data-rate. This section investigates the use of

active RC filters, Gm-C filters, and I&D filters as possible candidates for the desired

AAF. I&D is selected since it offers easy bandwidth programmability controlled by

the data-rate.

3.1.1 Active RC Filters

Active RC filters have largely been used to implement an AAF in data-acquisition

systems. Operational amplifiers, resistors, and capacitors constitute the building

19

20 3 Anti-Aliasing Filter (AAF) Design

blocks of such filters [29, 30]. While operational amplifiers provide voltage gain and

high dynamic range, they are bandwidth limited and therefore not suitable for high-

speed (Gb/s) systems. To combat this disadvantage, research continues to explore

more circuit techniques to design op-amps with large bandwidth.

A CMOS AAF with RC feedback is presented in [15] that achieves a maximum pole

frequency of 500 MHz. The op-amp utilized in this design (shown in Fig 3.1) consists

of three main stages with three local common-mode feedbacks and two feed-forward

stages for compensation. Although operating at a comparatively fast speed amongst

active RC filters, the design suffers from a low phase margin of 20◦.

The tunability of the bandwidth of an active RC filter can be achieved by digitally

selecting a number of parallel resistors or capacitors. There is no well-known approach

to adjust the bandwidth according to the data-rate. For this reason and due to the

low frequency operation of the op-amps, active RC filters are not suitable for the AFE

targeting high-speed ADC-based receivers.

In+

In-

Out+

Out-

Figure 3.1: Active RC filter presented in [15].

3.1.2 Gm-C Filters

Gm-C filters (transconductance-C filters) offer higher bandwidth than their active

RC counterparts [29, 30]. In Gm-C filters a differential input voltage is converted to

current by the transconductance cell and integrated on a capacitor. Therefore, the

key to designing fast filters of this type is to use fast transconductors.

3.1 Anti-Aliasing Filters 21

A transconductor circuit with a 3-dB bandwidth of 900 MHz for the design of Gm-

C filters is presented in [17]. The circuit consists of a fixed transconductor cascaded

with a variable gain cell (shown in Fig. 3.2). This topology takes advantage of the

current-mode signal processing to increase the operational bandwidth. Tunability of

the bandwidth is achieved by the variable gain stage.

Although Gm-C filters offer higher bandwidth compared to the active RC filters,

they are still unsuitable choices for ADC-based receivers covering 2-10 Gb/s. More-

over, as was the case with active RC filters, Gm-C filters are also unable to provide

bandwidth scalability with the data-rate.

Gm

Variable

Gain Cell

Vo

io

Vi

Figure 3.2: Gm-C filter presented in [17].

3.1.3 Integration and Dump (I&D) Filters

I&D is a well-known scheme for optimum detection in digital communication [14].

The input waveform is integrated for one full period, sampled, and reset before the

integration of the next bit commences. While I&D improves performance by averaging

the noise and therefore lowering the bit error rate (BER), it introduces the following

problems to high-speed systems. One is that the integration must be exactly phase-

aligned to the input data so that the entire unit interval (UI) is integrated. Another

issue is that the integration result is required to be reset immediately before the

next integration starts. This, however, can be relaxed by using time-interleaved

architectures [31].

For an linear-time invariant (LTI) system with an impulse response of h(t), the

input-output relationship can be expressed as:

y(t) =

∫ ∞

−∞x(τ)h(t − τ)dτ (3.1)

Assuming that h(t) is a rectangular filter of pulse-width equal to UI seconds, we


can write:

y(t) =

∫ t

t−UI

x(τ)dτ (3.2)

y(t) is sampled at the maximum eye-opening, which happens at the end of each UI

integration. The resulting samples can be formulated by the relationship 3.3:

y(nUI) =

∫ n(UI)

UI(n−1)

x(τ)dτ, n = 1, 2, ... (3.3)

For a specific input x(t), shown in Fig. 3.3(a), the outputs of the rectangular filter

and the I&D are presented in Fig. 3.3(b) and Fig. 3.3(c), respectively. Although

the two output waveforms are different, their corresponding output samples are the

same. This result shows that the I&D is a practical implementation of a rectangular

filter [26].

The above analyses show that an I&D scheme can be modeled as an LTI system

with a rectangular impulse response of width equal to the duration of the integration.

This model, shown in Fig. 3.4, is followed by its corresponding impulse/frequency

response in Fig. 3.5. In the frequency domain, I&D filter is a sinc function with

nulls at integer multiples of the data-rate (fb = 1/UI) and a 3-dB bandwidth equal

to 0.443fb [26].

3.1 Anti-Aliasing Filters 23

UI 2UI 3UI 4UI 5UI 6UI 7UI 8UI

-1

-0.5

0

0.5

1

Time (ps)

Voltage (V

)

(a) Input (x(t)) to both systems (rectangular fil-ter and I&D).


-1

-0.5

0

0.5

1

Time (ps)

Voltage (V

)

(b) Output of the rectangular filter.


-1

-0.5

0

0.5

1

Time (ps)

Voltage (V

)

(c) Output of the I&D.

Figure 3.3: Comparison between output samples of a rectangular filter and an I&Dfilter.


h(t)

Ts

= (n)UI

Vi(t)

n = 1, 2, ...

Vo(n)

Figure 3.4: LTI Model of an I&D filter.

UI 2UI 3UI 4UI

Time (s)

5UI

1

0

h(t)

(a) I&D impulse response.

fb

2fb

3fb

4fb

Frequency (Hz)

5fb

UI

0

|H(f)|

(b) I&D frequency response.

Figure 3.5: I&D Response.

3.2 Proposed I&D Scheme

Fig. 3.6 presents two examples of the previous works where I&D was utilized as

the front-end sampler for serializer/deserializer (SerDes) applications. Both of these

architectures take advantage of the I&D to filter the high-frequency noise and improve

the signal-to-noise ratio (SNR). However, integrating the entire bit period in these

two architectures necessitates a clock recovery unit to accurately align the phase of

the input to the integrating clock. In [32], the clock signal and data are sent from

the transmitter to the receiver. Although feasible at data-rates as low as 700 Mb/s,

this method is not applicable to today’s multi-Gb/s signaling. The reason is that the

skew introduced due to the losses of the channel is large enough to disturb the phase

relation between the input and the clock. To resolve this, [33] eliminates the clock

wire and recovers the clock from the received signal by a clock recovery unit. The

recovered clock drives the I&D which is phase-aligned to the input signal.

I&D scheme has also been employed in the design of a decision feedback equalizer

(DFE) as presented in [34]. This receiver enjoys the power-saving offered by this

scheme as opposed to standard DFE summing amplifiers. The I&D, which is phase-

aligned to the input, integrates the signal for an entire UI. The anti-aliasing feature

3.2 Proposed I&D Scheme 25

of the I&D is an undesired feature in this design since it closes the output eye. In a

future design [35], similar authors employed a sample-and-hold (S&H) to avoid the

loss introduced by the I&D. As discussed in Chapter 2, our goal is to explore the

anti-aliasing feature of the I&D and not hinder it.

DLL/PLL

I&D

I&D

clk

data

TX RX

clk

(a) Receiver using I&D in the front-end [32].

I&D

I&D

data

TX RX

clkFF

clk recovery

unit

(b) Receiver using I&D in the front-end [33].

Figure 3.6: Previous I&D works in the front-end.


Since we target 2x blind ADC-based receivers, the I&D clock no longer requires

to be phase-aligned to the input data. Therefore, the feedback loop from the clock

recovery unit to the AFE is eliminated. Furthermore, we take advantage of the

bandwidth programmability of the I&D to cover data-rates in the range of 2-10 Gb/s.

Fig. 3.7 presents the proposed AFE architecture for 2x blind ADC-based receivers.

This filter blindly integrates the incoming signal for 0.5UI. A 4-way time-interleaved

architecture is employed to relax the speed requirements of the reset switches. Each

interleaved branch is followed by a 4-b half-rate ADC that quantizes the AAF output

samples. These digital samples are further DeMUXed and sent to the 2x blind ADC-

based CDR [5]. The ADC clock has a certain phase relationship with respect to the

AAF/FFE clock which is discussed in Section 4.3.3.

AAF

AAF Ck

Gen

4-bit fb/2

GS/s ADC

16 for fb

<= 5 Gb/s2x Blind

ADC-Based

CDR DOUT

This Work

fb

= 2-10 Gb/s

32 for fb

> 5 Gb/s

ADC Ck

Gen

AFE

Figure 3.7: Proposed AFE for a 2x blind ADC-based receiver including AAF.

The impulse/frequency responses of an I&D are plotted in Fig. 3.8. This figure

shows that the I&D bandwidth linearly scales with the integration duration (Ti).

As described in Section 2.2.3, the phase (zero-crossing) of the incoming signal in 2x

ADC-based receivers is derived by a linear interpolation between the two consecutive

opposite samples. In order to see the effectiveness of the I&D as an AAF, Fig. 3.9

plots the phase error (φerror) versus the initial integration time (Tini). φerror is defined

as the difference between the interpolated phase and the actual input phase. Fig. 3.9

shows that the I&D scheme reduces this error by about 21 %.


0 100 200 300 400

0

0.2

0.4

0.6

0.8

1

Time (ps)

h(t)

(a) I&D impulse response for various integrationtimes.

Data-Rate = 2 Gb/s

Data-Rate = 5 Gb/s

Data-Rate = 10 Gb/s

10

-1

10

0

10

1-15

-10

-5

0

Frequency (GHz)

|H(f)|dB

(b) I&D frequency response for various inte-gration times.

Figure 3.8: Bandwidth scalability of the I&D scheme.


0 25 50

12.5

25

-12.5

-25

75 100

Фerror

(%UI)

Tini

(%UI)

Without I&D

With I&D

Figure 3.9: φerror plot for phase extraction with/without I&D.

It is beneficial to study the effect of the blind clock on the I&D scheme. To do

so, Tini is modified and the output samples from the rectangular filter and the I&D

are compared. Fig 3.10 shows the input used to simulate the different values of Tini.

Figures 3.11, 3.12, and 3.13 show the results for Tini = 0, Tini = 0.125UI, and Tini=

0.25UI, respectively. One important observation is that, regardless of the Tini, the

output samples from both the rectangular filter and I&D systems are always the same.

As a result, blindness of the clock with respect to the incoming data does not affect

the frequency response of the I&D filter.

100 300 500 700

-1

-0.5

0

0.5

1

Time (ps)

Voltage (V

)

Figure 3.10: Input data used for simulations with various Tini.


100 300 500 700

-1

-0.5

0

0.5

1

Time (ps)

Voltage (V

)

(a) Samples from the rectangular filter.

100 300 500 700

-1

-0.5

0

0.5

1

Time (ps)

Voltage (V

)

(b) Samples from I&D.

Figure 3.11: Tini = 0.

100 300 500 700

-1

-0.5

0

0.5

1

Time (ps)

Voltage (V

)


100 300 500 700

-1

-0.5

0

0.5

1

Time (ps)

Voltage (V

)


Figure 3.12: Tini = 12.5% UI = 12.5ps.

100 300 500 700

-1

-0.5

0

0.5

1

Time (ps)

Voltage (V

)


100 300 500 700

-1

-0.5

0

0.5

1

Time (ps)

Voltage (V

)


Figure 3.13: Tini = 25% UI = 25ps.


3.3 I&D Design Methodology and Modeling

To initiate the design process, a linear model of the I&D filter was incorporated with

the rest of the front-end components of a 2x blind ADC-based receiver as shown in

Fig. 3.7. While this model serves as an approximation, it provides insights about the

combination of the proposed I&D filter and the receiver front-end.

3.3.1 Design Methodology

A linear and event-driven [36] model of the I&D filter was built with Matlab’s Simulink

tool [37]. Implemented as a 4-way time-interleaved architecture, the proposed filter

was integrated with four 4-b ADCs, each followed by a DeMUX to constitute a com-

plete front-end for 2x blind ADC-based receivers. This model was used to analyze

the time-domain behaviour and verify the frequency response of the system. Once

the functional verification in Simulink was completed, the design was transferred to

transistor-level implementations in Cadence.

3.3.2 Behavioural Modeling

Due to the unavailability of a 2x blind ADC-based CDR at the time, a verification

methodology was proposed to evaluate the front-end as a stand-alone block without

the CDR. The components were modeled to ensure that the front-end can be measured

and verified independently. Simulink models for the I&D, 4-b ADC, and DeMUX were

created.

The following procedure was used to extract the frequency response of the front-end.

As the system is discrete, straight-forward ac-simulations are not able to generate the

frequency response. To resolve this issue, sinusoidal inputs at various frequencies were

provided. At each frequency, an eye-diagram of the DeMUXed output samples was

generated. The amplitudes of the resulting eye-diagrams were converted to dB values

and plotted versus their corresponding frequencies to obtain the frequency response.

As sinusoidal inputs were utilized for evaluation purposes, the relation 3.4 was

used to model the I&D topology in Simulink. The variable, tcur, refers to the current

time of the simulation. When tcur is triggered, equation 3.4 generates the result of

the sine integration of frequency (w rad/s) over the range of (tcur - 0.5UI) to tcur.

Final results of the I&D are sampled and quantized by the 4-b ADCs and sent to

3.4 Implementation 31

the DeMUXes. The DeMUXed output samples are employed to construct an eye-

diagram based on which the frequency response is obtained. Next section presents

the frequency response plots generated by the Simulink behavioural simulation.

∫ tcur

tcur−0.5UI

sin(wt)dt =1

w(cos(w(tcur − 0.5UI)) − cos(wtcur)) (3.4)

3.3.3 Behavioural Simulation Results

The Simulink behavioural model outlined in the previous section was simulated for

three data-rates of 2, 5, and 10 Gb/s. The simulated frequency responses are provided

in Fig. 3.14. This figure shows that the bandwidth of the I&D filter scales with the

data-rate. Table 3.1, provided in Section 3.4.2, summarizes the simulated bandwidths

derived from Simulink simulations and circuit simulations.

-30

-25

-20

-15

-10

-5

0

Frequency (GHz)

|H(f)|dB

10

-1

10

0

10

1

fb

= 10 Gb/s

fb

= 5 Gb/s

fb

= 2 Gb/s

Figure 3.14: Frequency response of the I&D filter from Simulink simulations.

3.4 Implementation

Once the I&D scheme was verified by Simulink behavioural model, circuit schematics

and layout were designed in Fujitsu’s 65-nm CMOS design-kit. This section presents

the circuit implementation of the I&D scheme, circuit simulation results, and com-

parisons with the Simulink model.


3.4.1 Circuit Design

The I&D filter is implemented as a clocked Gm-C filter with 4 outputs that are

interleaved in time. A system block diagram of the implementation is presented in

Fig. 3.15. Each output node goes through the four phases of integration (Int), hold

(Hld), dump (Rst), and idle (Idle). For example, when SC0 turns on, the current is

steered into the front-most block, integrated on the corresponding CL, and generates

a differential voltage at Vo1. During SC1 this voltage is held constant to be sampled

by the following ADC. Next, SC2 activates the dump operation and resets Vo1 to

zero. Finally, SC3 defines the idle state which will be replaced by a more important

phase as will be explained in Chapter 4.

CKr

CL

Hld

0.5UI

SC0

Vi

IntIdle Rst

Vo1

Vo3

Vo4

SC1

SC2

SC3

Gm

Vi

CKMI

CK

MI

CK

r

CK

MI

CK

r

CK

MI

CK

r

CK

MI

CK

r

Vo2

SC0 SC

1SC

2SC

3

HldIntIdleRst

IntIdleRstHld

Int IdleRstHld

Figure 3.15: AAF system block diagram.

To implement the 4-way time-interleaved I&D filter, two different architectures were

considered as shown in Fig. 3.17. Pulses that drive the 4-way time-interleaved I&D

system, SC0-SC3, are shown in Fig. 3.16. Section 4.3.3 explains the circuitry that

was used in order to generate the desired pulses to operate the 4-way time-interleaved

I&D circuitry. The output capacitances, CLs, model the input capacitances of the

4-b ADCs that follow the I&D system.

Fig. 3.17(a) shows a topology referred to as the source-switching architecture. It

consists of a differential input pair with source degeneration. The current steering

switches are located at the source of the input devices. In this configuration, the


input transistors are included in each interleaved branch which increases the input

capacitance of the overall system. Moreover, any abrupt changes in the input is

coupled to the output nodes through the gate-to-drain capacitances of the input

transistors.

On the other hand, Fig. 3.17(b) which is referred to as the cascode-switching

architecture, is also a differential input pair with source degeneration except that

the current steering switches are placed at the drain of the input devices. In this

configuration the source degenerative input differential pair is shared between the

four interleaved branches reducing the input capacitance and the overall area of the

system. In addition, the change in the incoming signal is no longer coupled to the

output nodes and hence the output nodes will not be disturbed while they are being

held constant to be sampled by the ADC.

As the above discussion suggests, the cascode-switching architecture was selected

as the better design for the 4-way time-interleaved I&D system. This is the design

that was used for simulations as presented in the following section.

SC0

SC1

SC2

SC3

Figure 3.16: Pulses that drive the 4-way time-interleaved I&D system.


Vo1

SC2

CL

CL

SC3

SC1

SC0

SC2

SC1

SC3

Vi

SC0

Vo2

Vo3

Vo4

Ib

Reset Cell

(a) I&D filter implementation as a source-switching architecture.

Vo1

SC2

CL

Ib

CL

SC3

SC1

SC0

SC2

SC1

SC3

Vi

SC0

Vo2

Vo3

Vo4

Reset Cell

(b) I&D filter implementation as a cascode-switching architecture.

Figure 3.17: Architectures considered for the I&D filter design.


3.4.2 Circuit Simulation Results

The design of the AAF was implemented using Fujitsu’s 65-nm CMOS process. The

device models do not support fast or slow corners nor do they model transistors of gate

lengths larger than 100 nm. These models, targeting RF frequencies (2.5-60 GHz),

offer a close match between the pre-layout and post-layout simulations as they include

parasitic capacitances and resistances due to the metal, contacts, and vias. This

feature was very helpful as extracted simulations were also not supported. Layout

techniques such as using dummy gates at each side of the transistor and maintaining

symmetry in the design was employed to reduce mismatches. All simulations were

performed on the typical device models at 40◦ C.

Fig. 3.18 presents the frequency responses that were obtained from Cadence sim-

ulations. The resulting bandwidths are compared against their Simulink simulated

values in Table 3.1. The implementation results vary from the simulation ones by

16.9-19.6 %.

-30

-25

-20

-15

-10

-5

0

Frequency (GHz)

|H(f)|dB

10

-1

10

0

10

1

fb

= 10 Gb/s

fb

= 5 Gb/s

fb

= 2 Gb/s

Figure 3.18: Frequency response of I&D filter generated from circuit simulations.

Table 3.1: Simulated 3-dB bandwidths of I&D filter.Data-rate 2 Gb/s 5 Gb/s 10 Gb/s

From Simulink 2.076 GHz 5.191 GHz 10.371 GHzFrom Cadence 2.427 GHz 6.068 GHz 8.34 GHz

Error 16.9 % 16.9 % 19.6 %


3.5 Summary

This chapter investigated the design of an AAF that is suitable for 2x blind ADC-

based receivers. Active-RC filters, Gm-C filters, and I&D filters were considered as

possible solutions. I&D proved to be the better architecture as it offers the highest

bandwidth that is easily adjustable with the data-rate. Furthermore, I&D design

methodology, behavioural modeling, implementation, and simulation results were

presented. The next chapter discusses the modifications to the proposed AAF to

complete the AFE design for ADC-based receivers.

4 Analog Front-End Design (AFE)

Current 2x blind ADC-based Receivers [4, 5] sample the received signal at twice the

baud rate. If the two samples have opposite signs, they are linearly interpolated

to estimate the input zero-crossings. This interpolation is valid only if the input

signal does not contain frequencies above the baud rate, which is half of the sampling

frequency. Otherwise, aliasing occurs which leads to erroneous estimations of the

zero-crossings. To date, the 2x ADC-based receivers relied on the channel to perform

anti-aliasing.

On the other hand, if the channel bandwidth is less than the 60% of the data-rate,

equalization is required [26]. The receivers presented in [4, 5] employ a 2-tap digital

FFE. The receiver in [5] also uses an RC-degenerated differential pair as an analog

equalizer to obtain 6 dB boost at half the data-rate. The digital FFE which is located

after the ADCs has the disadvantage of enhancing the ADC quantization noise.

To address both the anti-aliasing problem and quantization noise enhancement of

digital equalization, we proposed a new AFE as shown in Fig. 4.11. The front-end

consists of a combined anti-aliasing filter and a 2-tap FFE (AAF/FFE), four half-

rate time-interleaved ADCs, and a DeMUX. The AFE outputs are further sent to the

CDR for clock and data recovery.

This chapter is organized as follows: Section 4.1 presents the analog front-end

(AFE) architecture. Section 4.2 discusses the design methodology. Section 4.3 de-

scribes the circuit implementation of the AFE and the required clock generation

circuitry. Simulation results of the clock circuitry and the AFE are presented in

Section 4.4. A summary is provided in Section 4.5.

4.1 AFE Architecture

The contribution of our work, highlighted in Fig. 4.11, is an AAF/FFE block placed

at the front of the AFE. The idea is to program the AAF/FFE bandwidth according

to the data-rate. This avoids aliasing which occurs when the channel is unable to

37

38 4 Analog Front-End Design (AFE)

AAF/FFE

4-bit fb/2

GS/s ADC16 for f

b<= 5 Gb/s

2x Blind

ADC-Based

CDR DOUT

This Work

fb

= 2-10 Gb/s

32 for fb

> 5 Gb/s

AFE

fb

2-phase

(fb/2)

Δ

4-phase

Ck Gen/2

Δ = 0.25UI

4

2

4SC0-SC

3

Figure 4.1: A 2x blind ADC-based receiver with the proposed AFE.

bandlimit the input. On the other hand, to accommodate channels of higher loss,

FFE coefficients are adjusted to enhance the higher frequency content of the received

signal.

The proposed AAF/FFE architecture is illustrated in Fig. 4.2. Two transconductor

cells (i.e. Gm1 and Gm2) are employed in this architecture to construct the main and

post-cursor taps. The Gm cells convert the input voltage to an output current which

is integrated on CL during two consecutive clock pulses. For example, observing the

node (Vo1) in Fig. 4.2, when SC3 turns on, io2 is integrated on CL creating an output

voltage. Next, when SC0 is activated, io1 is integrated on CL adding to the previous

output voltage with opposite polarity. During SC1, this result is held constant and

sampled by the following ADC. Finally SC2 activates the reset phase. As shown in

Fig. 4.11, the ADC sampling is required to be phase-aligned to the blind AAF/FFE

clock. The ideal phase difference is a quarter of a UI which is explained in Section

4.3.3.

The remaining output nodes undergo the same four phases of post-cursor tap inte-

gration (PI), main tap integration (MI), hold (Hld), and dump (Rst). Since Gm1 is

turned on after Gm2, it adds the result of the integration from the current data to the

that of the previous data. Therefore, Gm1 and Gm2 are referred to as the main tap

and the post-cursor tap, respectively. This topology avoids the use of analog delay

cells which are often sensitive to PVT variation and require delay calibration. Various

FFE boosting levels can be achieved by adjusting the transconductance of Gm1 and

Gm2.

4.2 AFE Design Methodology and Modeling 39

CKPI

CKr

CL

Gm2

Main Tap

Post-Cursor Tap

+

Hld

0.5UI

MIPI Rst

Vo1

Vo3

Vo4

Gm1

Vi

CKMI

CK

MI

CK

PI

CK

r

CK

MI

CK

PI

CK

r

CK

MI

CK

PI

CK

r

CK

MI

CK

PI

CK

r

Vo2

SC0 SC

1SC

2SC

3

HldMIPIRst

MIPIRstHld

MI PIRstHld

io1

io2

SC0

Vi

SC1

SC2

SC3

Figure 4.2: AAF/FFE system block diagram.

4.2 AFE Design Methodology and Modeling

This section explains the methodology that was followed to design the AFE. The

linearized integration and dump (I&D) model introduced in Section 3.3 is expanded

to allow for equalization in addition to anti-aliasing. The behavioural model along

with the simulation results in this section confirm the operation of the AFE.

4.2.1 Design Methodology

An LTI model of the AFE was constructed with Matlab’s Simulink tool to analyze

the proposed architecture. The LTI model reveals information about the poles and

zeros of the AFE system. Once verified in Matlab simulations, the design was ported

into transistor-level Cadence implementations. To speed-up the initial stages of the

design, programmable Verilog-a functional descriptions [38] were used to generate the

clocking required to operate the AFE system. However, the actual clock generation

circuitry replaced the Verilog-a models once the design was finalized.


4.2.2 Behavioural Modeling

As mentioned in Section 3.1, the I&D can be modeled as an LTI system followed by

sampling. The impulse response of such a system is a rectangular filter of pulse-width

equal to the integration duration. The integration time is 0.5UI since the received

signal is sampled at twice the baud rate..

Fig. 4.3 presents an LTI model of the AFE architecture that was shown in Fig.

4.2. The main tap is modeled as a gain stage (G1) cascaded with a rectangular filter

(hID(t)) of magnitude one and pulse width 0.5UI. The post-cursor tap is modeled

in the same way except with a different gain stage (G2); it also includes a delay cell

(δ(t−0.5UI))) that delays the incoming signal for 0.5UI. The result of the post-cursor

tap is subtracted from the main tap to generate Vo(t). Vo(t) is further sampled to

construct the output samples of the 4-way time-interleaved architecture presented in

4.2.

Vo1[n]G

1hID

(t)

δ(t – 0.5UI) G2

hID

(t)

+

- Ts

= Tini

+ (2n)UI

Ts

= Tini

+ (2n + 0.5)UI

Ts

= Tini

+ (2n + 1)UI

Ts

= Tini

+ (2n + 1.5)UI

Vi(t)

n = 0, 1, 2, ...

Vo(t)

Vo2[n]

Vo3[n]

Vo4[n]

Main Tap

Post-Cursor Tap

Figure 4.3: AFE behavioural modeling.

4.2.3 Behavioural Simulation Results

The transfer function of the AFE, H(f), can be written as in equations 4.1 and 4.2.

HID(f) =1

2fb

× sinc(f

2fb

) × e(−jπf

2fb)

(4.1)

H(f) =Vo(f)

Vi(f)= HID(f)

(G1 − G2 × e

(−jπffb

))

(4.2)

4.3 AFE Implementation 41

Fig. 4.4 plots the zero-pole map of the H(f). Zeros on the imaginary axis are

introduced by the sinc function which is the result of the I&D scheme. FFE, on

the other hand, introduces a new zero on the real axis which equals 2fbln(α), where

α is the ratio of G2 over G1. As α increases from its minimum value of zero to a

maximum of 1, this zero moves towards the origin, and hence de-emphasizing the

lower frequencies to a greater extent.

w

σ

2fb

4fb

-2fb

-4fb

2fb(lnα)

α

(0 1)

0

Figure 4.4: Zero-pole map of the proposed AFE.

4.3 AFE Implementation

This section presents the circuit implementations of various components of the AFE

including the AAF/FFE, the reset cell, and the clock generation circuitry. The design

was implemented in Fujitsu’s 65-nm CMOS process and operates from a 1.2-V power

supply. The simulation results are presented in Section 4.4.

4.3.1 Cascode-Switching Architecture

As explained in Section 3.4, the AAF is implemented as a clocked Gm-C filter. Pros

and cons of the two different implementations namely, source-switching and cascode-

switching, were discussed. The latter was selected for the final design as it eliminates

the coupling from input to output while offering smaller area.

Fig. 4.5 is an expansion of Fig. 3.17(b) to incorporate the analog FFE into the

front-end architecture. A second transconductor (Gm2), implemented as a differential


input pair with source degeneration, is added to the original design with opposite

polarity. This new differential pair is responsible for subtracting a fraction of the

previous integration result from that of the current one. This fraction which is a

means to control the amount of equalization is achieved through setting Ib1 and Ib2

which bias the main tap and the post-cursor tap transconductor respectively.

Vo1

CL

Ib1

Ib2

SC1 SC

2 SC3

Vi

SC0

SC3

Vo2

Vo3

Vo4

CLReset Cell

Main Tap

Post-Cursor Tap

SC1

SC0

SC2SC

2SC

3SC

3SC

0SC

1

Figure 4.5: Cascode-switching implementation.

4.3.2 Reset Cell Architecture

The reset operation is an important part of the proposed AFE since both the anti-

aliasing and feed-forward equalization depend on the ability to reset the output to

the desired value. This sets an upper bound on the maximum speed. During the

reset phase which lasts 0.5UI, the output terminals are pulled up to Vdd to discharge

the output capacitances (CLs). The proposed reset cell is presented in Fig. 4.6 which

employs three PMOS transistors (M1-M3) to charge the output nodes to Vdd. The

transistors are sized such that the, under the maximum speed (10 Gb/s), the RC

time-constant allows the output nodes reach within 1% of Vdd. This time-constant

is formed by the PMOS on-resistance and the total output capacitance. M3 equates

both terminals of the output differential voltage (Vo1) during the reset phase such that

the following integration phase starts with identical voltage levels on both terminals.

To further increase the precision of the reset operation, effects of charge injection


from transistors (M1-M3) need to be suppressed when they turn off. Otherwise, they

introduce differential voltage errors comparable to LSB of the ADC corrupting the

integration results. Dummy transistors (M4 & M5) are responsible for canceling both

the charge injection from the channel and the clock feed-through errors [29]. When

M1-M3 turn off, half of the channel charge of M3 plus half of the channel charge of

M1 or M2 are injected to M4 or M5, respectively. Therefore, M4 and M5 are designed

to have the same size as M1-M3 to more effectively suppress charge injection effects.

SC2

SC2

M1

M3

M4

CL

CL

Vo1

M2

M5

Figure 4.6: Reset cell implementation.

4.3.3 Clock Generation Design

To characterize the AFE system proposed for 2-10 Gb/s, we designed a clock gener-

ation circuitry to generate 4 half-rate pulses (SC0-SC3) with 25% duty cycle. These

pulses drive the AFE by defining four phases of operation such as PI, MI, Hld, and

Rst (described in Section 4.1). As Fig. 4.7 suggests, these pulses can be produced

by performing the necessary logic on the 4-phase (0◦, 90◦, 180◦, 270◦) half-rate (fb/2)

waveforms. As a result, the design was divided into two separate stages: the first stage

(half-rate CMOS clock generator) generates 4-phase half-rate output waveforms from

the external 2-phase full-rate inputs; the second stage (CMOS logic) takes the 4-phase

half-rate waveforms, performs the necessary logic operations on them, and produces

the desired half-rate pulses.

The goal of this design was to drive the AFE with pulses that have rise and fall

times (tr/tf) equal to 10 %UI and the positive and negative pulses cross at Vdd/2.

With the AFE design finalized, the load of the ‘CMOS logic’ stage was known. As

shown in Fig. 4.8, this load is driven by a set of NAND/NOR gates followed by two

buffers which are sized accordingly to output inverted/non-inverted pulses satisfying


SC0

SC1

SC2

SC3

Fb, 0

◦

Fb, 180◦

Fb/2

,0◦

Fb/2, 90◦

Fb/2, 180◦

Fb/2

,270

◦

Half-rate

CMOS clock

generator

CMOS Logic

2 4 4

Figure 4.7: Clock generation block diagram.

the specification. Inverted pulses are required since the reset cell consists of PMOS

transistors described in the previous section. The duty-cycle of the CMOS pulses are

corrected by the cross-coupled inverters.

SC1

SC1

SC2

SC2

Fb/2

,0

◦

Fb/2

,90

◦

Fb/2

,180

◦

Fb/2, 270

◦

SC3

SC3

SC0

SC0

Inverter Chain 2

NAND/

NOR

Figure 4.8: CMOS logic implementation.

The 4-phase half-rate inputs to the ‘CMOS logic’ stage are generated by the ‘half-

rate CMOS clock generator’ block as shown in Fig. 4.9. This block divides the fre-

quency of the external full-rate differential clock by 2 via a current mode logic (CML)

divider which is described later. The half-rate CML clocks are further converted to

CMOS levels by CML-to-CMOS converters for area reduction purposes. Smaller tran-

sistors can be used with CMOS signaling to achieve the same speed as with the CML

signaling. The reason is that speed is proportional to the product of the transistor

width and the gate signal swing. Therefore, with higher clock swing, we can afford

to use smaller transistors without limiting the speed.


/2

CML-to-CMOS

Converter

Inverter Chain 1

+

-

+

-

+

-

+

-

Fb/2

,0◦

Fb/2, 90◦

Fb/2

,180

◦

Fb/2

,270

◦

Fb, 0◦

Fb, 180

◦

CML

Divider

Figure 4.9: Half-rate CMOS clock generator implementation.

A chain of buffers and back-to-back inverters (see Fig 4.9) boost the driving ca-

pability and correct the duty-cycles of the CML-to-CMOS converter outputs [39].

The fan-out utilized to design the buffers is in the range of 1-2 to maximize per-

formance. The simulation results which are presented in Section 4.4.1 illustrate the

overall performance of the design with reduced duty-cycle distortion (DCD).

The CML divider as shown in Fig. 4.10 consists of a CML flip-flop connected in

negative feedback. In-phase and quadrature-phase outputs of both latches in the

flip-flop are buffered to drive the same load and reduce clock skews. Buffers also

help clean the latch outputs without using large devices inside the latch. For speed

considerations, when fully switched, the transistors are biased as close to 0.3 mA/μm

as allowed by the voltage head-room to keep them in saturation. This corresponds to

the peak-fT current density of nMOSFETs [40]. The combination of the CML flip-flop

and buffers provide the above-mentioned CML-to-CMOS converters with 500-mVpp

single-ended voltage swings.

As a final note to this section, we will explain why the ADC clock needs to be

phase-aligned to the AAF/FFE clock. The ideal value for this phase difference is

0.25UI as shown in Fig. 4.12. The nodes shown in this timing diagram correspond

to Fig. 4.11 which is repeated here for convenience. A quarter of a UI phase-shift

between the 2-phase full-rate (fb) inputs to the AAF/FFE and ADC, ensures that the

edges of the ADC sampling clock ((fb/2)Δ) fall at the mid-points of the hold phases

for accurate sampling. The next section verifies our design choices by showing the

corresponding simulated waveforms.


d

ck

q

L L

Figure 4.10: CML divider implementation.

AAF/FFE

4-bit fb/2

GS/s ADC16 for f

b<= 5 Gb/s

2x Blind

ADC-Based

CDR DOUT

This Work

fb

= 2-10 Gb/s

32 for fb

> 5 Gb/s

AFE

fb

2-phase

(fb/2)

Δ

4-phase

Ck Gen/2

Δ = 0.25UI

4

2

4SC0-SC

3

Figure 4.11: A 2x blind ADC-based receiver with the proposed AFE.


Data D1 D2 D3 D4

UI

fb

SC3

SC2

SC0

SC1

0.5UI

fb/2

0.25UI

(fb/2)

Δ

Figure 4.12: Timing diagram.


4.4 Circuit Simulation Results

This section presents the simulation results of the complete AFE. Section 4.4.1 is

dedicated to the time-domain simulations of the clock generation circuitry. In Section

4.4.2, we verify the functionality of the complete AFE by looking at both the time-

domain output waveforms and the resulting frequency responses.

4.4.1 Clock Generation

Figures 4.14, 4.15, and 4.16 show the simulated waveforms in the clock generation

circuitry at data-rates of 10, 5, and 2 Gb/s respectively. These waveforms, which

correspond to the nodes shown in Fig. 4.13, refer to the external input clocks (A/A)

and the outputs of the CML divider(B/B), the CML-to-CMOS converter (C/C), the

Inverter Chain 1 (D/D), the NAND/NOR stage (E/E), and finally the outputs of

the CMOS logic stage (F/F ). The waveforms (F/F ), which are the final production

of the clock generation circuitry, drive the AFE circuitry.

CML-to-CMOS

converter/2

CML

Divider

B/BA/A Inverter

Chain 1

C/C D/DNAND/

NOR

Inverter

Chain 2

E/E F/F

CMOS Logic

Figure 4.13: Simulated nodes in clock generation circuitry.

Table 4.1 summarizes the results for tr, tf , and pulse-width (PW) of the F/F

waveforms for data-rates of 10, 5, and 2 Gb/s. The cross voltage, which refers to the

voltage where the negative and positive pulses intersect, is also shown in this table.

We had to over-design for lower data-rates since we wanted to ensure functionality at

10 Gb/s. At 10 Gb/s, the tr, tf , and PW are 8.69, 8.61, and 50.36 ps respectively.

The results show that we meet our specification as the pulses have rise/fall times of

less than 10 %UI and a duty cycle of 25.15 %UI.

4.4 Circuit Simulation Results 49

Table 4.1: Simulated results of the clock generation circuitry.

Data-rate 2 Gb/s 5 Gb/s 10 Gb/str(ps) 8.81 8.82 8.69tf (ps) 8.71 8.7 8.61

PW (ps) 250.8 100.8 50.36Cross(V ) 0.64 0.66 0.65

0

0.5

1

1.5

0

0.5

1

1.5

0

0.5

1

1.5

0

0.5

1

1.5

0

0.5

1

1.5

100 150 200 250 300 350 400 450 500

0

0.5

1

1.5

Time (ps)

Voltage (V

)

A

A

B

B

C

C

D

D

E

E

F

F

Figure 4.14: Simulated clock generation waveforms at 10 Gb/s.


0

0.5

1

1.5

0

0.5

1

1.5

0

0.5

1

1.5

0

0.5

1

1.5

0

0.5

1

1.5

0

0.5

1

1.5

Time (ps)

Voltage (V

)

100 200 300 400 500 600 700 800 900 1000

A

A

B

B

C

C

D

D

E

E

F

F



0

0.5

1

1.5

0

0.5

1

1.5

0

0.5

1

1.5

0

0.5

1

1.5

0

0.5

1

1.5

0

0.5

1

1.5

Time (ps)

Voltage (V

)

100 500 1000 1500 2000 2500

A

A

B

B

C

C

D

D

E

E

F

F



4.4.2 Analog Front-End (AFE)

Once the circuit verification of the clock generation was complete, we replaced the

Verilog-a models in the AFE design by the actual one. The complete system was

simulated at data-rates of 10, 5, and 2 Gb/s and the frequency responses were derived

as shown in Fig. 4.18, Fig. 4.19, and Fig. 4.20 respectively.

At each data-rate, three sets of Ib1 and Ib2 were used to obtain 3 boosting levels of

about 0, 5, and 11 dB. Ib1 and Ib2, which control the bias currents of the main tap and

the post-cursor tap, adjust the coefficient of the FFE. At each of these bias settings,

7 different input frequencies were applied to the AAF/FFE. Fig 4.17 presents an

example of a differential input to the AAF/FFE along with the resulting differential

output waveform at each data-rate.

The outputs of the AAF/FFE were further processed by the Simulink models of

4 time-interleaved 4-b ADCs and DeMUXes. The DeMUX under-sampled the 10,

5, and 2 Gb/s by a factor of 32, 16, and 16 respectively. The reason for the higher

DeMUX level at 10 Gb/s was to ensure testability with the available measurement

equipment. The digital DeMUXed samples taken from one DeMUX output were

used to construct an output eye. The amplitude of this eye was converted to dB

and plotted versus the corresponding frequency to generate the frequency responses

shown in Fig. 4.18, Fig. 4.19, and Fig. 4.20.

Table 4.2 summarizes the simulation results. By adjusting the bias currents (Ib1&Ib2)

as shown in the table, we can achieve 0-11.3 dB of boost at 10 Gb/s, 0-14 dB of boost

at 5 Gb/s, and 0-12.7 dB of boost at 2 Gb/s. When the FFE is off (i.e. Ib2 = 0),

the AAF has a 3-dB bandwidth of 2.427, 6.0068, and 8.34 GHz at 2, 5, and 10 Gb/s

respectively. Simulations show that the frequency response scales with the data-rate

as expected. In the next chapter, we compare these simulation results against the

measurements.

Table 4.2: Simulated results of AFE.Data-rate 2 Gb/s 5 Gb/s 10 Gb/sIb1(μA) 404 317 247 724 436 272 1300 600 320Ib2(μA) 0 152 211 0 161 204 0 170 210

3-dB BW (GHz) 2.427 6.068 8.34Boosting (dB) 0 5.4 12.7 0 4.4 14 0 5.4 11.3


200 400 600 800 1000 1200 1400 1600 1800 2000

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

Time (ps)

Voltage (V

)

Output

Input

(a) Data-rate = 10 Gb/s. Input frequency = 1 GHz.

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

500 1000 1500 2000 2500 3000 3500 4000

Time (ps)

Voltage (V

)

Output

Input

(b) Data-rate = 5 Gb/s. Input frequency = 500 MHz.

Time (ps)

1 2 3 4 5 6 7 8 9 10

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

Voltage (V

)

Output

Input

Time (ns)

(c) Data-rate = 2 Gb/s. Input frequency = 200 MHz.

Figure 4.17: Simulated time-domain waveforms of the AFE.


Frequency (GHz)

10

0

10

1

-25

-20

-15

-10

-5

0

Frequency R

esponse (dB

)

1.3/0 mA

600/170 μA

320/210 μA

Ib1/Ib2

Figure 4.18: Simulated frequency response at 10 Gb/s.

Frequency (GHz)

10

0

10

1

-25

-20

-15

-10

-5

0

Frequency R

esponse (dB

)

724/0 μA

436/131 μA

272/204 μA

Ib1/Ib2


4.5 Summary 55

Frequency (GHz)

10

-1

10

0

-25

-20

-15

-10

-5

0

Frequency R

esponse (dB

)

-1

404/0 μA

317/152 μA

247/211 μA

Ib1/Ib2


4.5 Summary

This chapter presented the architecture proposed for a combined AAF and FFE

targeting 2x blind ADC-based receivers. The design methodology was explained and

the final simulation results were presented. The plots of the simulated frequency

responses at each data-rate of 10, 5, and 2 Gb/s confirmed the functionality of the

AFE. In the following chapter, we present the measurement results while making a

comparison with the simulations.

5 Experimental Results

This chapter presents the experimental results of the analog front-end (AFE) fabri-

cated in Fujitsu’s 7-metal 65-nm CMOS technology. Section 5.1 presents the circuit

layout along with its pin-list and the measurement setup used to perform the verifica-

tion tests. The detailed description of the test channels for measurements is provided

in Section 5.2. The measurement results verifying the anti-aliasing filter (AAF) and

the feed-forward equalizer (FFE) performance are discussed in Section 5.3. Finally,

Section 5.4 concludes this chapter.

5.1 Circuit Layout and Equipment Setup

A micrograph of the test-chip along with its pin names is shown in Fig. 5.1. The

test-chip consists of the AFE, the clock generation, the four 4-bit ADCs, and the

DeMUX. The AFE and the clock generation, the contributions of this work, occupy

152×86 μm2 and 243×140 μm2, respectively. All the measurements on this test-chip

were performed with on-die probing.

Fig. 5.2 shows the measurement setup that was used to verify the AFE function-

ality. The measurement equipments are listed below:

• Probe Station: Cascade Microtech Summit 9000

• Sig Gen (1): Agilent E8257D PSG analog signal generator (250 kHz - 67 GHz)

• Sig Gen (2): HP 83620B synthesized sweeper (10 MHz - 20 GHz)

• Sig Gen (3): HP 83650B swept signal generator (10 MHz - 50 GHz)

• Sig Gen (4): Rohde&Schwarz SMT 03 signal generator (5 kHz - 3 GHz)

• Centellax OTB3P1A 10-Gb/s PRBS generator

• Sony/Tektronix DG2020A data generator

56

5.1 Circuit Layout and Equipment Setup 57

1900μm

Freq. Div. & PI

AAF/FFE

152x86 μm2

Ck Gen

243x140 μm2

4 ADCs (4-b)

DeMUX

(4:16)

VDN

VSN

CLK

IN

DIN[0]

DIN[1]

DIN[2]

DIN[3]

DIN[4]

DIN[5]

ADR

EN

DataEN

VSN

VDN

RSTX

VSN

VDDO

DOUT

[6]

VSSO

DOUT

[5]

VSSO

DOUT

[4]

VDDO

DOUT

[3]

VSSO

DOUT

[2]

VDDO

DOUT

[1]

VSSO

DOUT

[0]

DOUT

[7]

CLKOUT

VSSO

1900 μm

VDD3

CLKB

VSS3

CLKBX

VCMCB

VBIASA

VDD3

VBIASF

VCMCA

CLKAX

AVS

CLKA

AVD

AVDF

RXIN

AVD

RXIP

AVS

VCMD

AVS

REFL

VSB

REFH

VDN

ITRRUN

VSN

PRDN

AVSF

VDN

IBIASM

IBIASP

Figure 5.1: AFE micrograph.

• Agilent Infiniium DCA-J 86100C digital communication analyzer

• HP 8565E spectrum analyzer (30 Hz - 50 GHz)

• Tektronix TLA 714 logic analyzer

• Agilent E3631A/E3620A dual output power supplies (×8)

• Narda 4346 180◦ Hybrid (2-18 GHz) (×3)

• Mini-Circuits ZX86-12G-S+ Bias-T (×4)

• Picosecond 5828A Ultra-Broadband amplifier (10 dB gain/ 14 GHz BW)

Table 5.1 provides a description for each pin. Two separate clocks, CLKA and

CLKB, are required for measurements. The differential CLKA/CLKAX signals drive

the AFE while the differential CLKB/CLKBX operate the 4 time-interleaved ADCs.

Both clocks have the same frequency but their phase difference is adjusted manually

to a quarter of a UI. The reason, which was explained in Section 4.3.3, is to align

58 5 Experimental Results

Bias-T

s

IN

OUT

DC

0.8

VDC

DU

T

Pro

be

-ca

rd

Sig

G

en

(3

)

RFOUT

10MHzOUT

18

0◦

Hyb

rid

OUT0

IN

OUT180

PR

BS

G

en

DOUT

CK

IN

DOUT

CK

IN

Am

plifie

r

IN IN

OUT

OUT

Atte

nu

ato

rB

ackp

lane

IN IN

OUT

OUT

IN IN

OUT

OUT

Bias-T

s

IN IN

OUT

OUT

Sig

G

en

(1

)

RFOUT

10MHzIN

Sig

G

en

(2

)

RFOUT

10MHzOUT

18

0◦

Hybrid

OUT0

IN

OUT180

18

0◦

Hyb

rid

OUT0

IN

OUT180

DC

Bias-T

s

IN IN

OUT

OUT

DC

Bias-T

s

IN IN

OUT

OUT

DCRXIP

RXIN

CLKA

CLKAX

CLKB

CLKBX

DC

P

ow

er

Supp

lies

OUT

AVDF

VDN

AVD

VDD3

VDDO

Logic A

naly

zer

DOUT[7:0]

CLK

OUT

Data G

enerator

OUT

ADR

EN

DATA

EN

DIN[5:0]

RSTX

PR

DN

Sig

G

en

(4

)

RFOUT

CLK

IN

10MHzIN

0.9

VDC

0.6

VDC

Figure 5.2: Measurement setup.

5.1 Circuit Layout and Equipment Setup 59

the edge of ADC clock to the mid-point of the hold phase for accurate sampling.

This is achieved by the phase-change capability of the two signal generators. The

input signal (RXIP/RXIN) and the clock signals (CLKA/CLKAX,CLKB/CLKBX)

are provided differentially to the chip with the help of a 180◦ Hybrid.

The pins, IBIASM and IBIASP, control the bias current through the main tap and

post-cursor tap of the AFE. ADCs can be tuned by RFH and REFL. The power to

the AFE, the clock generation, the ADCs, the logic, and the I/Os are supplied by

AVDF, AVD, VDN, VDD3, and VDDO.

Table 5.1: Description of the pin-list.

Pin name DescriptionRXIP/RXIN Input differential signal

CLKA/CLKAX AFE differential clockCLKB/CLKBX ADC differential clock

CLKIN Register clockVCMD Input data common-mode level

VCMCA AFE clock common-mode levelVCMCB ADC clock common-mode levelVBIASF Bias voltage for AFE clockVBIASA Bias voltage for ADC clockIBIASM Bias current for main-tapIBIASP Bias current for post-cursor tapREFH ADC reference voltage (high)REFL ADC reference voltage (low)

ADREN Enable address line in test registerDATAEN Enable data line in test registerDIN [5:0] input to test registerDOUT [7:0] Parallel outputCLKOUT Parallel synchronous clockRSTX Reset (active low)PRDN Power down

AVDF/AVSF AFE power supplyVDN/VSN ADC power supplyAVD/AVS Clock power supply (ADC+AFE)

VDD3/VSS3 Logic power supplyVDDO/VSSO I/O power supply


5.2 Channel Measurements

Three test channels were used to evaluate the FFE performance. The channels, listed

in Table 5.2, are chosen such that they provide 11-13 dB of loss at the Nyquist

frequency (half the data-rate). All the channels include a 76” of SMA cable and an

FR4 backplane trace. The FR4 backplane, defined as n”-m”-n”, refers to 2 daughter

cards of length n” and a motherboard trace of length m” .

Table 5.2: Description of the test channels.

Name Description fb Measured Loss at fb/2Ch. 1 5”- 4”-5” backplane + 76′′ SMA Cables 10 Gb/s 13.3 dBCh. 2 5”-24”-5” backplane + 76′′ SMA Cables 5 Gb/s 13 dBCh. 3 5”-48”-5” backplane + 76′′ SMA Cables 2 Gb/s 11.7 dB

S-parameters of the three channels (ch1, ch2, and ch3) are measured using a vector

network analyzer (VNA) and the results are plotted in Fig. 5.3. These plots also

include the effects of a 10-dB broadband amplifier with a 6-dB attenuator that are

placed in data-path in order cover the entire range of the ADC.

5.2 Channel Measurements 61

1 3 5 7 9

-70

-50

-30

-10

10

Frequency (GHz)

[dB]

(a) Ch1 S21.

1 3 5 7 9

-70

-50

-30

-10

10

Frequency (GHz)

[dB]

(b) Ch2 S21.

1 3 5 7 9

-70

-50

-30

-10

10

Frequency (GHz)

[dB]

(c) Ch3 S21.

Figure 5.3: S21 plots.


5.3 AFE Performance

We have evaluated the AFE performance as a stand-alone block without the CDR.

We have measured the AFE frequency response which is discussed in Section 5.3.1.

This frequency response shows that the AAF bandwidth scales with the data-rate and

also that FFE can be configured to obtain the desirable boosting levels. Section 5.3.2

explores the equalization capability by trying to adjust the FFE coefficient to open

the output eye which is closed otherwise. Finally, Section 5.3.3 shows the significance

of the AAF for cases when the channel bandwidth is not enough to band-limit the

input data.

The digital eye-diagrams, used in the measurements, correspond to the ADC sam-

ples taken at one DeMUX output. This data is imported to Matlab where it is

rearranged in time to construct the eye. The frequency of the receiver sampling

clock is set to have an offset with respect to the input data-rate so that the eye can

be scanned in the same way as the sampling head of the digital oscilloscope. This

frequency offset is chosen such that the entire eye is swept with 0.64%UI resolution.

5.3.1 Frequency Response of AFE

The frequency response of the AFE (at 10, 2, and 5 Gb/s) was measured and compared

against the simulations. At each data-rate, 3 sets of (Ib1, Ib2) were chosen based on

the simulations to obtain 0, 5, and 11 dB of boost at high-frequency. As explained in

Chapter 4, Ib1 and Ib2 control the bias currents of the main tap and the post-cursor

tap of the FFE. At each of these bias settings, 7 input tones were applied to the AFE,

the digital output eye was constructed, and the amplitude of the eye was converted to

dB value. Finally, the frequency responses were generated by plotting the eye-heights

in dB values versus their corresponding frequencies.

The 7 input tones are chosen such that their periods are close (but not equal) to

an integer multiple of the receiver sampling clock (Ckrx/N). If the period is exactly

an integer multiple of the receiver sampling clock, the output samples will coincide

when folded back in time. Therefore, they will not be able to scan the eye and the

output eye can not be constructed. To prevent this case, the input tone needs to have

a frequency offset with Ckrx/N which is chosen small enough to scan the output eye

with 0.64 %UI resolution.

In the examples that follow, the simulations were performed by importing the out-

5.3 AFE Performance 63

put samples of the AFE circuit schematic to Matlab’s Simulink. These samples were

processed by the Simulink models of 4 time-interleaved 4-b ADCs and the DeMUXes.

Figures 5.4, 5.5, and 5.6 show the measured versus simulated frequency responses for

data-rates of 10, 5, and 2 Gb/s. Similar bias currents have been used both in mea-

surements and simulations. The reason for using slightly different input frequencies

in simulations was to speed up the simulation time; the time to scan the entire eye.

This was achieved by reducing the DeMUX level in Simulink and therefore allowing

for higher time resolution.

Frequency (GHz)

100

101

-25

-20

-15

-10

-5

0

Frequency R

esponse (dB

)

Ib1

= 1.3 mA

Ib2

= 0 mA

Measured

Simulated

Ib1

= 600 μA

Ib2

= 170 μA

Ib1

= 320 μA

Ib2

= 210 μA

Figure 5.4: Measured vs simulated frequency response (10 Gb/s).

Comparing the frequency responses of 10, 5, and 2 Gb/s shows that bandwidth of

the AFE linearly scales with the data-rate. In all these cases, there is a close match

between the simulations and the measurements. The slight discrepancies in the low-

frequency portion of the response at 2 Gb/s (and partially at 5 Gb/s) are artifacts of

the measurement equipment as these frequencies were outside the operating range of

the 180◦ Hybrid.

5.3.2 FFE Performance

The feed-forward equalizer (FFE) has been verified at 3 data-rates of 10, 5, and 2

Gb/s. The backplane that was employed for each of these data-rates is explained in

Table 5.2. Figures 5.7, 5.8, and 5.9 show the FFE operation for 10, 5, and 2 Gb/s.


Frequency (GHz)

-25

-20

-15

-10

-5

0

Frequency R

esponse (dB

)

Ib1

= 724 uA

Ib2

= 0 mA

Measured

Simulated

Ib1

= 436 μA

Ib2

= 161 μA

Ib1

= 272 μA

Ib2

= 204 μA

100

101

10-1


Frequency (GHz)

-25

-20

-15

-10

-5

0

Frequency R

esponse (dB

)

Ib1

= 404 uA

Ib2

= 0 mA

Measured

Simulated

Ib1

= 317 μA

Ib2

= 152 μA

Ib1

= 247 μA

Ib2

= 211 μA

100

10-1



The input data is a 27-1 PRBS sequence at 10.0005, 5.0005, and 2.0004 Gb/s. In

each figure, plot (a) shows the output eye when the FFE is off while plot (b) shows

the eye-opening achieved by turning the FFE on.

The eye-openings that were achieved for 10, 5, and 2 Gb/s were 5LSBs (223mV),

5LSBs (281.5mV), and 6LSBs (380.4mV), respectively. The bias currents used to

achieve these eye-openings were taken from the frequency response plots (Fig. 5.4,

5.5, and 5.6) corresponding to the setting for the highest frequency boost.

Time (ps)

AD

C O

utput Levels

0 20 40 60 80 100

0

2

4

6

8

10

12

14

(a) FFE OFF.

Time (ps)

AD

C O

utput Levels

0 20 40 60 80 100

0

2

4

6

8

10

12

14

(b) FFE ON.

Figure 5.7: Data-rate = 10 Gb/s - Channel loss = 13.3 dB @ 5 GHz.

Time (ps)

AD

C O

utput Levels

0 40 80120

160 200

0

2

4

6

8

10

12

14

(a) FFE OFF.

Time (ps)

AD

C O

utput Levels

0 40 80 120 160 200

0

2

4

6

8

10

12

14

(b) FFE ON.

Figure 5.8: Data-rate = 5 Gb/s - Channel loss = 13 dB @ 2.5 GHz.


Time (ps)

AD

C O

utput Levels

0 100 200 300 400 500

0

2

4

6

8

10

12

14

(a) FFE OFF.

Time (ps)

AD

C O

utput Levels

0 100 200 300 400 500

0

2

4

6

8

10

12

14

(b) FFE ON.

Figure 5.9: Data-rate = 2 Gb/s - Channel loss = 11.7 dB @ 1 GHz.

5.3.3 AAF Performance

The need for anti-aliasing filter arises at the lower range of the data-rate (i.e. 2 Gb/s)

when the interconnect is unable to band-limit the input signal. To create this scenario,

we used a total of 44” SMA cable in the data-path. The resulting attenuation of this

channel was measured to be 0.9 dB at 1 GHz. Next, we observed the output eye with

and without the AAF. To measure the case without the AAF, we used a test-chip

which directly connects the input to the ADCs (i.e. no AAF).

Fig. 5.10(a) shows the output eye when AAF was off while Fig. 5.10(b) shows the

output eye when AAF was on. In both cases, a 27-1 PRBS data is transmitted at

2.0004 Gb/s while the receiver sampling frequency is 2 GHz. It is clear from the figures

that the slopes of the eye opening are reduced when the AAF is on. Quantitatively,

the slope is reduced by a factor of 2.4. To illustrate the significance of the AAF, we

simulated the same conditions with a 2x ADC-based receiver [4]. Fig. 5.11 shows

that excluding the AAF prevents the CDR from locking. The proposed AAF is able

to restore the jitter tolerance to the accepted values.


Time (ps)

AD

C O

utput Levels

0 100 200 300 400 500

0

2

4

6

8

10

12

14

(a) AAF OFF.

Time (ps)

AD

C O

utput Levels

0 100 200 300 400 500

0

2

4

6

8

10

12

14

(b) AAF ON.

Figure 5.10: Verification of the anti-aliasing filter.

10-1

10

101

10

With AAF

10

4 6

10

8

10

10

0

10-1

100

101

10

2

Jitter frequency, Hz

Jitter tolerance, U

I PP

10

No locking without AAF

Figure 5.11: Jitter tolerance comparison with AAF on/off.


5.4 Summary

This chapter presented the circuit layout of the entire AFE test-chip that was de-

signed and fabricated in Fujitsu’s 7-metal 65-nm CMOS technology. The measure-

ment setup and the verification procedure was explained in detail. The plots for

frequency responses of the AFE were extracted from the measured output samples.

These results, which were in close proximity to the simulated ones, proved that the

frequency response of the AFE linearly scales with the data-rate.

A 27-1 PRBS sequence is transmitted to the test-chip to verify the anti-aliasing

filter (AAF) and feed-forward equalizer (FFE) operation. Turning on the AAF at 2

Gb/s, when the attenuation of the cables was only 0.9 dB, band-limited the received

signal. To verify FFE at 10, 5, and 2 Gb/s, a backplane was employed to introduce

about 13.3, 13, and 11 dB of attenuation at the corresponding Nyquist frequency.

Without the FFE, the output eye for each case was completely closed. With FFE,

we were able to open the eye for 5LSBs (223mV), 5LSBs (281.5mV), and 6LSBs

(380.4mV) at 10, 5, and 2 Gb/s.

6 Conclusions and Future Directions

This thesis presented the design of an analog front-end (AFE) targeting 2x blind

ADC-based receivers. The front-end consists of a combined anti-aliasing filter (AAF)

and 2-tap feed-forward equalizer (FFE), the required clock generation circuitry, 4

time-interleaved 4-b ADCs, and DeMUX. This design overcomes the limited data-

rate coverage of 2x blind ADC-based receivers and extends it to cover 2-10 Gb/s.

Current 2x blind ADC-based receivers [4, 5] linearly interpolate 2 samples, blindly

taken from the input, and extract the zero-crossing information. This interpolation,

however, causes erroneous estimations of the zero-crossings if the input contains sharp

transitions. To date, such receivers relied on the communication channel to prevent

aliasing. Our proposed front-end, in contrast, adjusts its bandwidth based on the

input data-rate. As a result, it does not depend on the channel to perform the anti-

aliasing and is able to cover a large range of data-rates.

The front-end employs an integration and dump (I&D) scheme to implement the

AAF and FFE in one block. The FFE is implemented without the need to design

noise-sensitive delay cells, which require delay calibration. The bandwidth of the AFE

is controlled by the integration time which is set to be half a unit interval (UI). As a

result, the data-rate automatically adjusts the front-end bandwidth. Once the design

was confirmed by Matlab’s Simulink tool and transistor-level Cadence simulations, it

was laid out and fabricated using Fujitsu’s 65-nm CMOS process.

The test-chip was measured and the results validated the proposed design. Digital

output eyes were constructed by taking samples from one DeMUX output. These

eyes were employed to generate plots of frequency response at 10, 5, and 2 Gb/s. At

each data-rate, 3 sets of (Ib1, Ib2) were chosen based on the simulations to obtain 0,

5, and 11 dB of boost at high-frequency. These plots validated the simulation results

by showing that the bandwidth of the front-end scales with input data-rate.

A 27-1 PRBS sequence was used at 2.0004 Gb/s to verify the anti-aliasing filter

(AAF) operation. The backplane consisted of a total of 44” of SMA cables that

exerted only 0.9 dB of attenuation at 1 GHz. Simulating the same conditions, showed

69

70 6 Conclusions and Future Directions

that without the AAF, the CDR was unable to lock. Turning on the AAF, band-

limited the received signal, such that the simulated jitter tolerance was restored to

its expected values.

A 27-1 PRBS sequence at 10.0005, 5.0005, and 4.0004 Gb/s was employed to verify

the FFE at 10, 5, and 2 Gb/s, respectively. To ensure that the output eye is closed

without the FFE, an external backplane was used to impose 13.3, 13, and 11 dB of

attenuation at the corresponding Nyquist frequencies. With FFE on, we obtained

vertical eye-openings of 5LSBs (223mV), 5LSBs (281.5mV), and 6LSBs (380.4mV)

at 10, 5, and 2 Gb/s, respectively.

Table 6.1 summarizes the measurements. The AAF/FFE consumes a total 2.4 mW

at 10 Gb/s and occupies 0.013 mm2 of the chip area. The clock generation circuitry,

which was not optimized and was only designed to verify the AAF/FFE functionality,

consumes 97.2 mW at 10 Gb/s and occupies an area of 0.034 mm2.

Table 6.1: Performance summary.

Technology 65-nm CMOSData-rate 2-10 Gb/sSupply 1.2 V

AAF/FFE Ck GenPower @ 10 Gb/s 2.4 mW 97.2 mWPower @ 5 Gb/s 2.2 mW 66 mWPower @ 2 Gb/s 1.6 mW 42 mW

Area 152 × 86 μm2 243 × 140 μm2

6.1 Thesis Contributions

The contributions of this thesis are:

• Proposal of an AFE whose bandwidth scales with the data-rate targeting 2x

blind ADC-based receivers.

• Design, implementation, and simulation of the proposed AFE that consists of

a combined AAF and 2-tap FFE.

• Measurement of the fabricated test-chip to validate simulations.

• Schematic and layout of the clock generation circuitry was mostly done by

Siamak Sarvari.

6.2 Future Work 71

6.2 Future Work

This section presents what we envision as the possible future directions for the de-

signed AFE targeting 2x blind ADC-based receivers.

The first area of improvement is to explore new circuits for the clock generation of

the AFE. Currently, the power of the test-chip is dominated by the clock generation.

For example, at 10 Gb/s the clock generation consumes 97.2 mW while the AAF/FFE

consumes only 2.4mW. During the design, the clock power consumption was not

optimized since the goal was to verify the functionality of the proposed AAF/FFE

first. As a result, it was ensured that the generated pulses driving the AFE are

clean and with sharp transitions. This was achieved by employing several stages of

power-hungry buffers and back-to-back inverters.

Secondly, the necessary logic can be implemented to make the equalization adap-

tive. In the designed front-end, coefficient of the equalization is set manually by

adjusting the bias currents of the main and post-cursor taps. The next step would

be to control these current via a DAC that is driven by the adaptation logic.

Thirdly, the phase difference between the ADC clock and the AAF/FFE should be

set automatically. As mentioned in Chapter 5, this phase difference is currently set

manually by the available signal generator in lab. The phase difference has to be set

to a quarter of UI so that the ADC samples the AAF/FFE output when it is valid.

This manual adjustment can be prevented by using a phase adjusting circuit to make

the system more robust.

Lastly, the entire AFE should be integrated with a 2x blind ADC-based receiver

on a test-chip. The bit error rate (BER) and jitter tolerance should be measured to

confirm the functionality of the front-end in conjunction with a real receiver.

References

[1] J.E. Rogers and J.R. Long. A 10-Gb/s CDR/DEMUX with LC delay line VCO.IEEE Journal of Solid-State Circuits, 37(12):1781–1789, 2002.

[2] R. Farjad-Rad, A. Nguyen, J. M. Tran, T. Greer, J. Poulton, W. J. Dally, J. H.Edmondson, R. Sentheinathan, R. Rathi, E. Lee, and H. T. Ng. A 33-mW 8-Gb/s CMOS clock multiplier and CDR for highly integrated I/Os. IEEE Journalof Solid-State Circuits, 39(9):1553–1561, 2004.

[3] M. Hossain and A.C. Carusone. A 6.8mW 7.4Gb/s clock-forwarded receiverwith up to 300MHz jitter tracking in 65nm CMOS. Proceedings of the 2010International Solid State Circuits Conference (ISSCC), pages 158–159, 2010.

[4] O. Tyshchenko, A. Sheikholeslami, H. Tamura, M. Kibune, H. Yamaguchi,J. Ogawa, and C. Sannomiya. A 5-Gb/s ADC-based feed-forward CDR in 65nmCMOS. IEEE Journal of Solid-State Circuits, 2010.

[5] H. Yamaguchi, H. Tamura, Y. Doi, Y. Tomita, T. Hamada, M. Kibune,S. Ohmoto, K. Tateishi, O. Tyshchenko, A. Sheikholeslami, T. Higuchi,J. Ogawa, T. Saito, H. Ishida, and K. Gotoh. A 5Gb/s transceiver with anADC-based feedforward CDR and CMA adaptive equalizer in 65nm CMOS.Proceedings of the 2010 International Solid State Circuits Conference (ISSCC),2010.

[6] H.-M. Bae, J.B. Ashbrook, J. Park, N.R. Shanbhag, A.C. Singer, and S. Chopra.An MLSE receiver for electronic dispersion compensation of OC-192 fiber links.Proceedings of the 2006 International Solid State Circuits Conference (ISSCC),pages 874–883, 2006.

[7] J. Cao, B. Zhang, U. Singh, D. Cui, A. Vasani, A. Garg, W. Zhang, N. Ko-caman, D. Pi, B. Raghavan, H. Pan, I. Fujimori, and A. Momtaz. A 500mwdigitally calibrated AFE in 65nm CMOS for 10 Gb/s serial links over backplaneand multimode fiber. Proceedings of the 2009 International Solid State CircuitsConference (ISSCC), pages 370–371, 2009.

[8] O. Agazzi, D. Crivellil, M. Huedal, H. Carrerl, G. Luna, A. Nazemil, C. Grace,B. Kobeissyl, C. Abidin, M. Kazemil, M. Kargarl, C. Marquez, S. Ramprasad,F. Bollol, V. Posse, S. Wang, G. Asmanis, G. Ealton, N. Swenson, T. Lindsay,and P. Vooisr. A 90nm CMOS DSP MLSD transceiver with integrated AFEfor electronic dispersion compensation of multi-mode optical fibers at 10 Gb/s.

72

References 73

Proceedings of the 2008 International Solid State Circuits Conference (ISSCC),pages 232–233, 2008.

[9] M. Harwood, N. Warke, R. Simpson, T. Leslie, A. Amerasekera, S. Batty, D. Col-man, E. Carr, V. Gopinathan, S. Hubbins, P. Hunt, A. Joy, P. Khandelwal,B. Killips, T. Krause, S. Lytollis, A. Pickering, M. Saxton, D. Sebastio, G. Swan-son, A. Szczepanek, T. Ward, J. Williams, R. Williams, and T. Willwerth. A12.5Gb/s SerDes in 65nm CMOS using a baud rate ADC with digital receiverequalization and clock recovery. Proceedings of the 2006 International Solid StateCircuits Conference (ISSCC), pages 436–437, 2007.

[10] PCI Express 3.0 Frequently Asked Questions, 2010. Available at HTTP: http ://www.pcisig.com/news room/faqs/pcie3.0 faq/.

[11] M. Horowitz, C. Yang, and S. Sidiropoulos. High-speed electrical signaling:Overview and limiations. IEEE Micro, pages 12–24, 1998.

[12] IEEE P802.3ap task force channel model material, 2006. Available at HTTP:http : //www.ieee802.org/3/ap/public/channel model/index.html.

[13] I. Fujimori. Will ADCs overtake binary frontends in backplane signaling? Inter-national Solid State Circuits Conference (ISSCC), 2009. Evening Session.

[14] E.A. Lee and D.G. Messerschmitt. Digital Communications. Kluwer, third edi-tion, 2004.

[15] J. Harrison and N. Weste. A 500MHz CMOS anti-alias filter using feed-forwardop-amps with local common-mode feedback. Proceedings of the 2003 Interna-tional Solid States Conference (ISSCC), 2003.

[16] T. Laxminidhi, V. Prasadu, and S. Pavan. Widely programmable high-frequencyactive RC filters in CMOS technology. IEEE Journal of Solid-State Circuits,56(2):327–336, 2009.

[17] A. Gharbiya and M. Surzycki. Highly linear, tunable, pseudo differentialtransconductor circuit for the design of Gm-C filters. Proceedings of the 2002IEEE Canadian Conference on Electrical and Computer Engineering, 2002.

[18] P. Pandey, J. Silva-Martinez, and X. Liu. A CMOS 140-mW fourth-ordercontinuous-time low-pass filter stabilized with a class AB common-mode feed-back operating at 550 MHz. IEEE Journal of Solid-State Circuits, 53(4):811–820,2006.

[19] R. Yuen, M. van Ierssel, A. Sheikholeslami, W.W. Walker, and H. Tamura. A5Gb/s transmitter with reflection cancellation for backplane transceivers. Cus-tom Integrated Circuits Conference (CICC), pages 413–416, 2006.

74 References

[20] A.C. Carusone, H. Cheng, and F.A. Musa. A 32/16-Gb/s dual-mode pulsewidthmodulation pre-emphasis (PWM-PE) transmitter with 30-dB loss compensationusing a high-speed CML design methodology. IEEE Transactions on Circuitsand Systems I: Regular Papers, 56(8):17941806, 2009.

[21] M. El Said, J. Sitch, and M. Elmasry. A 0.5m SiGe pre-equalizer for 10Gb/ssingle-mode fiber optic links. Proceedings of the 2005 International Solid StateCircuits Conference (ISSCC), page 224225, 2005.

[22] J. S. Choi, M. S. Hwang, and D. K. Jeong. A 0.18-um cmos 3.5-Gb/s continuous-time adaptive cable equalizer using enhanced low-frequency gain control method.IEEE Journal of Solid-State Circuits, 39(3):419425, 2004.

[23] D.H. Shin, J.E. Jang, F. OMahony, and C.P. Yue. A 1-mW 12-Gb/s continu-oustime adaptive passive equalizer in 90-nm CMOS. Custom Integrated CircuitsConference, pages 117–120, 2009.

[24] Y. Hidaka, W. Gai, T. Horie, J.H. Jiang, Y. Koyanagi, and H. Osone. A 4-channel 1.2510.3 Gb/s backplane transceiver macro with 35 dB equalizer andsign-based zero-forcing adaptive control. IEEE Journal of Solid-State Circuits,44(12):35473559, 2009.

[25] N. Krishnapura, M. Barazande-Pour, Q. Chaudhry, J. Khoury, and K. Laksh-mikumar. A 5 Gb/s NRZ transceiver with adaptive equalization for backplanetransmission. Proceedings of the 2005 International Solid State Circuits Confer-ence, pages 60–61, 2005.

[26] E. Sackinger. Broadband Circuits for Optical Fiber Communication. John Wiley& Sons, second edition, 2008.

[27] Y. Tomita, M. Kibune, J. Ogawa, W.W. Walker, H. Tamura, and T. Kuroda. A10-gb/s receiver with series equalizer and on-chip ISI monitor in 0.11-um CMOS.IEEE Journal of Solid-State Circuits, 40(4):986–993, 2005.

[28] J. Zerbe. High-performance wireline equalization: Issues, designs, and tradeoffs.International Solid State Circuits Conference (ISSCC), 2009. Forum 5.

[29] B. Razavi. Design of Analog CMOS Integrated Circuits. McGraw Hill, 2001.

[30] D. A. Johns and K. Martin. Analog Integrated Circuit Design. Wiley, 1996.

[31] B. Razavi. Design of Integrated Circuits for Optical Communications. McGrawHill, 2003.

[32] S. Sidiropoulos and Mark Horowitz. A 700-Mb/s/pin CMOS signaling inter-face using current imtegrating receivers. IEEE Journal of Solid-State Circuits,32(5):681–690, May 1997.

References 75

[33] F. Yang, J.H. O’Neill, D. Inglis, and J. Othmer. A CMOS Low-Power Multiple2.5-3.125-Gb/s Serial Link Macrocell for High IO Bandwidth Network ICs. IEEEJournal of Solid-State Circuits, 37(12):1813–1821, December 2002.

[34] M. Park, J. Bulzacchelli, M. Beakes, and D. Friedman. A 7gb/s 9.3mW 2-tapcurrent-integrating DFE receiver. Proceedings of the 2007 International SolidState Circuits Conference, pages 230–231, 2007.

[35] T.O. Dickson, J.F. Bulzacchelli, and D.J. Friedman. A 12-Gb/s 11-mW half-ratesampled 5-tap DFE with current-integrating in 45-nm SOI CMOS technology.Symposium of VLSI Circuits Digest of Technical Papers, pages 58–59, 2008.

[36] M. van Ierssel. Circuit Techniques for High-Speed Chip-to-Chip Signaling. PhDthesis, University of Toronto, 2006.

[37] The Mathworks, Inc. Using Simulink, 2002.

[38] Open Verilog International. Verilog-A Language Reference Manual, 1996.

[39] A. Emami-Neyestanak, A. Varzaghani, J.F. Bulzacchelli, A. Rylyakov, C.K. KenYang, and D.J. Friedman. A 6.0-mW 10.0-Gb/s receiver with switched-capacitorsummation DFE. IEEE Journal of Solid-State Circuits, 42(4):889–896, 2007.

[40] T.O. Dickson, K.H.K. Yau, T. Chalvatzis, A.M. Mangan, E. Laskin, R. Beerkens,P. Westergaard, M. Tazlauanu, M.T. Yang, and S.P. Voinigescu. The invarianceof characteristic current densities in nanoscale MOSFETs and its impact onalgorithmic design methodologies and design porting of Si(Ge (Bi)CMOS high-speed building blocks. IEEE Journal of Solid-State Circuits, 41(8):1830–1845,2006.

Analog Front-End Design for 2x Blind ADC-based … Front-End Design for 2x Blind ADC-Based Receivers...

Documents

Transcript of Analog Front-End Design for 2x Blind ADC-based … Front-End Design for 2x Blind ADC-Based Receivers...