Analog Front-End Design for 2x Blind ADC-based … Front-End Design for 2x Blind ADC-Based Receivers...
Transcript of Analog Front-End Design for 2x Blind ADC-based … Front-End Design for 2x Blind ADC-Based Receivers...
Analog Front-End Design for 2x Blind ADC-based
Receivers
by
Tina Tahmoureszadeh
A thesis submitted in conformity with the requirementsfor the degree of Master of Applied Science
Graduate Department of Electrical and Computer EngineeringUniversity of Toronto
c© Copyright by Tina Tahmoureszadeh 2010
Analog Front-End Design for 2x Blind ADC-BasedReceivers
Tina Tahmoureszadeh
Master of Applied Science, 2010
Graduate Department of Electrical and Computer Engineering
University of Toronto
Abstract
This thesis presents the design, implementation, and fabrication of an analog front-
end (AFE) targeting 2x blind ADC-based receivers. The front-end consists of a
combination of an anti-aliasing filter (AAF) and a 2-tap feed-forward equalizer (FFE)
(AAF/FFE), the required clock generation circuitry (Ck Gen), 4 time-interleaved
4-b ADCs, and DeMUX. The contributions of this design are the AAF/FFE and
the Ck Gen. The overall front-end optimizes the channel/filter characteristics for
data-rates of 2-10 Gb/s. The bandwidth of the AAF is scalable with the data-rate
and the analog 2-tap feed-forward equalizer (FFE) is designed without the need for
noise-sensitive analog delay cells. The test-chip is implemented in 65-nm CMOS and
the AAF/FFE occupies 152×86 μm2 and consumes 2.4 mW at 10 Gb/s. Measured
frequency responses at data-rates of 10, 5, and 2 Gb/s confirm the scalability of the
front-end bandwidth. FFE achieves 11 dB of high-frequency boost at 10 Gb/s.
ii
Acknowledgments
I would like to thank my supervisor, Professor Ali Sheikholeslami, for his support
and guidance throughout this research work. Thanks for making this journey worth
taking.
I would like to thank my colleagues at Fujitsu, notably Hirotaka Tamura, Yasumoto
Tomita, Masaya Kibune, and Bill Walker for their technical help and support over
the course of this project.
Special thanks to my defense committee: Professor Tony Chan Carusone, Professor
Roman Genov, and Professor Teng Joon Lim for their time and valuable feedback.
I would like to thank my parents, Farahnaz and Darioush, and my sisters, Tila and
Taraneh, for their endless love and support. Even if I could give you not only the
whole world, but the whole universe, with all its planets and stars, it will be nothing
compared to the sacrifices you have made for me.
I am forever grateful to the help and support I constantly received from my research
group members, Shayan Shahramian, Siamak Sarvari, Oleksiy Tyshchenko, David
Halupka, and Behrooz Abiri. Thanks for always having the time to offer your help.
To my girl buddies, Ruslana, Farzaneh, and Azadeh, I wouldn’t have made it
without you. I owe this to the long supporting, encouraging, and inspiring chats with
you. I owe this to the endless tea breaks and our adventurous long walks around the
campus.
Thanks to Hamed, Karim, Mike, Dustin, and Kentaro for the rest of the unforget-
table memories.Last but not the least, my dearest ‘koochooloo’ warmed up my heart, day and
night, so I can be here, writing to close this chapter of my life.
iii
Contents
Acknowledgments iii
List of Figures vi
List of Tables viii
List of Acronyms ix
1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Thesis Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Background 42.1 Wire-line Communication System . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Transceiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1.2 Communication Channel . . . . . . . . . . . . . . . . . . . . . 52.1.3 Binary versus ADC-based Receivers . . . . . . . . . . . . . . . 7
2.2 ADC-based Receivers . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2.1 Phase Tracking versus Blind Sampling . . . . . . . . . . . . . 82.2.2 1x versus 2x Sampling Rate . . . . . . . . . . . . . . . . . . . 92.2.3 2x blind ADC-based Receiver . . . . . . . . . . . . . . . . . . 10
2.3 Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3.1 Necessity for an Anti-Aliasing Filter . . . . . . . . . . . . . . . 132.3.2 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Equalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.4.1 Feed-Forward Equalization (FFE) . . . . . . . . . . . . . . . . 152.4.2 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3 Anti-Aliasing Filter (AAF) Design 193.1 Anti-Aliasing Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.1 Active RC Filters . . . . . . . . . . . . . . . . . . . . . . . . . 193.1.2 Gm-C Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.1.3 Integration and Dump (I&D) Filters . . . . . . . . . . . . . . 21
3.2 Proposed I&D Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . 24
iv
3.3 I&D Design Methodology and Modeling . . . . . . . . . . . . . . . . 303.3.1 Design Methodology . . . . . . . . . . . . . . . . . . . . . . . 303.3.2 Behavioural Modeling . . . . . . . . . . . . . . . . . . . . . . 303.3.3 Behavioural Simulation Results . . . . . . . . . . . . . . . . . 31
3.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.4.1 Circuit Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.4.2 Circuit Simulation Results . . . . . . . . . . . . . . . . . . . . 35
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4 Analog Front-End Design (AFE) 374.1 AFE Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.2 AFE Design Methodology and Modeling . . . . . . . . . . . . . . . . 39
4.2.1 Design Methodology . . . . . . . . . . . . . . . . . . . . . . . 394.2.2 Behavioural Modeling . . . . . . . . . . . . . . . . . . . . . . 404.2.3 Behavioural Simulation Results . . . . . . . . . . . . . . . . . 40
4.3 AFE Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.3.1 Cascode-Switching Architecture . . . . . . . . . . . . . . . . . 414.3.2 Reset Cell Architecture . . . . . . . . . . . . . . . . . . . . . . 424.3.3 Clock Generation Design . . . . . . . . . . . . . . . . . . . . . 43
4.4 Circuit Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . 484.4.1 Clock Generation . . . . . . . . . . . . . . . . . . . . . . . . . 484.4.2 Analog Front-End (AFE) . . . . . . . . . . . . . . . . . . . . . 52
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5 Experimental Results 565.1 Circuit Layout and Equipment Setup . . . . . . . . . . . . . . . . . . 565.2 Channel Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . 605.3 AFE Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3.1 Frequency Response of AFE . . . . . . . . . . . . . . . . . . . 625.3.2 FFE Performance . . . . . . . . . . . . . . . . . . . . . . . . . 635.3.3 AAF Performance . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6 Conclusions and Future Directions 696.1 Thesis Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 706.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
References 72
v
List of Figures
2.1 An example of a two connector backplane from Tyco Electronics [12]. 62.2 The effect of the limited channel bandwidth on the ideal NRZ signal. 72.3 ADC-based receiver architectures. . . . . . . . . . . . . . . . . . . . . 92.4 Extraction of the instantaneous phase (φinst). . . . . . . . . . . . . . 112.5 Data decision scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . 112.6 An example where the sampling theorem is satisfied. . . . . . . . . . 122.7 An example where the sampling theorem is violated (Aliasing). . . . . 132.8 Zero-crossing estimations based on the linear interpolation. . . . . . . 142.9 Equalization in frequency domain. . . . . . . . . . . . . . . . . . . . . 152.10 Feed-forward and feedback equalization. . . . . . . . . . . . . . . . . 152.11 A generic implementation of CTLE. . . . . . . . . . . . . . . . . . . . 162.12 FIR implementation of FFE. . . . . . . . . . . . . . . . . . . . . . . . 16
3.1 Active RC filter presented in [15]. . . . . . . . . . . . . . . . . . . . . 203.2 Gm-C filter presented in [17]. . . . . . . . . . . . . . . . . . . . . . . 213.3 Comparison between output samples of a rectangular filter and an I&D
filter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.4 LTI Model of an I&D filter. . . . . . . . . . . . . . . . . . . . . . . . 243.5 I&D Response. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.6 Previous I&D works in the front-end. . . . . . . . . . . . . . . . . . . 253.7 Proposed AFE for a 2x blind ADC-based receiver including AAF. . . 263.8 Bandwidth scalability of the I&D scheme. . . . . . . . . . . . . . . . 273.9 φerror plot for phase extraction with/without I&D. . . . . . . . . . . . 283.10 Input data used for simulations with various Tini. . . . . . . . . . . . 283.11 Tini = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.12 Tini = 12.5% UI = 12.5ps. . . . . . . . . . . . . . . . . . . . . . . . . 293.13 Tini = 25% UI = 25ps. . . . . . . . . . . . . . . . . . . . . . . . . . . 293.14 Frequency response of the I&D filter from Simulink simulations. . . . 313.15 AAF system block diagram. . . . . . . . . . . . . . . . . . . . . . . . 323.16 Pulses that drive the 4-way time-interleaved I&D system. . . . . . . 333.17 Architectures considered for the I&D filter design. . . . . . . . . . . . 343.18 Frequency response of I&D filter generated from circuit simulations. . 35
4.1 A 2x blind ADC-based receiver with the proposed AFE. . . . . . . . 384.2 AAF/FFE system block diagram. . . . . . . . . . . . . . . . . . . . . 39
vi
4.3 AFE behavioural modeling. . . . . . . . . . . . . . . . . . . . . . . . 404.4 Zero-pole map of the proposed AFE. . . . . . . . . . . . . . . . . . . 414.5 Cascode-switching implementation. . . . . . . . . . . . . . . . . . . . 424.6 Reset cell implementation. . . . . . . . . . . . . . . . . . . . . . . . . 434.7 Clock generation block diagram. . . . . . . . . . . . . . . . . . . . . . 444.8 CMOS logic implementation. . . . . . . . . . . . . . . . . . . . . . . . 444.9 Half-rate CMOS clock generator implementation. . . . . . . . . . . . 454.10 CML divider implementation. . . . . . . . . . . . . . . . . . . . . . . 464.11 A 2x blind ADC-based receiver with the proposed AFE. . . . . . . . 464.12 Timing diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.13 Simulated nodes in clock generation circuitry. . . . . . . . . . . . . . 484.14 Simulated clock generation waveforms at 10 Gb/s. . . . . . . . . . . . 494.15 Simulated clock generation waveforms at 5 Gb/s. . . . . . . . . . . . 504.16 Simulated clock generation waveforms at 2 Gb/s. . . . . . . . . . . . 514.17 Simulated time-domain waveforms of the AFE. . . . . . . . . . . . . . 534.18 Simulated frequency response at 10 Gb/s. . . . . . . . . . . . . . . . 544.19 Simulated frequency response at 5 Gb/s. . . . . . . . . . . . . . . . . 544.20 Simulated frequency response at 2 Gb/s. . . . . . . . . . . . . . . . . 55
5.1 AFE micrograph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575.2 Measurement setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585.3 S21 plots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615.4 Measured vs simulated frequency response (10 Gb/s). . . . . . . . . . 635.5 Measured vs simulated frequency response (5 Gb/s). . . . . . . . . . 645.6 Measured vs simulated frequency response (2 Gb/s). . . . . . . . . . 645.7 Data-rate = 10 Gb/s - Channel loss = 13.3 dB @ 5 GHz. . . . . . . . 655.8 Data-rate = 5 Gb/s - Channel loss = 13 dB @ 2.5 GHz. . . . . . . . 655.9 Data-rate = 2 Gb/s - Channel loss = 11.7 dB @ 1 GHz. . . . . . . . 665.10 Verification of the anti-aliasing filter. . . . . . . . . . . . . . . . . . . 675.11 Jitter tolerance comparison with AAF on/off. . . . . . . . . . . . . . 67
vii
List of Tables
3.1 Simulated 3-dB bandwidths of I&D filter. . . . . . . . . . . . . . . . . 35
4.1 Simulated results of the clock generation circuitry. . . . . . . . . . . . 494.2 Simulated results of AFE. . . . . . . . . . . . . . . . . . . . . . . . . 52
5.1 Description of the pin-list. . . . . . . . . . . . . . . . . . . . . . . . . 595.2 Description of the test channels. . . . . . . . . . . . . . . . . . . . . . 60
6.1 Performance summary. . . . . . . . . . . . . . . . . . . . . . . . . . . 70
viii
List of Acronyms
AAF anti-aliasing filter
AFE analog front-end
BER bit error rate
BERT bit error rate tester
CDR clock and data recovery
CML current mode logic
CMOS complimentary MOS
CTLE continuous time linear equalizer
DAC digital-to-analog converter
DCD duty-cycle distortion
DeMUX demultiplexer
DFE decision feedback equalizer
DSP digital-signal processor
FFE feed-forward equalizer
FIR finite impulse response
Gb/s gigabits per second
HDMI high-definition multi-media interface
IC integrated circuit
I&D integration and dump
ISI intersymbol interference
LSB least-significant bit
ix
LTI linear-time invariant
MOSFET metal oxide semiconductor field effect transistor
MSB most-significant bit
NMOS negative-channel metal oxide semiconductor
NRZ non-return-to-zero
PCB printed circuit board
PCIe Peripheral Component Interconnect Express
PD phase detector
PI phase interpolater
PMOS positive-channel metal oxide semiconductor
PLL phase-locked loop
PVT process-voltage-temperature
PRBS pseudo-random binary sequence
S&H sample-and-hold
SATA Serial Advanced Technology Attachment
SerDes serializer/deserializer
SNR signal-to-noise ratio
SONET synchronous optical network
UI unit interval
USB Universal Serial Bus
VCO voltage-controlled oscillator
VNA vector network analyzer
x
1 Introduction
High definition television, voice over Internet, and even gaming systems are creating
a large demand for faster data transmission which could range from chip-to-chip to
over continental distances. Established industry standards such as high-definition
multi-media interface (HDMI), Peripheral Component Interconnect Express (PCIe),
Universal Serial Bus (USB), and Serial Advanced Technology Attachment (SATA)
have driven this market for years. New circuit innovations are sought everyday to
carve out a path for multi-gigabits per second (Gb/s) data transmission.
1.1 Motivation
In a typical high-speed transceiver, serial data streams are sent from a transmitter
to a receiver via a communication channel. The existing low-cost channel materials
demonstrate low-pass behaviour at 1 Gb/s and above. This causes the transmitted
pulse to spread to over one unit interval (UI) and interfere with its neighbouring
symbols. Known as intersymbol interference (ISI), this phenomenon complicates the
task of the data recovery on the receiver side.
The most common practice to counteract the deteriorating channel effects is to use
equalization. Equalizers can be implemented in the analog or digital domain. Binary
receivers [1, 2, 3], use only 1 bit (i.e. the sign) of the received data to recover the
data and the embedded clock. ADC-based receivers [4, 5, 6, 7, 8, 9], on the other
hand, have access to more than 1 bit (i.e. the sign and the magnitude) of the received
signal. In these receivers, a front-end ADC digitizes the input signal and enables
more complex equalization circuitry in the digital domain.
Blind ADC-based receivers are a sub-category of the ADC-based receivers. They
utilize a feed-forward architecture and enable the design of a fully digital receiver.
Digital implementation is advantageous since it has low noise sensitivity compared to
the analog circuits, facilitates power/area scaling, and improves the flexibility of the
design. The focus of this thesis is to explore ways to improve the performance of the
1
2 1 Introduction
blind ADC-based receivers.
The industry standards, mentioned previously, demand supporting a wide range
of data-rates and various channel characteristics. Usually, they are required to be
fully compatible with prior generations. For example, PCIe 3.0 that aims for 8 GHz
is required to be backward compatible with PCIe 2.0 and 1.0 which support 5 GHz
and 2.5 GHz, respectively [10]. This shows the significance of designing receivers that
cover a large range of data-rates.
Current blind ADC-based receivers [4, 5] are limited in their operating data-rate.
The main problem is that they rely on the communication channel to perform the
necessary anti-aliasing. For a system that covers a wide range of data-rates, the
channel bandwidth is not sufficient to filter the transmitted signal at the lower speed.
To overcome this disadvantage, our work proposes an analog front-end (AFE) whose
bandwidth automatically adjusts with the data-rate.
The proposed AFE consists of a combined anti-aliasing filter (AAF) and an equal-
izer which extends the operating data-rate from 2 to 10 Gb/s. The equalizer turns
on when the channel imposes severe attenuation which occurs at the higher speeds.
The AAF, as explained above, turns on for low data-rates where the channel does
not have enough bandwidth to prevent aliasing.
1.2 Thesis Objectives
This thesis presents the design of an AFE for 2x blind ADC-based receivers. The
main objectives of this thesis are as follows:
• Exploring AAF solutions with adjustable bandwidth to expand the applications
of 2x blind ADC-based receivers to support multiple data-rates.
• Investigating the incorporation of an equalizer in conjunction with the band-
width scalable AAF.
• Design, fabrication, and measurement of the proposed AFE to prove function-
ality.
1.3 Thesis Outline 3
1.3 Thesis Outline
The remaining chapters of this thesis are organized as follows. Chapter 2 provides
a background on wire-line communication system, ADC-based receivers, and the sig-
nificance of equalization. It serves as a foundation for the discussions of its following
chapters. Chapter 3 presents the design methodology, modeling, and simulation re-
sults of the AAF. The design of the complete AFE including both the AAF and
the equalizer is presented in Chapter 4 followed by the simulation results. Chapter
5 discusses the measurements of the test-chip. Chapter 6 concludes this thesis and
outlines the future directions for this work.
2 Background
The rapid increase of speed in high-capacity networks and computer systems has
created a large demand for high-speed data transmission. Gigabit Ethernet, long-haul
optical channels, memory, and chip-to-chip interconnect are applications that directly
benefit from the multi-Gb/s serial link technology. This chapter presents the main
challenges in high-speed signaling along with their commonly used solutions. This
material serves as a background to frame the discussions in the following chapters.
Section 2.1 introduces a typical wire-line communication system consisting of a
transmitter and a receiver block communicating through a wired backplane. Channel
impairments that distort the transmitted signal, therefore complicating the task of
the receiver, are also discussed in this section. Binary and ADC-based receivers are
introduced towards the end of this section as the two well-known receiver architec-
tures. Section 2.2 discusses the ADC-based receivers in more detail to provide the
necessary context for the upcoming sections. Sections 2.3 and 2.4 present the ne-
cessity for an anti-aliasing filter (AAF) and equalization as a part of the currently
employed ADC-based receivers. Section 2.5 concludes this chapter.
2.1 Wire-line Communication System
High-speed signaling refers to the exchange of information from a transmitting device
to a receiving device at data-rates in excess of 1 Gb/s. The data is transmitted via
a physical medium based on which the communication system can be classified as
wire-line, wireless, optical, and etc. In the remainder of this section we discuss the
building blocks of a wire-line communication.
2.1.1 Transceiver
A generic high-speed transceiver consists of a transmitter on one chip and a receiver
on another. The task of the transceiver is to transfer the data from the transmitter to
the receiver through a communication channel. This communication channel, which
4
2.1 Wire-line Communication System 5
could range from hundreds of feet of cable in network interfaces to less than one foot
of a PCB trace in chip-to-chip signaling interfaces, suffers from non-idealities which
distort the transmitted signal. The role of the transceiver is to compensate for the
losses introduced by the physical channel and recover the data with an acceptable bit
error rate (BER).
The evolving integrated circuit (IC) technology has enabled the design of the high-
speed transceivers. The wire-line backplane, on the other hand, does not advance
with the same pace and continues to be the bottleneck. Although a transceiver
incorporates other blocks such as drivers, serializers, deserializers, and samplers [11],
its main design challenge is attributed to compensating for deteriorating channel
effects. In the next section, we study the channel impairments.
2.1.2 Communication Channel
In a wire-line communication system, which is the focus of this thesis, the communica-
tion channel carries electrical signals from the source to the destination. This channel
could be a twisted pair, coaxial cable, an Ethernet cable, or a USB cable. Although
these wire-line channels are of various natures, they impose similar challenges to the
designers.
A typical channel in serial high-speed signaling consists of connectors, a PCB trace,
and cables. A backplane trace, provided by Tyco Electronics [12], is shown in Fig.
2.1(a). It comprises of two line-cards of length 10” connected through a PCB trace of
length 20”. The material used in both the line-cards and the trace is Nelco 4000-13SI.
This channel can be modeled as a linear-time invariant (LTI) system with a frequency
response plotted in Fig. 2.1(b).
The limited bandwidth of this channel, which is an example of a generic wire-line
link, attenuates the higher frequency content of the transmitted signal. In the time-
domain, this translates to the spreading of the data bit, which consequently interferes
with its adjacent bits. This phenomenon, known as intersymbol interference (ISI), is
more clearly explained in the following example.
If an ideal non-return-to-zero (NRZ) signal is applied to a channel with infinite
bandwidth, an undistorted NRZ output is obtained, as shown in Fig. 2.2(a). The
corresponding histogram of the samples, shown on the right side, includes an impulse
at the zero level and another at the one level, indicating only two possible sample
values. Fig 2.2(b), in contrast, presents a similar case except with a practical channel
6 2 Background
10"
20"
(a) Physical dimensions of the channel.
10 1010 1010
-1
10
0
10
1
-70
-60
-50
-40
-30
-20
-10
0
Frequency (GHz)
|S
21
| (dB
)
(b) The channel transfer function.
Figure 2.1: An example of a two connector backplane from Tyco Electronics [12].
that has a limited bandwidth. The samples, in this scenario, not only depend on the
current bit value but also on the bits before and after it. Now, the histogram on the
right, consists of more than two impulses, indicating a range around the one and zero
levels, as possible sampled values. The samples affected by ISI vary with time and
can be misinterpreted by the receiver.
The narrower the channel bandwidth is, the longer the UI extends in the time-
domain, and the more severely ISI affects the transmitted waveform. The waveform
will also fail to reach the full levels of zero and one due to the destructive interference
from the neighbouring bits. These effects, if not compensated for, result in erroneous
data detection and increased BER of the receiver. The most common practice to
compensate for ISI is to utilize equalization to flatten the combined frequency response
of the channel and the equalizer. We will see this in more detail in Section 2.4.
2.1 Wire-line Communication System 7
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
-1
-0.6
-0.2
0.2
0.6
1
Time (ns)
Voltage (V
)
(a) Output of a channel with infinite bandwidth.
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
-1
-0.6
-0.2
0.2
0.6
1
Time (ns)
Voltage (V
)
(b) Output of a channel with finite bandwidth.
Figure 2.2: The effect of the limited channel bandwidth on the ideal NRZ signal.
2.1.3 Binary versus ADC-based Receivers
The task of a receiver is to correctly detect the transmitted signal. Ideally, the high-
speed serial data stream would be transmitted across the channel in parallel with the
corresponding clock signal. Non-idealities of the communication channels, however,
distort the data and the clock differently to the extent that they will no longer be
phase-aligned. Furthermore, the increased link cost for carrying the clock signal makes
this receiver topology undesirable. Consequently, in today’s high-speed transceivers,
the clock is not transmitted along with the data and the task of the clock recovery is
solely left to the receiver.
Generally, receivers are categorized to a binary or an ADC-based receiver accord-
ing to their front-end sampler. The more traditional receivers are of the type binary
as they use a flip-flop at the front-end to sample the incoming signal. The binary
sample carries the sign information of the data while discarding the magnitude in-
8 2 Background
formation. This requires the necessary compensation to be performed before the
sampling. Usually, an analog equalizer that precedes the binary sampler takes care
of this compensation.
ADC-based receivers, however, sample the received signal with an ADC. Each
sample is now represented with a set of more than one bit, preserving both the sign
and the magnitude information. The extra information about the received signal
enables the ADC-based receivers to employ more complex equalization in the digital
domain in addition to the analog equalization. As the advancement in wire-line
channels lags the rapid increase in the data-rate, the need for intensive compensation
of the channel impairments grows. Accordingly, ADC-based receivers which allow
for a higher degree of equalization are a promising solution for wire-line multi-Gb/s
transceivers above 20 Gb/s [13].
2.2 ADC-based Receivers
As discussed in the previous section, ADC-based receivers offer extensive channel
loss compensation in digital domain as they use more than one bit to represent each
sample. ADC-based receivers can be categorized based on their sampling topology
and sampling rate. By sampling topology, as will be discussed in Section 2.2.1, the
receivers can be grouped to phase-tracking and blind. Based on the sampling rate,
as will be discussed in Section 2.2.3, the receivers can be classified as the baud rate
sampling (i.e. 1x) or twice the baud rate sampling (i.e. 2x).
2.2.1 Phase Tracking versus Blind Sampling
The more well-known ADC-based receivers, known as the phase-tracking receivers,
align the sampling clock of the front-end to the phase of the incoming signal via
internal feedback [7, 8, 9, 6]. These receivers, shown in 2.3(a), recover the embedded
clock in the incoming signal by a phase recovery unit that drives a DAC to generate
a control signal for the clock generator unit. The clock generator unit that is either
a voltage-controlled oscillator (VCO) or a phase interpolater (PI) is an analog block
that produces a sampling clock in phase with the input signal. The design of such
analog blocks in multi-Gb/s signaling is the main disadvantage of the phase-tracking
architectures since they are sensitive to noise, prevent fast production-level testing,
and they do not easily port to new technologies.
2.2 ADC-based Receivers 9
ADC
Phase
Recovery UnitDAC
Clock
Generator
Decision
Input Data
Recovered Data
Sampling Clock Control Signal
(a) Phase-tracking ADC-based receiver architecture.
ADC
Phase
Detector
Filter
Decision
Input Data
Recovered Data
Blind
Sampling Clock
Фinst
Фave
(b) Feed-forward (blind) ADC-based receiver architecture.
Figure 2.3: ADC-based receiver architectures.
In an attempt to design an all-digital receiver, blind ADC-based architectures have
been introduced [4, 5]. In these architectures, shown in 2.3(b), a blind sampling clock
(i.e. a clock signal with no phase relation with respect to the data) samples the
incoming data. The newly introduced digital blocks, a phase detector and a digital
filter, replace the undesirable analog VCO/PI blocks in the phase-tracking receiver.
As the input is sampled blindly, there is no need for a feedback loop to recover the
clock. This is why these architectures are also known as feed-forward ADC-based
receivers.
2.2.2 1x versus 2x Sampling Rate
ADC-based receivers are also classified based on the number of the samples per unit
interval (UI) taken from the received signal. The most common sampling rates are
twice per UI (i.e. 2x) or once per UI (i.e. 1x). Either of the phase-tracking or blind
ADC-based receiver can utilize a baud-rate sampling or twice a baud-rate sampling
rate, resulting in total of four possible architectures: 1x phase-tracking, 2x phase-
10 2 Background
tracking, 1x blind, and 2x blind.
As explained in the previous section, phase-tracking receivers depend on a feedback
loop to align the sampling clock to the received signal. This feedback loop often
demands considerable design and verification resources. One solution to simplify
the design is to remove the feedback phase recovery loop and investigate the blind
architectures. The motivation to go from the 2x to the 1x sampling rate is to relax
the ADC conversion rate, allowing for increased baud rate. The 1x blind ADC-based
receivers, however, are not feasible since in the worst case, the samples can fall on
the zero-crossings of the input and make the task of the data recovery impossible. In
the next section, 2x blind ADC-based receivers are discussed in more detail.
2.2.3 2x blind ADC-based Receiver
A block diagram of a blind-sampling ADC-based receiver [4, 5] is presented in 2.3(b).
A blind clock samples the received signal twice per UI. The phase detector uses
these samples to approximate the instantaneous zero-crossings, φinst, of each data
transition. The value of φinst is further filtered to generate the average instantaneous
phase, φave. Both phase values along with the digital samples are sent to the decision
block for data recovery. In this section, we explain the functionality of the phase
detector and the decision block to provide the context for the remaining chapters.
Fig. 2.4 illustrates the task of the phase detector which relies on the linear inter-
polation between the two consecutive samples of opposite signs. If X and Y are the
two samples, the instantaneous phase of X (i.e. φinst) can be derived by the similar
triangles theorem as shown in equation 2.1. Xamp and Yamp are the absolute values
of the amplitudes of X and Y respectively.
φinst =
(0.5Xamp
Xamp + Yamp
)UI (2.1)
The filter that follows the phase detector, subtracts the current value of the φave
from the φinst to generate a phase error (φer) for every UI. Similar to the phase-
tracking receiver, φer is low-pass filtered to recover the φave. The decision block, uses
φinst, φave, and the value of the 3 consecutive samples to extract the data. First,
the average eye-center phase, known as the data-picking phase, φpick, is calculated by
adding 0.5UI to the φave. Second, the value of the φinst and φpick are compared and
the sample closest to the φpick or furthest to the φinst is chosen as the decided data
2.3 Aliasing 11
value. Fig. 2.5 illustrates this data decision scheme.
Y
X
t
0.5UI
Фinst
Yamp
Xamp
Figure 2.4: Extraction of the instantaneous phase (φinst).
A B C
ФinstФpick
A B C
ФinstФpick
Picked Data=A
Picked Data=B
Figure 2.5: Data decision scheme.
The next section discusses the limitations of the phase recovery scheme used in 2x
blind ADC-based receivers. This serves as a necessary background to motivate our
proposed design in this work.
2.3 Aliasing
A typical signal processing system samples and digitizes the incoming analog signal,
performs the necessary digital signal processing, and converts the final output back
to the continuous domain to interface with the analog world. The sampling theo-
rem [14] states that a continuous signal, g(t), strictly band-limited to B Hz, can be
12 2 Background
reconstructed from its samples only if the sampling frequency is more than 2B Hz.
Otherwise, the sampling process will not be reversible.
B
...
f
G(f)
B-B
f
Gs(f)
...
2B-2B
-B
f
H(f)
B-B
Figure 2.6: An example where the sampling theorem is satisfied.
Suppose that g(t) with a frequency spectrum, G(f), is sampled at 2B Hz. The
frequency spectrum of the sampled signal, Gs(f), would be the sum of the replications
of the G(f) around the integer multiples of 2B Hz, as shown in Fig. 2.6. The low-pass
filter, H(f), with cutoff frequency of B can be used to recover the original signal, g(t).
The sampling theorem relies on the assumption that g(t) is strictly band-limited.
No practical signal, however, is strictly band-limited, with the result that under-
sampling always occurs to some degree. This phenomenon, known as aliasing, refers
to the overlap of the frequency content as highlighted in Fig. 2.7. Once the signal
is aliased, it is impossible to differentiate between the frequencies in band and out of
band.
To combat the effects of aliasing, low-pass anti-aliasing filters are placed prior to
the sampler to attenuate the higher frequency content of the signal. Although ADC-
based receivers recover one bit per UI and not the actual transmitted pulse waveform,
being a sampled system, they need to deal with aliasing. The next section discusses
the aliasing issues specific to the design of the 2x blind ADC-based receivers.
2.3 Aliasing 13
B
...
f
G(f)
B-B
f
Gs(f)
...
2B-2B
-B
Figure 2.7: An example where the sampling theorem is violated (Aliasing).
2.3.1 Necessity for an Anti-Aliasing Filter
The phase recovery scheme employed in 2x blind ADC-based receivers [4, 5] was
described in Section 2.2.3. The scheme relies on the linear interpolation between
the two consecutive samples of opposite sign. This interpolation leads to erroneous
estimation of the zero crossings and reduced jitter tolerance if the received signal
contains sharp transitions. Fig. 2.8 compares the results of the interpolation on an
ideal signal against a filtered one. Linear interpolation gives a far better estimation
of the zero-crossings when the signal is filtered as opposed to when it is not.
The 2x blind ADC-based receivers reported so far [4, 5] leave the task of anti-
aliasing to the communication channel. Therefore, they can not be applied to the
standards where the backplane trace is as short as a few centimeters as in PCIe.
To expand the applications of the 2x blind ADC-based receivers, it is important to
incorporate an anti-aliasing filter at the front-end to reduce aliasing and improve the
jitter tolerance of the receiver.
2.3.2 Previous Work
To date, no anti-aliasing filter has been incorporated in the design of the 2x blind
ADC-based receivers. Anti-aliasing filters, however, are used in almost every elec-
tronic circuit. In audio systems, they are used for preamplification, equalization, and
tone control. Communication systems use them for tuning to specific frequencies. In
digital signal processing, the filters avoid the aliasing of the out-of-band noise and
interference. These systems primarily utilize low-pass filters prior to the ADC to
14 2 Background
0 0.2 0.4 0.6 0.8 1
-1
-0.6
-0.2
0.2
0.6
1
Time (ns)
Voltage (V
)
(a) Ideal signal.
0 0.2 0.4 0.6 0.8 1
-1
-0.6
-0.2
0.2
0.6
1
Time (ns)
Voltage (V
)
(b) Filtered signal.
Figure 2.8: Zero-crossing estimations based on the linear interpolation.
eliminate the undesired aliased information in the signal path.
Op-amp RC filters [15, 16] are attractive anti-aliasing solutions as they offer low
noise and high dynamic range. While feedback is mainly responsible for these desired
features, it limits the bandwidth of the filters. Gm-C (transconductance-C) filters
[17, 18] are more suited for high-frequency performance as they eliminate the feedback.
Section 3.1 discusses the different types of anti-aliasing filters in more detail.
2.4 Equalization
As explained in Section 2.1, the signal that travels from the transmitter to the receiver
is distorted by the non-idealities of the channel. Limited channel bandwidth disperses
the current UI in the time-domain such that it interferes with its neighbouring UIs.
The most common way to cancel ISI is to use equalization to make the cascade of the
channel and the equalizer have a flat frequency response, as shown in Fig. 2.9.
2.4 Equalization 15
Channel Equalizer Channel & Equalizerx =
f f f
Figure 2.9: Equalization in frequency domain.
Equalization can be performed on the transmitter side [19, 20, 21], the receiver
side [5, 22, 23], or both [9, 24, 25]. In the remainder of this section we focus on the
receiver-side implementations of the equalization.
2.4.1 Feed-Forward Equalization (FFE)
Equalization [26] at the receiver can be performed in a linear or a non-linear man-
ner. The former has a feed-forward architecture while the latter uses a feedback,
as shown in 2.10. Feed-forward equalizers (FFE) do not have a feedback path and
can be implemented either as a continuous time linear equalizer (CTLE) or a finite
impulse response (FIR) filter. A common way to implement CTLE is a differential
pair with source degeneration consisting of a capacitor in parallel with a resistor [27].
The capacitor becomes a short at high frequencies which increases the gain and com-
pensates for the channel losses. Fig. 2.11 shows a generic schematic of this approach
along with the corresponding frequency response. This approach is limited by the
gain-bandwidth product of the source-coupled differential pair and if designed well
can provide 4-6 dB gain/stage at 10 Gb/s in 90-nm CMOS technology [28].
FFE
Input Data Recovered Data
+
FIR
-
FeedbackFeed-forwad
Figure 2.10: Feed-forward and feedback equalization.
16 2 Background
At HF
Vi
f
|H(f)|
Figure 2.11: A generic implementation of CTLE.
A 2-tap FIR implementation of the FFE is shown in Fig. 2.12. In this realization,
equalization is achieved by subtracting a fraction of the previously sampled data from
the current sample. This fraction (α2/α1) can be adjusted to obtain the required
high-frequency boost to counteract the channel attenuation. For severe cases of ISI,
FIR filter can be generalized to have more taps to account for the disturbance of
more neighbouring bits. Analog FIR filters are often not a suitable choice since the
analog delay cells exhibit sensitivity to noise and process-voltage-temperature (PVT)
variations. ADC-based receivers, on the other hand, facilitate the use of digital
FIR filters as FFE. In digital FIR filters, equalization is performed on the digital
presentations of the incoming data.
Input Data Equalized Data
n-tap
Delay
Σ
-
α1
α2
Delay
α3
...
αn
...
...
...
2-tap
Figure 2.12: FIR implementation of FFE.
2.5 Summary 17
As mentioned above, linear equalization amplifies the high-frequency content of the
data; this also amplifies the high-frequency noise which reduces the signal-to-noise
ratio (SNR) and increases the BER. Noise enhancement problem can be avoided
by employing non-linear equalization such as the decision feedback equalizer (DFE)
presented in Fig. 2.10. In this technique, the decided bit drives the equalization
eliminating the noise from the received signal. In an ADC-based receiver, on the
other hand, DFE enhances the quantization noise introduced by the ADC. Therefore,
it is a good idea to employ both an analog FFE and a digital FFE or DFE in ADC-
based receivers to minimize BER degradations due to both the high-frequency and
quantization noise.
2.4.2 Previous Work
ADC-based receivers are potential solutions for data-rates above 20 Gb/s [13] since
they enable complex equalization to cancel the deteriorating channel effects. The
2x blind ADC-based receiver presented in [5] employs a linear analog equalizer prior
to the ADC plus a digital FFE following the ADC. The analog equalizer provides
a nominal gain of 6 dB at 2.5 GHz by using an RC-degenerated differential pair
designed in 65-nm CMOS. The digital FFE, implemented as a half-a-UI-spaced 2-
tap FIR filter, further equalizes the digital signal. The combined analog and digital
equalization is capable of compensating 15 dB of signal loss caused by the cable.
Future chapters explain our proposed architecture to achieve almost the same boost-
ing effect except with an all-analog implementation. This is useful when the ADC
quantization noise becomes a limiting factor in the design; this occurs when the ADC
resolution is chosen to be low for power-saving purposes.
2.5 Summary
This chapter provided an introduction on wire-line communication systems explaining
the roles of a receiver, a transmitter, and an equalizer. The signal degradations due to
the communication channel were studied and the methods to counteract these effects
were presented. The ADC-based receiver architecture was introduced as a potential
solution for data-rates of 20 Gb/s and above. It was explained that the current ADC-
based receivers lack anti-aliasing filters at the front-end. In the following chapters,
18 2 Background
we present our solution to this problem expanding the applications of the ADC-based
receivers to standards that support multi data-rates and a variety of channels.
3 Anti-Aliasing Filter (AAF) Design
Anti-aliasing filters are widely used in today’s data acquisition systems. These types
of systems consist of a front-end sampler, an ADC, and digital-signal processor (DSP)
circuitry. The anti-aliasing filter, which is placed prior to the sampler, ensures that
the input signal does not contain frequencies higher than half the sampling rate. This
guarantees the reconstruction of the input signal. In 2x blind ADC-based receivers,
aliasing prevents the accurate phase recovery of the input. As a result, it is impor-
tant to investigate solutions to prevent aliasing in these receivers and expand their
operating input frequency range.
This chapter presents the design of the anti-aliasing filter (AAF) portion of the
proposed analog front-end (AFE). Section 3.1 studies various ways to implement
the desired AAF. Section 3.2 selects the integration and dump (I&D) scheme as
the desired solution and presents the proposed architecture. Section 3.3 discusses
the design methodology for the I&D filter along with the behavioural modeling and
simulation results. Section 3.4 describes the circuit implementation of the I&D filter
followed by the simulation results. Section 3.5 concludes this chapter.
3.1 Anti-Aliasing Filters
As discussed in Section 2.3, in order for 2x blind ADC-based receivers to accommodate
data-rates ranging from 2-10 Gb/s, it is crucial that the AFE design includes an AAF
whose bandwidth adjusts with the data-rate. This section investigates the use of
active RC filters, Gm-C filters, and I&D filters as possible candidates for the desired
AAF. I&D is selected since it offers easy bandwidth programmability controlled by
the data-rate.
3.1.1 Active RC Filters
Active RC filters have largely been used to implement an AAF in data-acquisition
systems. Operational amplifiers, resistors, and capacitors constitute the building
19
20 3 Anti-Aliasing Filter (AAF) Design
blocks of such filters [29, 30]. While operational amplifiers provide voltage gain and
high dynamic range, they are bandwidth limited and therefore not suitable for high-
speed (Gb/s) systems. To combat this disadvantage, research continues to explore
more circuit techniques to design op-amps with large bandwidth.
A CMOS AAF with RC feedback is presented in [15] that achieves a maximum pole
frequency of 500 MHz. The op-amp utilized in this design (shown in Fig 3.1) consists
of three main stages with three local common-mode feedbacks and two feed-forward
stages for compensation. Although operating at a comparatively fast speed amongst
active RC filters, the design suffers from a low phase margin of 20◦.
The tunability of the bandwidth of an active RC filter can be achieved by digitally
selecting a number of parallel resistors or capacitors. There is no well-known approach
to adjust the bandwidth according to the data-rate. For this reason and due to the
low frequency operation of the op-amps, active RC filters are not suitable for the AFE
targeting high-speed ADC-based receivers.
In+
In-
Out+
Out-
Figure 3.1: Active RC filter presented in [15].
3.1.2 Gm-C Filters
Gm-C filters (transconductance-C filters) offer higher bandwidth than their active
RC counterparts [29, 30]. In Gm-C filters a differential input voltage is converted to
current by the transconductance cell and integrated on a capacitor. Therefore, the
key to designing fast filters of this type is to use fast transconductors.
3.1 Anti-Aliasing Filters 21
A transconductor circuit with a 3-dB bandwidth of 900 MHz for the design of Gm-
C filters is presented in [17]. The circuit consists of a fixed transconductor cascaded
with a variable gain cell (shown in Fig. 3.2). This topology takes advantage of the
current-mode signal processing to increase the operational bandwidth. Tunability of
the bandwidth is achieved by the variable gain stage.
Although Gm-C filters offer higher bandwidth compared to the active RC filters,
they are still unsuitable choices for ADC-based receivers covering 2-10 Gb/s. More-
over, as was the case with active RC filters, Gm-C filters are also unable to provide
bandwidth scalability with the data-rate.
Gm
Variable
Gain Cell
Vo
io
Vi
Figure 3.2: Gm-C filter presented in [17].
3.1.3 Integration and Dump (I&D) Filters
I&D is a well-known scheme for optimum detection in digital communication [14].
The input waveform is integrated for one full period, sampled, and reset before the
integration of the next bit commences. While I&D improves performance by averaging
the noise and therefore lowering the bit error rate (BER), it introduces the following
problems to high-speed systems. One is that the integration must be exactly phase-
aligned to the input data so that the entire unit interval (UI) is integrated. Another
issue is that the integration result is required to be reset immediately before the
next integration starts. This, however, can be relaxed by using time-interleaved
architectures [31].
For an linear-time invariant (LTI) system with an impulse response of h(t), the
input-output relationship can be expressed as:
y(t) =
∫ ∞
−∞x(τ)h(t − τ)dτ (3.1)
Assuming that h(t) is a rectangular filter of pulse-width equal to UI seconds, we
22 3 Anti-Aliasing Filter (AAF) Design
can write:
y(t) =
∫ t
t−UI
x(τ)dτ (3.2)
y(t) is sampled at the maximum eye-opening, which happens at the end of each UI
integration. The resulting samples can be formulated by the relationship 3.3:
y(nUI) =
∫ n(UI)
UI(n−1)
x(τ)dτ, n = 1, 2, ... (3.3)
For a specific input x(t), shown in Fig. 3.3(a), the outputs of the rectangular filter
and the I&D are presented in Fig. 3.3(b) and Fig. 3.3(c), respectively. Although
the two output waveforms are different, their corresponding output samples are the
same. This result shows that the I&D is a practical implementation of a rectangular
filter [26].
The above analyses show that an I&D scheme can be modeled as an LTI system
with a rectangular impulse response of width equal to the duration of the integration.
This model, shown in Fig. 3.4, is followed by its corresponding impulse/frequency
response in Fig. 3.5. In the frequency domain, I&D filter is a sinc function with
nulls at integer multiples of the data-rate (fb = 1/UI) and a 3-dB bandwidth equal
to 0.443fb [26].
3.1 Anti-Aliasing Filters 23
UI 2UI 3UI 4UI 5UI 6UI 7UI 8UI
-1
-0.5
0
0.5
1
Time (ps)
Voltage (V
)
(a) Input (x(t)) to both systems (rectangular fil-ter and I&D).
UI 2UI 3UI 4UI 5UI 6UI 7UI 8UI
-1
-0.5
0
0.5
1
Time (ps)
Voltage (V
)
(b) Output of the rectangular filter.
UI 2UI 3UI 4UI 5UI 6UI 7UI 8UI
-1
-0.5
0
0.5
1
Time (ps)
Voltage (V
)
(c) Output of the I&D.
Figure 3.3: Comparison between output samples of a rectangular filter and an I&Dfilter.
24 3 Anti-Aliasing Filter (AAF) Design
h(t)
Ts
= (n)UI
Vi(t)
n = 1, 2, ...
Vo(n)
Figure 3.4: LTI Model of an I&D filter.
UI 2UI 3UI 4UI
Time (s)
5UI
1
0
h(t)
(a) I&D impulse response.
fb
2fb
3fb
4fb
Frequency (Hz)
5fb
UI
0
|H(f)|
(b) I&D frequency response.
Figure 3.5: I&D Response.
3.2 Proposed I&D Scheme
Fig. 3.6 presents two examples of the previous works where I&D was utilized as
the front-end sampler for serializer/deserializer (SerDes) applications. Both of these
architectures take advantage of the I&D to filter the high-frequency noise and improve
the signal-to-noise ratio (SNR). However, integrating the entire bit period in these
two architectures necessitates a clock recovery unit to accurately align the phase of
the input to the integrating clock. In [32], the clock signal and data are sent from
the transmitter to the receiver. Although feasible at data-rates as low as 700 Mb/s,
this method is not applicable to today’s multi-Gb/s signaling. The reason is that the
skew introduced due to the losses of the channel is large enough to disturb the phase
relation between the input and the clock. To resolve this, [33] eliminates the clock
wire and recovers the clock from the received signal by a clock recovery unit. The
recovered clock drives the I&D which is phase-aligned to the input signal.
I&D scheme has also been employed in the design of a decision feedback equalizer
(DFE) as presented in [34]. This receiver enjoys the power-saving offered by this
scheme as opposed to standard DFE summing amplifiers. The I&D, which is phase-
aligned to the input, integrates the signal for an entire UI. The anti-aliasing feature
3.2 Proposed I&D Scheme 25
of the I&D is an undesired feature in this design since it closes the output eye. In a
future design [35], similar authors employed a sample-and-hold (S&H) to avoid the
loss introduced by the I&D. As discussed in Chapter 2, our goal is to explore the
anti-aliasing feature of the I&D and not hinder it.
DLL/PLL
I&D
I&D
clk
data
TX RX
clk
(a) Receiver using I&D in the front-end [32].
I&D
I&D
data
TX RX
clkFF
clk recovery
unit
(b) Receiver using I&D in the front-end [33].
Figure 3.6: Previous I&D works in the front-end.
26 3 Anti-Aliasing Filter (AAF) Design
Since we target 2x blind ADC-based receivers, the I&D clock no longer requires
to be phase-aligned to the input data. Therefore, the feedback loop from the clock
recovery unit to the AFE is eliminated. Furthermore, we take advantage of the
bandwidth programmability of the I&D to cover data-rates in the range of 2-10 Gb/s.
Fig. 3.7 presents the proposed AFE architecture for 2x blind ADC-based receivers.
This filter blindly integrates the incoming signal for 0.5UI. A 4-way time-interleaved
architecture is employed to relax the speed requirements of the reset switches. Each
interleaved branch is followed by a 4-b half-rate ADC that quantizes the AAF output
samples. These digital samples are further DeMUXed and sent to the 2x blind ADC-
based CDR [5]. The ADC clock has a certain phase relationship with respect to the
AAF/FFE clock which is discussed in Section 4.3.3.
AAF
AAF Ck
Gen
4-bit fb/2
GS/s ADC
16 for fb
<= 5 Gb/s2x Blind
ADC-Based
CDR DOUT
This Work
fb
= 2-10 Gb/s
32 for fb
> 5 Gb/s
ADC Ck
Gen
AFE
Figure 3.7: Proposed AFE for a 2x blind ADC-based receiver including AAF.
The impulse/frequency responses of an I&D are plotted in Fig. 3.8. This figure
shows that the I&D bandwidth linearly scales with the integration duration (Ti).
As described in Section 2.2.3, the phase (zero-crossing) of the incoming signal in 2x
ADC-based receivers is derived by a linear interpolation between the two consecutive
opposite samples. In order to see the effectiveness of the I&D as an AAF, Fig. 3.9
plots the phase error (φerror) versus the initial integration time (Tini). φerror is defined
as the difference between the interpolated phase and the actual input phase. Fig. 3.9
shows that the I&D scheme reduces this error by about 21 %.
3.2 Proposed I&D Scheme 27
0 100 200 300 400
0
0.2
0.4
0.6
0.8
1
Time (ps)
h(t)
(a) I&D impulse response for various integrationtimes.
Data-Rate = 2 Gb/s
Data-Rate = 5 Gb/s
Data-Rate = 10 Gb/s
10
-1
10
0
10
1-15
-10
-5
0
Frequency (GHz)
|H(f)|dB
(b) I&D frequency response for various inte-gration times.
Figure 3.8: Bandwidth scalability of the I&D scheme.
28 3 Anti-Aliasing Filter (AAF) Design
0 25 50
12.5
25
-12.5
-25
75 100
Фerror
(%UI)
Tini
(%UI)
Without I&D
With I&D
Figure 3.9: φerror plot for phase extraction with/without I&D.
It is beneficial to study the effect of the blind clock on the I&D scheme. To do
so, Tini is modified and the output samples from the rectangular filter and the I&D
are compared. Fig 3.10 shows the input used to simulate the different values of Tini.
Figures 3.11, 3.12, and 3.13 show the results for Tini = 0, Tini = 0.125UI, and Tini=
0.25UI, respectively. One important observation is that, regardless of the Tini, the
output samples from both the rectangular filter and I&D systems are always the same.
As a result, blindness of the clock with respect to the incoming data does not affect
the frequency response of the I&D filter.
100 300 500 700
-1
-0.5
0
0.5
1
Time (ps)
Voltage (V
)
Figure 3.10: Input data used for simulations with various Tini.
3.2 Proposed I&D Scheme 29
100 300 500 700
-1
-0.5
0
0.5
1
Time (ps)
Voltage (V
)
(a) Samples from the rectangular filter.
100 300 500 700
-1
-0.5
0
0.5
1
Time (ps)
Voltage (V
)
(b) Samples from I&D.
Figure 3.11: Tini = 0.
100 300 500 700
-1
-0.5
0
0.5
1
Time (ps)
Voltage (V
)
(a) Samples from the rectangular filter.
100 300 500 700
-1
-0.5
0
0.5
1
Time (ps)
Voltage (V
)
(b) Samples from I&D.
Figure 3.12: Tini = 12.5% UI = 12.5ps.
100 300 500 700
-1
-0.5
0
0.5
1
Time (ps)
Voltage (V
)
(a) Samples from the rectangular filter.
100 300 500 700
-1
-0.5
0
0.5
1
Time (ps)
Voltage (V
)
(b) Samples from I&D.
Figure 3.13: Tini = 25% UI = 25ps.
30 3 Anti-Aliasing Filter (AAF) Design
3.3 I&D Design Methodology and Modeling
To initiate the design process, a linear model of the I&D filter was incorporated with
the rest of the front-end components of a 2x blind ADC-based receiver as shown in
Fig. 3.7. While this model serves as an approximation, it provides insights about the
combination of the proposed I&D filter and the receiver front-end.
3.3.1 Design Methodology
A linear and event-driven [36] model of the I&D filter was built with Matlab’s Simulink
tool [37]. Implemented as a 4-way time-interleaved architecture, the proposed filter
was integrated with four 4-b ADCs, each followed by a DeMUX to constitute a com-
plete front-end for 2x blind ADC-based receivers. This model was used to analyze
the time-domain behaviour and verify the frequency response of the system. Once
the functional verification in Simulink was completed, the design was transferred to
transistor-level implementations in Cadence.
3.3.2 Behavioural Modeling
Due to the unavailability of a 2x blind ADC-based CDR at the time, a verification
methodology was proposed to evaluate the front-end as a stand-alone block without
the CDR. The components were modeled to ensure that the front-end can be measured
and verified independently. Simulink models for the I&D, 4-b ADC, and DeMUX were
created.
The following procedure was used to extract the frequency response of the front-end.
As the system is discrete, straight-forward ac-simulations are not able to generate the
frequency response. To resolve this issue, sinusoidal inputs at various frequencies were
provided. At each frequency, an eye-diagram of the DeMUXed output samples was
generated. The amplitudes of the resulting eye-diagrams were converted to dB values
and plotted versus their corresponding frequencies to obtain the frequency response.
As sinusoidal inputs were utilized for evaluation purposes, the relation 3.4 was
used to model the I&D topology in Simulink. The variable, tcur, refers to the current
time of the simulation. When tcur is triggered, equation 3.4 generates the result of
the sine integration of frequency (w rad/s) over the range of (tcur - 0.5UI) to tcur.
Final results of the I&D are sampled and quantized by the 4-b ADCs and sent to
3.4 Implementation 31
the DeMUXes. The DeMUXed output samples are employed to construct an eye-
diagram based on which the frequency response is obtained. Next section presents
the frequency response plots generated by the Simulink behavioural simulation.
∫ tcur
tcur−0.5UI
sin(wt)dt =1
w(cos(w(tcur − 0.5UI)) − cos(wtcur)) (3.4)
3.3.3 Behavioural Simulation Results
The Simulink behavioural model outlined in the previous section was simulated for
three data-rates of 2, 5, and 10 Gb/s. The simulated frequency responses are provided
in Fig. 3.14. This figure shows that the bandwidth of the I&D filter scales with the
data-rate. Table 3.1, provided in Section 3.4.2, summarizes the simulated bandwidths
derived from Simulink simulations and circuit simulations.
-30
-25
-20
-15
-10
-5
0
Frequency (GHz)
|H(f)|dB
10
-1
10
0
10
1
fb
= 10 Gb/s
fb
= 5 Gb/s
fb
= 2 Gb/s
Figure 3.14: Frequency response of the I&D filter from Simulink simulations.
3.4 Implementation
Once the I&D scheme was verified by Simulink behavioural model, circuit schematics
and layout were designed in Fujitsu’s 65-nm CMOS design-kit. This section presents
the circuit implementation of the I&D scheme, circuit simulation results, and com-
parisons with the Simulink model.
32 3 Anti-Aliasing Filter (AAF) Design
3.4.1 Circuit Design
The I&D filter is implemented as a clocked Gm-C filter with 4 outputs that are
interleaved in time. A system block diagram of the implementation is presented in
Fig. 3.15. Each output node goes through the four phases of integration (Int), hold
(Hld), dump (Rst), and idle (Idle). For example, when SC0 turns on, the current is
steered into the front-most block, integrated on the corresponding CL, and generates
a differential voltage at Vo1. During SC1 this voltage is held constant to be sampled
by the following ADC. Next, SC2 activates the dump operation and resets Vo1 to
zero. Finally, SC3 defines the idle state which will be replaced by a more important
phase as will be explained in Chapter 4.
CKr
CL
Hld
0.5UI
SC0
Vi
IntIdle Rst
Vo1
Vo3
Vo4
SC1
SC2
SC3
Gm
Vi
CKMI
CK
MI
CK
r
CK
MI
CK
r
CK
MI
CK
r
CK
MI
CK
r
Vo2
SC0 SC
1SC
2SC
3
HldIntIdleRst
IntIdleRstHld
Int IdleRstHld
Figure 3.15: AAF system block diagram.
To implement the 4-way time-interleaved I&D filter, two different architectures were
considered as shown in Fig. 3.17. Pulses that drive the 4-way time-interleaved I&D
system, SC0-SC3, are shown in Fig. 3.16. Section 4.3.3 explains the circuitry that
was used in order to generate the desired pulses to operate the 4-way time-interleaved
I&D circuitry. The output capacitances, CLs, model the input capacitances of the
4-b ADCs that follow the I&D system.
Fig. 3.17(a) shows a topology referred to as the source-switching architecture. It
consists of a differential input pair with source degeneration. The current steering
switches are located at the source of the input devices. In this configuration, the
3.4 Implementation 33
input transistors are included in each interleaved branch which increases the input
capacitance of the overall system. Moreover, any abrupt changes in the input is
coupled to the output nodes through the gate-to-drain capacitances of the input
transistors.
On the other hand, Fig. 3.17(b) which is referred to as the cascode-switching
architecture, is also a differential input pair with source degeneration except that
the current steering switches are placed at the drain of the input devices. In this
configuration the source degenerative input differential pair is shared between the
four interleaved branches reducing the input capacitance and the overall area of the
system. In addition, the change in the incoming signal is no longer coupled to the
output nodes and hence the output nodes will not be disturbed while they are being
held constant to be sampled by the ADC.
As the above discussion suggests, the cascode-switching architecture was selected
as the better design for the 4-way time-interleaved I&D system. This is the design
that was used for simulations as presented in the following section.
SC0
SC1
SC2
SC3
Figure 3.16: Pulses that drive the 4-way time-interleaved I&D system.
34 3 Anti-Aliasing Filter (AAF) Design
Vo1
SC2
CL
CL
SC3
SC1
SC0
SC2
SC1
SC3
Vi
SC0
Vo2
Vo3
Vo4
Ib
Reset Cell
(a) I&D filter implementation as a source-switching architecture.
Vo1
SC2
CL
Ib
CL
SC3
SC1
SC0
SC2
SC1
SC3
Vi
SC0
Vo2
Vo3
Vo4
Reset Cell
(b) I&D filter implementation as a cascode-switching architecture.
Figure 3.17: Architectures considered for the I&D filter design.
3.4 Implementation 35
3.4.2 Circuit Simulation Results
The design of the AAF was implemented using Fujitsu’s 65-nm CMOS process. The
device models do not support fast or slow corners nor do they model transistors of gate
lengths larger than 100 nm. These models, targeting RF frequencies (2.5-60 GHz),
offer a close match between the pre-layout and post-layout simulations as they include
parasitic capacitances and resistances due to the metal, contacts, and vias. This
feature was very helpful as extracted simulations were also not supported. Layout
techniques such as using dummy gates at each side of the transistor and maintaining
symmetry in the design was employed to reduce mismatches. All simulations were
performed on the typical device models at 40◦ C.
Fig. 3.18 presents the frequency responses that were obtained from Cadence sim-
ulations. The resulting bandwidths are compared against their Simulink simulated
values in Table 3.1. The implementation results vary from the simulation ones by
16.9-19.6 %.
-30
-25
-20
-15
-10
-5
0
Frequency (GHz)
|H(f)|dB
10
-1
10
0
10
1
fb
= 10 Gb/s
fb
= 5 Gb/s
fb
= 2 Gb/s
Figure 3.18: Frequency response of I&D filter generated from circuit simulations.
Table 3.1: Simulated 3-dB bandwidths of I&D filter.Data-rate 2 Gb/s 5 Gb/s 10 Gb/s
From Simulink 2.076 GHz 5.191 GHz 10.371 GHzFrom Cadence 2.427 GHz 6.068 GHz 8.34 GHz
Error 16.9 % 16.9 % 19.6 %
36 3 Anti-Aliasing Filter (AAF) Design
3.5 Summary
This chapter investigated the design of an AAF that is suitable for 2x blind ADC-
based receivers. Active-RC filters, Gm-C filters, and I&D filters were considered as
possible solutions. I&D proved to be the better architecture as it offers the highest
bandwidth that is easily adjustable with the data-rate. Furthermore, I&D design
methodology, behavioural modeling, implementation, and simulation results were
presented. The next chapter discusses the modifications to the proposed AAF to
complete the AFE design for ADC-based receivers.
4 Analog Front-End Design (AFE)
Current 2x blind ADC-based Receivers [4, 5] sample the received signal at twice the
baud rate. If the two samples have opposite signs, they are linearly interpolated
to estimate the input zero-crossings. This interpolation is valid only if the input
signal does not contain frequencies above the baud rate, which is half of the sampling
frequency. Otherwise, aliasing occurs which leads to erroneous estimations of the
zero-crossings. To date, the 2x ADC-based receivers relied on the channel to perform
anti-aliasing.
On the other hand, if the channel bandwidth is less than the 60% of the data-rate,
equalization is required [26]. The receivers presented in [4, 5] employ a 2-tap digital
FFE. The receiver in [5] also uses an RC-degenerated differential pair as an analog
equalizer to obtain 6 dB boost at half the data-rate. The digital FFE which is located
after the ADCs has the disadvantage of enhancing the ADC quantization noise.
To address both the anti-aliasing problem and quantization noise enhancement of
digital equalization, we proposed a new AFE as shown in Fig. 4.11. The front-end
consists of a combined anti-aliasing filter and a 2-tap FFE (AAF/FFE), four half-
rate time-interleaved ADCs, and a DeMUX. The AFE outputs are further sent to the
CDR for clock and data recovery.
This chapter is organized as follows: Section 4.1 presents the analog front-end
(AFE) architecture. Section 4.2 discusses the design methodology. Section 4.3 de-
scribes the circuit implementation of the AFE and the required clock generation
circuitry. Simulation results of the clock circuitry and the AFE are presented in
Section 4.4. A summary is provided in Section 4.5.
4.1 AFE Architecture
The contribution of our work, highlighted in Fig. 4.11, is an AAF/FFE block placed
at the front of the AFE. The idea is to program the AAF/FFE bandwidth according
to the data-rate. This avoids aliasing which occurs when the channel is unable to
37
38 4 Analog Front-End Design (AFE)
AAF/FFE
4-bit fb/2
GS/s ADC16 for f
b<= 5 Gb/s
2x Blind
ADC-Based
CDR DOUT
This Work
fb
= 2-10 Gb/s
32 for fb
> 5 Gb/s
AFE
fb
2-phase
(fb/2)
Δ
4-phase
Ck Gen/2
Δ = 0.25UI
4
2
4SC0-SC
3
Figure 4.1: A 2x blind ADC-based receiver with the proposed AFE.
bandlimit the input. On the other hand, to accommodate channels of higher loss,
FFE coefficients are adjusted to enhance the higher frequency content of the received
signal.
The proposed AAF/FFE architecture is illustrated in Fig. 4.2. Two transconductor
cells (i.e. Gm1 and Gm2) are employed in this architecture to construct the main and
post-cursor taps. The Gm cells convert the input voltage to an output current which
is integrated on CL during two consecutive clock pulses. For example, observing the
node (Vo1) in Fig. 4.2, when SC3 turns on, io2 is integrated on CL creating an output
voltage. Next, when SC0 is activated, io1 is integrated on CL adding to the previous
output voltage with opposite polarity. During SC1, this result is held constant and
sampled by the following ADC. Finally SC2 activates the reset phase. As shown in
Fig. 4.11, the ADC sampling is required to be phase-aligned to the blind AAF/FFE
clock. The ideal phase difference is a quarter of a UI which is explained in Section
4.3.3.
The remaining output nodes undergo the same four phases of post-cursor tap inte-
gration (PI), main tap integration (MI), hold (Hld), and dump (Rst). Since Gm1 is
turned on after Gm2, it adds the result of the integration from the current data to the
that of the previous data. Therefore, Gm1 and Gm2 are referred to as the main tap
and the post-cursor tap, respectively. This topology avoids the use of analog delay
cells which are often sensitive to PVT variation and require delay calibration. Various
FFE boosting levels can be achieved by adjusting the transconductance of Gm1 and
Gm2.
4.2 AFE Design Methodology and Modeling 39
CKPI
CKr
CL
Gm2
Main Tap
Post-Cursor Tap
+
Hld
0.5UI
MIPI Rst
Vo1
Vo3
Vo4
Gm1
Vi
CKMI
CK
MI
CK
PI
CK
r
CK
MI
CK
PI
CK
r
CK
MI
CK
PI
CK
r
CK
MI
CK
PI
CK
r
Vo2
SC0 SC
1SC
2SC
3
HldMIPIRst
MIPIRstHld
MI PIRstHld
io1
io2
SC0
Vi
SC1
SC2
SC3
Figure 4.2: AAF/FFE system block diagram.
4.2 AFE Design Methodology and Modeling
This section explains the methodology that was followed to design the AFE. The
linearized integration and dump (I&D) model introduced in Section 3.3 is expanded
to allow for equalization in addition to anti-aliasing. The behavioural model along
with the simulation results in this section confirm the operation of the AFE.
4.2.1 Design Methodology
An LTI model of the AFE was constructed with Matlab’s Simulink tool to analyze
the proposed architecture. The LTI model reveals information about the poles and
zeros of the AFE system. Once verified in Matlab simulations, the design was ported
into transistor-level Cadence implementations. To speed-up the initial stages of the
design, programmable Verilog-a functional descriptions [38] were used to generate the
clocking required to operate the AFE system. However, the actual clock generation
circuitry replaced the Verilog-a models once the design was finalized.
40 4 Analog Front-End Design (AFE)
4.2.2 Behavioural Modeling
As mentioned in Section 3.1, the I&D can be modeled as an LTI system followed by
sampling. The impulse response of such a system is a rectangular filter of pulse-width
equal to the integration duration. The integration time is 0.5UI since the received
signal is sampled at twice the baud rate..
Fig. 4.3 presents an LTI model of the AFE architecture that was shown in Fig.
4.2. The main tap is modeled as a gain stage (G1) cascaded with a rectangular filter
(hID(t)) of magnitude one and pulse width 0.5UI. The post-cursor tap is modeled
in the same way except with a different gain stage (G2); it also includes a delay cell
(δ(t−0.5UI))) that delays the incoming signal for 0.5UI. The result of the post-cursor
tap is subtracted from the main tap to generate Vo(t). Vo(t) is further sampled to
construct the output samples of the 4-way time-interleaved architecture presented in
4.2.
Vo1[n]G
1hID
(t)
δ(t – 0.5UI) G2
hID
(t)
+
- Ts
= Tini
+ (2n)UI
Ts
= Tini
+ (2n + 0.5)UI
Ts
= Tini
+ (2n + 1)UI
Ts
= Tini
+ (2n + 1.5)UI
Vi(t)
n = 0, 1, 2, ...
Vo(t)
Vo2[n]
Vo3[n]
Vo4[n]
Main Tap
Post-Cursor Tap
Figure 4.3: AFE behavioural modeling.
4.2.3 Behavioural Simulation Results
The transfer function of the AFE, H(f), can be written as in equations 4.1 and 4.2.
HID(f) =1
2fb
× sinc(f
2fb
) × e(−jπf
2fb)
(4.1)
H(f) =Vo(f)
Vi(f)= HID(f)
(G1 − G2 × e
(−jπffb
))
(4.2)
4.3 AFE Implementation 41
Fig. 4.4 plots the zero-pole map of the H(f). Zeros on the imaginary axis are
introduced by the sinc function which is the result of the I&D scheme. FFE, on
the other hand, introduces a new zero on the real axis which equals 2fbln(α), where
α is the ratio of G2 over G1. As α increases from its minimum value of zero to a
maximum of 1, this zero moves towards the origin, and hence de-emphasizing the
lower frequencies to a greater extent.
w
σ
2fb
4fb
-2fb
-4fb
2fb(lnα)
α
(0 1)
0
Figure 4.4: Zero-pole map of the proposed AFE.
4.3 AFE Implementation
This section presents the circuit implementations of various components of the AFE
including the AAF/FFE, the reset cell, and the clock generation circuitry. The design
was implemented in Fujitsu’s 65-nm CMOS process and operates from a 1.2-V power
supply. The simulation results are presented in Section 4.4.
4.3.1 Cascode-Switching Architecture
As explained in Section 3.4, the AAF is implemented as a clocked Gm-C filter. Pros
and cons of the two different implementations namely, source-switching and cascode-
switching, were discussed. The latter was selected for the final design as it eliminates
the coupling from input to output while offering smaller area.
Fig. 4.5 is an expansion of Fig. 3.17(b) to incorporate the analog FFE into the
front-end architecture. A second transconductor (Gm2), implemented as a differential
42 4 Analog Front-End Design (AFE)
input pair with source degeneration, is added to the original design with opposite
polarity. This new differential pair is responsible for subtracting a fraction of the
previous integration result from that of the current one. This fraction which is a
means to control the amount of equalization is achieved through setting Ib1 and Ib2
which bias the main tap and the post-cursor tap transconductor respectively.
Vo1
CL
Ib1
Ib2
SC1 SC
2 SC3
Vi
SC0
SC3
Vo2
Vo3
Vo4
CLReset Cell
Main Tap
Post-Cursor Tap
SC1
SC0
SC2SC
2SC
3SC
3SC
0SC
1
Figure 4.5: Cascode-switching implementation.
4.3.2 Reset Cell Architecture
The reset operation is an important part of the proposed AFE since both the anti-
aliasing and feed-forward equalization depend on the ability to reset the output to
the desired value. This sets an upper bound on the maximum speed. During the
reset phase which lasts 0.5UI, the output terminals are pulled up to Vdd to discharge
the output capacitances (CLs). The proposed reset cell is presented in Fig. 4.6 which
employs three PMOS transistors (M1-M3) to charge the output nodes to Vdd. The
transistors are sized such that the, under the maximum speed (10 Gb/s), the RC
time-constant allows the output nodes reach within 1% of Vdd. This time-constant
is formed by the PMOS on-resistance and the total output capacitance. M3 equates
both terminals of the output differential voltage (Vo1) during the reset phase such that
the following integration phase starts with identical voltage levels on both terminals.
To further increase the precision of the reset operation, effects of charge injection
4.3 AFE Implementation 43
from transistors (M1-M3) need to be suppressed when they turn off. Otherwise, they
introduce differential voltage errors comparable to LSB of the ADC corrupting the
integration results. Dummy transistors (M4 & M5) are responsible for canceling both
the charge injection from the channel and the clock feed-through errors [29]. When
M1-M3 turn off, half of the channel charge of M3 plus half of the channel charge of
M1 or M2 are injected to M4 or M5, respectively. Therefore, M4 and M5 are designed
to have the same size as M1-M3 to more effectively suppress charge injection effects.
SC2
SC2
M1
M3
M4
CL
CL
Vo1
M2
M5
Figure 4.6: Reset cell implementation.
4.3.3 Clock Generation Design
To characterize the AFE system proposed for 2-10 Gb/s, we designed a clock gener-
ation circuitry to generate 4 half-rate pulses (SC0-SC3) with 25% duty cycle. These
pulses drive the AFE by defining four phases of operation such as PI, MI, Hld, and
Rst (described in Section 4.1). As Fig. 4.7 suggests, these pulses can be produced
by performing the necessary logic on the 4-phase (0◦, 90◦, 180◦, 270◦) half-rate (fb/2)
waveforms. As a result, the design was divided into two separate stages: the first stage
(half-rate CMOS clock generator) generates 4-phase half-rate output waveforms from
the external 2-phase full-rate inputs; the second stage (CMOS logic) takes the 4-phase
half-rate waveforms, performs the necessary logic operations on them, and produces
the desired half-rate pulses.
The goal of this design was to drive the AFE with pulses that have rise and fall
times (tr/tf) equal to 10 %UI and the positive and negative pulses cross at Vdd/2.
With the AFE design finalized, the load of the ‘CMOS logic’ stage was known. As
shown in Fig. 4.8, this load is driven by a set of NAND/NOR gates followed by two
buffers which are sized accordingly to output inverted/non-inverted pulses satisfying
44 4 Analog Front-End Design (AFE)
SC0
SC1
SC2
SC3
Fb, 0
◦
Fb, 180◦
Fb/2
,0◦
Fb/2, 90◦
Fb/2, 180◦
Fb/2
,270
◦
Half-rate
CMOS clock
generator
CMOS Logic
2 4 4
Figure 4.7: Clock generation block diagram.
the specification. Inverted pulses are required since the reset cell consists of PMOS
transistors described in the previous section. The duty-cycle of the CMOS pulses are
corrected by the cross-coupled inverters.
SC1
SC1
SC2
SC2
Fb/2
,0
◦
Fb/2
,90
◦
Fb/2
,180
◦
Fb/2, 270
◦
SC3
SC3
SC0
SC0
Inverter Chain 2
NAND/
NOR
Figure 4.8: CMOS logic implementation.
The 4-phase half-rate inputs to the ‘CMOS logic’ stage are generated by the ‘half-
rate CMOS clock generator’ block as shown in Fig. 4.9. This block divides the fre-
quency of the external full-rate differential clock by 2 via a current mode logic (CML)
divider which is described later. The half-rate CML clocks are further converted to
CMOS levels by CML-to-CMOS converters for area reduction purposes. Smaller tran-
sistors can be used with CMOS signaling to achieve the same speed as with the CML
signaling. The reason is that speed is proportional to the product of the transistor
width and the gate signal swing. Therefore, with higher clock swing, we can afford
to use smaller transistors without limiting the speed.
4.3 AFE Implementation 45
/2
CML-to-CMOS
Converter
Inverter Chain 1
+
-
+
-
+
-
+
-
Fb/2
,0◦
Fb/2, 90◦
Fb/2
,180
◦
Fb/2
,270
◦
Fb, 0◦
Fb, 180
◦
CML
Divider
Figure 4.9: Half-rate CMOS clock generator implementation.
A chain of buffers and back-to-back inverters (see Fig 4.9) boost the driving ca-
pability and correct the duty-cycles of the CML-to-CMOS converter outputs [39].
The fan-out utilized to design the buffers is in the range of 1-2 to maximize per-
formance. The simulation results which are presented in Section 4.4.1 illustrate the
overall performance of the design with reduced duty-cycle distortion (DCD).
The CML divider as shown in Fig. 4.10 consists of a CML flip-flop connected in
negative feedback. In-phase and quadrature-phase outputs of both latches in the
flip-flop are buffered to drive the same load and reduce clock skews. Buffers also
help clean the latch outputs without using large devices inside the latch. For speed
considerations, when fully switched, the transistors are biased as close to 0.3 mA/μm
as allowed by the voltage head-room to keep them in saturation. This corresponds to
the peak-fT current density of nMOSFETs [40]. The combination of the CML flip-flop
and buffers provide the above-mentioned CML-to-CMOS converters with 500-mVpp
single-ended voltage swings.
As a final note to this section, we will explain why the ADC clock needs to be
phase-aligned to the AAF/FFE clock. The ideal value for this phase difference is
0.25UI as shown in Fig. 4.12. The nodes shown in this timing diagram correspond
to Fig. 4.11 which is repeated here for convenience. A quarter of a UI phase-shift
between the 2-phase full-rate (fb) inputs to the AAF/FFE and ADC, ensures that the
edges of the ADC sampling clock ((fb/2)Δ) fall at the mid-points of the hold phases
for accurate sampling. The next section verifies our design choices by showing the
corresponding simulated waveforms.
46 4 Analog Front-End Design (AFE)
d
ck
q
L L
Figure 4.10: CML divider implementation.
AAF/FFE
4-bit fb/2
GS/s ADC16 for f
b<= 5 Gb/s
2x Blind
ADC-Based
CDR DOUT
This Work
fb
= 2-10 Gb/s
32 for fb
> 5 Gb/s
AFE
fb
2-phase
(fb/2)
Δ
4-phase
Ck Gen/2
Δ = 0.25UI
4
2
4SC0-SC
3
Figure 4.11: A 2x blind ADC-based receiver with the proposed AFE.
4.3 AFE Implementation 47
Data D1 D2 D3 D4
UI
fb
SC3
SC2
SC0
SC1
0.5UI
fb/2
0.25UI
(fb/2)
Δ
Figure 4.12: Timing diagram.
48 4 Analog Front-End Design (AFE)
4.4 Circuit Simulation Results
This section presents the simulation results of the complete AFE. Section 4.4.1 is
dedicated to the time-domain simulations of the clock generation circuitry. In Section
4.4.2, we verify the functionality of the complete AFE by looking at both the time-
domain output waveforms and the resulting frequency responses.
4.4.1 Clock Generation
Figures 4.14, 4.15, and 4.16 show the simulated waveforms in the clock generation
circuitry at data-rates of 10, 5, and 2 Gb/s respectively. These waveforms, which
correspond to the nodes shown in Fig. 4.13, refer to the external input clocks (A/A)
and the outputs of the CML divider(B/B), the CML-to-CMOS converter (C/C), the
Inverter Chain 1 (D/D), the NAND/NOR stage (E/E), and finally the outputs of
the CMOS logic stage (F/F ). The waveforms (F/F ), which are the final production
of the clock generation circuitry, drive the AFE circuitry.
CML-to-CMOS
converter/2
CML
Divider
B/BA/A Inverter
Chain 1
C/C D/DNAND/
NOR
Inverter
Chain 2
E/E F/F
CMOS Logic
Figure 4.13: Simulated nodes in clock generation circuitry.
Table 4.1 summarizes the results for tr, tf , and pulse-width (PW) of the F/F
waveforms for data-rates of 10, 5, and 2 Gb/s. The cross voltage, which refers to the
voltage where the negative and positive pulses intersect, is also shown in this table.
We had to over-design for lower data-rates since we wanted to ensure functionality at
10 Gb/s. At 10 Gb/s, the tr, tf , and PW are 8.69, 8.61, and 50.36 ps respectively.
The results show that we meet our specification as the pulses have rise/fall times of
less than 10 %UI and a duty cycle of 25.15 %UI.
4.4 Circuit Simulation Results 49
Table 4.1: Simulated results of the clock generation circuitry.
Data-rate 2 Gb/s 5 Gb/s 10 Gb/str(ps) 8.81 8.82 8.69tf (ps) 8.71 8.7 8.61
PW (ps) 250.8 100.8 50.36Cross(V ) 0.64 0.66 0.65
0
0.5
1
1.5
0
0.5
1
1.5
0
0.5
1
1.5
0
0.5
1
1.5
0
0.5
1
1.5
100 150 200 250 300 350 400 450 500
0
0.5
1
1.5
Time (ps)
Voltage (V
)
A
A
B
B
C
C
D
D
E
E
F
F
Figure 4.14: Simulated clock generation waveforms at 10 Gb/s.
50 4 Analog Front-End Design (AFE)
0
0.5
1
1.5
0
0.5
1
1.5
0
0.5
1
1.5
0
0.5
1
1.5
0
0.5
1
1.5
0
0.5
1
1.5
Time (ps)
Voltage (V
)
100 200 300 400 500 600 700 800 900 1000
A
A
B
B
C
C
D
D
E
E
F
F
Figure 4.15: Simulated clock generation waveforms at 5 Gb/s.
4.4 Circuit Simulation Results 51
0
0.5
1
1.5
0
0.5
1
1.5
0
0.5
1
1.5
0
0.5
1
1.5
0
0.5
1
1.5
0
0.5
1
1.5
Time (ps)
Voltage (V
)
100 500 1000 1500 2000 2500
A
A
B
B
C
C
D
D
E
E
F
F
Figure 4.16: Simulated clock generation waveforms at 2 Gb/s.
52 4 Analog Front-End Design (AFE)
4.4.2 Analog Front-End (AFE)
Once the circuit verification of the clock generation was complete, we replaced the
Verilog-a models in the AFE design by the actual one. The complete system was
simulated at data-rates of 10, 5, and 2 Gb/s and the frequency responses were derived
as shown in Fig. 4.18, Fig. 4.19, and Fig. 4.20 respectively.
At each data-rate, three sets of Ib1 and Ib2 were used to obtain 3 boosting levels of
about 0, 5, and 11 dB. Ib1 and Ib2, which control the bias currents of the main tap and
the post-cursor tap, adjust the coefficient of the FFE. At each of these bias settings,
7 different input frequencies were applied to the AAF/FFE. Fig 4.17 presents an
example of a differential input to the AAF/FFE along with the resulting differential
output waveform at each data-rate.
The outputs of the AAF/FFE were further processed by the Simulink models of
4 time-interleaved 4-b ADCs and DeMUXes. The DeMUX under-sampled the 10,
5, and 2 Gb/s by a factor of 32, 16, and 16 respectively. The reason for the higher
DeMUX level at 10 Gb/s was to ensure testability with the available measurement
equipment. The digital DeMUXed samples taken from one DeMUX output were
used to construct an output eye. The amplitude of this eye was converted to dB
and plotted versus the corresponding frequency to generate the frequency responses
shown in Fig. 4.18, Fig. 4.19, and Fig. 4.20.
Table 4.2 summarizes the simulation results. By adjusting the bias currents (Ib1&Ib2)
as shown in the table, we can achieve 0-11.3 dB of boost at 10 Gb/s, 0-14 dB of boost
at 5 Gb/s, and 0-12.7 dB of boost at 2 Gb/s. When the FFE is off (i.e. Ib2 = 0),
the AAF has a 3-dB bandwidth of 2.427, 6.0068, and 8.34 GHz at 2, 5, and 10 Gb/s
respectively. Simulations show that the frequency response scales with the data-rate
as expected. In the next chapter, we compare these simulation results against the
measurements.
Table 4.2: Simulated results of AFE.Data-rate 2 Gb/s 5 Gb/s 10 Gb/sIb1(μA) 404 317 247 724 436 272 1300 600 320Ib2(μA) 0 152 211 0 161 204 0 170 210
3-dB BW (GHz) 2.427 6.068 8.34Boosting (dB) 0 5.4 12.7 0 4.4 14 0 5.4 11.3
4.4 Circuit Simulation Results 53
200 400 600 800 1000 1200 1400 1600 1800 2000
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
Time (ps)
Voltage (V
)
Output
Input
(a) Data-rate = 10 Gb/s. Input frequency = 1 GHz.
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
500 1000 1500 2000 2500 3000 3500 4000
Time (ps)
Voltage (V
)
Output
Input
(b) Data-rate = 5 Gb/s. Input frequency = 500 MHz.
Time (ps)
1 2 3 4 5 6 7 8 9 10
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
Voltage (V
)
Output
Input
Time (ns)
(c) Data-rate = 2 Gb/s. Input frequency = 200 MHz.
Figure 4.17: Simulated time-domain waveforms of the AFE.
54 4 Analog Front-End Design (AFE)
Frequency (GHz)
10
0
10
1
-25
-20
-15
-10
-5
0
Frequency R
esponse (dB
)
1.3/0 mA
600/170 μA
320/210 μA
Ib1/Ib2
Figure 4.18: Simulated frequency response at 10 Gb/s.
Frequency (GHz)
10
0
10
1
-25
-20
-15
-10
-5
0
Frequency R
esponse (dB
)
724/0 μA
436/131 μA
272/204 μA
Ib1/Ib2
Figure 4.19: Simulated frequency response at 5 Gb/s.
4.5 Summary 55
Frequency (GHz)
10
-1
10
0
-25
-20
-15
-10
-5
0
Frequency R
esponse (dB
)
-1
404/0 μA
317/152 μA
247/211 μA
Ib1/Ib2
Figure 4.20: Simulated frequency response at 2 Gb/s.
4.5 Summary
This chapter presented the architecture proposed for a combined AAF and FFE
targeting 2x blind ADC-based receivers. The design methodology was explained and
the final simulation results were presented. The plots of the simulated frequency
responses at each data-rate of 10, 5, and 2 Gb/s confirmed the functionality of the
AFE. In the following chapter, we present the measurement results while making a
comparison with the simulations.
5 Experimental Results
This chapter presents the experimental results of the analog front-end (AFE) fabri-
cated in Fujitsu’s 7-metal 65-nm CMOS technology. Section 5.1 presents the circuit
layout along with its pin-list and the measurement setup used to perform the verifica-
tion tests. The detailed description of the test channels for measurements is provided
in Section 5.2. The measurement results verifying the anti-aliasing filter (AAF) and
the feed-forward equalizer (FFE) performance are discussed in Section 5.3. Finally,
Section 5.4 concludes this chapter.
5.1 Circuit Layout and Equipment Setup
A micrograph of the test-chip along with its pin names is shown in Fig. 5.1. The
test-chip consists of the AFE, the clock generation, the four 4-bit ADCs, and the
DeMUX. The AFE and the clock generation, the contributions of this work, occupy
152×86 μm2 and 243×140 μm2, respectively. All the measurements on this test-chip
were performed with on-die probing.
Fig. 5.2 shows the measurement setup that was used to verify the AFE function-
ality. The measurement equipments are listed below:
• Probe Station: Cascade Microtech Summit 9000
• Sig Gen (1): Agilent E8257D PSG analog signal generator (250 kHz - 67 GHz)
• Sig Gen (2): HP 83620B synthesized sweeper (10 MHz - 20 GHz)
• Sig Gen (3): HP 83650B swept signal generator (10 MHz - 50 GHz)
• Sig Gen (4): Rohde&Schwarz SMT 03 signal generator (5 kHz - 3 GHz)
• Centellax OTB3P1A 10-Gb/s PRBS generator
• Sony/Tektronix DG2020A data generator
56
5.1 Circuit Layout and Equipment Setup 57
1900μm
Freq. Div. & PI
AAF/FFE
152x86 μm2
Ck Gen
243x140 μm2
4 ADCs (4-b)
DeMUX
(4:16)
VDN
VSN
CLK
IN
DIN[0]
DIN[1]
DIN[2]
DIN[3]
DIN[4]
DIN[5]
ADR
EN
DataEN
VSN
VDN
RSTX
VSN
VDDO
DOUT
[6]
VSSO
DOUT
[5]
VSSO
DOUT
[4]
VDDO
DOUT
[3]
VSSO
DOUT
[2]
VDDO
DOUT
[1]
VSSO
DOUT
[0]
DOUT
[7]
CLKOUT
VSSO
1900 μm
VDD3
CLKB
VSS3
CLKBX
VCMCB
VBIASA
VDD3
VBIASF
VCMCA
CLKAX
AVS
CLKA
AVD
AVDF
RXIN
AVD
RXIP
AVS
VCMD
AVS
REFL
VSB
REFH
VDN
ITRRUN
VSN
PRDN
AVSF
VDN
IBIASM
IBIASP
Figure 5.1: AFE micrograph.
• Agilent Infiniium DCA-J 86100C digital communication analyzer
• HP 8565E spectrum analyzer (30 Hz - 50 GHz)
• Tektronix TLA 714 logic analyzer
• Agilent E3631A/E3620A dual output power supplies (×8)
• Narda 4346 180◦ Hybrid (2-18 GHz) (×3)
• Mini-Circuits ZX86-12G-S+ Bias-T (×4)
• Picosecond 5828A Ultra-Broadband amplifier (10 dB gain/ 14 GHz BW)
Table 5.1 provides a description for each pin. Two separate clocks, CLKA and
CLKB, are required for measurements. The differential CLKA/CLKAX signals drive
the AFE while the differential CLKB/CLKBX operate the 4 time-interleaved ADCs.
Both clocks have the same frequency but their phase difference is adjusted manually
to a quarter of a UI. The reason, which was explained in Section 4.3.3, is to align
58 5 Experimental Results
Bias-T
s
IN
OUT
DC
0.8
VDC
DU
T
Pro
be
-ca
rd
Sig
G
en
(3
)
RFOUT
10MHzOUT
18
0◦
Hyb
rid
OUT0
IN
OUT180
PR
BS
G
en
DOUT
CK
IN
DOUT
CK
IN
Am
plifie
r
IN IN
OUT
OUT
Atte
nu
ato
rB
ackp
lane
IN IN
OUT
OUT
IN IN
OUT
OUT
Bias-T
s
IN IN
OUT
OUT
Sig
G
en
(1
)
RFOUT
10MHzIN
Sig
G
en
(2
)
RFOUT
10MHzOUT
18
0◦
Hybrid
OUT0
IN
OUT180
18
0◦
Hyb
rid
OUT0
IN
OUT180
DC
Bias-T
s
IN IN
OUT
OUT
DC
Bias-T
s
IN IN
OUT
OUT
DCRXIP
RXIN
CLKA
CLKAX
CLKB
CLKBX
DC
P
ow
er
Supp
lies
OUT
AVDF
VDN
AVD
VDD3
VDDO
Logic A
naly
zer
DOUT[7:0]
CLK
OUT
Data G
enerator
OUT
ADR
EN
DATA
EN
DIN[5:0]
RSTX
PR
DN
Sig
G
en
(4
)
RFOUT
CLK
IN
10MHzIN
0.9
VDC
0.6
VDC
Figure 5.2: Measurement setup.
5.1 Circuit Layout and Equipment Setup 59
the edge of ADC clock to the mid-point of the hold phase for accurate sampling.
This is achieved by the phase-change capability of the two signal generators. The
input signal (RXIP/RXIN) and the clock signals (CLKA/CLKAX,CLKB/CLKBX)
are provided differentially to the chip with the help of a 180◦ Hybrid.
The pins, IBIASM and IBIASP, control the bias current through the main tap and
post-cursor tap of the AFE. ADCs can be tuned by RFH and REFL. The power to
the AFE, the clock generation, the ADCs, the logic, and the I/Os are supplied by
AVDF, AVD, VDN, VDD3, and VDDO.
Table 5.1: Description of the pin-list.
Pin name DescriptionRXIP/RXIN Input differential signal
CLKA/CLKAX AFE differential clockCLKB/CLKBX ADC differential clock
CLKIN Register clockVCMD Input data common-mode level
VCMCA AFE clock common-mode levelVCMCB ADC clock common-mode levelVBIASF Bias voltage for AFE clockVBIASA Bias voltage for ADC clockIBIASM Bias current for main-tapIBIASP Bias current for post-cursor tapREFH ADC reference voltage (high)REFL ADC reference voltage (low)
ADREN Enable address line in test registerDATAEN Enable data line in test registerDIN [5:0] input to test registerDOUT [7:0] Parallel outputCLKOUT Parallel synchronous clockRSTX Reset (active low)PRDN Power down
AVDF/AVSF AFE power supplyVDN/VSN ADC power supplyAVD/AVS Clock power supply (ADC+AFE)
VDD3/VSS3 Logic power supplyVDDO/VSSO I/O power supply
60 5 Experimental Results
5.2 Channel Measurements
Three test channels were used to evaluate the FFE performance. The channels, listed
in Table 5.2, are chosen such that they provide 11-13 dB of loss at the Nyquist
frequency (half the data-rate). All the channels include a 76” of SMA cable and an
FR4 backplane trace. The FR4 backplane, defined as n”-m”-n”, refers to 2 daughter
cards of length n” and a motherboard trace of length m” .
Table 5.2: Description of the test channels.
Name Description fb Measured Loss at fb/2Ch. 1 5”- 4”-5” backplane + 76′′ SMA Cables 10 Gb/s 13.3 dBCh. 2 5”-24”-5” backplane + 76′′ SMA Cables 5 Gb/s 13 dBCh. 3 5”-48”-5” backplane + 76′′ SMA Cables 2 Gb/s 11.7 dB
S-parameters of the three channels (ch1, ch2, and ch3) are measured using a vector
network analyzer (VNA) and the results are plotted in Fig. 5.3. These plots also
include the effects of a 10-dB broadband amplifier with a 6-dB attenuator that are
placed in data-path in order cover the entire range of the ADC.
5.2 Channel Measurements 61
1 3 5 7 9
-70
-50
-30
-10
10
Frequency (GHz)
[dB]
(a) Ch1 S21.
1 3 5 7 9
-70
-50
-30
-10
10
Frequency (GHz)
[dB]
(b) Ch2 S21.
1 3 5 7 9
-70
-50
-30
-10
10
Frequency (GHz)
[dB]
(c) Ch3 S21.
Figure 5.3: S21 plots.
62 5 Experimental Results
5.3 AFE Performance
We have evaluated the AFE performance as a stand-alone block without the CDR.
We have measured the AFE frequency response which is discussed in Section 5.3.1.
This frequency response shows that the AAF bandwidth scales with the data-rate and
also that FFE can be configured to obtain the desirable boosting levels. Section 5.3.2
explores the equalization capability by trying to adjust the FFE coefficient to open
the output eye which is closed otherwise. Finally, Section 5.3.3 shows the significance
of the AAF for cases when the channel bandwidth is not enough to band-limit the
input data.
The digital eye-diagrams, used in the measurements, correspond to the ADC sam-
ples taken at one DeMUX output. This data is imported to Matlab where it is
rearranged in time to construct the eye. The frequency of the receiver sampling
clock is set to have an offset with respect to the input data-rate so that the eye can
be scanned in the same way as the sampling head of the digital oscilloscope. This
frequency offset is chosen such that the entire eye is swept with 0.64%UI resolution.
5.3.1 Frequency Response of AFE
The frequency response of the AFE (at 10, 2, and 5 Gb/s) was measured and compared
against the simulations. At each data-rate, 3 sets of (Ib1, Ib2) were chosen based on
the simulations to obtain 0, 5, and 11 dB of boost at high-frequency. As explained in
Chapter 4, Ib1 and Ib2 control the bias currents of the main tap and the post-cursor
tap of the FFE. At each of these bias settings, 7 input tones were applied to the AFE,
the digital output eye was constructed, and the amplitude of the eye was converted to
dB value. Finally, the frequency responses were generated by plotting the eye-heights
in dB values versus their corresponding frequencies.
The 7 input tones are chosen such that their periods are close (but not equal) to
an integer multiple of the receiver sampling clock (Ckrx/N). If the period is exactly
an integer multiple of the receiver sampling clock, the output samples will coincide
when folded back in time. Therefore, they will not be able to scan the eye and the
output eye can not be constructed. To prevent this case, the input tone needs to have
a frequency offset with Ckrx/N which is chosen small enough to scan the output eye
with 0.64 %UI resolution.
In the examples that follow, the simulations were performed by importing the out-
5.3 AFE Performance 63
put samples of the AFE circuit schematic to Matlab’s Simulink. These samples were
processed by the Simulink models of 4 time-interleaved 4-b ADCs and the DeMUXes.
Figures 5.4, 5.5, and 5.6 show the measured versus simulated frequency responses for
data-rates of 10, 5, and 2 Gb/s. Similar bias currents have been used both in mea-
surements and simulations. The reason for using slightly different input frequencies
in simulations was to speed up the simulation time; the time to scan the entire eye.
This was achieved by reducing the DeMUX level in Simulink and therefore allowing
for higher time resolution.
Frequency (GHz)
100
101
-25
-20
-15
-10
-5
0
Frequency R
esponse (dB
)
Ib1
= 1.3 mA
Ib2
= 0 mA
Measured
Simulated
Ib1
= 600 μA
Ib2
= 170 μA
Ib1
= 320 μA
Ib2
= 210 μA
Figure 5.4: Measured vs simulated frequency response (10 Gb/s).
Comparing the frequency responses of 10, 5, and 2 Gb/s shows that bandwidth of
the AFE linearly scales with the data-rate. In all these cases, there is a close match
between the simulations and the measurements. The slight discrepancies in the low-
frequency portion of the response at 2 Gb/s (and partially at 5 Gb/s) are artifacts of
the measurement equipment as these frequencies were outside the operating range of
the 180◦ Hybrid.
5.3.2 FFE Performance
The feed-forward equalizer (FFE) has been verified at 3 data-rates of 10, 5, and 2
Gb/s. The backplane that was employed for each of these data-rates is explained in
Table 5.2. Figures 5.7, 5.8, and 5.9 show the FFE operation for 10, 5, and 2 Gb/s.
64 5 Experimental Results
Frequency (GHz)
-25
-20
-15
-10
-5
0
Frequency R
esponse (dB
)
Ib1
= 724 uA
Ib2
= 0 mA
Measured
Simulated
Ib1
= 436 μA
Ib2
= 161 μA
Ib1
= 272 μA
Ib2
= 204 μA
100
101
10-1
Figure 5.5: Measured vs simulated frequency response (5 Gb/s).
Frequency (GHz)
-25
-20
-15
-10
-5
0
Frequency R
esponse (dB
)
Ib1
= 404 uA
Ib2
= 0 mA
Measured
Simulated
Ib1
= 317 μA
Ib2
= 152 μA
Ib1
= 247 μA
Ib2
= 211 μA
100
10-1
Figure 5.6: Measured vs simulated frequency response (2 Gb/s).
5.3 AFE Performance 65
The input data is a 27-1 PRBS sequence at 10.0005, 5.0005, and 2.0004 Gb/s. In
each figure, plot (a) shows the output eye when the FFE is off while plot (b) shows
the eye-opening achieved by turning the FFE on.
The eye-openings that were achieved for 10, 5, and 2 Gb/s were 5LSBs (223mV),
5LSBs (281.5mV), and 6LSBs (380.4mV), respectively. The bias currents used to
achieve these eye-openings were taken from the frequency response plots (Fig. 5.4,
5.5, and 5.6) corresponding to the setting for the highest frequency boost.
Time (ps)
AD
C O
utput Levels
0 20 40 60 80 100
0
2
4
6
8
10
12
14
(a) FFE OFF.
Time (ps)
AD
C O
utput Levels
0 20 40 60 80 100
0
2
4
6
8
10
12
14
(b) FFE ON.
Figure 5.7: Data-rate = 10 Gb/s - Channel loss = 13.3 dB @ 5 GHz.
Time (ps)
AD
C O
utput Levels
0 40 80120
160 200
0
2
4
6
8
10
12
14
(a) FFE OFF.
Time (ps)
AD
C O
utput Levels
0 40 80 120 160 200
0
2
4
6
8
10
12
14
(b) FFE ON.
Figure 5.8: Data-rate = 5 Gb/s - Channel loss = 13 dB @ 2.5 GHz.
66 5 Experimental Results
Time (ps)
AD
C O
utput Levels
0 100 200 300 400 500
0
2
4
6
8
10
12
14
(a) FFE OFF.
Time (ps)
AD
C O
utput Levels
0 100 200 300 400 500
0
2
4
6
8
10
12
14
(b) FFE ON.
Figure 5.9: Data-rate = 2 Gb/s - Channel loss = 11.7 dB @ 1 GHz.
5.3.3 AAF Performance
The need for anti-aliasing filter arises at the lower range of the data-rate (i.e. 2 Gb/s)
when the interconnect is unable to band-limit the input signal. To create this scenario,
we used a total of 44” SMA cable in the data-path. The resulting attenuation of this
channel was measured to be 0.9 dB at 1 GHz. Next, we observed the output eye with
and without the AAF. To measure the case without the AAF, we used a test-chip
which directly connects the input to the ADCs (i.e. no AAF).
Fig. 5.10(a) shows the output eye when AAF was off while Fig. 5.10(b) shows the
output eye when AAF was on. In both cases, a 27-1 PRBS data is transmitted at
2.0004 Gb/s while the receiver sampling frequency is 2 GHz. It is clear from the figures
that the slopes of the eye opening are reduced when the AAF is on. Quantitatively,
the slope is reduced by a factor of 2.4. To illustrate the significance of the AAF, we
simulated the same conditions with a 2x ADC-based receiver [4]. Fig. 5.11 shows
that excluding the AAF prevents the CDR from locking. The proposed AAF is able
to restore the jitter tolerance to the accepted values.
5.3 AFE Performance 67
Time (ps)
AD
C O
utput Levels
0 100 200 300 400 500
0
2
4
6
8
10
12
14
(a) AAF OFF.
Time (ps)
AD
C O
utput Levels
0 100 200 300 400 500
0
2
4
6
8
10
12
14
(b) AAF ON.
Figure 5.10: Verification of the anti-aliasing filter.
10-1
10
101
10
With AAF
10
4 6
10
8
10
10
0
10-1
100
101
10
2
Jitter frequency, Hz
Jitter tolerance, U
I PP
10
No locking without AAF
Figure 5.11: Jitter tolerance comparison with AAF on/off.
68 5 Experimental Results
5.4 Summary
This chapter presented the circuit layout of the entire AFE test-chip that was de-
signed and fabricated in Fujitsu’s 7-metal 65-nm CMOS technology. The measure-
ment setup and the verification procedure was explained in detail. The plots for
frequency responses of the AFE were extracted from the measured output samples.
These results, which were in close proximity to the simulated ones, proved that the
frequency response of the AFE linearly scales with the data-rate.
A 27-1 PRBS sequence is transmitted to the test-chip to verify the anti-aliasing
filter (AAF) and feed-forward equalizer (FFE) operation. Turning on the AAF at 2
Gb/s, when the attenuation of the cables was only 0.9 dB, band-limited the received
signal. To verify FFE at 10, 5, and 2 Gb/s, a backplane was employed to introduce
about 13.3, 13, and 11 dB of attenuation at the corresponding Nyquist frequency.
Without the FFE, the output eye for each case was completely closed. With FFE,
we were able to open the eye for 5LSBs (223mV), 5LSBs (281.5mV), and 6LSBs
(380.4mV) at 10, 5, and 2 Gb/s.
6 Conclusions and Future Directions
This thesis presented the design of an analog front-end (AFE) targeting 2x blind
ADC-based receivers. The front-end consists of a combined anti-aliasing filter (AAF)
and 2-tap feed-forward equalizer (FFE), the required clock generation circuitry, 4
time-interleaved 4-b ADCs, and DeMUX. This design overcomes the limited data-
rate coverage of 2x blind ADC-based receivers and extends it to cover 2-10 Gb/s.
Current 2x blind ADC-based receivers [4, 5] linearly interpolate 2 samples, blindly
taken from the input, and extract the zero-crossing information. This interpolation,
however, causes erroneous estimations of the zero-crossings if the input contains sharp
transitions. To date, such receivers relied on the communication channel to prevent
aliasing. Our proposed front-end, in contrast, adjusts its bandwidth based on the
input data-rate. As a result, it does not depend on the channel to perform the anti-
aliasing and is able to cover a large range of data-rates.
The front-end employs an integration and dump (I&D) scheme to implement the
AAF and FFE in one block. The FFE is implemented without the need to design
noise-sensitive delay cells, which require delay calibration. The bandwidth of the AFE
is controlled by the integration time which is set to be half a unit interval (UI). As a
result, the data-rate automatically adjusts the front-end bandwidth. Once the design
was confirmed by Matlab’s Simulink tool and transistor-level Cadence simulations, it
was laid out and fabricated using Fujitsu’s 65-nm CMOS process.
The test-chip was measured and the results validated the proposed design. Digital
output eyes were constructed by taking samples from one DeMUX output. These
eyes were employed to generate plots of frequency response at 10, 5, and 2 Gb/s. At
each data-rate, 3 sets of (Ib1, Ib2) were chosen based on the simulations to obtain 0,
5, and 11 dB of boost at high-frequency. These plots validated the simulation results
by showing that the bandwidth of the front-end scales with input data-rate.
A 27-1 PRBS sequence was used at 2.0004 Gb/s to verify the anti-aliasing filter
(AAF) operation. The backplane consisted of a total of 44” of SMA cables that
exerted only 0.9 dB of attenuation at 1 GHz. Simulating the same conditions, showed
69
70 6 Conclusions and Future Directions
that without the AAF, the CDR was unable to lock. Turning on the AAF, band-
limited the received signal, such that the simulated jitter tolerance was restored to
its expected values.
A 27-1 PRBS sequence at 10.0005, 5.0005, and 4.0004 Gb/s was employed to verify
the FFE at 10, 5, and 2 Gb/s, respectively. To ensure that the output eye is closed
without the FFE, an external backplane was used to impose 13.3, 13, and 11 dB of
attenuation at the corresponding Nyquist frequencies. With FFE on, we obtained
vertical eye-openings of 5LSBs (223mV), 5LSBs (281.5mV), and 6LSBs (380.4mV)
at 10, 5, and 2 Gb/s, respectively.
Table 6.1 summarizes the measurements. The AAF/FFE consumes a total 2.4 mW
at 10 Gb/s and occupies 0.013 mm2 of the chip area. The clock generation circuitry,
which was not optimized and was only designed to verify the AAF/FFE functionality,
consumes 97.2 mW at 10 Gb/s and occupies an area of 0.034 mm2.
Table 6.1: Performance summary.
Technology 65-nm CMOSData-rate 2-10 Gb/sSupply 1.2 V
AAF/FFE Ck GenPower @ 10 Gb/s 2.4 mW 97.2 mWPower @ 5 Gb/s 2.2 mW 66 mWPower @ 2 Gb/s 1.6 mW 42 mW
Area 152 × 86 μm2 243 × 140 μm2
6.1 Thesis Contributions
The contributions of this thesis are:
• Proposal of an AFE whose bandwidth scales with the data-rate targeting 2x
blind ADC-based receivers.
• Design, implementation, and simulation of the proposed AFE that consists of
a combined AAF and 2-tap FFE.
• Measurement of the fabricated test-chip to validate simulations.
• Schematic and layout of the clock generation circuitry was mostly done by
Siamak Sarvari.
6.2 Future Work 71
6.2 Future Work
This section presents what we envision as the possible future directions for the de-
signed AFE targeting 2x blind ADC-based receivers.
The first area of improvement is to explore new circuits for the clock generation of
the AFE. Currently, the power of the test-chip is dominated by the clock generation.
For example, at 10 Gb/s the clock generation consumes 97.2 mW while the AAF/FFE
consumes only 2.4mW. During the design, the clock power consumption was not
optimized since the goal was to verify the functionality of the proposed AAF/FFE
first. As a result, it was ensured that the generated pulses driving the AFE are
clean and with sharp transitions. This was achieved by employing several stages of
power-hungry buffers and back-to-back inverters.
Secondly, the necessary logic can be implemented to make the equalization adap-
tive. In the designed front-end, coefficient of the equalization is set manually by
adjusting the bias currents of the main and post-cursor taps. The next step would
be to control these current via a DAC that is driven by the adaptation logic.
Thirdly, the phase difference between the ADC clock and the AAF/FFE should be
set automatically. As mentioned in Chapter 5, this phase difference is currently set
manually by the available signal generator in lab. The phase difference has to be set
to a quarter of UI so that the ADC samples the AAF/FFE output when it is valid.
This manual adjustment can be prevented by using a phase adjusting circuit to make
the system more robust.
Lastly, the entire AFE should be integrated with a 2x blind ADC-based receiver
on a test-chip. The bit error rate (BER) and jitter tolerance should be measured to
confirm the functionality of the front-end in conjunction with a real receiver.
References
[1] J.E. Rogers and J.R. Long. A 10-Gb/s CDR/DEMUX with LC delay line VCO.IEEE Journal of Solid-State Circuits, 37(12):1781–1789, 2002.
[2] R. Farjad-Rad, A. Nguyen, J. M. Tran, T. Greer, J. Poulton, W. J. Dally, J. H.Edmondson, R. Sentheinathan, R. Rathi, E. Lee, and H. T. Ng. A 33-mW 8-Gb/s CMOS clock multiplier and CDR for highly integrated I/Os. IEEE Journalof Solid-State Circuits, 39(9):1553–1561, 2004.
[3] M. Hossain and A.C. Carusone. A 6.8mW 7.4Gb/s clock-forwarded receiverwith up to 300MHz jitter tracking in 65nm CMOS. Proceedings of the 2010International Solid State Circuits Conference (ISSCC), pages 158–159, 2010.
[4] O. Tyshchenko, A. Sheikholeslami, H. Tamura, M. Kibune, H. Yamaguchi,J. Ogawa, and C. Sannomiya. A 5-Gb/s ADC-based feed-forward CDR in 65nmCMOS. IEEE Journal of Solid-State Circuits, 2010.
[5] H. Yamaguchi, H. Tamura, Y. Doi, Y. Tomita, T. Hamada, M. Kibune,S. Ohmoto, K. Tateishi, O. Tyshchenko, A. Sheikholeslami, T. Higuchi,J. Ogawa, T. Saito, H. Ishida, and K. Gotoh. A 5Gb/s transceiver with anADC-based feedforward CDR and CMA adaptive equalizer in 65nm CMOS.Proceedings of the 2010 International Solid State Circuits Conference (ISSCC),2010.
[6] H.-M. Bae, J.B. Ashbrook, J. Park, N.R. Shanbhag, A.C. Singer, and S. Chopra.An MLSE receiver for electronic dispersion compensation of OC-192 fiber links.Proceedings of the 2006 International Solid State Circuits Conference (ISSCC),pages 874–883, 2006.
[7] J. Cao, B. Zhang, U. Singh, D. Cui, A. Vasani, A. Garg, W. Zhang, N. Ko-caman, D. Pi, B. Raghavan, H. Pan, I. Fujimori, and A. Momtaz. A 500mwdigitally calibrated AFE in 65nm CMOS for 10 Gb/s serial links over backplaneand multimode fiber. Proceedings of the 2009 International Solid State CircuitsConference (ISSCC), pages 370–371, 2009.
[8] O. Agazzi, D. Crivellil, M. Huedal, H. Carrerl, G. Luna, A. Nazemil, C. Grace,B. Kobeissyl, C. Abidin, M. Kazemil, M. Kargarl, C. Marquez, S. Ramprasad,F. Bollol, V. Posse, S. Wang, G. Asmanis, G. Ealton, N. Swenson, T. Lindsay,and P. Vooisr. A 90nm CMOS DSP MLSD transceiver with integrated AFEfor electronic dispersion compensation of multi-mode optical fibers at 10 Gb/s.
72
References 73
Proceedings of the 2008 International Solid State Circuits Conference (ISSCC),pages 232–233, 2008.
[9] M. Harwood, N. Warke, R. Simpson, T. Leslie, A. Amerasekera, S. Batty, D. Col-man, E. Carr, V. Gopinathan, S. Hubbins, P. Hunt, A. Joy, P. Khandelwal,B. Killips, T. Krause, S. Lytollis, A. Pickering, M. Saxton, D. Sebastio, G. Swan-son, A. Szczepanek, T. Ward, J. Williams, R. Williams, and T. Willwerth. A12.5Gb/s SerDes in 65nm CMOS using a baud rate ADC with digital receiverequalization and clock recovery. Proceedings of the 2006 International Solid StateCircuits Conference (ISSCC), pages 436–437, 2007.
[10] PCI Express 3.0 Frequently Asked Questions, 2010. Available at HTTP: http ://www.pcisig.com/news room/faqs/pcie3.0 faq/.
[11] M. Horowitz, C. Yang, and S. Sidiropoulos. High-speed electrical signaling:Overview and limiations. IEEE Micro, pages 12–24, 1998.
[12] IEEE P802.3ap task force channel model material, 2006. Available at HTTP:http : //www.ieee802.org/3/ap/public/channel model/index.html.
[13] I. Fujimori. Will ADCs overtake binary frontends in backplane signaling? Inter-national Solid State Circuits Conference (ISSCC), 2009. Evening Session.
[14] E.A. Lee and D.G. Messerschmitt. Digital Communications. Kluwer, third edi-tion, 2004.
[15] J. Harrison and N. Weste. A 500MHz CMOS anti-alias filter using feed-forwardop-amps with local common-mode feedback. Proceedings of the 2003 Interna-tional Solid States Conference (ISSCC), 2003.
[16] T. Laxminidhi, V. Prasadu, and S. Pavan. Widely programmable high-frequencyactive RC filters in CMOS technology. IEEE Journal of Solid-State Circuits,56(2):327–336, 2009.
[17] A. Gharbiya and M. Surzycki. Highly linear, tunable, pseudo differentialtransconductor circuit for the design of Gm-C filters. Proceedings of the 2002IEEE Canadian Conference on Electrical and Computer Engineering, 2002.
[18] P. Pandey, J. Silva-Martinez, and X. Liu. A CMOS 140-mW fourth-ordercontinuous-time low-pass filter stabilized with a class AB common-mode feed-back operating at 550 MHz. IEEE Journal of Solid-State Circuits, 53(4):811–820,2006.
[19] R. Yuen, M. van Ierssel, A. Sheikholeslami, W.W. Walker, and H. Tamura. A5Gb/s transmitter with reflection cancellation for backplane transceivers. Cus-tom Integrated Circuits Conference (CICC), pages 413–416, 2006.
74 References
[20] A.C. Carusone, H. Cheng, and F.A. Musa. A 32/16-Gb/s dual-mode pulsewidthmodulation pre-emphasis (PWM-PE) transmitter with 30-dB loss compensationusing a high-speed CML design methodology. IEEE Transactions on Circuitsand Systems I: Regular Papers, 56(8):17941806, 2009.
[21] M. El Said, J. Sitch, and M. Elmasry. A 0.5m SiGe pre-equalizer for 10Gb/ssingle-mode fiber optic links. Proceedings of the 2005 International Solid StateCircuits Conference (ISSCC), page 224225, 2005.
[22] J. S. Choi, M. S. Hwang, and D. K. Jeong. A 0.18-um cmos 3.5-Gb/s continuous-time adaptive cable equalizer using enhanced low-frequency gain control method.IEEE Journal of Solid-State Circuits, 39(3):419425, 2004.
[23] D.H. Shin, J.E. Jang, F. OMahony, and C.P. Yue. A 1-mW 12-Gb/s continu-oustime adaptive passive equalizer in 90-nm CMOS. Custom Integrated CircuitsConference, pages 117–120, 2009.
[24] Y. Hidaka, W. Gai, T. Horie, J.H. Jiang, Y. Koyanagi, and H. Osone. A 4-channel 1.2510.3 Gb/s backplane transceiver macro with 35 dB equalizer andsign-based zero-forcing adaptive control. IEEE Journal of Solid-State Circuits,44(12):35473559, 2009.
[25] N. Krishnapura, M. Barazande-Pour, Q. Chaudhry, J. Khoury, and K. Laksh-mikumar. A 5 Gb/s NRZ transceiver with adaptive equalization for backplanetransmission. Proceedings of the 2005 International Solid State Circuits Confer-ence, pages 60–61, 2005.
[26] E. Sackinger. Broadband Circuits for Optical Fiber Communication. John Wiley& Sons, second edition, 2008.
[27] Y. Tomita, M. Kibune, J. Ogawa, W.W. Walker, H. Tamura, and T. Kuroda. A10-gb/s receiver with series equalizer and on-chip ISI monitor in 0.11-um CMOS.IEEE Journal of Solid-State Circuits, 40(4):986–993, 2005.
[28] J. Zerbe. High-performance wireline equalization: Issues, designs, and tradeoffs.International Solid State Circuits Conference (ISSCC), 2009. Forum 5.
[29] B. Razavi. Design of Analog CMOS Integrated Circuits. McGraw Hill, 2001.
[30] D. A. Johns and K. Martin. Analog Integrated Circuit Design. Wiley, 1996.
[31] B. Razavi. Design of Integrated Circuits for Optical Communications. McGrawHill, 2003.
[32] S. Sidiropoulos and Mark Horowitz. A 700-Mb/s/pin CMOS signaling inter-face using current imtegrating receivers. IEEE Journal of Solid-State Circuits,32(5):681–690, May 1997.
References 75
[33] F. Yang, J.H. O’Neill, D. Inglis, and J. Othmer. A CMOS Low-Power Multiple2.5-3.125-Gb/s Serial Link Macrocell for High IO Bandwidth Network ICs. IEEEJournal of Solid-State Circuits, 37(12):1813–1821, December 2002.
[34] M. Park, J. Bulzacchelli, M. Beakes, and D. Friedman. A 7gb/s 9.3mW 2-tapcurrent-integrating DFE receiver. Proceedings of the 2007 International SolidState Circuits Conference, pages 230–231, 2007.
[35] T.O. Dickson, J.F. Bulzacchelli, and D.J. Friedman. A 12-Gb/s 11-mW half-ratesampled 5-tap DFE with current-integrating in 45-nm SOI CMOS technology.Symposium of VLSI Circuits Digest of Technical Papers, pages 58–59, 2008.
[36] M. van Ierssel. Circuit Techniques for High-Speed Chip-to-Chip Signaling. PhDthesis, University of Toronto, 2006.
[37] The Mathworks, Inc. Using Simulink, 2002.
[38] Open Verilog International. Verilog-A Language Reference Manual, 1996.
[39] A. Emami-Neyestanak, A. Varzaghani, J.F. Bulzacchelli, A. Rylyakov, C.K. KenYang, and D.J. Friedman. A 6.0-mW 10.0-Gb/s receiver with switched-capacitorsummation DFE. IEEE Journal of Solid-State Circuits, 42(4):889–896, 2007.
[40] T.O. Dickson, K.H.K. Yau, T. Chalvatzis, A.M. Mangan, E. Laskin, R. Beerkens,P. Westergaard, M. Tazlauanu, M.T. Yang, and S.P. Voinigescu. The invarianceof characteristic current densities in nanoscale MOSFETs and its impact onalgorithmic design methodologies and design porting of Si(Ge (Bi)CMOS high-speed building blocks. IEEE Journal of Solid-State Circuits, 41(8):1830–1845,2006.