Analog-Circuit-Design-Scalable

ANALOG CIRCUIT DESIGN

Analog Circuit DesignScalable Analog Circuit Design,High Speed D/A Converters, RF PowerAmplifiers

Edited by

Johan H. HuijsingDelft University of Technology

Michiel SteyaertKU Leuven

and

Arthur van RoermundEindhoven University of Technology

KLUWER ACADEMIC PUBLISHERSNEW YORK, BOSTON, DORDRECHT, LONDON, MOSCOW

eBook ISBN: 0-306-47950-8Print ISBN: 0-7923-7621-8

©2003 Kluwer Academic PublishersNew York, Boston, Dordrecht, London, Moscow

Print ©2002 Kluwer Academic Publishers

All rights reserved

No part of this eBook may be reproduced or transmitted in any form or by any means, electronic,mechanical, recording, or otherwise, without written consent from the Publisher

Created in the United States of America

Visit Kluwer Online at: http://kluweronline.comand Kluwer's eBookstore at: http://ebooks.kluweronline.com

Dordrecht

Table of Contents

Preface vii

Part I: Scalable Analog Circuit DesignIntroduction 1

Scalable High-Speed Analog Circuit DesignM. Vertregt and P. Scholtens 3

Scalable High Resolution Mixed Mode Circuit DesignR.J. Brewer

Scalable “High Voltages” Integrated Circuit Design for XDSL Type ofApplicationsD. Rossi 43

23

57

71

89

113

115

151

171

189

Scalability of Wire-Line Analog Front-EndsK. Bult

Reusable IP Analog Circuit DesignJ. Hauptmann, A. Wiesbauer and H. Weinberger

Process Migration Tools for Analog and Digital CircuitsK. Francken and G. Gielen

Part II: High-Speed D/A ConvertersIntroduction

Introduction to High-Speed Digital-to-Analog Converter DesignR. van de Plassche

Design Considerations for a Retargetable 12b 200MHz CMOS Current-Steering DACJ. Vital, A. Marques, P. Azevedo and J. Franca

High-Speed CMOS DA Converters for Upstream Cable ApplicationsR. Roovers

Solving Static and Dynamic Performance Limitations for High Speed D/AConvertersA. Van den Bosch, M. Steyaert and W. Sansen

vi

High Speed Digital-Analog Converters – The Dynamic Linearity ChallengeA.R. Bugeja

A 400-MHz, 10-bit Charge Domain CMOS D/A Converter for Low-Spurious Frequency SynthesisK. Khanoyan, F. Behbahani and A.A. Abidi 233

211

Part III - RF Power AmplifiesIntroduction 247

249

269

Design Considerations for RF Power Amplifiers demonstrated through aGSM/EDGE Power Amplifier ModuleP. Baltus and A. van Bezooijen

Class-E High-Efficiency RF/Microwave Power Amplifiers: Principles ofOperation, Design Procedures, and Experimental VerificationN.O. Sokal

Linear Transmitter ArchitecturesL. Sundström

GaAs Microwave SSPA’s: Design and characteristicsA.P. de Hek and F.E. van Vliet

Monolithic Transformer-Coupled RF Power Amplifiers in SI-BipolarW. Simbürger, D. Kehrer, A. Heinz, H.D. Wohlmuth, M. Rest, K. Aufinger andA.L. Scholtz

Low Voltage PA Design in Standard CMOSK. Mertens and M. Steyaert 373

347

325

303

Preface

This book contains the revised contributions of the 18 tutorialspeakers at the tenth AACD 2001 in Noordwijk, the Netherlands,April 24-26.

The conference was organized by Marcel Pelgrom, Philips ResearchEindhoven, and Ed van Tuijl, Philips Research Eindhoven and TwenteUniversity, Enschede, the Netherlands.

The program committee consisted of:Johan Huijsing, Delft University of TechnologyArthur van Roermund, Eindhoven University of TechnologyMichiel Steyaert, Catholic University of Leuven

The program was concentrated around three main topics in analogcircuit design. Each of these topics has been covered by six papers.The three main topics are:

Scalable Analog Circuit DesignHigh-Speed D/A ConvertersRF Power Amplifiers

Other topics covered before in this series:

2000 High-Speed Analog-to-Digital ConvertersMixed Signal DesignPLL’s and Synthesizers

1999 XDSL and other Communication SystemsRF MOST ModelsIntegrated Filters and Oscillators

1998 1-Volt- ElectronicsMixed-Mode SystemsLow-Noise and RF Power Amplifiers for Telecommunication

vii

viii

1997 RF A-D ConvertersSensor and Actuator InterfacesLow-Noise Oscillators, PLL’s and Synthesizers

1996 RF CMOS Circuit DesignBandpass Sigma Delta and other ConvertersTranslinear Circuits

1995 Low-Noise, Low-Power, Low-VoltageMixed Mode with CAD TrialsVoltage, Current and Time References

1994 Low-Power Low VoltageIntegrated FiltersSmart power

1993 Mixed-Mode A/D DesignSensor InterfacesCommunications Circuits

1992 Op AmpsADC’sAnalog CAD

We hope to serve the analog design community with these series ofbooks and plan to continue this series in the future.

Johan H. Huijsing

Scalable high-speed analog circuit design

Maarten Vertregt and Peter Scholtens

Philips Research Eindhoven, The [email protected]

AbstractThe impact of scaling on the analog performance of MOScircuits was studied. The solution space for analog scalingwas explored between two dimensions: a “standard digitalscaling” axis and an “increased bandwidth and dynamic-range” axis. Circuit simulation was applied to exploretrends in noise and linearity performance under analogoperating conditions at device level and for a basic circuitblock. It appears that a single scaling rule is not applicablein the analog circuit domain.

1 IntroductionThe two-year cycle of successive technology generations [1] hasenabled an ever increasing amount of system integration per chip. Fora long time, this increase in integration density was satisfied by addingextra digital functions and memory. Nowadays, interfaces to theanalog world (both base-band and RF) are also packed onto thesesystems-on-chip.

J. H. Huijsing et al., (eds.), Analog Circuit Design, 3-21.© 2002 Kluwer Academic Publishers. Printed in the Netherlands.

3

In addition to the dominant “constant field” CMOS scaling trend, andthe associated continuous decrease of the power supply voltage, thereare other major hurdles for system integration. Increasing demands forextended dynamic range and signal bandwidth of modern integratedsystems must also be met (Figure 1, dynamic range is plotted in theresolution in bits of an A/D converter).

4

It is not necessarily true that the most advanced technologygeneration will have the highest value for the product of the dynamicrange and signal bandwidth (scaling towards the upper right-handcorner of the graph in Figure 1) [2]. Additional devices (highly linearcapacitors, gate-oxide for MOS transistors) can facilitate system-on-chip integration, since those “high quality” passives enable aperformance increase, and “previous generation” analog blocks caneasily be re-used (voltage levels are maintained).

The combination of doing a trend analysis and having additionaldevices available creates two problems. Firstly (when the totalfunction remains in a previous technology generation, because of thetime needed to create and characterize high quality passives), thedigital part of the system-to-be-integrated suffers from a lack offunction-density and an elevated supply voltage. This has a quadraticeffect on the dynamic power dissipation through Secondly(with combined use of state-of-the-art MOS transistors for digitalfunctions, and previous-generation MOS transistors for analogfunctions), the potential of the new technology is not exploited forthese analog functions. The approach of adding devices is thereforeuseful for porting functions, but is not interesting when identifyingscaling issues.

2 Scaling goals

5

Scaling of digital functions is directly coupled to feature sizereduction. Per function, this yields a combination of continuous areareduction, speed increase and dynamic power reduction (see [3] for anexample). Static power dissipation becomes a major limitation withthe integration of more functions at an increased density. Speed andpower improvement for digital functions is then done concurrently byselecting the optimum On/Off ratio of the MOS transistor for a certainapplication domain. The scaling space basically narrows down to twodimensions: on/off ratio vs feature size [4].

For analog functions, the goals of scaling are diverse. The focus canbe on area efficiency, with the continued availability of a function at afixed power dissipation, bandwidth and dynamic range. Alternately,the focus can be on the exploration of the ultimate bandwidthcapability, without limiting the power or area. It could also be onpushing the combined limits of dynamic range, bandwidth and power.The preferred scaling scenario heavily depends on the goal, and wemust sacrifice the performance in directions that have a lower priorityto obtain a feasible solution.

The basic quadratic MOS current/voltage relationships (see [5] forexample) are used to choose the relative change of the operatingpoints across technology generations, as well as to approximate (for alimited bias range only) the analog scaling rules of Table 1:

To the first order, the quasi DC distortion is dependent on thevariation across the signal amplitude through modulation of the firstorder derivatives

6

3 Scaling scenariosWe have applied several methods to explore analog scalability. Thefocus varies from general “power and SNR” considerations [6], toconcurrent “power, SNR, and linearity” optimization for a fixedbuilding block [7]. The focus also ranges from practical deviceartifacts, through compact model simulation [8, 9], to trend analysis atthe functional block level [10, 11].

Here, the solution space for scaling (expressed in the well-knownlinear scaling factor s=0.7 from generation-to-generation) is exploredusing three different cases:

Relaxed Dynamic Range (Digital scaling I)Standard digital scaling as in [3] for example. The focus is now onarea and power reduction-per-function. The performance metricsbeing sacrificed are linearity [9], and the signal-to-noise ratio

(a power ratio, assumed to be dominated by thermal noisein the denominator). Neither linearity nor SNR degradation have tobe a limiting factor when scaling a circuit, however, the fact thatthe will degrade by a factor per generation under thisscaling regime requires attention for wide-band circuits. Thelinearity degradation is consistent with the third harmonic interceptvoltage findings in [9]. In case of a dominant third harmonic,the expected signal-to-distortion ratio deteriorates by due to acombination of loss of intrinsic MOS gain andinsufficiently scaled with respect to the supply scaling (s).Relaxed area and power (Analog Scaling II)The major consideration is that analog circuits only occupy a minorportion of a system-on-chip. Area reduction is therefore not rankedas a top priority. Instead, with the application demands in mind, thefocus during scaling is on the concurrent performance increase interms of bandwidth and dynamic range increase at a fixedfrequency. Maintaining the linearity part of the dynamic rangerequires the signal amplitude at least to scale with the supplyvoltage. The effect of the noise part of the dynamic range is treated

7

as follows, where is the bandwidth of the circuit and is the rmssignal level:

Thus, for constant should remain constant, and theactive “Area” is now the metric to be sacrificed:

Applying the part of equation (4), we learn that a constantSNR requires a cubic increase of the transconductance, i.e. a cubicdecrease of the impedance level, to compensate for the lowersignal amplitude and the higher bandwidth. To reach this goal(bound by the feature-size scaling of L), equation (2) subsequentlydefines the to scale with and the W with From equation(3), it follows that the foremost sacrificed item is now gds, withFor the signal carrying parts of the circuit, the overall decrease inthe impedance level on the circuit nodes compensates for thissacrifice.Constant area and constant power (Analog Scaling III)The third scenario is a mixture of the previous two. A minor loss ofSNR is accepted to avoid an increased area and power dissipation,whilst the linearity performance is maintained at the level of the“Analog Scaling II” scenario.

The results of these three scaling strategies were evaluated usingcircuit simulation (on both the device and the basic circuit blocklevel). The use of circuit simulation safeguards the inclusion ofhigher-order impairments on performance (such as moderateinversion), and gives insight into the performance latitude (in signalamplitude, distortion and noise) of the scaled circuit. The scaling rulesapplied within these three strategies are summarized in Table 1:

8

4 Results of scaling4.1 Single device scalingCompact model simulation of single MOS devices [9] was applied forvarious technology generations. This identifies the impact of higher-order impairments on these approximate relationships and checks forthe validity of the selected operating range.

9

Figures 2 and 3 give an example how this validity check on the usablebasic square-law model operating range is done. This is shown for anNMOS device with in the generation. Forthe and the trans-conductance as function of gate-drive (Figure 2)and for the and the output conductance as function of source-drainvoltage (Figure 3).

We use the gate overdrive for all device biasing. By thischoice we circumvent that variations in threshold voltage (fordifferent device geometries, or for successive technology generations)will influence the actual analog operating point. At first glance we seefor this example in Figure 2 a more or less linear relationshipof for the limited range (and for theobvious condition of sustained saturation

At the edges of this range, part of the distortion caused by thisdeviation can be overcome by circuit design techniques such as adifferential circuit topology and/or feedback. We will have a betterlook on the trans-conductance linearity behavior for successivetechnology generations later on, and first inspect the outputconductance for an NMOS example in technology.

10

For the the variation for a 300mV signal swing is approximately

can therefore not be accommodated. Reliability reasons are not thecritical issue. This lack of headroom explains why the signal level andthe nominal gate overdrive in subsequent technology generationshave to scale down, at least proportionally with the power supply.

As shown in Figures 2 and 3, the trans-conductance and outputconductance derivatives give a better view on relevant analog deviceproperties (such as linearity) than the current characteristics. We nowrearrange these characteristics in a form allowing easy comparisonsacross technology generations. We do this by looking at the relativevariation of these conductance curves with respect to the drain-sourcecurrent. Figure 4 shows the trend for an NMOS (normalized toq/kT units) against for four subsequent technology generations.According to equation (2), we expected proportionality of with1/s through . Also this curve is plotted in Figure 4. Higherorder impairments present in the simulation model show up as a lossof improvement for (with an increasing deviationfor smaller Figure 5 shows the consequence of this effect whenan NMOS device is scaled through successive technology generations.

is scaled with s, according to the scaling rule choices of Table 1(from 300mV at the generation to l00mV at thegeneration).

an order of magnitude, at relatively low . Multiple devices have tobe stacked in the 1.8V supply for circuit reasons. A higher value for

12

Higher order impairments, present in the simulation model, show upas a deterioration of the expected scaling across technologygenerations, with a multiplication factor of approximately (for

We performed a similar exercise for the output conductance. Figure 6shows the trend for the "early voltage" as a function ofwhere is scaled with in the simulation. According to equation (3)we would expect proportionality of with through L.

We therefore verify this relationship in Figure 7 with a graph ofvs minimum feature size of successive technology generations. Higherorder impairments show up as a deterioration of scaling, with amultiplication factor of approximately

13

To summarize these observations on conductance scaling, we foundthat the intrinsically “obtainable gain” of the MOS transistor isa factor per generation lower than expected from the square-lawmodel. This affects all three scaling rule scenarios. We will nowverify the consequences of this deviation on circuit block level.

4.2 Circuit block scalingA basic voltage follower was applied to explore the bandwidth anddynamic range consequences at circuit level (Figure 8). As a startingpoint, an implementation in with a signal amplitude of 0.3V,targeted for a signal-to-noise-and-distortion of approximately 50dB,was used. The three scenarios listed in Table 1 were subsequentlyapplied, scaling backward to and forward totechnology.

14

Figures 9 to 11 show the results of the voltage-follower simulationexercise across four technology generations for the performancecriteria bandwidth, signal-to-noise ratio (SNR), and signal-to-distortion ratio (SDR). The circuit is current driven (with a diodegenerating the gate voltage of the tail current source), which meansthat we expect the deviations from the applied scaling rule primarilyto show up in the linearity, and not in the or the (beingrepresentative for the circuit bandwidth).

For this voltage-follower circuit, Figure 9 shows that the “digital I”scaling scenario gives the best bandwidth improvement acrosstechnology generations. The improvement factor 1.75x more or less

improvement for the generation is slightly less than expected.This means that the parasitic capacitances outside the scaled MOSdevices and the scaled external load are not scaling as well as before.

Figure 10 shows that, as expected from Table 1, the “digital I” scalingscenario is worst for thermal noise in this wide-band circuit. The aimto keep SNR constant is clearly fulfilled by the “analog II” scenario.The “analog III” scaling scenario creates the expected, minor,degradation in SNR. Due to the combination of a relatively reducedbandwidth and a lower impedance level (due to an increased the

generation consistently shows a slightly better SNR.

equals the expected value from Table 1. The bandwidth

17

Striking in the signal-to-distortion plot of Figure 11 is the tremendousimpairment on linearity that occurs throughout technology generationswhen “digital I” scaling is applied for this kind of wide-band circuitblock.

The drop in circuit linearity at higher frequencies is aconsequence of this generation’s lower bandwidth. We do not get the

low-frequency circuit linearity beyond 70dB in “reverse ”scaling from the initial circuit definition in We contribute thisto undefined higher order impairments far beyond the initial circuitlinearity. The severe degradation across two generations (from

to is approximately 10dB/generation (with thegeneration positioned slightly eccentric). This is a

combination of insufficient scaling of (by instead of and anadditional degrading factor affecting both analog scaling scenarios aswell (see below).

Contrary to the original expectation for the “Analog II” and “AnalogIII” scaling scenarios, we see no systematic linearity improvementacross technology generations (and some degradation for the DC-linearity in the case). We attribute this lack of voltage-follower linearity improvement to the additional loss of a factor s pergeneration of the intrinsically “obtainable gain” of a voltagedriven MOS transistor (shown by the device level simulations ofFigures 4 and 5). The proper current, bandwidth, and conductancelevels are maintained at the expense of an increased gate-drive andthus a degraded scaling. Within the context of the voltage followercircuit block the impairment therefore mainly shows up in thelinearity performance.

19

5 Discussion and functional block examplesThe application demands from Figure 1 have been confronted with thepractical consequences of analog scaling. Concurrent analog buildingblock improvements in area, and bandwidth and dynamic range cannotbe created through feature size scaling alone.

A drawback of using simulation as a forward-looking scaling tool isthe lack of model card parameters with an “analog” quality. This fateis a major reason for the time shift in applied technology generationbetween state-of-the-art digital circuits and high dynamic range analogcircuits.

Circuit topology plays a major role in attacking high-dynamic rangeapplication problems with low-dynamic range circuitry. Examples arein sigma-delta conversion (where single bit quantization is capable ofdelivering high dynamic range), in spread spectrum techniques (wherea very low or negative SNR is allowed without hampering propercommunication), or by applying dynamic correction techniques tocompensate for degraded device properties (as in wide-band nyquistA/D conversion). To highlight the last point; a 10bit A/D function wasimplemented using the technology generation in [12].In a scaled technology, this A/D function occupies 1 at a12 bit resolution, thanks to the “mixed-signal chopping andcalibration” circuit topology technique [13]. This scaled realizationresults in a 4-fold increase in dynamic range, and twice the bandwidthTogether with times more power consumption and a largerarea this means a performance improvement in one generation.

20

6 ConclusionsWe can therefore draw the following conclusions:

Concurrent improvement of bandwidth and dynamic range by afeature-size scaling rule results in a power and area increase ofsignal-carrying devices in critical blocks.Porting fixed functions benefits most from previous generationdevice availability and/or a higher power supply voltage(maintaining the original operating points and signal levels).However, the scaled technology is not exploited and scaling trendscannot be identified.Down-scaling analog circuits by applying a feature-size scaling ruledoes not fulfill the application demand. Circuit topologyimprovement does.

Therefore, new application domain demands are best served byemploying a mixture of scaling rules and by optimal system levelchoices.

7 AcknowledgmentsWe are grateful to Anne Johan Annema, Pierre Woerlee and Ronaldvan Langevelde for their constructive discussions and supportingmaterial.

8 References[1] ITRS 2000,http://public.itrs.net/Files/2000UpdateFinal/2kUdFinal.cfm[2] Kelly, D. et. al.,: "A 3V 340mW 14b 75MSPS CMOS ADC with85dB SFDR at nyquist", Technical Digest ISSCC, 2001, pp. 134-439.[3] Veendrick, Harry: "Digital goes Analog", Proceedings ESSCIRC1998, pp. 44-50.[4] Jurczak, M. et.al.: “Dielectric pockets-a new concept of thejunctions for deca-nanometric CMOS devices”, lEEE-Transactions-on-Electron-Devices (USA), vol.48, no.8, p. 1776-82, Aug. 2001[5] Razavi, B.: "Design of Analog CMOS Integrated Circuits",McGrawHill, 2001.[6] Vittoz, E.A.: "Low-power design: ways to approach the limits",Proceedings of ISSCC '94, San Francisco, CA, USA, 16-18 Feb. 1994.pp.14-18, 1994.

21

[7] Annema, Anne-Johan: "Analog circuit performance and process-scaling", IEEE tr. on Circuits and Systems II, Vol. 46, No. 6, June1999, pp. 717-725.[8] Pelgrom, M.J.M. et. al.: "CMOS Technology for mixed signalICs", Solid-State Electronics, Vol. 41, No. 7, 1997, pp. 967-974.[9] Woerlee P. et. al.: "RF-CMOS Performance Trends" IEEE-Transactions-on-Electron-Devices (USA), vol.48, no.8, p. 1776-82,Aug. 2001.[10] Walden, R.H.: "Analog-to-Digital Converter Survey andAnalysis" IEEE Journal on Selected Areas in Communications, Vol.17, No. 4, April 1999, pp. 539-550.[11] Bult, Klaas: "Analog Design in Deep Sub-Micron CMOS",Proceedings ESSCIRC 2000, pp. 11-17.[12] Ploeg, Hendrik van der et. al.: "A 3.3-V, 10-b, 25-MSample/sTwo-Step ADC in CMOS", IEEE Journal of Solid-StateCircuits (JSSC), Vol. 34, No. 12, December 1999, pp. 1803-1811.[13] Ploeg, Hendrik van der, et. al.: "A 2.5V, 12b, 54MSample/s0.25um CMOS ADC in Technical Digest ISSCC, 2001, pp.132–439.

SCALABLE HIGH RESOLUTION MIXED MODECIRCUIT DESIGN

R.J.BrewerAnalog Devices

Pembroke Road, Newbury RG14 1BX, U.K.bob. [email protected]

ABSTRACT

This paper discusses architectures for analog todigital interchange which are suitable forimplementation in deep sub-micron CMOS mixedmode technologies. Discussed in detail are successiveapproximation and low over-sampling ratio sigma-delta converters giving >12 bits resolution at orderMHz bandwidth. Also discussed are architecturespotentially suitable for operational amplifiersbuffering such converters, integrated in the sametechnology.

1. INTRODUCTION

The topic of “scalable high resolution mixed mode circuit design” ispotentially broad and the focus of this paper will be architecturessuitable for fabrication in deep sub-micron CMOS technology (DSM)which implement analog to digital interchange at bandwidths from DCto several MHz and with resolutions of 12 bits and above. Analogdesign in DSM is dominated by the reality that the process driver isdigital. Typically, a mixed mode DSM technology will lag indevelopment by about a year behind its digital substrate and comprisea digital process with the addition of reasonably linear doublepolysilicon capacitors with a layer of medium-resistivity polysiliconavailable to create non-trimmable resistors with matching no betterthan a few tenths of a percent with realistic values up to several tens of

23J. H. Huijsing et al. (eds.), Analog Circuit Design, 23-42.© 2002 Kluwer Academic Publishers. Printed in the Netherlands.

24

kohms. Moore’s Law famously identifies a trend line of a 70% lineargeometry shrink per 18 months; the current range of mixed modeprocesses widely available run as follows: 0.5um/5v to 0.35um/3v to0.25um/2.5v to 0.18um/1.8v. In many cases higher voltage devices arealso offered on the lower voltage processes; it is tempting to assumethese may be used for the analog sections of a mixed-mode circuit, butin some cases these higher voltage devices are intended for digital I/Oand have poor electrical properties which make them less, not more,suitable for analog circuits. Thus this paper assumes that scalablemeans using the small geometry devices. Another subtle assumption isthat application forces are driving up signal processing informationbandwidth and a driver in scalable design is to use the speed of theseprocesses: so that analog bandwidths of MHz are more interestingthan kHz.

Although this paper addresses scalable design, it is worth remarkingthat in many cases this may not be the economically optimum designapproach to the implementation of systems combining complex logicand high performance analog. There are several approaches toavoiding rather than solving the problem. In some DSM technologies,high performance analog devices with thicker gate oxide and highersupply voltage are made available permitting what is essentially ahybrid design approach on one substrate. However, this means thedigital section must carry the cost overhead of the dual gate oxide andother analog components, such as double polysilicon. If the number ofinterconnects required between the analog and digital sections issmall, it may actually be better to split the die. Package engineers arebecoming increasingly comfortable with dual paddle packages withdie-to-die bonding, or alternatively low pin-count packages arebecoming increasingly small: e.g. an MSOP-8 at just 3mmx4.9mm.

2. TRENDS IN DEVICE PROPERTIES

The most obvious effect of scaling is an approximately linear shrink inpermitted supply voltage allowing about 1 volt per 0.1 um minimumgate length. Unfortunately shrinking does little if anything for one ofthe dominating noise sources in CMOS data conversion design: kTCnoise. This results in an approximately linear compression with

25

scaling of signal to noise ratio. However, process designers do usuallyscale the threshold voltages to some degree with the shrink, so that theability to stack devices in, for example, an opamp design, doesbecome compressed with process scaling but relatively softly. Thusscalable design implies maximising the p-p signal range within thesupply voltage whilst some stacking of devices is still acceptable butbecomes increasingly undesirable. Some processes may also offeroptional low threshold devices for analog design; but note these maybe leaky if used as switches. A major problem is a rising 1/f (flicker)noise corner frequency: for example for a minimum length NMOSwith W/L=10 at 100uA the 1/f corner may be around 1MHz at 0.35umand escalate to tens of MHz at 0.18um. Amplifier stage gains are low(e.g. 25dB) and decline with scaling and, as already mentioned,stacking devices for cascoding is permitted but increasingly difficult.Finally leakage currents – between all terminals of the MOS device –rise with scaling. Scalable design implies accommodating all thesetrends.

3. ANALOG to DIGITAL CONVERSION

Two architectures are coming to dominate sub-micron and deep-sub-micron CMOS design in the resolutions and bandwidths discussedhere: SAR (successive-approximation) converters with electrically-trimmable capacitor-array DACs [1,2] and low-oversampling-ratiosigma-deltas [14-30].

For very high resolution conversion at low bandwidths sigma-deltaconverters with high oversampling ratios (>16, perhaps typically 128)are the predominant technology. These typically have architecturallysimple single loops with single bit quantization and noise shaping of2-4 orders. These work very well and are very scalable to very deepsub-micron. It may be expected that they will remain a verycommercially important class of converter but will not be discussedfurther here, where the focus is on medium bandwidth (of order MHz)leading edge architectures. At higher bandwidths (10-1000MHz) flashand pipelined converters dominate; again, these will not be discussedfurther here.

26

In the early to mid 1990s there was a wave of great interest in self-calibrated ADCs. Typical lithography permits analog elementmatching to around 12 bits resolution. To achieve high yields at 12bits, and performance beyond, wafer level laser trimming was thedominant technology. Self-calibration became an attractive alternativeapparently better suited to volume-manufactured CMOS technologies[3-5] giving a high resolution ADC capability to designers withoutaccess to laser trimming. There are two fundamental approaches. Thefirst uses some slow but high linearity method such as an integrator tocreate a very linear sequence of voltage levels which are then used tocalibrate the capacitor or resistor array which forms the workingconverter. The second relies on the observation that in an ideal linearbinary-weighted element (e.g. capacitor) array each element equals thesum of all the lesser weight elements (plus 1 LSB). Thus a calibrationalgorithm can clearly be devised which relies solely upon establishinginternal equalities, without reference to any absolute calibrationstandards. In either case, some on-chip memory is then required tostore calibration constants which are applied dynamically to someform of trim-DAC during conversions. This all works; however, thetrend seems to be away from self-calibration as it increasingly appearsthat similar performance can be achieved more economically andmore conveniently for the user with one-time electrical trimming asdiscussed below.

3.1 Capacitor Array Successive Approximation ADCs

The core element is a DAC implemented with an array of binaryweighted double-polysilicon capacitors. This is usually broken intotwo similar sections, making the upper and lower bits, linked by aseries capacitor which de-weights the lower bits. To achieve >12 bitsperformance various of these capacitors are trimmed by further smallarrays of capacitors which are switched in or out at test [1,2] (Fig 1).The capacitors are usually double plates of polysilicon with silicondioxide dielectric which are very mechanically and electrically stable.Since these structures are very stable post-manufacturing, the trim isusually once-only with a small on-chip PROM, often comprisingelectrically-blown polysilicon fuses. This architecture is relativelycheap to manufacture and easy to use in the end application as no

27

calibration cycles are needed. There are very few publicationsdescribing the internal detail of such converters but they are of greatcommercial importance.

A problem with SAR converters is that all bit trials are critical anderrors are non-recoverable. A popular trend is to incorporateredundancy into the bit trials with an algorithm which permits errorsmade in the earlier (MSB) bit trials to be corrected later [8-13]. Thisgives improved noise immunity and permits a higher sampling rate bypermitting accelerated bit trials. This also plays well to DSM scalingand mixed mode design. Various methods can be used, although allultimately serve the same purpose and overcome the same weaknessin a conventional binary weighted successive approximation searchalgorithm.

The problem is illustrated in Fig 2a. Assume the true input voltage isslightly below mid-scale, but the comparator makes an error andbelieves it lies slightly above. This error could be due to allowinginsufficient time for DAC settling (e.g. settling to the 10 timeconstants required for 0.005% accuracy), analog noise, digital noise orreference or ground bounce. It is obvious that with a simple binary

28

weighted search path there is no path through the search space whichrecovers the error. However, consider the search path of Fig. 2b with“one bit per bit redundancy”. After the first bit trial (with erroneousresult) the search space is shifted by one quarter of its span in thedirection of the result and the bit trial repeated. After this second bittrial the search space is halved normally according to the comparatorresult, but it is now certain that the true input value lies within the newreduced search space even with an error in the first comparator resultof ¼ of the span of the search space. This algorithm converges on thecorrect result with a tolerance for error of ¼ of the current searchspace at each bit trial. Twice as many bit trials are required but eachmay be many times faster with, for example, DAC settling of only afew time constants.

In practice, one redundant bit per bit may be excessive. Somedesigners favour only one redundant bit in the whole array, typicallyaround half way down the sequence of bit trials. Others favourperhaps one redundant bit per 4 bits; this latter still allows large errorsbut only increases the number of bit trials by 5/4.

A conceptually elegant alternative which achieves the same result is asuccessive approximation with an array with slightly less than binaryweighting, as in Fig 2c. It will be apparent from inspection that thisalso allows a search path which converges on the correct result despitemoderate errors on the way. However, it suffers the majordisadvantage of using non-binary-weighted elements. The analogelements, typically capacitors, are difficult to make with accurate non-

29

binary weighting, and the results of the non-binary bit trials requiresignificant computation to map onto a binary coded output word.

As supply voltages are reduced with scaling there is generally a less-than-proportional decrease in MOS threshold voltages, together with aneed to make the signal range as close to the full supply voltage aspossible, so that on-resistance in the input sampling switches becomeincreasingly important as the effective over-voltage reduces. This iseased by pumping or boot-strapping the gate voltage on the NMOSswitches but of course this ideally should be as much as possiblewithout exceeding the processes absolute maximum voltage rating.Gate boot-strapping methods have been developed to do thisaccurately [6,7,17].

The above collection of design features make for cheap, easy to useand robust converters which are suitable for scaling to deep sub-micron CMOS processes. Successive approximation converters are thesubject of very little published work but of great commercialimportance which is likely to continue into the foreseeable future. Forevery paper published on the SAR solutions there are likely ten on thesigma-delta; but for every sigma-delta ADC manufactured there arelikely ten SAR. Converters with signal bandwidths in the range 1-10MHz and resolutions in the range 12-16 bits are likely achievable inthe near future at time of writing using both the SAR architecturesdiscussed above and the low oversampling ratio sigma-delta discussedbelow as the two approaches increasingly converge.

3.2. Low Oversampling Ratio Sigma-Deltas

There is an observable architectural evolution or fashion trend at theleading edge:

very high order single-bit single-loop: e.g. -order shaping such asthe established CS5396 or AD7722single-bit multi-loop: e.g. 4 loops of order 2+1+1+1 ( order) suchas the established more recent AD7723

30

multi-bit single-loop or multi-loop designs using bit-shuffling tosuppress non-linearity in the multi-bit DAC: e.g. 3 loops of order2+1+1 order) (3x4)-bit [15]

For high resolution at low bandwidths simple (e.g. order) single-bitsingle-loop modulators appear to have a commercially very importantlong term future. However, we focus here on the information-bandwidth leading edge, where the performance of interest could be16 bits at 2Ms/s or 20 bits at 100ks/s (which have the same resolvableinformation rate).

Increasing the order of the noise shaping in single-bit single-loopdesigns has apparently reached its natural limit as instability limits thebenefit of increasing orders of noise shaping. Converters have beenmade (and are commercially successful products from more than onecompany) with order loop filtering, but in truth order probablyrepresents a useful maximum. Multi-loop and multi-bit designsachieve higher performance but there exists a very wide range ofarchitectural alternatives. We will now discuss the optimisation ofsuch architectures.

The principle of multi-loop architectures (Fig 3.) is that thequantization noise in the first loop is copied as an analog voltage into

31

a second loop where it is measured, to be subtracted in the digitaldomain. This process may be repeated or cascaded indefinitely,resulting in a theoretical possibility of any desired SNR.

The main limitation is the accuracy of the analog copy from the firstto second loop; thus the lower the quantization noise in the first loop,the lower the accuracy requirement of this analog amplifier. Thissuggests that multiple orders of noise shaping and multiple bits ofquantization be pushed into the first loop. However, aggressive noiseshaping in the first loop may result in stability problems which limitthe achievable shaping, as is well known. Also, if multi-bitquantization is used in the first loop, it must be bit-shuffled as itsintegral non-linearity otherwise appears directly in the digitisedoutput. If the multi-bit quantizers are in second or subsequent loops,which are converting quantization noise and not signal, bit-shuffling islikely not needed. Further, the number of comparators needed in themulti-bit flash converters is minimised if the multi-bit quantization isspread across multiple flash converters in multiple loops: for example,if 6 bits of quantization are chosen, this requires 64 comparators inone converter but ideally 3x3=9 comparators if split into three 2-bitconverters. Whether dither is required in the first loop is also arguable:for very low residual tones (e.g. <-100dB) in the digitised output,dither may be useful to ease the demands on the difference amplifierwhich must copy any tones generated by the first loop into the secondloop for digitisation and cancellation.

Considering scalability to very deep sub-micron geometries, it isgenerally true that the analog difference amplifier which copies thequantization noise out of the first loop into the second becomesincreasingly difficult to make accurate. At the same time, the multi-bitflash converters tend to take less die area and power; similarly the bit-shuffling required for multi-bit quantizers in the first loop. Thusscaling tends to favour pushing more noise shaping and more bits ofquantization into the first loop, possibly to the extreme of movingback to a single loop.

It will be clear from the above that there are contradictory argumentsin optimising a sigma-delta architecture. All will thus be compromises

32

and the optimisation appears quite flat and broad, permitting differentsensible designers to adopt very different solutions. Extremearchitectural optimisations seen from different designers within onecompany are illustrated entertainingly by the AD7722 (unpublished)and AD9260 [17]: one is a 1-loop, 1-bit order design; the other hasone-and-a-half loops with a 5-bit order loop followed by a 12-bitunshaped pipeline converter – thus 1.5 loops, 17-bit order (Fig. 4).Both doubtless seemed sensible decisions at the time (and work).

The above examples probably represent extremes of sensible designoptimisations. A reasonable view could be that a good scalableoptimised architecture has parameters in the following ranges ...

a total of 4-5 orders of noise shaping: fewer than 4 is undoubtedly“leaving potential performance on the table” while greater than 5orders probably buys little improvementa total of some 4-8 bits of quantization: a few bits of quantizationbring significant improvements, perhaps particularly in loop

33

stability characteristics, but circuit complexity increases rather non-linearly with too many bitstwo or maybe three cascaded loops; (although, a single noise-shaped loop could also be a well optimised solution, as multi-bitquantization eases the loop stability criteria and permits veryaggressive higher order noise shaping)at least 2 orders of noise shaping and 2-3 bits of quantization in thefirst loop, to ease the required accuracy of the amplifier whichcopies the quantization noise into the second loopthe remaining bits of quantization spread as much as possible andprobably not bit-shuffledbit-shuffling of the first loop multi-bit quantizer of coursepossibly dither in the first loop.

Such an architecture has characteristics which should make itpotentially scalable to very deep sub-micron CMOS.

Bit-shuffling algorithms are the subject of much research andbecoming increasingly complex but effective. The principle is that amulti-bit flash ADC and then DAC in the noise shaping loop increasesthe SNR and also, importantly, increases the loop stability undernormal and overload conditions. However, in the first loop where thesignal is being directly digitised, any integral non-linearity in the loopfeedback DAC will appear directly in the digitised signal. However,this non-linearity may be converted to shaped noise by variousversions of the old idea of dynamic element matching [31]. If a 3-bitDAC is made of 8 supposedly matched elements, if these elements areused in randomised combinations to create the 8 voltage levels then itis obvious that this maps non-linearity due to element mis-match intonoise. As always the detail is more complex but work in this area isrelatively well published [14-30].

34

4. OPAMP ARCHITECTURES

The A/D converters discussed above are very difficult to drive. Thecapacitor-based successive approximation converters typically samplethe input voltage directly onto the array capacitance to form a simpleinherent sample-and-hold action. The capacitance is of order a fewtens of picoFarads and this will not reduce with scaling as kTC noiseis typically the determinant of the noise floor. Sigma-delta convertersare different but not necessarily more benign in that sampling is on aslightly smaller capacitance but at a higher (over-) sampling rate. Thisputs a very challenging charge-gulp recovery requirement on thedriving opamp in addition to the straightforward challenge ofachieving SINAD in the range 70-100dB at several MHz signalbandwidth with near rail-to-rail signal swing. Inherently, switchedcapacitor A/D converters, whether SAR or sigma-delta, do not havevery good DC accuracy due to MOS switch injection. However it isquite easy to self-calibrate this out in the ADC, to create a DCaccuracy specification which cannot be met by any opamp which isnot trimmed or auto-zeroed. There are few discrete bipolar opampswhich can meet this challenge and fewer which can be integrated inCMOS; and fewer still which are scalable.

The process scaling issues which directly impact opamp design are:supply voltage compression, low stage gain with reducing stackabilityand rising 1/f noise corner frequency. Conventionally CMOS opampsare designed using an architecture originally developed for bipolarimplementation with +/-15v supplies (which in turn derives fromvacuum tube designs): this typically comprises two gain stages with adifferential input pair transconductance stage driving a single stageMiller integrator via a current mirror differential to single endedconverter. We assume this is very familiar. However, it is badlyeffected by all the scaling issues summarised above. The followingdiscussion assumes an objective of a scalable opamp architecturesuited to buffering A/D and D/A converters with >12 bits resolutionover a signal bandwidth from DC to several MHz.

In the light of the scaling issues summarised above, we postulate thatan opamp architecture is “scalable” if it offers:

35

“rail-to-rail” signal swing: this is most important at the outputwhere a reasonable target is a signal p-p swing of >75% of VDDbut is also desirable as a common mode range at the input for highimpedance unity gain bufferingmultiple gain stages: e.g. 5 inverter stagessuppression of low frequency flicker noise (which of course alsobrings good DC performance).

Rail-to-rail output stage design is well known [32] and the textbookmethods appear adequately scalable with stacking of threshold andsaturation voltages which can be accommodated within the shrinkingsupplies. However the equivalent textbook input stages [ibid.] withrail-to-rail common mode (CM) range appear challenging for highperformance (MHz bandwidth and low THD) applications because ofinput offset shifting over the CM range as the input stage transitionsbetween N- and P-MOS conduction.

Two scalable architectures are shown. Both can use rail-to-rail outputstages to maximise p-p signal swing and thus SNR. Both are scalablein that they further meet the requirements of multiple (five) gainstages at low frequencies and suppression of flicker noise whilstretaining wide signal bandwidth and the capability of low distortion atorder MHz bandwidths.

The design shown in Fig. 5 (unpublished) essentially splits the signalby frequency into three paths which are recombined by summing the

36

outputs of transconductance stages, with the lowest frequenciespassing through a chopped or auto-zeroed path with 5 gain stageswhile the high frequencies have a short un-chopped 2-stage path. Thisdesign only works well over a limited input CM range as the inputdifferential pairs require some voltage headroom and if used in a non-inverting configuration the harmonic distortion will be limited by theinput pair common mode rejection ratio. Further, the chopped or auto-zeroed path has a bandwidth much lower than the full signalbandwidth and cancellation of DC offsets and low frequency flickernoise effectively relies on time-averaging the input offset to zero. It isthus intolerant of non-linear transient overloads which will generallynot average correctly to zero.

The low frequency path in this design can use any of the wide varietyof chopping and auto-zeroing methods. This field has itself been thesubject of a comprehensive review paper so will not be discussedfurther here [33].

It is thus suited to applications where the input common mode rangeand frequency spectrum are well defined and thus known to lie withinthe architecture’s limitations. It has been used to the author’sknowledge successfully as a D/A converter output reconstructionbuffer and driver, with application-specific implementations in both0.6um with 5v supply and 0.35um and 3v supply. The application in0.6um delivers 6v p-p (differential) into 1 kohm with –85dB distortionat 300kHz; the 0.35um driver application delivers 4v p-p (differential)into just 8 ohms load with a class A/B output stage with order 10mmwide output MOS devices with –75dB distortion at 140kHz [Hurrell –private communication]. This architecture should scale well to smallergeometries and lower supply voltages.

37

The design in Fig. 6 (unpublished at time of writing) has an inherentlyrail-to-rail wide bandwidth input CM range and is tolerant of non-linear transients, which makes it better suited as an A/D converterbuffer where the input signal is undefined. However it contaminatesthe signal with some (few millivolts) level of modulator frequency(>100MHz) noise which must then be suppressed by further filtering:for example, when driving an A/D converter with a 20pF capacitancea series resistance of around 500 ohms is necessary to provide thenecessary attenuation of modulator feed-through. Alternatively, withsome ADC architectures the ADC may be operated synchronouslywith the modulation at a somewhat reduced modulation frequency andthe output of the opamp sampled twice per clock cycle with the signalvoltage taken as being the sum or average of the two samples.

The input voltage is modulated up to a frequency well beyond thesignal bandwidth; in this example a modulation frequency in the range20-200MHz is practical in geometries in the range 0.6-0.25um withsignal bandwidths of a few MHz. It is AC amplified by a 2-stageamplifier and demodulated back to base-band with a demodulatorwhich incorporates a differential to single ended conversion, followedby a 3-stage integrator. With 3 gain stages the integrator requires aninternal nested pole to preserve stability. The use of two gain stages inthe AC amplifier does not greatly degrade the loop stability as the firststage must be run at quite high current levels to achieve adequatelylow thermal noise, resulting in it having very low delay.

38

Whilst this is a functional design, it benefits greatly from two keyimprovements, as follows (Fig. 7). It will be apparent that, with 5 gainstages and a nested pole, the amplifier’s transient response will tend tobe poor. A transconductor (Gm stage) is thus added leapfrogging the3-stage integrator. The 3-stage integrator is merged with thetransconductor via a resistor with a value R=l/Gm. An analysis of thiscompound structure shows that it combines much of the lowfrequency gain of the 3 stages and the transient behaviour of thesimple transconductor.

A further improvement is to incorporate the passive current filteringnetwork shown between the demodulator and the integrator. Analysiswill show that this network has a band-stop current transfer functionwith zero phase shift at a selected high frequency (Fig. 8), chosen tobe the amplifier’s unity gain frequency.

39

In this example, optimised for an opamp with unity gain bandwidth of40MHz and a maximum signal frequency of 1 MHz, it is seen that theeffect of the filter is to permit a factor 3 reduction in integrator timeconstant to give 3x loop gain increase at the maximum signalfrequency with zero phase loss at the unity gain bandwidth. Withsuitable optimisation of component values this permits a significantreduction in the value of the integrator time constant without loss ofoverall loop phase margin, with a corresponding increase in gain andthus reduction in harmonic distortion at the higher end of the signalspectrum.

This architecture has been implemented in 0.6um with 5v supplyachieving –80dB THD at 500kHz as a 2.5v p-p follower.

5. CONCLUSION

This review paper has identified the issues facing the designer ofADCs, DACs and buffering opamps which are: inherently robust inDSM CMOS from 0.5um / 5v down to 0.18um / 1.8v and potentiallyfurther; and achieve resolutions of >12 bits at bandwidths up toseveral MHz. Architectures which meet these requirements have beendiscussed.

6. REFERENCES

(successive approximation converters)1) “A Two-Stage Weighted Capacitor Network for D/A-A/DConversion” Yee, Terman and Heller, IEEE Jnl. of Solid StateCircuits, Vol. 14, pp. 778-781, Aug. 19792) “A Low Power 12b Analog to Digital Converter with On-ChipPrecision Trimming” de Wit et al. IEEE Jnl. of Solid State Circuits,Vol. 28, pp. 455-461, Apr. 1993(self-calibration)

40

3) “A Self-Calibrating 15 bit CMOS A/D Converter” Lee, Hodges andGray, IEEE Jnl. of Solid State Circuits, Vol. 19, pp. 813-819, Dec.19844) “Architecture and Algorithm for Fully Digital Correction ofMonolithic Pipelined ADCs” Soenen and Geiger, IEEE Trans.Circuits and Systems II, Vol. 42, pp 143-153, March 19955) “200mW 1Ms/s 16-b Pipelined Converter with an On-chip 32-bMicrocontroller” Mayes et al., IEEE Jnl. of Solid State Circuits, Vol.31, pp. 1862-1872, Dec. 1996(pumped switches)6) “Two-phase Bootstrapped CMOS Switch Drive Technique andCircuit” Singer and Brooks, USP 6118326, Sep. 20007) “Very Low-Voltage Digital-Audio Delta-Sigma Modulator with88dB Dynamic Range Using Local Switch Bootstrapping” Dessoukyand Kaiser, IEEE Jnl. of Solid State Circuits, Vol. 36, pp. 349-355,Mar. 2001(bit trial error correction algorithms)8) “A 16 bit 500ks/s 2.7v 5mW ADC/DAC in 0.8um CMOS usingError-correcting Successive Approximation” Schofield, Dedic andKemp, Proc. European Solid-State Circuits Conference,Southampton, 19979) “Successive Approximation Type Analog to Digital Converter withRepetitive Conversion Cycles” Dedic and Beckett, USP 5870052,Feb. 199910) “Method for Successive Approximation A/D Conversion” Cooperand Bacrania, USP 4620179, Oct. 198611) “Analog to Digital Conversion with Multiple Charge BalanceConversions” Cotter and Garavan, USP 5621409, Apr. 199712) “Charge Redistribution Analog to Digital Converter with ReducedComparator Hysteresis Effects” Hester and Bright, USP 5675340,Oct. 199713) “Algorithmic Analog to Digital Converter Having Redundancyand Digital Calibration” Kerth and Green, USP 5644308, July 1997(multibit sigma delta modulators)14) “An Audio ADC Delta-Sigma Modulator with 100dB PeakSINAD and 102dB DR Using a Second-Order Mismatch-ShapingDAC” Fogleman et al., IEEE Jnl. of Solid State Circuits, Vol. 36, pp.339-348, Mar. 2001

41

15) “A 90dB SNR 2.5MHz Output Rate ADC Using CascadedMultibit Delta Sigma Modulation at 8x Oversampling Ratio” Fujimoriet al., IEEE Jnl. of Solid State Circuits, Vol. 35, pp. 1820-1828, Dec.200016) “113dB SNR Oversampling DAC with Segmented Noise shapedScrambling” Adams, Nguyen and Sweetland, IEEE Jnl. of Solid StateCircuits, Vol. 33, pp. 1871-1878, Dec. 199817) “Cascaded Sigma-Delta Pipeline A/D Converter with 1.25MHzSignal Bandwidth and 89dB SNR” Brooks et al., IEEE Jnl. of SolidState Circuits, Vol. 32, pp. 1896-1906, Dec. 199718) “Tree Structure for Mismatch Noise-Shaping Multibit DAC”Keady and Lyden, Elec. Letters, Vol. 33, pp. 1431-1432, Aug. 199719) “A 74dB Dynamic Range 1.1 MHz Signal Band Order 2-1-1Cascade Multibit CMOS Sigma Delta Modulator” Madeiro et al.,Proc. European Solid-State Circuits Conference, Southampton,199720) “Delta-Sigma Data Converters” Norsworthy, Schreier and Temes,IEEE Press, 199721) “A Monolithic 19 bits 800kHz Low Power Multibit Sigma DeltaModulator CMOS ADC Using Data Weighted Averaging” Nys andHenderson, Proc. European Solid-State Circuits Conference, pp.252-255, Southampton, 199622) “A Low Oversampling Ratio 14-b 500kHz Delta-Sigma ADCwith a Self-Calibrated Multibit DAC” Baird and Fiez, IEEE Jnl. ofSolid State Circuits, Vol. 31, pp. 312-320, Mar. 199623) “Linearity Enhancements of Multi Bit Delta-Sigma D/A and A/DConverters using Data Weighted Averaging” Baird and Fiez, IEEETrans. Circuits and Systems II, Vol. 42, pp753-762, Dec. 199524) “A high Resolution Multi Bit Sigma Delta Modulator withIndividual Level Averaging” Chen and Leung, IEEE Jnl. of SolidState Circuits, Vol. 30, pp. 453-460, Apr. 199525) “Data-directed Scrambler for Multi-Bit Noise-Shaping D/AConverters, Adams and Kwan, USP 5404142, Apr. 199526) “Noise Shaped Multi Bit D/A Converter Employing UnitElements” Schreier and Zhang, Elec. Letters, Vol. 31, pp. 1712-1713,199527) “A High Resolution Multi Bit Sigma Delta Modulator with DigitalCorrection and Relaxed Amplifier Requirements” Sarhang-Hejad and

42

Temes, IEEE Jnl. of Solid State Circuits, Vol. 28, pp. 648-660, June199328) “Fourth Order Two Stage Delta Sigma Modulator using both 1 Bitand Multi Bit Quantizers” Tan and Eriksson, Elec. Letters, Vol. 29,pp. 937-938, May 199329) “Multi Bit Sigma Delta A/D Converter Incorporating a NovelClass of Dynamic Element Matching Technique” Leung and Sutarja,IEEE Trans. Circuits and Systems II, Vol. 39, pp. 35-51, Jan. 199230) “A 50MHz Multi Bit Sigma Delta Modulator for 12 Bit 2MHzA/D Conversion” Brandt and Wooley, IEEE Jnl. of Solid StateCircuits, Vol. 26, pp. 1746-1756, Dec. 199131) “Current Distribution Arrangement for Realising a Plurality ofCurrents having a Specific Very Accurately Defined Ratio Relative toEach Other” van de Plassche, USP 4125803, Nov. 1978(operational amplifiers)32) “Design of Low-power Low-voltage Operational Amplifier Cells”Hogervorst and Huijsing, Kluwer Academic Pub., 199633) “Circuit Techniques for Reducing the Effects of OpampImperfections: Autozeroing, Correlated Double Sampling andChopper Stabilisation” Enz and Temes, Proc. IEEE, Vol. 84, pp.1584-1614, Nov. 1996

SCALABLE “ HIGH VOLTAGES” INTEGRATED CIRCUIT DESIGNFOR XDSL TYPE OF APPLICATIONS

Domenico ROSSITelecommunication and Peripheral/Automotive Group

Wireline Communication DivisionST Microelectronics, 20041 Agrate Brianza, V.Olivetti 2, Italy

ABSTRACT

Service providers are largely adopting ADSL technology and telcos to deliverhigh-speed data communication over traditional copper twisted pair.Continuous growth of this market has led to new requirements for lower cost,higher transmission bandwidth, improved power efficiency and longer reach.Most of these targets are heavily depending on the electrical performances ofXDSL Line Drivers and Receiver which for cost reasons are, nowadays, oftenembedded with other functions. This paper describes most recent advances insemiconductor technology and design techniques specifically adopted tocomply with these technical demands. Practical examples of Line Driverrealized in different technologies and adopting different circuit architectures arealso reported.

INTRODUCTION AND TUTORIAL ON SYSTEM REQUIREMENTS.

XDSL technology features significant improvements in data transmissioncompared to traditional analog modems by combining advanced signalprocessing techniques (digital modulation, digital equalizat ion, errorcorrections, etc) with high performance analog interfaces.To better understand what such analog interfaces (from hybrid to the linedrivers) asks for and how this translates into specific requirements forsemiconductor technologies and design skills, a short tutorial XDSL system toplevel requirements is here reported.For sake of simplicity, this tutorial is here limited to ADSL, but most of theconsiderations here done, are easily extendable to any XDLS transmission.


44

Moreover, the electrical characteristics of an ADSL analog front-end, such asline driver linearity are particularly stressed in case of DMT, themo/demodulation technique typical of this kind of transmission.As said before, ADSL relies on DMT modulation to carry digital data. Forinstance, ADSL spectrum is composed by individual sub -bands QAMmodulated and uniformly spaced in frequency 4.3125KHz apart and extendingup to 1.1 MHz (see Figure 1-a).

Viewed in the time domain, a DMT signal appears as a pseudo -random noisetypically having low rms voltage level (see Figure 1 -b), but ADSL Line drivershave to be also capable of delivering high voltage peaks that sometimes occur.

45

Apart from voltage ratings, intermodulation is another key feature to carefullylook at. To preserve signal integrity, the information contained in each sub -bandhas not to be corrupted by any signal from other sub -bands. MTPR ( multi-tonepower ratio) expressed as the relative difference expressed in dBc between themeasured power in a sub-band left empty and the power of another sub -band, isthe parameter used to quantify this feature.As a consequence, a good line driver is a component featuring high voltagehandling capability, high slew-rate and bandwidth and very good linearity.

The first three of these parameters may be used to compute the initialrequirements for any line driver, which at a minimum, has to deliver both therequired voltage and current output swings. The maximum required linevoltage, VLPP, might be computed by stepping through the following equations

The maximum VLPP on the line has to be taken as a primary design goal. Fora given VLPP, the voltage handling capability of the line drivers depends onthe characteristics of the hybrid used.The hybrid, for everybody who is familiar with communications over twistedcopper pair, is the component used to separate Rx from Tx signals , performline termination, isolate the line from the modem, and optimize, when possible,the power delivered to the line.Even if an example of fully monolithic line transceivers exists [1, 2], most oftoday hybrids are transformer based (see Figure 2 -a).

Summarizing, the minimum specifications required to start the line driverdesign are:

Average power level required on the line (PL),Crest Factor for the modulation choseLine impedance assumed for the average power specification (RL),Transmission frequency band (BW),Target harmonic distortion.

1.2.3.4.5.

46

Transformer is, in fact, an “ almost perfect” component, since ever used tomatch the load impedance while meeting the obvious constraints in terms ofvoltage and current of the component it is driven by (changing the transformerturn ratio).In practice, for a given VLPP and impedance, the amplifier’s output voltage andcurrents can be traded off with the transformer‘s turn ratio.Increasing the turns ratio will not only decre ase the required voltage swing (butat the expenses of a higher current output), but will also allow lower supplyvoltage and, in turn, the use of low-voltage components/ technologies.There are, however limitations increase the transformer turns ratio. Forinstance:

High peak-output currents will start to limit the available voltage swingimpacting the power efficiency of the power supply.

A high turns ratio in transformers can limit bandwidth and be more prone todistortion.

Often the transformer is in the path of the received signal path coming downthe line. An high step-up ratio will a high step-down for the received signalimpacting noise characteristics of the RX path and, hence reach.Examples of peak voltage and current output requiremen ts for ADSL linedriver Vs. different output power and transformer turns ratio are reported in

47

Table 1. It must be noted that 13dBm corresponds to the power transmitted inup stream, 20.4dBm to the power in down stream.

This said, it is also mandatory to matc h line impedance. This can be achievedeither by adopting passive, as implemented in Figure 2-a or active impedancesynthesis (Figure 2-b).

While for passive impedance synthesis, a series resistor is generally put at theoutput of the amplifier (but this resistor dissipates significant power, while theload is fed just by a part of the amplifier’s voltage swing), it is nowadays

48

common practice to use active impedance synthesis to match line impedancewhile minimizing the maximum output voltag e swing and the dissipated power.This Driver uses both voltage and current feedbacks (through R3) toindependently set the output impedance Rout and voltage gain G.Calculating output voltage and current, it is possible to determine both the line

driver output impedance and the gain given as:

By using active impedance synthesis is then possible to minimize the outputvoltage swing of the line driver.

WHAT HIGH VOLTAGE TECHNO LOGY FOR XDSL?

Since ever, microelectronics has been driven by the insatiable requirement forbetter performance and lower cost. This not only translates into smaller size fora given function but also into proper integration of different functions on asingle semiconductor substrate.Approaches have been proven often feasible and on a case-by-case basiseconomically applicable to address different applications and market segments;both approaches have been also adopted in case of XDSL. Common to both th etwo approaches is the requirement for high-speed components showing high ftand minimum parasitic capacitances even when withstanding high voltagecondition. To better understand the implication involved in realizing such akind of components it is worth referring, for sake of simplicity, to the voltagelimitations of npn transistor. [3]

49

As shown in Figure 3, its collector to emitter breakdown voltage with baseshorted (Biceps is usually made equal to Bacon) mainly depends on thebreakdown voltage of diode D1 and D2. The net epitaxial layer W1, itsresistivity and the reach-through mechanism define the breakdown voltage ofD1 while the breakdown of D2 mainly depend s on the radius of curvature of thebase diffusion.In standard bipolar process processes, an increase maximum sustainable voltageis achieved by increasing the thickness and the resistivity of the epitaxial layer.The bigger these two values, the bigger the lateral diffusion of the isolationlayer, the bigger the size of all the junction isolated components.Minimizing the size of all these components means minimizing the epitaxialthickness and the out-diffusion of the buried layer during all thermal step sfollowing the epitaxial growth.In the following two example of one to the other not mutually exclusive arereportedDielectric isolation is another technique often used. This technique often offers

significant advantages over junction-isolated process for high-speed analogcircuits. Trench lateral isolation of SOI bonded wafers drastically improvescircuit density for thick epithaxy because the lateral diffusion of isolationdiffusion is eliminated.

Table 2 details the difference between devices featurin g the same currentdensity and realized in the two technologies. The junction isolated PNPtransistor’s area is roughly four times the area of a comparable dielectricallyisolated PNP, while a junction isolated NPN is roughly 1.5 times as large as theequivalent dielectrically isolated NPN.

50

Another way of minimizing the size of high voltage components is the adoptionof high voltage DMOS components.Once fixed Bacon, the voltage capability of a bipolar technology is, in fact,defined by the breakdown volt age Bicep (emitter to collector breakdownvoltage with open base). [4]Since BVceo is lower than Bvcbo and given by

A bipolar transistor can be regarded as also incapable to fully exploit themaximum technology the technology is capable of.

On the contrary, a DMOS component (see Figure 4) is capable of working at abreakdown voltage Beds equal to the Bacon of the parasitic nun componentprovided that the base to emitter short circuit is good enough. To some extend,DMOS capable junction isolated technologies feature small component size.As an additional advantage (see in the following), DMOS technologies can alsobe made compatible with CMOS transistors that, in its turn, can enable therealization of highly complex mixed ICs.

51

INTEGRATION OF HIGH VOLTAGE DMOS HIGH VOLTAGECOMPONENTS INTO SUBMICRON TECHNOLOGIES.

Designing highly complex Smart Power IC s requires taking advantage ofavailable low voltage IP s ideally adding high-voltage devices into alreadyexisting VLSI process platform.Unfortunately, the evolution of smart power technologies toward finer and finermicro lithography asks to solve conflicting requirements such as mergingmanufacturing drive-in steps which, for high voltage power components areusually long and at high temperature, while, in case sub -micron technologies,must be “low temperature” to guarantee good yield and process reproducibility(mainly for thin oxide layers).This has been, for instance achieved by exploiting innovative technology stepswhich have made possible the realization of H.V. fully complementary N-Channel and P-Channel DMOS components into a standard VLSI CMOStechnologies. [5]High voltage lateral DMOS are impleme nted by realizing the Body region bymeans of a large angle tilt implantation masked by the gate layer and withoutrequiring any specific thermal treatment.Energy and tilt angle implant are depend on the compromise betweenlateral/vertical junction depth and doping charge i.e. between required source-to-drain punch-through sustainable voltage and component threshold voltage(while large tilt angles are more effective in pushing charge in the DMOS activechannel, low tilt angles reduce channel charge and length causing prematurepunch-through). 45° angle is usually found as the best compromise betweenthese two opposite requirements.In BCD6 (0.35um) the N-LDMOS P-body layer is to be directly embedded inCMOS epic-pockets.Scaling down the gate oxide thickness requires also a proper LDMOS drainstructure engineering. In BCD6, LDMOS and CMOS share exactly the samegate oxide (70nm).To avoid dangerous overcrowding of the equip -potential lines at the drain side,it is possible to adopt a gate layout stepping over the field oxide, whilechanging the doping profile of N-Well, it is possible to properly size drainextension region.With different DMOS drain solution a voltage capability from 16V to 20V areachievable.When higher operating voltages are required, dedicated low -doping N-Well isto be added. In this way, breakdown voltages in excess of 60V are achievable.To further increase BVdss, the heavily doped N+ buried layer is replaced with a

52

low doping buried well and RESURF technique is to be adopted. In th is casebreakdown voltage in excess of 100V is easily achievable.

Table3 summarizes the main features concerning N-Channel Lateral DMOSrealized in BCD6 (0.35 um CMOS).

Exploiting the flexibility offered by the large tilt implant technique used torealize the N-channel DMOS P-body region, it is possible to implement a N-type body region to build P -channel DMOS Transistors.As a matter of fact, fully complementary N-channel and P-channel type ofcomponents are, nowadays available in low voltage semiconductor processes.

COMPLEMENTARY, DIELECTRICALLY ISOLATED BIPOLARTECHNOLOGY ON SOI

In case of XDSL application, SOI and dielectric isolation is, nowadays, gettingmore and more acceptance because of the its good characteristics in terms ofspeed. Minimizing the component size, translates automatically into reducedparasitic capacitances. [6]P and N buried collectors are usually formed by ion implantation after which ann-type epitaxy layer is grown to form the intrinsic collector of the NPN.

53

A pwell is added to form the intrinsic collector of the PNP. Lateral isolation isachieved by etching trenches down to the buried oxide. The t renches are usuallyfilled with LPCVD oxide and polysilicon. Transistors emitters can be eitherSilicon or Poly. Always referring to Table 2, the base to collector junctioncapacitance of the junction isolated NPN is roughly twice that of thedielectrically isolated NPN’s, and its substrate capacitance is three times bigger.Same applies for PNP’s.Measurements reveal that the cut-off frequency for the bipolar transistors ismuch higher. Nowadays it is possible to easily obtain NPN and IsolatedCollector PNP featuring Ft of more than 2 / 6 GHz for NPN and 2 / 4 GHz forPNP.

AN EXAMPLE OF ADSL LINE DRIVER REALIZED IN MIXED BIPOLAR,CMOS, DMOS MIXED TECHNOLOGY.

Even if most of today available Line Driver for C.O. (Central Office) is realizedby fully complementary high-speed bipolar processes, an example of line driverrealized in Multipower BCD (Bipolar, Cmos, Dmos) technology is herereported. The functional diagram is shown in Figure 5.

It consists of a differential gain stage followed by a class AB output stage. Theinput stage is a simple emitter coupled pair where low voltage high speed(ft=7GHz ) npn transistors are used to achieve low input referred noise. The

54

intermediate is a classical Class AB stage used to guarantee high slew ratewhile featuring low quiescent current. While low voltage npn transistor (indeedcascaded) are here still used to get low noise features, the unavailability of pnpcounterpart, led to the utilization of PDMOS components. The outputs of ClassAB intermediate stage (Vp and Vn) directly drive the gate of a push-pullcommon drain output stage (PDMOS M13 and NDMOS M14). Quiescentcurrent of M13 and 14 is controlled by current mirroring between M12 andM13 closed thorough the OTA. The key features of this ADSL Line driver arereported in Table 4.

AN EXAMPLE OF ADSL LINE DRIVER REALIZED IN HIGH SPEEDCOMPLEMENTARY SOI TECHNOLOGY

Advanced complementary, SOI isolated bipolar processes that some timesenable the capability of integrating submicron CMOS have recently developedto allow the realization of high performance ADSL line drivers. [7] Highvoltage technologies (Bvces>30V) semiconductor technologies offeringtransistors with ft in excess of 4GHz for pnp and in excess of 6GHz for npn are,as a matter of fact, nowadays available. In these technologies, current feedbackis very often adopted (see Figure 6).

55

SOI superior characteristics in terms of ft and parasitic capacitances easilyallow high small-signal bandwidth and slew rate, while small base resistance(often shown in SOI technologies) and reduced biasing current result in lowinput voltage and current noise. Some key features of a comme rcially availablecurrent feedback C.O. driver realized in SOI are reported in Table 5.

Moreover, examples of SOI technologies allowing also the fabrication ofaccurate laser trimmed analog filters have been recently announced [XX].

56

CONCLUSIONS

The analog front-ends (AFE) of XDSL modems are typically partitioned intotwo technologies. Data converters, analog filters and Rx amplifiers arefabricated on low voltage technologies, while XDSL line drivers employ highervoltage processes. However, nowadays available high voltage process oftenembedding submicron CMOS components make it possible to conceive adifferent system partitioning with data converters, analog filters and Rxamplifiers integrated together with line drivers. Examples exists of ICseconomically integrating all these functions and realized either on a fullycomplementary bipolar or on CMOS, DMOS centered technology.

REFERENCES

(1) Zojer et al.,” A Broadband High-Voltage SLIC for a Splitter andTransformerless Combined ADSL-Lite /POTS Line Card “ ISSCC Digestof Technical Papers, pp.304-305, Feb.2000

(2) Berton et al., “ A High Voltage Line Driver (HVLDR) for CombinedData and Voice Services “ ISSCC Digest of Technical Pap ers, pp.302-303,Feb.2001

(3) “Power Integrated Circuits: Physics, Design, and Applications”P.Antognetti, Editor, Mc Graw-Hill p.p.4.13-4.17.

(4) “Smart Power ICs: Technologies and Applications” B.Murari, F.Bertotti,G.A.Vignola, Springer pp. 179-180.

(5) C. Contie ro et al., LDMOS Implementation by large Tilt Implant in 0.6BCD Process, Flash memory Compatible, Proceedings ISPS’99

(6) “A 30V Complementary Bipolar Technology on SOI for High SpeedPrecision Analog Circuits” R.Patel et Al. IEEE BCTV 2.3 pp 48 -50

(7) M.Cresi et al.,”An ADSL Central Office Analog Front-End IntegratingActively-Terminated Line Driver, Receiver and Filters” “ ISSCC Digestof Technical paper, pp.304-305, Feb.2001.

SCALABILITY OF WIRE-LINE ANALOG FRONT-ENDS

Klaas BULTBroadcom Netherlands B.V.Bunnik, The Netherlands.

ABSTRACTAnalog design in deep sub-micron technologies is a

reality now and poses severe challenges to the circuitdesigner. Trends in technologies as well as their effectson circuit design are discussed. It is shown that, specifi-cally for Wire-Line AFE’s, the power required for a cer-tain dynamic range and bandwidth decreases withminimum feature size as long as a constant ratio betweensignal swing and supply voltage can be maintained.However, below channel-length, predictions ofthe threshold voltage endanger that requirement.

1. INTRODUCTIONIn Wire-Line applications (like Ethernet, Gigabit, Set-Top Boxes,

Cable Modem’s, etc.), analog integration in deep sub-micron CMOShas become an economic necessity. Several papers already discussedthe problems and design challenges of analog circuits integrated inpurely digital deep sub-micron CMOS technologies [1] - [16]. Thispaper will discuss trends in technologies and their effects on circuitdesign, specifically focussed on Analog Front-End’s (AFE’s) forWire-Line applications. Emphasis will be on the effect of supply volt-age scaling on circuit design and performance.

After discussing a generic Wire-Line Analog Front-End in section 2,section 3 deals with process scaling. Section 4 then deals with theeffect of process scaling on Power Dissipation and in section 5 experi-mental data from literature corroborates the findings of section 4. Sec-tion 6 puts the previous results in perspective by discussing somedetails and caviats. In section 7, finally, the scalabitiy of Wire-LineAFE’s is discussed. Section 8 summarizes the conclusions.


58

2. WIRE-LINE ANALOG FRONT-END’SWire-Line IC’s are a typical example of ULSI integration dominated

by digital circuitry, with some peripheral analog circuitry. Analog sig-nal processing is usually kept to a minimum. A generic AFE isdepicted in Fig. 1. The analog input-signal either comes from the wire-line hybrid (like in Ethernet), or through an RF Tuner (like in Cableapplications). Gain, Gain-Control and Filtering may or may not beapplied, dependent on application. The Track and Hold (T&H) andADC function however are mandatory and form the core of the AFE.

Of all the aspects of design in deep sub-micron technologies, thescaling of the Supply Voltage is the most obvious and mostseverely affects analog circuit design [5, 10, 12, 14, 15]. Fig. 2 showsthe 1999 International Technology Roadmap for Semiconductors pre-dicting a maximum 0.6V supply voltage for the year 2010 [17]. To geta feeling of how process-parameters have changed over time and willchange in the near future, Table 1 gives an overview of 14 differentprocesses, ranging from down to of which only thelast two are predictions (data from [15, 17, 18,21]). The Supply Voltage Oxide Thickness Threshold Volt-age and Matching parameter of these 14 processes are plottedon a log-log scale in Fig. 3. For technologies larger than (or equal to)

stays flat and equals 5.0V In smaller technologies,scales roughly linear with minimum feature size (although it follows a

3. PROCESS SCALING

59

staircase function). Fig. 3 shows that both oxide thickness aswell as matching scale down linearly with technology. Fig. 3also shows threshold voltage clearly not scaling linearly, but morelike a square-root function. The effect of that on voltage headroom isstill not that strong as is still only 25% of This might change

60

below as preliminary estimates show to have a lower limitof approximately 300mV.

4. VOLTAGE SCALING AND POWER DISSIPATIONDue to the continual down-scaling of the supply voltage over time,

the Dynamic Range (DR) requires extra attention in circuit design. Ithas been shown that, especially in ADC front-end circuitry, matchingis more dominant in determining the low-end of the dynamic rangethan noise [4,5]. Matching is reported to scale with oxide thickness

which is clearly visible in Fig.3. Defining theDynamic Range (DR) as the ratio between and (Fig.

61

4a), and defining [20,21], where n is thenumber of sigma’s necessary for a certain yield, we find:

with being the voltage efficiency Fig. 4a depicts

the DR and the terms it consists of, where sqrt(WL) (height of lower

62

shaded area) is adjusted such that a constant DR is obtained over allprocesses. Using the inverse of (1):

the gate capacitance may be derived:

where a constant depending purely on technology.This capacitance is the gate capacitance of the transistor with a match-ing requirement to support a certain Dynamic Range (DR). Assuminga maximum signal frequency the Slew-Rate current to sup-port a swing of at frequency is:

If this current is delivered by a driver with anefficiencythan the power dissipation P follows:

The relationship described by (5) is depicted in Fig. 4b. The processdata of Table 1 is used in Fig. 5, where power (P) is plotted againsttechnology according to equations (5). As can be seen from thisfigure, the data follows the same shape as predicted by the curves inFig. 4a and 4b.

Expression (5) consists of 4 separate terms. The first term is processdependent only and is mainly dependent on oxide thickness. The sec-ond term is the product of voltage efficiency and current effi-ciency and reflects circuit “smartness”. The third term is the aresult of yield requirements and the last term depicts the system needs.These components are also visible in Fig. 4b. If constant circuit smart-

63

ness, yield and system requirements are assumed, power scales downwith oxide thickness. From this point of view the future of analogdesign in deep submicron does not seem so bleak. The crux of theabove assumption obviously lies in maintaining a constant product

As can be clearly seen from Fig.3, will be negativelyaffected by the fact that is not scaling linearly with

5. EXPERIMENTAL DATA FROM LITERATUREAs a test to the above derivation of circuit performance versus technol-ogy, data was gathered from 15 different 6-bit ADC’s [24-38]. Anoverview of this data is shown in Table 2. A figure of merit for 6-bitADC’s can be defined as:

and is plotted against technology in Fig.6. Assuming the majority

of the layout scales with (i.e. source and drain diffusion area’s,contacts and wiring) and (5) predicting the power P to scale with

64

is expected to scale with as is shown clearly in Fig. 6.Compensating for this effect yields a technology independent figure ofmerit and is show in Fig. 6 by the open dots. The best fitting straightline indeed is independent of technology (i.e. has a slope of 0).

6. DISCUSSIONThe derivation of Power Dissipation as a function of Technology

scaling given in section 4 was done under the assumption of Matchingand Slew-Rate being the dominant design issues. Fig. 7 shows thePower Estimate versus Technology based on this assumption (curvea). It also shows 2 other Power Estimates. Curve b) is based on theassumption Matching and Bandwidth are dominant. It can be shownthat this requirement is basically independent of Technology and iscurrently Technology) still significantly less important thanBandwidth and Slew-Rate. Curve c) shows the required Power Dissi-pation to meet the Thermal Noise specifications. As is shown also byother authors [4,5], the required Power under this condition is cur-rently still several orders of magnitude lower than curve a), butincreases with smaller Technologies. Flicker Noise still has to beadded to that, but Flicker Noise predictions for future technologies

65

have proven to be hard. In any case, the effect of Flicker Noise is thatcurve c) will be raised dramatically and ultimately, Noise will be thedominant requirement as far as Power Dissipation is concerned. Thequestion is how many Technology generations we are away from thispoint.

Moreover, all of the above estimates are the required Power for onesingle Transistor meeting either the Matching, Slew-Rate, Bandwidthor Noise specifications. To obtain the Power dissipation of a completecircuit, one has to multiply this estimate with the number of Transis-tors (or rather branches) in the circuit having to comply with theserequirements. Moreover, the estimated power dissipation also assumesno circuit tricks such as Dynamic Element Matching [39], Chopping

66

[39], Auto-Zero Techniques [29, 34] or Averaging [40]. Use of suchtechniques can reduce the power requirements based on matching byas much as an order of magnitude and will lower curves a) and b) inFig. 7 equivalently. Noise usually is affected for lower frequenciesonly and as a result curve c) remains more or less at its place.Although the effect on the current situation is not dramatic,it does however move the cross-over point several Technology genera-tions earlier.

7. SCALING OF WIRE-LINE ANALOG FRONT-ENDSConsider again the generic block diagram of Wire-Line AFE’s in

Fig.l. The PGA and the LPF, if present, are usually primarily passive

67

and do not contribute considerably to the overall Power Dissipation.The main blocks to consider are the Track & Hold Amplifier and theADC. As discussed above, the ADC is a perfect example of a circuitdominated by Matching and Noise is much less of a problem. There-for, ADC Power Dissipation will follow curve a) as a result of Tech-nology scaling. Amplifier design is usually not affected by Matchingand is usually governed by it’s Noise requirements. However, as theLoad-Capacitance of the Track & Hold circuit is formed by the inputcapacitance of the ADC and hence is dominated by Matching require-ments, also T&H Power Dissipation will follow curve a) as a result ofTechnology scaling. This leads to the conclusion that Wire-LineAFE’s will require less Power as a result of Technology scaling.Flicker Noise however, may change that picture at some point in thefuture.

8. CONCLUSIONAnalog design in deep sub-micron technologies has become a reality

now and poses severe challenges to the circuit designer. Trends intechnologies and their effects on circuit design have been discussed. Ithas been shown that specifically for Wire-Line AFE’s the powerrequired for a certain Dynamic Range and Bandwidth decreases withminimum feature size. This is primarily due to the fact that Wire-LineAFE’s are dominated by the ADC design, which in turn is dominatedby Matching requirements and Matching improves with thinnerOxides. The reduction of power dissipation with Technology scaling isbased however, on a constant voltage and current efficiency. This iswhere the design challenge lies, as below predictions of thethreshold voltage endanger that requirement.

REFERENCESW. Sansen, “Mixed Analog-Digital Design Challenges”, IEEEColloq. System on a Chip, pp. 1/1 - 1/6, Sept. 1998.

[1]

[2]

[3]

B. Hosticka et al., “Low-Voltage CMOS Analog Circuits”, IEEETrans, on Circ. and Syst., vol. 42, no. 11, pp. 864-872, Nov. 1995.W. Sansen, “Challenges in Analog IC Design in Submicron CMOSTechnologies”, Analog and Mixed IC Design, IEEE-CAS Region8 Workshop, pp. 72-78, Sept. 1996.

68

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

Peter Kinget and Michiel Steyaert, “Impact of transistor mismatchon the speed-accuracy-power trade-off of analog CMOS circuits”,Proc. IEEE Custom Integrated Circuit Conference, CICC96,pp.333-336, 1996.M.Steyaert et al., “Custom Analog Low Power Design: The prob-lem of low-voltage and mismatch”, Proc. IEEECustom Int. Circ.Conf., CICC97, pp.285-292, 1997.V.Prodanov and M.Green, “Design Techniques and ParadigmsToward Design of Low-Voltage CMOS Analog Circuits”, Proc.1997 IEEE International Symposium on Circuits and Systems, pp.129-132, June 1997.W.Sansen et al., “Towards Sub 1V Analog Integrated Circuits inSubmicron Standard CMOS Technologies”, IEEE Int. Solid-StateCirc. Conf., Dig. Tech. Papers, pp. 186-187, Feb. 1998.Q. Huang et al., “The Impact of Scaling Down to Deep Submicronon CMOS RF Circuits”, IEEE J. Solid-State Circuits, vol. 33, no.7, pp. 1023-1036, July 1998.R.Castello et al. “High-Frequency Analog Filters in Deep-Submi-cron CMOS Technologies”, IEEE Int. Solid-State Circ. Conf.,Dig. Tech. Papers, pp. 74-75, Feb. 1999.Klaas Bult, “Analog Broadband Communication Circuits in DeepSub-Micron CMOS”, IEEE Int. Solid-State Circ. Conf. Dig. Tech.Papers, pp.76-77, Feb. 1999.J. Fattaruso, “Low-Voltage Analog CMOS Circuit Techniques”,Proc. Int. Symp. on VLSI Tech., Syst. and Appl., pp. 286-289,1999.Daniel Foty, “Taking a Deep Look at Analog CMOS”, IEEE Cir-cuits & Devices, pp. 23-28, March 1999.D. Buss, “Device Issues in the Integration of Analog/RF Func-tions in Deep Submicron Digital CMOS”, IEDM Techn. Dig., pp.423-426, 1999.A. J. Annema, “Analog Circuit Performance and Process Scaling”,IEEE Trans. on Circ. and Syst., vol. 46, no. 6, pp. 711-725, June1999.M.Steyaert et al., “Speed-Power-Accuracy Trade-off in high-speed Analog-to-Digital Converters: Now and in the future...”,Proc. AACD, Tegernsee, April 2000.

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

69

J.Burghartz et al. “RF Potential of a 0.18-um CMOS Logic DeviceTechnology”, IEEE Trans, on Elec. Dev. vol. 47, no. 4, pp. 864-870, April 2000.Abrishami et al., “International Technology Roadmap for Semi-conductors”, Semiconductor Industry Assoc., 1999.C.Hu, “Future CMOS Scaling and Reliability”, IEEE Proceed-ings, vol. 81, no. 5, pp. 682-689, May 1993.B. Davari et al., “CMOS Scaling for High Performance and Low-Power - The Next Ten Years”, IEEE Proceedings, vol. 83, no. 4,pp. 595-606, April 1995.K. Lakshmikumar et al., “Characterization and Modelling of Mis-match in MOS Transistor for Precision Analog Design”, IEEE J.of Solid-State Circ., vol SC-21, no. 6, pp. 1057-11066, Dec. 1986

1989.

M.Pelgrom et al., “Matching Properties of MOS Transistors”,IEEE J. of Solid-State Circ., vol 24, no. 5, pp. 1433-1439, Oct.

T. Mizuno et al., “Experimental Study of Threshold Voltage Fluc-tuation Due to Statistical Variation of Channel Dopant Number inMOSFET’s”, IEEE Trans. on Elec. Dev. vol. 41, no.11, pp. 2216-2221, Nov. 1994.

K.McCall et al. “A 6-bit 125 MHz CMOS A/D Converter”, Proc.IEEE Custom Int. Circ. Conf., CICC, 1992.M.Flynn and D.Allstot, “CMOS Folding ADCs with Current-Mode Interpolation”, IEEE Int. Solid-State Circ. Conf., Dig. Tech.Papers, pp.274-275, Feb. 1995.F.Paillardet and P.Robert, “A 3.3 V 6 bits 60 MHz CMOS DualADC”, IEEE Trans. on Cons. Elec., vol. 41, no. 3, pp. 880-883,Aug. 1995.J.Spalding and D.Dalton,”A 200MSample/s 6b Flash ADC in0.61m CMOS”, IEEE Int. Solid-State Circ. Conf., Dig. Tech.Papers, pp. 320-321, Feb. 1996.R.Roovers and M.Steyaert, “A 175 Ms/s, 6b, 160 mW, 3.3 VCMOS A/D Converter”, IEEE J. of Solid-State Circ., vol 31, no.7, pp. 938-944, July 1996.

M.Pelgrom et al., “Transistor matching in analog CMOS applica-tions”, IEEE IEDM Techn. Dig., pp. 915-918, 1998.

70

[29]S.Tsukamoto et al., “A CMOS 6-b, 200 MSample/s, 3 V-SupplyA/D Converter for a PRML Read Channel LSI”, IEEE J. of Solid-State Circ.,vol 31, no. 11, pp. 1831-1836, Nov. 1996.

[30]D.Dalton et al., “A 200-MSPS 6-Bit Flash ADC in 0.6-1mCMOS”, IEEE Trans. on Circ. and Syst., vol. 45, no. 11, pp. 1433-1444, Nov. 1998.

[31]M.Flynn and B.Sheahan, “A 400-MSample/s 6-b CMOS Foldingand Interpolating ADC”, IEEE J. of Solid-State Circ., vol 33, no.12, pp. 1932-1938, Dec. 1998.

[32]S.Tsukamoto et al., “A CMOS 6-b, 400-MSample/s ADC withError Correction”, IEEE J. of Solid-State Circ., vol 33, no. 12, pp.

[33] Y.Tamba and K.Yamakido, “A CMOS 6b 500MSample/s ADC fora Hard Disk Drive Read Channel”, IEEE Int. Solid-State Circ.Conf., Dig. Tech. Papers, pp.324-325, Feb. 1999.

[34]K.Yoon et al., “A 6b 500MSample/s CMOS Flash ADC with aBackground Interpolated Auto-Zero Technique”, IEEE Int. Solid-State Circ. Conf., Dig. Tech. Papers, pp. 326-327, Feb. 1999.

[35]I.Mehr and D.Dalton, “A 500-MSample/s, 6-Bit Nyquist-RateADC for Disk-Drive Read-Channel Applications”, IEEE J. ofSolid-State Circ., vol 34, no. 7, pp. 912-920, July 1999.

[36]K.Nagaraj et al., “Efficient 6-Bit A/D Converter Using a 1-BitFolding Front End”, IEEE J. of Solid-State Circ., vol 34, no. 8, pp.1056-1062, Aug. 1999.

[37]K.Nagaraj et al., “A 700MSample/s 6b Read Channel A/D Con-verter with 7b Servo Mode”, IEEE Int. Solid-State Circ. Conf.,

[38]K.Sushihara et al., “A 6b 800MSample/s CMOS A/D Converter”,IEEE Int. Solid-State Circ. Conf., Dig. Tech. Papers, pp.428-429,Feb. 2000.

[39]R.v.d.Plassche, “Integrated Analog-to-Digital and Digital-to-Ana-log Converters”, Kluwer Academic Publishers, Dordrecht, TheNetherlands, 1994.

[40]K.Bult and A.Buchwald, “An embedded 240-mW 10-b 50-MS/sCMOS ADC in IEEE J. o7 Solid-State Circ., vol 32, no.12, pp. 1887-1895, Dec. 1997.

Dig. Tech. Papers, pp.426-427, Feb. 2000.

1939-1947, Dec. 1998.

Reusable IP Analog Circuit Design

Jörg Hauptmann, Andreas Wiesbauer, HubertWeinberger

Infineon Technologies, Design Centers Austria GmbHVillach, Austria

ABSTRACT

As ‘Time to market’ plays a crucial role for successfulSystem on Chip (SoC) business, all chip companies tryto drastically reduce development cycle times.Especially in analog circuit design this is anextraordinarily challenging target. Decreasing supplyvoltages along with the fast introduction of new submicron technologies and increased performance andfunctionality would rather suggest an increase ofdesign efforts. But making use of IP-reuse can help alot to achieve development cycle time reduction. Areview of possible reuse methods and comments ontheir feasibility are presented in this paper

1) INTRODUCTION

In the last 10 years the development and introduction of new sub-micron technologies was very aggressive, as every other year a newtechnology was released. Today’s sub micron technologies allowintegration of millions of digital gates on one silicon die, therebycreating complex SoC designs, which are requested by the market.Due to the cost saving potential, the market demands to migrateexisting system solutions into the most recent and smallesttechnologies available, additionally trying to further increase the on-chip functionality.


72

Not only the size of the transistors is scaled down, but also the supplyvoltage has to be drastically reduced. Coming from 5V for 0.5utechnologies and 3.3V for 0.35u technologies, the voltage has beenreduced to 1.8V for 0.18u or even below for 0.13u and 0 . 1 u (seeFig.1). Deep sub-micron processes are optimized for digital circuits,making it more difficult for analog designers to shrink designs intomore recent technologies. Down to a feature size of thethreshold voltage decreased almost proportional with the supplyvoltage. For smaller feature sizes the threshold voltage decreases moreslowly, leaving less room for linear analog voltage swing. In addition,

the specific capacitance is reduced and gds is increased. There are alsosome beneficial changes, such as increased speed and improvedmatching properties, which help to implement the analog functionality[4]. All these circumstances however, ask for changing building-blocktopologies in order to fulfill the specified performance. Since directshrinking without architectural changes is almost impossible,maintaining an efficient reuse strategy is difficult.Many of these systems, such as xDSL transceivers, Ethernet PHY’s orfirst IF wireless receivers, need complex analog functions on the samedie with complex digital circuits. In other cases the analog

73

functionality is rather simple, e.g. in micro controllers one or twoanalog building blocks are sufficient. According to the complexity ofthe analog functions different levels of reuse can be defined:Section 2 deals with the reuse of complete analog front-ends (AFE)for SoC designs, basically showing that there is a huge challenge forthe system architects and concept engineers to define several SoCssuch that the same AFE can be reused without major changes. Anothertype of reuse, focusing on standard building blocks is described inSection 3. Here, the strategy is to design one analog building blockwith some overhead for reusability and make it available to many SoCdesigners. Very often the reused block is not optimized for the specificapplication and therefore consumes more power and/or more siliconarea than necessary. Whenever the efforts in power consumption andsilicon area for the analog functionality are much smaller than theefforts for the digital part, this approach seems to be feasible. Limitsof this strategy, such as competitiveness, power optimization,performance optimization and number of necessary reuses arediscussed. Section 4 describes possibilities of reuse for high volumestate of the art designs, where, for reasons of competitiveness,compromises in performance or power consumption are notacceptable. Usually this AFE’s take a significant part in area and/orpower consumption of the SoC. Also the performance is typicallyclose to the physical limits of the used sub-micron technology. Thusoptimum AFE design is required, challenging the designers to findefficient reuse possibilities.Within each reuse level we will be discussing different types of reuse.Plug & play is given, if a specific building block can be inserted in thedesign without changing anything inside the macro. Of course, theremight be some programming features to adapt the module to thespecific application. Essentially, the designer does not need a modulespecific know how. Mix & match reuse, on the other hand, issomewhat less restrictive. A module from a different design is takenas a basis and then adapted to the new requirements. The designerneeds to know the building block very well and can change it at therequired nodes, e.g. changing aspect ratios or bias currents. Whileplug & play reuse requires library type of modules with all kind ofdifferent views, mix & match reuse can be handled less formal and isbased on interpersonal contacts towards IP reuse of the designer. In

74

terms of quality assurance, mix & match reuse is much more personspecific than plug & play reuse.The ‘time to market’ issue together with a limited number of availableanalog resources leads to a very strong demand for IP reuse in analogcircuit design. Additional aspects towards cycle time reduction, suchas the use of appropriate design tools and the need for innovativeproject structures are discussed in Section 5.

2) Reuse of complete analog front-ends for SoC

In this section several examples on the reuse of complex Analog FrontEnds (AFE’s) are presented. Several applications allow defining oneanalog module (front-end macro), which can be reused in all theseapplications in the same manner. A macro can be even standardized,like it is done for 10/100base-T Ethernet PHY’s. If a macro is oncedefined carefully by the system engineers, it can be used in severalderivatives of a whole product family. An often proven example is thestandard analog voice macro, which could be used in many differentISDN or plain old telephony services (POTS) applications. Only witha strict discipline in system definition and sometimes also draw-backsin digital design, a common AFE design specification is possible.

75

Technology roadmap, supply voltage, functionality and powerconsumption are only a few parameters, which have to be aligned inall the different applications. Bug fixing problems may also occur, ifseveral of these projects are done in concurrent engineering, togetherwith the macro itself.Figure 2 shows the block diagram of a cable-modem AFE, designedfor SoC usage [5]. It consists of two downstream channels, oneupstream channel, a biasing block, an automatic filter tuning and alow jitter PLL, designed in 0.18µm CMOS with 1.8V power supplyonly. The architecture was defined in such a way, that its downstreampart fits also a digital terrestrial TV receiver (DVB-T) SoC application[1], Since also the PLL, the central biasing and the filter tuning couldbe reused, a total reuse of 95% could be achieved. The mostsignificant changes were the use of a different sampling rate (PLL)and a different filter order for the anti-aliasing filter. This AFE alsofits quite nicely to the requirements of a hiperlan or LMDS SoCapplication, such that a reuse rate of more than 90% would bepossible.

76

Some system solutions may have similar architectures per definition,as it is in the family of xDSL products. Then it is possible to adopt thecircuits with low efforts to the new system requirements. For exampleit was possible to design a complete analog front-end for HDSL2within 2 months by making reuse of an existing ADSL analog front-end. Fig.3 shows the architecture of the ADSL frontend chip [2].The main difference between the Analog Front Ends (AFE) for ADSLand HDSL2 from system point of view is the analog bandwidth, whichis 1.1MHz for ADSL and 450kHz for HDSL2 respectively. Of course,the modulation schemes are different (PAM for HDSL2 and DMT forADSL) and also the data rates. However, the requirements for theAFE’s are nearly the same: We need 14bit A/D and D/A converter’sand Harmonic Distortion better than 75dBc at half Full Scale Signals.In Table 1 the key perfomance date of both systems are shown.

As you can see in Table 1 the AGC has the same dynamic range forboth AFE’s, so no change in the topology was necessary. We onlyreduced the powerconsumption due to the lower bandwidth. Bysimply reducing the bias current of the opamp no layout effort wasnecessary for this adaptation. The PREFI had to be redesigned for thelower cornerfrequency and therefore also a new layout had to be done.In theA/D converter, a order multibit sigma delta converter, we justredesigned the OTA’s of the integrators for the lower clockfrequency,thereby reducing the power consumption dramatically. Only minorlayout changes were necessary. In the D/A converter - a 7 bit currentsteering DAC - we only optimized the power consumption byreducing the bias current of the opamp, so also no layout effort had tobe spent. For the POFI the same effort had to be spent as for the

77

PREFI. Also no change was necessary for the Linedriver, only achange in the supply voltage due to different output voltage swings.The reuse rate in this case was very high and came close to 80%.

Reuse of AFE’s always requires the concept engineering, digitaldesigners and analog designers to work closely together whenspecifying the system requirements. For all the mentioned productsthe necessary changes could be done in a very short time-frame,because the analog design team was not changed for one productfamily.

3) Reuse of analog standard building blocks

In chapter 2 we briefly discussed the reuse of complete analog cores.Another approach for reuse can be found in the building blocks itself.Here we have to distinguish between standard building blocks, whichhave moderate performance and can be standardized, and highperformance state of the art building blocks, which have to bedesigned in a particular way for each project.Standard building blocks, which can be designed as ‘ready to use’modules are for example comparators, bandgaps, power-on-resets,standard PLL’s and oscillators. These basic plug & play buildingblocks need additional design effort in order to guarantee quality andreusability without the need of special knowledge of the block or evenanalog design knowledge. Due to the additional efforts, the reuse rate(# of reuses per technology) must be larger than 3 to benefit from thislibrary element. Table 2 shows elementary building blocks and givesan estimation of their reuse number within the same processtechnology. The numbers are for a mid size SoC group withapproximately 100 employees, including designers, concept engineerslayout and product definition.Especially high potential for saving efforts can be seen for standardPLL’s and standard ADCs, used for example in micro-controllers asstandard interface to the analog world.

78

A 10 bit SAR ADC was designed once with a design effort of 15MM.This was the basis for 28 ADC modules with an average effort ofabout 2MM per module. Fig.4 shows the tree of all these converters.The converters were designed in different technologies by means ofmix & match strategy, and were reused in most of these technologiesseveral times in a plug & play manner by digital designers. All thesemodules have been delivered with very high quality – an importantaspect for plug & play macros in digital projects.

79

A similar strategy is possible for standard PLLs, used for generatingappropriate clocking for digital IC’s. The design must have someoverhead for flexible programming of the output in order to have moreplug & play reuse possibilities of one dedicated design.Digital crystal oscillators and central biasing are candidates for almost100% reuse. But again care has to be taken in the grounding strategiesof the central bias, which may differ from application to application.

4) Reusable IP in high volume, state of the art designs

In products designed for state of the art technologies (xDSL, cableapplications, fiber optics), the probability of finding modules ready forreuse is low. The performance of the blocks has to be close to thephysical limits of state of the art circuitry and the area and powerconsumption must be absolutely optimized in order to be competitive.

80

Too many parameters besides the used technology, such as bandwidth,signal swing, appropriate load, supply voltage, open loop gain foropamp’s on one hand, or bit accuracy, signal bandwidth, clock rate,supply voltage for ADC’s and DAC’s on the other hand, have tomatch the specification. This makes it hardly possible to reuse blocksin different projects. But this doesn’t mean that there is no reuse at alland everything has to be designed from scratch. The IP in analogdesign teams is usually very high and can be reused in all differentblocks.Opamps: The IP reuse in opamps is very high (about 70%), so thatnew opamps are ready including layout within 2 days. Mathematicaldocuments and schematics from former projects ease the design ofnew opamps drastically, so that the design can be done within oneday. 60% of this effort is simulations, which can be additionallyreduced by using automatic simulation shells. The fact, that the pinstructure of opamps didn’t change at all (2 inputs, 2 outputs for diff.opamps), automatic simulation shells for opamps can be alwaysreused and are also suitable for the future.The IP in opamp design can be further programmed into commercialtools for automatic design including layout, but this is limited to fixedstructures, which may change with decreasing supply voltage and inthis way quickly limit the capability of such tools.Anyway, compared to the overall design effort of a project - about 40to 60 MM initial design and 100 MM till production release - thecontribution of opamp design effort is minor.Amplifiers, Filters: For designing amplifiers and filters, nearly thesame mix & match approach can be used, only the IP reuse is in thiscase in average lower (40-50%) and the design takes about 1 to 2 menweeks. Automatic simulations are in this case also not very useful.Converters: ADC’s and DAC’s are usually the most critical parts inhigh performance products (e.g. xDSL, cable modem,...) and need alot of design effort. New technologies with low supply voltages andstate of the art specifications always require to find new circuitstructures and circuit improvements.Nevertheless, in Fig.5 it can be seen, that there are only a fewtopologies of ADCs commonly used for telecommunicationapplications.

81

Sigma Delta converters are widely used in high resolution, mediumbandwidth applications, like ADSL, HDSL and SDSL, whereas 2 stepflash sub-ranging converters are best suited for medium resolution (upto 11 bits) and high bandwidth, needed in VDSL, cable modem, DVB-T and Gigabit Ethernet. Although each of the mentioned products hasmore or less different specifications, IP can be reused in a highlymanner (60 to 70%). Once you have designed one converter type in atypical technology, the design effort and risks are significantlyreduced for each additional converter of the same topology andtechnology.A typical IP reuse is described next: A 2step flash converter (Fig. 6)with 10bit eff resolution and 150MHz sampling rate with 1.8V supplyvoltage in technology was designed for a cable modem front-end, using the IP of a version with 5V supply. The identicalconverter could be reused afterwards in a COFDM project, aterrestrial receiver for digital TV (DVB-T). By introducingoversampling and adding digital filters, the same converter core wasagain suitable for VDSL with 11 bit effective resolution and 12 MHz

82

signal bandwidth. Only the driver circuit and the reference buffers hadto be optimized for the VDSL-requirements.

Fig.7) shows the layout of this converter type in a) technology5V supply voltage and in b) technology 1.8V. Both convertershave the same performance of about 11bit effective for 12 MHz signalbandwidth. The power consumption could be reduced from 250mW to180mW and the silicon area is drastically reduced for 0.18u version.

83

Using available parts of this converter and adapt them to newspecifications (different bit resolution, different bandwidth) is a goodmix & match approach to come easily to converters suitable also for aQPSK satellite-receiver (DVB-S) or Gigabit Ethernet. It is the samemix & match strategy, as for opamps, amplifiers or filters.

Similar reuse is possible for multi-bit Sigma Delta converters, neededin xDSL products, with the need of adjustments to differentresolutions and bandwidth.The reference design was a order multi bit sigma delta converterused in an ADSL-RT (Remote Terminal) chip, see Fig. 8 for the blockdiagram. A cascade 2-1 structure with 3bit resolution in the first stageand 5bit resolution in the second stage was chosen [3]. The analogbandwidth is 1.1 MHz with 14 bit resolution and a sampling frequencyof 26 MHz. The first design was done in a technology with 5volts supply, designed with an effort of 15 MM.

84

Then this converter was redesigned for a HDSL2 application with450kHz bandwidth in the same technology. Due to the smallerbandwidth the second stage resolution could be reduced to a 3 bitstructure and also the sampling frequency was reduced to 16MHz,which resulted in a smaller area and lower power consumption of theconverter. The design and layout effort for this converter was only5MM.The next step was a redesign for an ADSL-COT (Central OfficeTerminal) application with a bandwidth of 250kHz. Again we changedthe structure of the second stage to 4bit resolution and the samplingfrequency to 4MHz. The effort reduced to 3MM.This two reuse designs where done in the same technology, the nextstep was a technology change from to for the ADSL-COT converter. Due to the very fast technology we could againchange the topology. We increased the sampling frequency to 53MHzand we decreased the converter order from 3 to 2 with 3 bit resolution,resulting in a very small silicon area as you can see in Fig.9. Since we made a technology change, each block had to be designednew, and also a completely new layout was made. The effort for thisnew converter increased to 7MM.

85

As a summary, in table 3 several projects are listed to demonstrate thepercentage of IP- and schematic reuse by means of mix & match. Thepercentage of reuse can differ from block to block from 5% up to90%.Although the probability of 100% reuse of designed blocks is prettylow, 60% to 90% reuse capability in some projects is still very high byjust using available IP, schematics, simulation shells, layout cells,testing facilities etc. and doing the new design by means of mix &match.

86

5) Additional aspects towards cycle time reduction

Up to now, the paper was focused on cycle time reduction in analoglayout and analog design by reuse within these tasks. However, theproduct development speed can also benefit from improvement intooling and speeding up of other tasks besides analog layout andanalog design.High potential for tooling is in support and automation of standarddesign tasks such as: definition and execution of block specificsimulation runs, efficient (higher level) modeling of analog buildingblocks, interactive back annotation of layout data and re-simulation,efficient modeling and simulation of substrate effects, thermalcoupling of building blocks and packaging impact. Some of thetooling aspects target towards quality improvement, which can help tomake the design first time right. Tools for automatic design of specificbuilding blocks are not very efficient due to their restrictions to lowcircuit complexity, predefined circuit topology and the minor savingpotential in design time. As a typical example an Opamp wasdiscussed in section 4.

87

For a typical project, Fig. 9 shows the percentage of effort for design,definition, architecture, layout and management with respect to theoverall project effort. Design and layout makes about half of theproduct development efforts. Clearly, minimum cycle-time can beachieved only by attacking all tasks in the product development. Forsure, a lot of IP reuse is possible in definition and architectural work.

Reuse strategies require good cooperation within a design team andbetween different design teams. Thus the human component must beconsidered as well. Team-building, motivation and information floware essential to make reuse work. Besides reuse and adaptation work,each project should have some innovative parts. This helps to havemotivated engineers and keep their know-how up-to date.

6) Conclusions

Driven by technology roadmap, increasing system requirements and‘time to market’ targets, reuse of analog IP is nowadays veryimportant. But this does not only mean using plug & play analogmodules or macros, it also means IP reuse with a so-called mix &match strategy. Architectural considerations should also not beneglected as an important factor in this strategy. The key enabler forIP reuse is the team spirit within a company and thus special attention

88

has to be paid to interpersonal relations. Last but not least with all thereuse don’t forget to design new and innovative circuits in order tohave innovative steps in the product roadmap and to keep up with theleading edge of mixed signal design.

7) Acknowledgements

Special thaks to B. Seger for the contributions concerning layout, F.Cepl for providing the SAR ADC reuse-tree, R. Schledz forcontributing table of reuse categories. Furthermore we appreciate thevaluable discussions with M. Clara, Ch. Fleischhacker and Ch.Sandner.

8) References

[1] M. Christian, et. al, “0,35u CMOS COFDM Receiver Chip for terrestrialDigital Video Broadcasting”, ISSCC 2000, page 76-77[2] H. Weinberger, et. al., “A 800mW, Full-Rate ADSL-RT Analog FrontendIC with integrated Line Driver” CICC 2001[3] A. Wiesbauer, et. al., “A 13.5 Bit Cost Optimized Multi-Bit Delta SigmaADC for ADSL” , Proceedings of ESSCIRC, September 1999, pp 82-88[4] K. Bult, "Analog Design in Deep sub micron CMOS", Invited Paper,

Cable Modem Applications in CMOS”, submitted to ESSCIRC 2001,unpublished.

Proceedings of ESSCIRC, September 2000, pp 11-17[5] A. Wiesbauer, et. al.,“ A Fully Integrated Analog Front-end Macro for

PROCESS MIGRATION TOOLSFOR ANALOG AND DIGITAL CIRCUITS

Kenneth FRANCKEN, Georges GIELEN

Katholieke Universiteit Leuven, ESAT-MICASKasteelpark Arenberg 10, B-3001 Leuven, Belgium

[email protected] : francken

ABSTRACT

The rapid progress in CMOS VLSI technologies togetherwith the shortening time-to-market constraints of acompetitive market and the shortage of designersnecessitates the use of computer-aided design (CAD) toolsfor the automatic porting of existing designs from onetechnology process to another. Both horizontal and verticaltechnology porting are considered, where during verticalporting the intrinsically better capabilities of the newprocess can be exploited to either improve the performanceof the circuit, or to keep the same performance whilereducing power and/or chip area consumption.

This paper presents CAD techniques for the automaticporting of both analog and digital circuits. Both the circuitresizing and the layout regeneration are discussed. For thecircuit resizing, a scaling step is followed by a finetuningstep. For the layout regeneration, a template-basedapproach is suggested. Experimental results illustrate thecapabilities of the presented methods. Finally, theimportance of proper design documentation will bestressed as a necessary means to facilitate easy technologyporting.

1. INTRODUCTION

Advances in very deep submicron CMOS VLSI integrated circuitprocessing technologies offer the possibility to integrate more and morefunctionality on one and the same die, enabling today the integration ofcomplete systems that before occupied one or more printed circuit


boards onto a single piece of silicon. An increasing part of theseintegrated systems contain digital as well as analog circuits, and this inapplication areas like telecommunications, automotive and multimediaamong others.

The growing complexity of these integrated systems in combination withthe tightening time to market constraints, however, poses a seriouschallenge to the designers’ productivity. That is why new designmethodologies are being developed, such as the use of platform-baseddesign, object-oriented system-level design refinement flows, hardware-software co-design, and IP reuse, on top of the already established use ofCAD tools for logic synthesis and digital place & route. For analogcircuits the basic level of design abstraction, however, is still thetransistor level, although commercial CAD tool support for cell-levelcircuit and layout synthesis is emerging [1], allowing designers toconcentrate more on the high-level architectural design issues as well ason the design of key critical blocks only.

One serious problem that challenges both analog and digital designers isthe extremely fast pace of the introduction of new deeper and deepersubmicron CMOS technologies, at a rate which is even faster than thepredicted technology roadmaps [2]. Before any new process can be used,however, a library of digital standard cells or selected IP blocks, such asa processor core or a memory generator, has to be developed andqualified. Developing this from scratch is very time-consuming andexpensive, and delays the production use of the new process. At the sametime, many existing analog and digital blocks are reused in new systemdesigns for new applications, or in newer versions of an existing systemthat is redone in a newer process to reduce cost. The effort, however, thatis needed to guarantee at least the same performance for these blocks inthe new technology is not negliglible and is not at all regarded as verycreative by designers.

Computer-aided or even automated technology porting of integratedcircuit blocks, both analog and digital, is therefore getting more andmore attention today. Two types of process migration or processretargeting can be distinguished, as shown in Fig. 1. The first one iscalled horizontal porting where the same cell performance has to beobtained in a process with the same minimum transistor length but froma different foundry (e.g. CMOS of company ABC toCMOS of company EDF). The second one is called vertical portingwhere the same cell performance or better has to be obtained in a processwith a smaller minimum transistor length from the same or anotherfoundry (e.g. CMOS of company GHI to CMOS ofcompany JKL). For the vertical porting, the intrinsically better

90

91

capabilities of the new process can be exploited to either improve theperformance of the circuit, or to keep the same performance whilereducing power and/or chip area consumption.

This paper will discuss techniques for the automatic process porting(both vertical and horizontal) of both analog and digital cells. Bothadvantages and limitations will be presented. Section 2 will describe apossible flow for an automatic porting tool, distinguishing between thesizing retuning phase and the layout retargeting phase. Section 3 willthen illustrate this for an analog design case (a modulator), whilesection 4 will illustrate this for the porting of a digital standard celllibrary. Guidelines or measures to be taken into account during design tofacilitate an easy porting of that design later on will be discussed insection 5. Finally, conclusions will be drawn in section 6.

2. PORTING METHODOLOGY

While digital circuits can often be retargeted to a new technology bygeometrically scaling the layout, this procedure does not automaticallyguarantee success for analog circuits and not even for digital standardcells. This is due to the different scaling needed for different componentsin the circuit to keep at least the same performance. We can distinguishtwo steps in the technology porting task (see Fig. 2) : 1) the circuittuning or resizing in which the device sizes and biasing are modifiedsuch that at least the same performance is obtained in the new process,and 2) the regeneration of the layout with the new layout rules and theupdated device sizes. Since new technology parameters can influence thecircuit performances, it is imperative to perform simulations at thecircuit level to verify the correct performance of the circuit after tuningand after layout. Both the tuning and the layout steps will be furtherdiscussed in the next sections. Note that for a more complex circuit, like

92

an analog-to-digital converter, this process will be performedhierarchically, first at the level of the converter and secondly at the levelof the circuit blocks, as will be illustrated for the modulator lateron.

In the remainder of this paper, we assume that the new technologyprocess (called the target process) is compatible with the originalprocess (called the source process) in the sense that we can use the samecircuit topology. If this is not the case, for instance because the targetprocess has a supply voltage that is much lower than the source process(e.g. 1.8 V versus 3.3 V), then new topology structures (e.g. low-voltagestructures) will have to be used and the design becomes a completelynew design instead of the process migration of an existing design. Weexclude such cases here.

The largest difference between the porting of a design on the one handand the creation of a new design, be it by manual handcrafting or with acircuit and layout synthesis tool, on the other hand, is that, in the case ofporting, a good reference design exists that serves as a basis to start from.This is not the case with a new design that basically starts from scratch.

In the case of porting, the existing design has already proven to beworking, and only needs updating in the sense of slight modifications tothe device sizes and a regeneration of the layout tailored to the newlayout technology rules and the tuned device sizes. In many cases thedesigners even prefer the new layout to look very much like the old one,which makes it easier for them to “read” the new layout. Therefore,advantage can be taken of both the existing device sizes and the existinglayout to reduce the complexity of the porting task. This will bediscussed in detail in the next section.

3. PORTING OF ANALOG CELLS

For the porting of analog cells, we perform the resizing in two steps.Keeping the same topology, the first step is to perform an initial scalingof the original design, which gives us a starting point already close to thefinal solution in the target technology process. Following this step, afinetuning phase using optimization takes place to correct for possibleviolations of certain performance specifications or to reduce powerand/or area while keeping the same performance. This is graphicallyillustrated in Fig. 3. Both steps can be automated, although they requiresome information from the original design.

93

3.1. Sizing step

94

The first step taking place in the resizing process is an initial scaling. Theexisting netlist is linked to the new technology file and all transistormodel parameters are updated. Then the biasing currents and transistorwidths W (relative to ) are altered. If the supply voltage stays thesame and bias currents and W's are scaled, we choose not to alter the biasvoltages. The scaling factor is determined by writing down the currentequations (we neglect the Early effect) under constant constraint.(Note that other scaling factors are obtained if other constraints areused.) If index A stands for the source process and index B indicates thetarget process, we get:

Or:

The minimal transistor lengths of the two processes are known. Theparameters KP (both for n and for pMOS) and (consisting of anda term dependent on the body effect) are - on the contrary – typically notgiven in the technology file of deep-submicron CMOS processes. Theycan be obtained by simulating a test circuit in SPICE and then fitting theoutput data. For a digital circuit we can take but foranalog circuits this is dependent on the sizing. We assume that the supplyvoltage is the same in both technologies and we assume that thetransistors will be designed with the same gate overdrive voltage in thenew technology, which is equivalent to making equal to 1. Hence, weget for the scaling of the transistor widths :

with

The numerical value of these factors is different for nMOS and pMOStransistors.

3.2.1. Initial scaling

with

and

Capacitor values on the other hand are scaled depending on the function

of the capacitor. To keep poles and zeros at the same frequency, theirvalue is kept constant, and their area is updated according to the new per-unit capacitance value. When the matching of the capacitors is important(like for sampling and integration capacitors in switched-capacitorcircuits), the scaling is performed according to the new mismatch data(i.e. keep the ratio, but alter the size to meet the matching specification).This depends on the function of the capacitor in the original design andthis information should therefore be available from the original designer(see section 5).

Capacitors can cause even more difficulty during porting when the sametype of capacitor implementation (e.g. poly-poly capacitor) is not presentin the new technology. New CMOS deep-submicron processes willalways be first available in the digital-only version. Analog extensions oranalog options to this technology (like a poly/poly capacitor) will onlybe available in a later phase, if at all. Therefore, if the original design wasimplemented with poly-poly capacitors, these might now have to bereplaced with metal-sandwich (MiM) capacitors (if linearity isimportant) or MOS gate capacitors (otherwise, because of small area).

Similar problems can arise for other passive components like resistorsand certainly for on-chip spiral inductors, which completely have to beregenerated based on the new technology data (especially the substrateresistivity is important).

Table 1 summarizes the initial scaling factors for different deviceparameters. Applying these formulas results in a sized circuit in the newprocess. The circuit is verified with numerical simulations (e.g. SPICE)which yields numerical data for all performances of concern. We canthen compare these performances with the specifications to see whetherthey are all satisfied or not. If they are, the whole porting process hassuccessfully finished, unless we want to additionally reduce powerand/or chip area. In this case and in the case when not all thespecifications are satisfied, we have to continue with the finetuning step.

95

where is a penalty function that assumes a large value when

3.1.3. Example : resizing of a modulator

The approach is now illustrated for the resizing of a modulatordesigned for ADSL specifications from the Alcatel Microelectronics

CMOS technology with analog options (poly-poly capacitor) to thedigital CMOS technology of Alcatel Microelectronics. As nopoly-poly capacitors could be used in this target process, a 5-metal-layersandwich capacitor was chosen instead for sampling and integrationcapacitors.

performance does not satisfy specification and and areweighting coefficients.

As compared to circuit optimization during synthesis [1], theoptimization algorithm used here is preferrably a local method, since agood starting point is already available from the original design after thescaling. In addition, the method can even be speeded up usinginformation from qualitative reasoning, which indicates in a tabularformat, called dependency matrix, which parameter has to be improved toimprove a certain performance [3]. This is close to the human way ofthinking. The information in the table can for instance be generated usingsensitivity analysis, and the different possible parameter changes areprioritized according to their impact on the violated specifications butalso considering their (possibly negative) impact on already satisfiedspecifications.

96

After the initial scaling, the next step is to simulate the circuit blocks,verify their performances and - if necessary - adjust some device sizes.This problem can be cast as an optimization problem in which anoptimization algorithm minimizes a cost function. This cost functionconsists of terms that penalize any violations of the performancecharacteristics compared to the original design, and possibly in additioncontains the implementation cost of the solution, i.e. power consumptionand/or chip area, that has to be minimized. The optimization variables arethe device sizes and biasing. The mathematical formulation is as follows:

with

3.2.2. Finetuning step

97

The ADSL specification requires an accuracy of 12 bits, but the goal ofthe prototype was set higher, namely a 13-bit accuracy. This means adynamic range of 80 dB. The required signal bandwidth for ADSL is 1.1MHz Furthermore, an oversampling ratio (R) of24 was chosen for the original design, resulting in a sampling frequency

of 52.8 MHz. A 2-1-1 (4th-order) cascade structure was selectedfor the original implementation as shown in Fig. 4. The complete blockdiagram is shown in Fig. 5. More details on this original design can befound in [4].

Circuit structure

98

In the original design, the size of the sampling capacitor of the first stagewas determined by the required kT/C noise floor. Fromwith a and OSR=24, the minimum size of C turned out to be3pF. The capacitor size of the last stage was mainly determined bymatching considerations. A matching of 1% was sufficient for thesecapacitors, so a unit capacitance of 0.25 pF was chosen, which has amatching that is smaller than 1%. The other capacitors were scaled downfrom the first to the last integrator. This means that the capacitive loadsof all the integrators are reduced and the OTA’s were scaled down as

well over the different stages. From behavioral simulations a minimumOTA gain of 80 dB, a closed-loop OTA pole of 190 MHz, a minimumslew rate of 400 and a maximum switch resistance of werederived. For the comparator the requirements for offset and hysteresiswere maximum 100 mV and 40 mV, respectively. All building blockspecifications are summarized in Table 2.

We will now discuss the integrator and the comparator. We used 3scaling factors to perform the initial scaling: scale_n (for nMOStransistors), scale_p (for pMOS transistors) and scale (for capacitancesand bias currents). All three have been calculated using Table 1 as0.54065, 0.51639 and 0.7 respectively.

99

100

The integrator. Table 3 illustrates the specifications and theperformances of the integrator building block in the and the

process after the initial scaling step. As can be seen, the switchresistance specification is violated, and thereforee a second step, namelythe finetuning phase, is necessary to correct for this violatedspecification. The finetuning is also used to reduce the powerconsumption of the integrator as Table 3 shows that there is margin onperformances like GBW and slew rate. The schematic of the employedOTA, a gain-boosted differential folded-cascode, is shown in Fig. 6(without common-mode feedback or biasing circuitry).

The sizes for the OTA, for the gain-boosting stages and for the switched-capacitor integrator together with their scaling factors are shown inTable 4, 5 and 6 respectively. Note that – due to the finetuning – theeffective scaling factors are different from the initial ones, precluding asimple geometric scaling. The bias current of the OTA was changed witha factor 0.7 from 2.5 mA to 1.75 mA. Due to the different parasitics ofthe metal-sandwich capacitor, Cload was changed from 18 pF to 12 pF.The 3 other integrators were scaled in the original design by factors 0.5,0.35 and 0.35 compared to the first integrator (due to the samplingcapacitances decreasing in each stage). The same factors were appliedduring the porting.

101

Part of the qualitative dependency matrix that can be used to finetune theperformance is shown in Table 7. Possible parameters to tune the

102

performance are the bias current and the widthof the input transistors as illustrated by thequalitative numbers in the table.

An additional finetuning step was performed to correct for the violatedswitch resistance specification (see Table 2) and to additionally reducethe power consumption by means of the OTA bias current whilekeeping the performances at least equivalent to the ones in the originaldesign. The scaling factors of the switches are determined to be 0.6758and 0.6455 for nMOS and pMOS transistors respectively (see Table 6).This results in a switch on-resistance of like in the originaldesign.

As can be seen from Table 3, the limiting performance was the slew rate.If we want to make the slew rate in C035) equal to theoriginal design ( in C05), the current can be reduced with 15%down to 1.5 mA. After comparing the simulated results of this finaldesign with the performances of the original design, we can see that thetuned version performs at least equally well as the original design with a40% lower power consumption for the integrator.

The comparator. Table 8 illustrates the specifications and theperformances for both processes of the comparator building block. Theschematic of the comparator is shown in Fig. 7. Table 9 summarizes alldevice sizes for both the source and the target process, together with the

effective scaling that was applied. To avoid kickback noise, the input ofthe comparator is sampled on a 0.25 pF capacitor of which the value isleft unaltered. The design variable Ibias was scaled from 100 toLike in the original design, the second comparator is a scaled version ofthe first one to reduce the load on the C2a clock signal; the third one isidentical to the second one.

103

existing layout as much as possible, a template-based approach ispreferred in this case [5,6]. The layout then looks like the original layout,and also the parasitics will likely be similar (in a scaled sense). Thetemplate fixes the relative position and interconnection of the devices.The layout is then completed by correctly regenerating the devices (withthe possibly updated device sizes) and the wires for the new processaccording to this fixed geometric template, thereby trying to use the areaas efficiently as possible. These approaches work best when the changesto the circuit’s device sizes are not too large, so that there is little needfor global alterations in the general circuit layout structure and hence theexisting template can be used. Fig. 8 shows for example three differentinstances of the layout of a circuit generated with a template-drivenlayout methodology [7].

The main problem, however, as already stated above, is the automaticextraction of the template from an existing layout. Most template-basedapproaches published in the past [5,6] a priori generated a template forevery circuit and stored that in some library, to be used during layoutgeneration. If no such template is available, then it will have to beextracted from the layout, which is much more difficult. In practice, thiswill often be the case, unless the design has been documented properly,as will be discussed in section 5 later on.

104

3.2. Layout step

105

to keep the relative positions of the building blocks

to keep the aspect ratios of the building blocks

All of this should be done as far as practically managable.

3.2.2. Template-based cell layout generation

Analogous to the sizing problem, we want to take as much advantage ofthe original layout as possible to synthesize the new layout based on thenew device sizes. This means that we want to generate the new layout asmuch as possible with the original layout as guide or reference, called a“template”. The preferred layout approach here is therefore “template-driven” layout.

There are however a few practical limitations. One of them is that, withthe original layout at our disposal, we still are confronted with the factthat we must be able to automatically recognize all the devices of theoriginal circuit and their interconnections on that layout. This is a verycomplex task that to some extent is also performed in LVS tools, butthese tools don’t provide the full information needed to build up atemplate from an existing layout. Another practical inconvenience thatone is likely to encounter when trying to recognize and resynthesizedevices is a possible different technological implementation of certaindevices. For analog circuits this can be the case for special resistor andcapacitor layers available in one technology but not in the other. For bothanalog and digital circuits the number of metal layers (used forinterconnection) can differ between the two processes.

We will distinguish between the top-level layout or floorplan, which isneeded for more complex cells, and the layout regeneration of the basiccells.

3.2.1. The floorplan

For more complex cells like a modulator, the layout is generatedhierarchically according to a floorplan that is defined first. The layout ofthe floorplan is an important step which has impact on all blocks that arepart of it. Mostly, a lot of reasoning has preceded the final floorplan ofthe original layout. It is therefore only logical to reuse the originalfloorplan in the new target process, or more specifically :

Once the floorplan has been determined, the layout of the differentblocks can be generated accordingly. In order to take advantage of the

Digital standard cell libraries are a key element in every modern VLSIdesign flow. The most important issues are compactness and speed of thecells. Therefore, the performance of these cells and their layout areindividually tuned. This job is not only complex, but also very time-consuming considering the fact that this is mostly handcrafted work. Ofcourse, this only needs to be done once for every technology. But also inthe case where multiple foundries are used for reasons of multi-sourcingor where different flavors of the same process (e.g. with or withoutgermanium option) are used, a new optimized library is needed. Let usalso keep in mind that new and smaller feature-size technologies arebecoming available at an increasingly faster rate [2] and that even “older”processes get tweaked over time to increase performance and yield. Onthe other hand, market pressure demands quick product introductions andthe availability of the standard cell library is therefore often a bottleneckto adopt a new technology today.

It would therefore be beneficiary to have very quickly access to a firstversion of the new library, generated by the computer from an existingprevious library, and which can still be tuned manually afterwards if theneed arises to squeeze out the last square micrometer. We will nowdiscuss such a porting methodology for digital standard cell sizing basedon a genetic algorithm. Since we use a SPICE –level circuit simulator forthe transient delay simulations of the cells, accurate performance resultsare guaranteed.

Our approach is optimization-based in combination with SPICEsimulations, as this is the only approach that provides the necessaryaccuracy for the library cell performances similarly to what is normallyused in hand–crafted sizing. The optimizer therefore iterates for differentvalues of the device sizes to tune the cell’s performances to the requiredspecifications while minimizing cost such as power and/or chip area. Ateach iteration transient SPICE simulations are performed to extract thedesired performance characteristics (propagation delays, rise or falltimes, etc.). To this end, parameterizable netlist descriptions for thedifferent cells have been developed. These descriptions are standardSPICE syntax and the desired performances are also represented in eachnetlist as measured variables, which are automatically parsed by a toolimplementing the porting approach.

For the porting itself, a user can choose that the performances in thetarget process can be kept equal to those in the source process or they canbe tuned by relaxing some specifications or making them more stringent.Of course, in practice, one will set the specifications – mainly in terms of

106

4. PORTING OF DIGITAL CELLS

delays – more stringent for the target technology; otherwise there wouldbe no need for the new process.

As optimization algorithm guiding the parameter selections we employedthe differential–evolution genetic–based program described in [8], whichwe altered slightly. It is a genetic algorithm that searches for a globaloptimum and uses continuous parameter values. Among the changescompared to [8] are the inclusion of parameter bounding and stopcriteria.

Every population member in the genetic algorithm is represented asshown in Fig. 10. The different genes represent the lengths and widths ofthe transistors in the circuit. These parameters are passed to the simulatorwhich performs the requested analyses. The simulation results togetherwith the specifications are then used to evaluate the fitness of the

107

4.1. Flow of the toolFig. 9 shows the flow of the tool. The user provides the specifications ofthe performances, mainly delays and rise/fall times, that have to beevaluated by means of the measurement statements in the SPICE netlist.These specifications for the target technology can be chosen to be thesame as in the source technology or other, more stringent, values can bespecified. The properties of both source and target technology arespecified in an ASCII configuration file. The tool then returns theoptimum cell sizes that ensure that every performance satisfies itsspecification.

This is a minimax problem formulation. The algorithm will try tominimize the cost, which is equal to the maximum normalized deviationof a performance from its specification. Each performance is thusnormalized to have an equally important influence. Also, a weight factorW is included which is different when the specification is met (100) ornot (100000). Note that with W = 100 and a cost threshold stop criterionof 1, a tolerance of 1% is achieved. It is, however, also possible that thegenetic algorithm proposes bad combinations of parameters (e.g. out ofrange). Then, a “high” cost is assigned (e.g. 1e+8) to such solutions.

108

member by means of the following cost function :

4.2. Digital standard cell porting examplesWe will now demonstrate the capabilities of the tool to automaticallyfind the scaling factors for the transistor widths (nMOS and pMOS) thatare necessary to migrate digital standard cells from one technology to anewer one. To have an optimal performance, the scaling factors are notnecessarily the same for each type of cell. The source technology is a

CMOS process and the target process has a gate length.Since all cells have minimum gate length, we don’t optimize thetransistor lengths.

A first experiment keeps the performance specifications of the originalcell. We migrate a simple inverter cell from a CMOS to a

process, where we try to keep the performances. So, the question is:how small can the transistors be sized in the technology as tostill have the same performance as in the technology? Note thatthe scaling factor for the transistor length is 0.714 (1/1.4). The results

are given in Table 10, where a comparison is made with actual (“real”)data of cells that were hand–crafted by the manufacturer in the sametarget technology process. The final cost function value is given togetherwith the time taken by the tool and the number of generations of thegenetic algorithm. Although a genetic algorithm is very well suited forparallel execution, the numbers presented here are the results ofexecution on a single host computer (SUN Ultra 30). We can concludefrom the table that the optimized performances are within the giventolerance of 1% (0.5% for low–to–high (PLH) and 0.2% for high–to–low propagation delay (PHL) respectively). Nevertheless, the optimizedparameters – the nMOS and pMOS scaling factors – deviate by as muchas 62 % from the “real” values that we had in the manufacturer’s library.This is, of course, due to the fact that the speed specification for astandard cell library in practice is always increased when moving to afaster technology process; otherwise no advantage of the faster processwould be taken.

Hence, in the second experiment, also the speed specification isincreased. In Table 11, we present the results of an experiment similar tothe first but where the target performance specification is entered to beequal to the real simulated target cell specifications from themanufacturer’s hand–crafted library. This is done for three different cells(inverter, 2–input and, exor). Again, a comparison has been madebetween results from the tool and the actual cell data. It is clear that thescaling factors now match better with the real values. Nevertheless, theydeviate by 3 to 8%, even though the optimized performance is within 1%of the specification (as requested). This is likely due to extra designmargins that are taken in a real design.

The above mentioned experiments show that the migration flow fordigital standard cells works and that the user can arbitrarily set the targetspecifications. The performances of the optimized cells are within theaccuracy specified by the user (1% in our example). Also, theoptimization times are well within reasonable limits since the librarymigration will be done only once for every new process. In addition, wedidn’t make use of parallel execution on different host computers, whichwould speed up the optimization even further. Therefore, by makingtemplates of the library cells only once, a fast migration at the level of

109

110

cell sizing is possible for every subsequent technology. Again, it isassumed that the cell topology does not change when porting to the newprocess.

In order to allow an easy porting, the original design should be somehow“prepared” for the porting. It is difficult for another designer unfamiliarwith the previous design, and even more so for a computer tool, tounderstand all the details, the intent and the little “twists” in the mind ofthe original designer when completing his/her original design. Therefore,to facilitate porting, a minimum amount of documentation should begenerated by the original designer and should be delivered together withthe design itself. This small initial overhead certainly pays off on the longrun for the company when the design has to be ported to other processeslater on. And it is only the original designer who has the information thatis needed for this and who therefore has to provide this. Besides, “designflow capturing” tools that operate in the background could be set up hereto help the designer in this job. Also, standardized verification tools thatgenerate a standardized datasheet for each circuit would certainly beuseful here.

5. DOCUMENTATION FOR PORTING

As a start in this discussion, we will specify here what kind ofinformation should be included in the documentation accompanying theoriginal design. For a design to be portable, we propose the followingmandatory set :

System specifications + derivation of the specifications of each block(top-down) in order to ensure that the system will work, together withother essential specifications/performances

Top-level architecture + external PIN connections + the topology of

1.

2.

111

6. CONCLUSIONS

This paper has presented CAD techniques for the automatic horizontaland vertical porting of both analog and digital circuits. Both the circuitresizing and the layout regeneration are discussed. In both cases,advantage is taken as much as possible of the existing design as areference to start from. For the analog circuit resizing, a scaling step wasfollowed by a finetuning step. For the layout regeneration, a template-based approach was presented. For the digital standard cells, asimulation-based optimization approach was adopted. Experimentalresults have illustrated the capabilities of the presented methods. Alsothe importance of proper design documentation has been discussed as anecessary means to facilitate easy technology porting.

Future work will have to concentrate on improving the methods andintegrating them into a flawless automated environment for both analog

all blocks in the hierarchy + their interconnections to form the system( = hierarchical netlist)

The circuit sizes for each block together with a list of the criticaldevices, possible problems and the relation between importantperformances and the device sizes having the most impact on thesespecifications/performances (e.g. the GBW increases with increasingchannel width of the input transistors, etc.)

Simulation or verification method, applied inputs, outputs to bechecked, how to verify that the specification is met, simulationexamples (graphs)

3.

4.

We understand the extra effort needed for the original designer of thecircuit to document all this information in an orderly fashion. On theother hand, almost all of the information in the list is generated at somepoint of time in one or another file during the course of the designanyway. Moreover, the designer him/herself can also benefit from thisdocumentation, either for a next design or for some kind of reporting.

Finally, we want to point to the fact that documentation will play anincreasingly important role in the trend towards complex integratedsystems-on-a-chip. Organisations like the VSI (Virtual Socket Interface)Alliance [9] have acknowledged this need and have proposed an openinterface to make design re-use possible. The circuit that is being reusedis then a so-called VC (Virtual Component) and will have to beaccompanied by a minimum standardized set of documentation.Retargeting benefits from the same documentation.

and digital circuit porting. Also the role of documentation andtechniques to minimize the overhead of design for reuse will have to befurther investigated and implemented.

112

ACKNOWLEDGEMENTS

This work has been supported in part by the ESPRIT project NAOMI andthe IWT project FRONTENDS.

REFERENCES

G. Gielen, R. Rutenbar, “Computer-aided design of analog andmixed-signal integrated circuits,” Proceedings of the IEEE, Vol.88, No. 12, pp. 1825-1854, December 2000.“International Technology Roadmap for Semiconductors,” 1999version + 2000 update, http://public.itrs.net.K. Francken, G. Gielen, “Methodology for analog technologyporting including performance tuning,” proceedings InternationalSymposium on Circuits and Systems (ISCAS), June 1999.Y. Geerts, A. Marques, M. Steyaert, W. Sansen, “A 3.3 V 15-bitADC with a signal bandwidth of 1.1 MHz for ADSL applications,”IEEE Journal of Solid-State Circuits, Vol. 34, No. 7, pp. 927-936,1999.G. Beenker, J. Conway, G. Schrooten, A. Slenter, “Analog CAD forconsumer ICs,” chapter 15 in “Analog circuit design” (edited by J.Huijsing, R. van der Plassche and W. Sansen), Kluwer AcademicPublishers, pp. 347-367, 1993.H. Koh, C. Séquin, P. Gray, “OPASYN: a compiler for CMOSoperational amplifiers,” IEEE Transactions on Computer-AidedDesign, Vol. 9, No. 2, pp. 113-125, February 1990.R. Castro-López, M. Delgado-Restituto, F. Fernández, A.Rodríguez-Vázquez, “Reusability methodology for IC layouts,”proceedings Workshop on Advanced Mixed-Signal Tools, ESD-MSD Mixed-Signal Design Cluster initiative of the EuropeanUnion, March 2001.R. Storn, “On the usage of differential evolution for functionoptimization,” in NAFIPS, pp. 519–523, 1996.Virtual Socket Interface Alliance, several documents includingVSIA Architecture Document and Analog/Mixed-Signal ExtensionDocument, http://www.vsi.org.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

Introduction to High-speed Digital-to-AnalogConverter Design

Rudy van de PlasscheBroadcom Netherlands BV

Bunnik

Abstract

In this paper limitations in static linearity (INL, DNL) and dynamic range (Ef-

fective Number of Bits, ENOB’s) of digital-to-analog converters due to clock jitter,

finite linearity, component matching and switching uncertainty will be calculated.

Secondly quantization error spectra are analyzed and the influence on distortion and

cross modulation effects is derived. Practical design examples will be discussed.

1 Introduction

Digital-to-analog converters are the link between digital signalprocessing and the analog world. In Fig. 1 the different signalconditions present in a converter are given. From Fig. 1 it is seen

that a digital signal is a discrete time, discrete amplitude signal.115

J. H. Huijsing et al. (eds.), Analog Circuit Design, 115-150.© 2002 Kluwer Academic Publishers. Printed in the Netherlands.

116

An analog signal is a time continuous, amplitude continuous sig-nal. To convert from digital into analog a signal reconstructiontakes place. Sampling limits the maximum frequency range tothe Nyquist frequency. A filtering operation is needed to limitthe maximum signal frequency and avoid aliasing. Amplitudequantization discretizes the amplitude into well known steps. Aquantization error is introduced. This quantization error limitsthe dynamic range of a system. The quantization error dependson the number of steps used in the system.

2 Ideal converter

In an ideal converter the sampling time is fixed and constant anddoes not introduce any error. Only the amplitude quantizationerror causes a limitation to the system. Because all errors dis-cussed in this paper will be referred to the quantization error thiserror will be calculated first. In Fig. 2 the quantization of a signalat the sampling moment to the amplitude level is shown.The quantization error determines the error between the analogsignal and the quantization level In the lower part of Fig. 2the probability density of the error over the quantization interval

is shown. Here is the quantization step of the con-verter. The uniform probability error function shows that thereis no correlation between the signal frequency and the samplingfrequency. The quantization error power can be calculated using

as the quantization step. The average quantization error powerbecomes:

Solving this equation we get the well known result:

Applying a sine wave with a peak-to-peak amplitude ofto an n-bit system, then the RMS signal am-

plitude can be calculated as:

The dynamic range (Signal-to-Noise ratio) of the n-bit systembecomes:

Inserting values for amplitude and quantization error we get:

Converting this into Decibels we obtain:

These results give a global analysis of quantization error. Toobtain a better knowledge about what quantization errors arean analysis of the quantization error spectra will be given. Thismodel can be applied to analog-to-digital and digital-to-analogconverters.

117

118

2.1 Quantization error spectra

Suppose a quantized ramp signal shown in Fig. 3 is reconstructedthen the error signal can be determined as a sawtooth with am-plitude and repetitions as shown in Fig. 4.Note that at this moment ONLY AMPLITUDE QUANTIZA-TION is used. SAMPLING of the signal will be performed at alater stage.By shifting the DC value of the signal as shown in Fig. 4 then aFourier analysis of that signal gives only odd harmonics describedas:

In case a sine wave is applied then the output spectrum becomes

more complex. From [1] we obtain:

119

Simplifying this equation we get:

With defined by:

The amplitude of the harmonic with index is given by fromequation 10. The quantization error spectra can be plotted usingthis equation. In Fig. 5 a spectrum with up to 30.000 odd com-ponents of a 10-bit quantizer is shown. The spectrum slowly de-creases with increasing number of harmonics and has a length ofinfinity. The spectrum shows furthermore peaks that can math-ematically be determined to occur at the harmonicof the input signal. A more detailed part of the frequency spec-trum of the same quantizer of Fig. 5 is shown in Fig. 6. Lowerorder odd harmonic amplitudes can be estimated from this figure.Third harmonic is about 90 dB down with respect to full scale.A relation between for example the third harmonic componentand the number of bits of a converter needs some mathematicalmanipulations that go beyond the scope of this material. Theresult for the third order distortion, however, can be expressedas:

As a result a 10-bit converter has a third order distortion com-ponent 90 dB down with respect to full scale. Increasing the

121

resolution of the converter with 1-bit, then the distortion compo-nent reduces with In Fig. 7 the quantization error, thethird order distortion component and the intermodulation com-ponent (IM3) as a function of the number of bits are shown. The

IM3 products will be described in one of the following sections.

2.2 Amplitude dependence of the quantization compo-nents

So far the calculation of the quantization spectra has been per-formed for signals exactly fitting within the quantization levels.In case a signal varies within a quantization level then the totalspectrum changes. Suppose that the amplitude varies as [01], then equation 10 is modified into:

With

Taking as an example p=3 and p=31 then the result for a 6-bitconverter is shown in Fig. 8. This figure (8) shows that depending

122

on the signal amplitude distortion components can be reduced tozero. Maximum third order distortion is found roughly at thequantization levels.

2.3 Multiple signal distortion

At the moment two or more signals are quantized, then it isimportant to know what the so called cross-modulation or in-termodulation products (IM3) will be. A formula for two inputsignals will be given.Suppose we apply the following signals

Using the previously described analysis and using some mathe-matical manipulations we get for the cross-modulation:

Applying signals with equal amplitude then the inter-modulation product changes with the number of bits of

123

the converter as This means that with an increasein resolution of 1-bit, the intermodulation product will decreasewith 12 dB. Again this needs some mathematical manipulationsof the equations to obtain this result.Quantization of analog signals results in errors that are corre-lated with the signal mostly as odd harmonics of the signal. Theanalyzed spectra show basically an infinite number of spectralcomponents.

3 Sampling of a quantizer

In a converter a quantization and a sampling operation is per-formed as has been shown in the introduction. At the momentthe quantizer spectra are sampled using a sampling frequencythen a large distortion called aliasing occurs for all frequenciesoutside the band As a result of this aliasing all higherorder frequency components are shifted to this baseband. Thisoperation is shown in Fig. 9. This figure shows a single sidedfrequency picture of the sampling operation. From this figure

124

it can be seen that if a correlation between the sampling fre-quency and the signal frequency exists, then distortion productswill add, while in the case of uncorrelated sampling and signalfrequencies these components fall in between the harmonics andresult into a ”noise-like“ quantization error. Increasing the sam-pling frequency will result in a reduction of the amplitude of thequantization error components, however, the total power of allthe error components in the baseband never exceeds

4 Non-ideal converter

Practical converters show deviations from the above given analy-sis. Especially the finite matching of practical elements such asresistors and transistors show a big influence on the maximumperformance of a converter. These non-idealities can be split upin timing errors and non-linearity errors. The timing errors canbe random due to noise and jitter introduced by the samplingclock and systematic errors mostly introduces by the layout of aconverter. These systematic errors must be avoided as much aspossible, but at a certain moment are unavoidable. Sometimesa change in converter architecture is needed to avoid systematictiming errors. In a layout for example an interconnection wireof about 100 introduces a time error of about 1 psec. Athigh signal and clock speeds these errors will introduce glitchesand/or mostly third order distortion products of the signal fre-quency. First we will start with timing errors.

4.1 Timing

Timing in a practical system is not ideal. Timing errors can be:

Random timing errors due to clock jitter or noise on clockcircuits

Systematic errors due to layout, differences in wire lengthsor the system architecture

125

Random timing errors result mostly in an increase of quantiza-tion errors at high signal frequencies. Systematic errors result indistortion that in term results in a reduction of dynamic range.Mostly systematic errors are architecture or layout dependent.Wires from the sampling clock to the bit-switches can have dif-ferent lengths in case a layout is not carefully designed.

4.2 Random timing errors

In practical systems sampling clocks show a certain instabilitycalled clock jitter, while additional to this clock jitter noise inthe clock distribution circuitry can increase this jitter. In Fig. 10the influence of clock jitter on the sampling of an analog signalis shown. The influence of this clock jitter on the amplitude

quantization of an analog signal can be calculated. From Fig. 10it is seen that clock jitter only has influence on the amplitudequantization at high input frequencies. This random clock jitterexhibits itself as an extra quantization error and thus reduces thedynamic range of the converter. This effect will be calculated incase of an n-bit converter with quantization steps. A sinewave will be applied as signal because it is the highest frequencypossible in a band limited system avoiding aliasing. With

we obtain after differentiation:

126

With and A the amplitude of the signal.The step size of the converter equals:

Inserting 26 into 25 gives:

A quick indication about the peak-to-peak clock jitter can beobtained stating that the amplitude uncertainty may notexceed the quantization step we obtain after rearranging ofthe equation:

The worst case condition is found if so this equa-tion simplifies into:

With MHz and n = 10 bit, the peak-to-peak clock un-certainty must be below 65 psec. However, we want to know theinfluence of the RMS clock jitter on the Effective Number Of Bits(ENOB’s) of a converter. With ENOB defined as:

Here is the ”measured“ effective resolution of aconverter in dB and includes all the non-ideality effects of a prac-tical converter.The error power due to jitter equals:

In this power equation the slope of the signal determines thesensitivity of the converter to clock jitter. We can average thisslope over the signal period and get:

127

Where defined as:

WithThe total error power due to quantization and jitter becomes:

Using equation 28 we can rearrange equation 29 and obtain for

We will call the sample clock phase noise (rms).The dynamic range of the converter due to clock jitter noise power

changes according to:

The reduction in ENOB’s as a result of jitter then equals:

If then the ENOB’s reduce with bit or 3 dB. The ratiobetween the clock jitter and the signal frequency can be calcu-lated as a function of the number of bits in the converter. InFig. 11 the decrease in ENOB’s as a function of is shownfor converters having a resolution between 4 and 16 bits.

4.3 Glitches

Glitches can be seen as a systematic error occuring in the re-production of an analog signal via a digital-to-analog converter.Especially when a binary weighted converter architecture is usedand a small signal around a major carry transition is converted,then a glitch can be produced. As an example of this phenomenon

128

a binary weighted converter with offset binary coding reproduc-ing an LSB code step at the 011111.. to 100000.. transition. Atthis code transition the MSB value will be switched on or off atthe same time that all other values are switched off or on. In caseswitching time errors occur, then the output code can reach fullscale (1111..) or all zeros (0000..) during a short period of time.This produces an unwanted signal glitch. Filtering off this glitchwill reduce the amplitude, but will NOT reduce the amount ofdistortion produced by this glitch. Suppose that the MSB switchis faster in switching then all the other bits, then the influenceof the glitch energy can be calculated. With is the LSB stepsize, then half scale equals The glitch area becomes:

With is the sample time, then the LSB area is found as:

Suppose that an acceptable reduction in dynamic range is ob-tained when the glitch energy equals the LSB energy then wehave:

129

With and n = 14 bits we obtain thatpsec. Such a small value indicates that an accurate layout of theconverter concerning the switching is needed. Changing the ar-chitecture into a step by step switching of the information having1023 switches with unit currents for example in a 10-bit converteravoids the glitch problem. However, at high output frequenciesclose to the Nyquist frequency a switching time uncertainty isintroduced by the layout of a converter. Every single switch canbe seen as having a certain switching uncertainty compared to anideal switching system. As a result the reproduced signal movesin time giving after filtering a third order distortion. Again thisthird order distortion depends on the signal frequency and thetiming accuracy that can be designed in the layout.

4.4 Linearity

In a non-ideal (practical) converter the quantization steps have alimited accuracy because of finite matching of components. Thisresults in an Integral Non-Linearity (INL) and a Differential Non-Linearity (DNL) of the converter. The INL is important for largesignals because it determines the overall linearity, while the DNLof a converter is important for small signals [2]. Basically theDNL determines the accuracy of the quantization step from quan-tization level to quantization level. This non-ideality results indistortion of the signal and thus in a reduction of the dynamicrange. Mostly the INL is specified as ± LSB. DNL dependson the construction of the converter but is at maximum 2*INLin case of a binary weighted converter. In Fig. 12 the INL andDNL characteristics of a converter are shown. Note that some-times codes are missing giving a large DNL or even the outputsignal can step back with an increase in digital input code. Non-monotonicity of the converter is observed at that moment.

130

4.5 Matching accuracy of converter elements

When designing converters with resolutions from 8 to 16 bits thefollowing question arises:

How accurate do I need to design the unit currents or resistorsto obtain a certain INL and/or DNL?

To answer this question a Matlab program has been used to ob-tain information about the INL and ENOB’s of converters. Theconverter has been modeled using unit current sources or unitresistors to determine every quantization level. In Fig. 13 theresults of this program for a 10-bit converter are shown. A totalnumber of 1000 ”converters“ have been analyzed using this pro-gram. At the same time the ratio between the largest distortioncomponent and the signal component defined as Spurious FreeDynamic Range (SFDR) is analyzed too. The matching of theunit elements has a of 2.5 % in this simulation. In Fig. 14 a his-togram of the converters as a function of the INL is shown. This

131

histogram shows that of the converters reaches ± LSB INLand are within 1 LSB INL. In the range of 8 to 12 bits of resolu-tion of a converter identical simulations have been performed. Asa result of these simulations Fig. 15 shows the relation betweenthe required unit element matching and the number of effectivebits (ENOB’s). Results for and yield of a converter areshown too. It must be noted, that in case a segmented or binaryweighting in a converter is used, then the matching accuracy be-tween the segmented elements or the binary weighted elementsincreases according to the value or the amount of elements usedto obtain the required weight. In practice mostly a number ofelements is put in parallel to increase the unit value. As a resultthe accuracy increases withThe finite matching of components in a converter results in alimited linearity of such a converter. A very useful relation be-tween INL and reduction in ENOB’s of a converter is proposed.The finite INL results in a systematic error signal that introduces

133

distortion. Because the INL is directly related to the LSB of aconverter the distortion introduced will be related to the quanti-zation error using a simple ”fitting“ model.

Identical to what has been done with clock jitter the dynamic

range of a converter changes according to:

Here is the peak to peak systematic signal distortion com-ponent due to finite converter accuracy. This value gives theworst case condition, because it is not known how the INL curveas a function of the signal value behaves. A Fourier analysiswould give exactly the value of the different distortion compo-nents and in that way a better estimation of the total distortioncan be obtained. To verify the model the ENOB reduction willbe calculated. The ENOB’s reduce with:

134

This model is valid for yield of converters. In Fig. 16 thesimplified model is inserted into the ENOB simulation using MatLab. In this figure only a limited range of INL is shown. However,the model has been verified over larger variations of INL.In Fig. 17 the worse case reduction in ENOB’s of a converteras a function of the INL is shown. This graph is very usefullto get quick information about the converter resolution and thelinearity.

5 MOS matching models

Designing converters with unit elements, then the matching ac-curacy between the elements as a function of the number of bits

a matching of 2.5 %. In case we want to increase the yield tothen the matching must be increased to 1.25 %. In MOS technol-ogy information about matching of components is available as afunction of technology and a limited amount of model parameters

is known. A 10-bit converter needs for yield and ± LSB INL

135

[3].Suppose the MOS devices axe in saturation then:

The first condition that will be considered is:1) Equal Drain Currents so:

This results in:

Defining small difference between the two MOS devices using:

We obtain:

Working out this equation we obtain:

136

The matching of an MOS pair is equally influenced by the thresh-old matching or by the slope mismatch if:

In practical MOS technologies

In Fig. 18 the threshold matching of MOS devices having a bydevice size versus the gate oxide thickness of the technology

is shown. From this figure it can be seen that the mismatchreduces with decreasing gate oxide thickness. The validity of thisrelation has been proven even for submicron technologies. The

gain mismatch of MOS devices versus gate oxide is shownin Fig. 19. From this figure we see that the gain mismatch isnearly independent of the gate oxide thickness. This means that

137

with increasing drain current the mismatch of a differential pairor a current mirror will become independent of technologylimited). If

then

In practice this means that the current density in the MOS devicemust be below a value corresponding with the givengate-source voltage. The calculated offset is valid for MOS de-vices with a by gate size. Increasing the size of thedevices it is known from literature that the offset decreases withincreasing device area or:

In Fig. 20 the measured threshold mismatch has been plottedagainst the device size of an MOS transistor. Increasingthe size reduces the mismatch according to equation 57. Thedesigner has the option to size the devices regarding offset. In a

138

practical situation the device size variations are limited. A 1 to100 size variation is still possible, however, the capacitance of thedevices increases. This results mostly in an increase in biasingcurrent and thus power.2) Equal gate-source voltage:

With the devices in saturation we obtain:

Solving these equations for a difference in drain currents:

139

Inserting small difference between the drain currents using:

we obtain

Using a first order approximation for the square root we get:

The variable can be replaced by:

Then we obtain for the current mismatch:

At small current densities we have that:

The current mismatch at small current densities can be simplifiedinto:

At large current densities the matching is determined by:

Note that the calculated current offset is again valid for 1x1sized devices!When the size of the devices is changed then the offset variesdepending on of the gate area.

140

The final mismatch a small current densities and device size de-pendent becomes:

In case of large current densities we obtain finally for the currentmismatch:

In Fig. 21 the measured gain mismatch versus the devicesize WL is shown. The designer has again the option to scale the

device size to reduce the current offset.

6 Digital-to analog converter architectures

In this part different architectures to construct digital-to-analogconverters in CMOS technology will be given. What architecturewill be used depends on the application field and the choices adesigner makes. Only a few examples can discussed. Output sig-nals can be a current or a voltage. Mostly differential structures

141

will be used producing the converted output signal and its com-plement. In a differential operation of a system mostly a verygood symmetry exists resulting in the absence of even order dis-tortion components in the output signal and in the quantizationerror. Differential systems can furthermore apply a twice as largeoutput signal to the load. This is important in CMOS submicrontechnologies that have smaller breakdown voltages (about 1 V).Single ended systems on the other hand might show odd andeven distortion components at half the output swing. A largeoutput swing is preferred to improve the dynamic range in a sys-tem application. Cross-talk from other system parts may limitthe dynamic range in such an application. Differential operationimproves the performance by rejecting part of the cross-talk.

6.1 10-bit current mode digital-to-analog converter

Suppose we want to design a 10-bit digital-to-analog converterwith a 1σ INL of ± .5 LSB. The technology we have shows a

of 2 % and we want to use 1023 equal devices to generate allthe current steps [4]. The DC current is set at a value that thethreshold mismatch equals the gain mismatch. This means thatthe average element mismatch becomes:

To obtain a 1 INL of ± .5 LSB an element matching accu-racy of 2.5 % is required. This means that we have to increasethe unit device size to at least Depending on how theoutput signal is generated, a cascode current source construc-tion might be needed to make the matching independent of thedrain-source voltage of the current generating elements. Mostlycascoded stages are used to avoid output signal dependent match-ing problems. The next design choice is: switching unit currentsources using 1023 switches or using a binary weighted construc-tion of the digital-to-analog converter. The unit current switch-

142

ing has the advantage of generating small glitches and having agood Differential Non-Linearity. A problem is the systematic er-ror that can be introduced because of different lengths in clockwires to control the switches in the layout. A very careful layoutis needed having in mind that 100 metal interconnect gives asystematic timing error of 1 psec.The binary weighting of the currents by connecting over a lay-out area distributed current sources to get the binary weightingcauses mostly larger glitches because of a more accurately neededtiming in the on and off switching of the currents and increasethe DNL to about ± 1 LSB. In many designs a combination ofsegmented current sources (equal to 8 or 16 times the LSB cur-rent) and unit weighting is used. In Fig. 22 an example of a10-bit digital-to-analog current generating network is shown. Inthis network only equal sized MOS devices are used. Note that ina layout of such a network at least one row of dummy transistorsmust be added at the outside to improve the overall matching. Incase a current to voltage converter is used to sum the output cur-rents of this network, then a cascode current source may not beneeded. However, the voltage drop across the switches must beequal in all cases to avoid current modulation due to a variationin the drain-source voltage of this network.

143

6.2 10-bit Coarse-Fine voltage mode digital-to-analogconverter

In most IC processes the matching of resistors is a lot better thanthe matching of the active elements. Resistor matching dependson size and mask accuracy of the technology [5, 6]. Without toomuch difficulty a resistor matching better than 0.25 % is ob-tained. This means that the resolution and accuracy limits ofresistor matching dominated designs are between 12 to 14 bitswithout needing special precautions. In Fig. 23 an example of a10-bit coarse-fine resistor matched digital-to-analog converter isshown. As is seen from Fig. 23 the system consists of a coarse

ladder using rather low valued resistors to obtain the coarse con-verter levels. Across each coarse converter level a fine ”ladder“is connected to obtain the fine steps. At each step a switch hasbeen connected that will be controlled by the input digital dataand then an output voltage is generated. Analyzing this systemwe can see, that the output impedance of the total system de-pends on the digital code applied to the converter. Furthermore

144

all these switches are at the output terminal connected togethergiving a large output capacitance. As a result of this variableoutput impedance a different signal dependent delay of the ana-log output signal is found resulting in distortion. Secondly a highimpedance loading is required. A buffer amplifier can be used todecouple the output load from the converter, however, this buffercan introduce distortion due to slew rate limitations and finitebandwidth. Furthermore generating output voltages from 0 to

makes the buffer difficult to design.

6.3 Continuous current calibration converter

When the resolution of a converter increases, then the matchingaccuracy of the individual elements must increase. However, theincrease in matching can so far be only obtained by increasingthe device size WL. In submicron technology, however, this in-crease in size can become unpractically large. At that momentother techniques are required to obtain the high accuracy [7].Furthermore scaling of technology does NOT reduce the size ofthe current network because the gain mismatch dominates theaccuracy. As has been shown the gain mismatch is technologyindependent and therefore the sizing of the devices can not beused. Calibration or Dynamic Element Matching techniques canbe used to improve matching accuracy without increasing size.The continuous current calibration principle is another possibil-ity. In Fig. 24 the basic idea of current calibration is shown. Asis seen from Fig. 24 the calibration principle has two states. Dur-ing calibration the MOS device is via connected as a diodeand switch supplies the calibration current to the diode.At this moment across the gate input capacitor a voltage isgenerated that fits exactly the input current During the op-eration of the system, the switch is opened and the switchconnects the drain of to the output terminal. The voltage onthe gate in principle remains fixed, resulting in an output currentto be exactly equal to In a practical situation, however, the

145

operation of the system is not as expected before. Because ofleakage currents introduced by the drain-substrate diode of theswitching MOS and the charge feed through of this switch, arather large error is found. To overcome these problems the basicsystem has to be modified into the circuit shown in Fig. 25. The

basic operation of the system is identical to the circuit shown inFig. 24. However, in this system an extra constant current

146

being 95% of is added. The calibration now takes place onthe ERROR signal and NOT on the full signal This meansthat errors only influence the accuracy of the calibrated ERRORsignal. An improvement of at least a factor 20 with respect tothe original system is obtained.The application of the continuous current calibration system intoa 16-bit digital-to-analog converter is shown in Fig. 26. The 16-

bit converter consists of 65 current sources that are continuouslycalibrated using an interchanging system. One output current ofthis high-accuracy 6-bit coarse converter network is subdividedusing a MOS only 1024 element binary weighted current divider.The output currents of the coarse and the fine elements are sup-plied to the output switches. These switches are controlled bythe digital input signals and so the digital-to-analog conversion

147

takes place. Depending on the practical design limitations theswitching spikes and small calibrated current mismatches extraquantization errors are introduced. As long as the sampling clockand the calibration/interchanging clock are not correlated theseerrors can be below the quantization error. In that case onlya slight deterioration of the dynamic range of the converter isfound.

7 Conclusion

The following conclusion can be obtained from this paper:

Spectra of quantization errors and the influence of the ampli-tude on distortion and cross-modulation products have beencalculated.

Quantization errors have minor influence on the performanceof practical converters with finite linearityA relation between element matching and overall linearity(INL and DNL) has been practically determinedA practical ”fitting“ model giving the relation between lin-earity and Effective Number of Bits has been demonstratedDistortion in a converter is dominated by the matching ac-curacy of the elements usedThe influence of sampling clock jitter on the Effective Num-ber of Bits of a converter has been determined

Systematic layout problems resulting in timing errors havebeen determined and analyzedMatching parameters of MOS devices have been determinedPractical solutions for converters using element matching pa-rameters and system solutions to obtain a very high accuracyhave been discussed

Depending on the required performance of a digital-to-analogconverter a designer can find a number of design rules to helpwith architectural and circuit design issues.

148

8 Acknowledgment

The author wants to thank Frank van der Goes of BroadcomNetherlands for the Mat Lab programming.

149

References

[1] N.M. Blachman, “The Intermodulation and Distortiondue to Quantization of Sinusoids” IEEE Transactions onAcoustics, Speech and Signal Processing, vol. ASSP-33, No.6, pp. 1417-1426, December 1985.

R.J. van de Plassche, “Integrated Analog-to-Digital andDigital-to-Analog Converters” Kluwer Academic Publish-ers, ISBN 0-7923-9436-4, 1994.

[2]

M.J.M. Pelgrom, A.C.J. Duinmaijer, A.P.G. Welbers,“Matching properties of MOS transistors” IEEE Journal ofSolid-State Circuits, vol. 24, pp. 1433-1439, October 1989.

[3]

H.J. Schouwenaars, D.W.J. Groeneveld, H. Termeer, “Astereo 16-bit CMOS D/A converter for digital audio” IEEEJournal of Solid-State Circuits, vol. SC-23, pp. 1290-1297,Dec. 1988.

[4]

M.J.M. Pelgrom, “A 10-b 50-MHz CMOS D/A converterwith buffer” IEEE Journal of Solid-State Circuits, vol.25, pp. 1347-1352, December 1990.

[5]

P. Holloway, “A trimless 16-bit digital potentiometer”ISSCC Digest of Technical Papers, pp. 66-67, February1984.

[6]

D.W.J. Groeneveld, H.J. Schouwenaars, H. Termeer, “Aself calibration technique for monolithic high-resolutionD/A converters”, IEEE Journal of Solid-State Circuits, vol.SC-24, pp. 1517-1522, Dec. 1989.

[7]

A.W.M. van den Enden, N.A.M. Verhoekx, “Discrete-timesignal processing,” Prentice Hall, 1989.

[8]

W.R. Bennett, “Spectra of quantized signals” Bell SystemTechnical Journal, vol. 27 pp. 446-472, July 1948.

[9]

M. Schwartz, “Information transmission, modulation, andnoise,” McGraw-Hill, 1980.

[10]

150

A.B. Carlson, “Communication systems” McGraw-Hill1975.

[11]

K-C. Hsieh, Th.A. Knotts, G.L. Baldwin, T. Hornak, “A12-bit 1-Gword/s GaAs digital-to-analog converter system,”IEEE Journal of Solid-State Circuits, vol. 22, pp. 1048-1055, Decemebr 1987.

[12]

G. Wegmann, E.A. Vittoz, “Analysis and improvements ofaccurate dynamic current mirrors”, IEEE Journal of Solid-State Circuits, vol. 25, pp. 699-706, June 1990.

[13]

Design Considerations for a Retargetable12b 200MHz CMOS Current-Steering DAC

J. Vital, A. Marques1, P. Azevedo, J. Franca

ChipIdea-Microelectrónica, S.A., Porto Salvo, Portugal

AbstractThis paper addresses design considerations for high-speed moderate-to-high resolution current-steeringdigital-to-analogue converters (DACs) in CMOStechnology. A design example of a 12b 200MHz DAC in

CMOS digital technology is used to illustrate thedesign techniques, which are then validated throughexperimental results obtained from the integratedprototypes. Additionally, some techniques used to renderthe layout of this DAC easily retargetable are alsoexplained.

1. IntroductionHigh-speed, medium-to-high resolution digital-to-analog converters(DACs) are essential blocks in graphical interfaces and in manytransmit ports of modern communication systems. In theseapplications, the current-steering DAC architecture has become awidely used platform, owing it to its linearity, dynamic behaviour,robustness and power efficiency.

This paper makes an overview of the most well known techniques fordesigning current-steering DACs, and describes in more detail aspecific implementation of a 12b 200 MHz DAC in a CMOS

1 Augusto Marques was with ChipIdea - Microelectrónica, S.A. until May 2000. Since then he has been withSilicon Laboratories, TX., U.S.A.


152

technology. Section 2 is dedicated to various aspects of importancefor architecture selection, and in Section 3 the requirements for staticperformance are analysed. Section 4 is dedicated to the circuit designalternatives, focusing on the implementations used in the 12-bitdesign example. Finally, the integrated prototype is described inSection 5, together with the experimental results.

2. Architecture Selection

2.1 Basic Architecture

The basic topologies can be further refined by adding a fixed half-scale current sink/source, such that the total output current can assumeboth positive and negative values, as represented in Fig. l(c). Finally,a generic topology employing an additional output block can also beconsidered, as shown in Fig. l(d). This block can represent a

The basic topologies of a current-steering DAC are shown in Fig. 1.The simpler forms of this type of DACs are just digitallyprogrammable current sources/sinks that dump their output currentinto the load. The resistive part of the load is responsible for the staticcurrent-to-voltage conversion function, whereas the capacitive partrepresents the ultimate limitation for the settling behaviour of theresulting output voltage. As a remark, it is important to notice thatmost of the implementations use, in fact, two complementary outputs,such that the internal elementary current sources can be steered fromone output to the other without the need to be shut-off. This isextremely important for achieving the best performance at highfrequency.

transimpedance amplifier for current-to-voltage conversion, allowingmore flexible driving capabilities, and/or an output re-sampler formore sophisticated output formats other than the zero order hold,which have benefits from the frequency domain performance point ofview [1].

In this paper only the simplest topologies will be considered, sincethey are the most suitable alternatives for very high update rates.

2.2 Decoding OptionsOne of the most important distinguishing factors in current-steeringDACs is the way the digital input data is decoded to drive the internalelementary current sources and implement the digitally programmableoutput current. The two opposite alternatives for an N-bit DAC arerepresented in Fig. 2. The simplest scheme is obtained by organisingthe current sources in N elementary binary weighted currentsources, as represented in Fig. 2(a). This requires no decoding,because the bit weights are directly assigned to the current sources,resulting in a digital part with low complexity. However, there aresignificant drawbacks of this scheme, especially related to major bittransitions [2]. In this topology, major bit transitions result in the mostsignificant bit (MSB) current source being switched to one output andall the others being switched to the other. On one hand, it is verydifficult to guarantee good differential nonlinearity (DNL) andmonotonicity, since it is necessary to ensure that the MSB currentsource matches to within 0.5 least significant bit (LSB) the sum of allthe other current sources plus one unity. This imposes a very stringentrequirement on the allowable mismatch on the elementary currentsources. On the other hand, the fact that all the elementary currentsources are simultaneously switched at these major bit transitionsproduces large glitch areas, resulting in large spurious components inthe frequency domain.

The opposite decoding scheme uses a thermometer decoder toindividually control all the elementary current sources. Thishas advantages in terms of required matching to guarantee good DNLand monotonicity, because the unit current sources are incrementally

153

154

switched from one output to the other as required. In addition, theglitch area is proportional to the amplitude of the transition steps inthe output current. This means that glitches are linearly related to thesignal, resulting in signal filtering rather than distortion [2]. The onlydrawback of this decoding scheme is area, because the number ofdecoding elements is now proportional to The best alternativefor thermometer decoding is achieved by organising the currentsources in a matrix and by using fast row-column decoders, which, inturn, address local decoders associated to the current sources.

2.3 SegmentationIn order to get the advantages of the thermometer decoding withoutpenalising too much the area of the DAC, some sort of segmentationof the input digital word is normally introduced. In the most popularsegmentation scheme, the M MSBs address elementary currentsources of value using thermometer decoding, whereas the Lleast significant current sources have a binary weighted arrangementdirectly addressed by the LSBs [3].

155

The selection of the right segmentation into thermometer decodingand binary-weighted arrangement depends on the trade-off betweensuch factors as required area for DAC implementation, simplicity ofthe decoding scheme and tolerable level of dynamic non-idealities.The decoding scheme must be kept simple and compact to allow veryhigh update rates without significant degradation of the dynamicbehaviour [4]. A 6-bit (3+3) row-column decoder uses 3-input gates,whereas an 8-bit (4+4) row-column decoder uses 4-input gates. As thenumber of inputs required in the basic gates is increased, theirintrinsic speed is progressively decreased if no pipelining schemes areemployed. Therefore, in high-speed practical implementations thenumber of MSBs involved in the thermometer decoding scheme hasbeen limited to a maximum of 6 to 8.

3. Static PerformanceThe static behaviour of the DAC is affected by a number of factors,the most important of which are random mismatches in the currentsources, systematic errors due to gradients on process, stress and

The selected segmentation for the 12-bit design example consideredhere is 6+2+4 [4]. The M=6 MSBs use a 3+3 row-column decoding tosimultaneously address four 6-bit DACs connected in parallel andarranged in a fully symmetrical way with respect to the centre. Thisspecific arrangement, together with the adopted switching scheme,implements an effective compensation for the systematic errorspresent across the matrix of current sources. This will be furtheranalysed in the next section. The I=2 intermediate bits also usethermometer decoding and are implemented with the non-used currentsource in the 8×8 matrix of each of the four 6-bit DACs. The requiredscaling factor of 4 between the MSB elementary current source andthe one for the intermediate bits is intrinsically obtained in this way[4]. The remaining L=4 LSBs directly address 4 binary weightedcurrent sources, which are obtained by subdivision of the elementarycurrent sources in the matrixes. The overall segmentation is, in fact,logically identical to the one presented in [4], but its electrical andphysical implementation was simplified and made more compact.

156

temperature, errors due to voltage drop in the power distribution lines,and finite output impedance. These factors are discussed in thefollowing sections.

3.1 Random Mismatch of Current SourcesDue to the adopted segmentation, the current source matchingrequirements to satisfy the condition DNL<0.5 LSB are largelyrelaxed. In fact, the most critical transitions for DNL occur when bittransitions in the MSB segment correspond to a simultaneous changeof state of all the other least significant bits. In this particularsegmentation scheme, unit current sources must be matched towithin 0.5 LSB to other unrelated unit current sources. Thecondition to satisfy can be written as [5]

The relative standard deviation of the unit current source is obtainedby considering L=4 and I=2 in (1), leading to

The requirements to guarantee a good INL can be obtained with thehelp of a Monte Carlo analysis to understand the relationship betweenthe relative standard deviation of the unit current sources and the yieldto achieve INL<0.5 LSB. The results of such an analysis with asimple MATLAB model using Gaussian distributions for the unitcurrent sources are presented in Fig. 3, for 8-bit, 10-bit, 12-bit and 14-bit DACs. Closed form analytical expressions to obtain such resultshave also been derived in [6]. These results depend only on theequivalent number of total unit current sources employed in the DAC,and are independent on the type of segmentation adopted. It can beconcluded that a relative standard deviation ofmust be considered for a 12-bit DAC to obtain an INL yield of morethan 99%. The requirements imposed by the INL condition are,therefore, more stringent than those imposed by the DNL condition,owing it to the use of thermometer decoding in the MSBs.

Given the above constraint and assuming that a statistical mismatchcharacterisation of the process [7] is available, the minimum gate area

157

to be used in the MOS transistor that defines the unit current sourcecan be estimated by

On one hand, the overdrive voltage of the MOS transistordefining the current source must be the largest the possible tominimise the area of the current sources. On the other hand, themaximum overdrive voltage of these transistors is limited to theheadroom available between the supply voltage and the combinationof output full-scale voltage together with the various drain-to-sourcevoltages of saturated transistors in the current source and switchstructure. The present 12-bit DAC case study must be supplied at aminimum of 2.7 V. This value, together with an output swing of 0.5V, limits the of the current source to a maximum of 0.8 V.The result is and for a full-scale current of20 mA.

3.2 Systematic Errors and Switching StrategySystematic errors produced by various processing and environmentalfactors are well known disturbing elements in current-steering DACs.However, their effects can be partially compensated by using spatial

158

arrangements in the matrix of current sources controlled by the MSBs,together with specific sequences for switching the current sources as afunction of the MSB code. Many different strategies were proposed sofar, with different properties of error accumulation depending on thetype of spatial error profile considered [2, 4, 8, 9, 10]. From a briefanalysis we can conclude that complex switching sequences can bevery effective to compensate the DC performance of the DAC, butthey suffer from a fast degradation of the characteristics when theinput signal frequency is increased. A good trade-off between DCperformance and switching scheme complexity must be obtained.

The present 12-bit design example uses the spatial arrangement andswitching sequence proposed in [4]. The matrix is arranged in fourquadrants fully symmetric with respect to the central horizontal andvertical axis, as shown in Fig. 4. This topology implements a two-dimensional (2D) centroid-switching scheme capable to bettercompensate for 2D linear and parabolic errors [4]. In fact, 2D linear-type gradients are fully cancelled, as it expected from a common-centroid arrangement. For a 2D parabolic gradient, as shown in Fig.5(a), the resulting INL is represented in Fig. 5(b). This represents animprovement by a factor of more than two when compared to aconventional hierarchical switching scheme.

159

3.3 I×R Drop EffectsHigh speed current-steering DACs are normally designed for largefull-scale currents, to be able to generate moderate voltage outputswings on small resistive loads. Typically, full-scale currents of10 mA to 20 mA are considered, which means that significant voltagedrops can be generated along the power distribution lines in the matrixof current sources if these lines are not well sized. This fact can beresponsible for a degradation of the INL characteristic in high-speedhigh-resolution current steering DACs.

To correctly estimate the sizing of the power lines in the 12-bit casestudy DAC under analysis, a model of the interconnections was used.Fig. 6 shows the distribution of the voltage drop of the positive supplyacross the matrix of current sources, and its influence on the INL withthe adopted 2D centroid-switching scheme.

160

3.4 Output ImpedanceThe finite output impedance of the current source is the lastconsidered effect in this paper for static performance degradation. It isa source of linearity degradation, since the value of each unit currentsource will be a function of the output voltage of the DAC. It caneasily be concluded that the DAC output voltage can be expressed as[3]

where is the input code, corresponding to the number of unit currentsources switched to the output, represents the resistive output loadand is the output conductance of the unit current source. The

is (doubly terminated cable), which requires a total outputimpedance for the DAC larger than

4. Circuit Design for Dynamic Performance

4.1 Current-Source DesignIn order to satisfy the output impedance requirements for the 12-bitDAC under consideration, a cascode current source must beconsidered. In addition, it was decided to implement the DAC currentsources with PMOS transistors since the output voltage is naturallybuilt on a grounded load. In addition, improved substrate noiseisolation could be achieved, although this may only be effective atlow frequencies. Fig. 7 shows two possible alternatives for PMOS

condition to meet the INL<0.5 LSB is then given by

This condition is normally easy to satisfy using cascode currentsource topologies, due to the large channel lengths required to satisfythe conditions imposed by matching on the current sources.Furthermore, the nonlinearity present in (3) predominantly generatessecond order harmonic distortion, which in differential applications isfurther suppressed. In the 12-bit DAC under analysis, the output load

161

cascode current source implementation. The alternative on Fig. 7(b)was proposed in [4] as a means to reduce the feedthrough of thecontrol signals q and qz to the outputs. However, this topology has aninherent asymmetry of the falling and rising transitions due to thefact that the pole at the source of the cascode transistor starts to moveto lower frequencies when the current in the corresponding branch iscut-off. Therefore, the falling transition settling is slower than therising transition, being therefore a potential source of distortion. Theimplementation presented in this paper adopted the moreconventional current source topology represented in Fig. 7(a).

4.2 Switch DriversThe control nodes q and qz must swing in a limited voltage range,

sufficient to steer the tail current from one branch to theother while keeping the feedthrough from these nodes to the outputs atcontrolled levels. In addition, the impedance defined at these nodes isimportant, as it may be responsible for low-frequency poles degradingthe settling behaviour of the DAC. In this design was set to a cleananalogue ground to simplify the switch driving scheme. For a single-ended 0.5V output swing, this value for guarantees that the PMOSswitches are saturated when steering the current to the output. Thisrepresents another contribution to increase the output impedance of

162

the DAC. In addition, the low impedance of results in a goodsettling behaviour of the steering action. The generation, lesscritical for the settling behaviour, is performed by diode connectedNMOS transistors biased at a constant current. The complete switchdriver scheme is shown in Fig. 8.

4.3 Synchronization and Timing EqualizationThe synchronization of the switching instants in all the elementarycurrent sources in the DACs is fundamental to get a good dynamicperformance and a low glitch area. Additionally, as PMOS switchesare employed, the crossing point of the control signals q and qz mustbe kept low to guarantee that the switches never cut offsimultaneously. A clocked latch scheme can be used to provide therequired synchronisation together with the necessary overlappingbetween q and qz. In this implementation a ratioed-logic latch wasused, as depicted in Fig. 9 [4].

163

This scheme easily guarantees that synchronisation is achieved in theelementary current cells in the matrix. However, to further reduce theglitch area in code transitions, it is also necessary to equalise theswitching timing in the binary-weighted LSB current sources to thetiming present in the matrix, because the circuitry is scaled-down. Intheory, this could be performed by scaling down the latch accordingto the load imposed by the switch drivers in the LSB section. As thisis not easy to achieve, in this implementation a different strategy wasadopted. The latches and switch drivers in the LSB section are exactlythe same as in the matrix, and the switches, whose gate widths arescaled-down in width in the same proportion as the current sourcedoes, are complemented with the geometries removed in the scaling-down process as dummy structures. This results in an effective timingequalisation and also in a simple and very regular implementation inthe layout, which also improves timing.

5. Integrated Prototype and Measured Results

5.1 Retargetable DAC PrototypeThe layout of the DAC prototype in a digital CMOStechnology is shown in Fig. 10. It was conceived with a similar basicprinciple of the DAC described in [4], which consists on the removalof latches and local decoders out of the matrix for improved matchingon the current sources and reduced coupling of digital circuitry intothe sensitive analogue part. The organization of the matrix closelyfollows the explanation in Fig. 4. Two additional columns and rows ofdummy cells have been added to the surrounding edges of the matrixfor improved matching. The LSB section of the current sources isimplemented in one of these dummy rows on the top, while thecurrent mirrors for bias generation are implemented in the dummyrows on the bottom. The switches, switch drivers, latches, localdecoders and column and row decoders were organised in a stackalong the left side of the matrix, leading to a very compact layout. Thecore cell area is 1mm×2mm.

164

The layout of this DAC was developed for easy retargetability. Thiswas made possible by its own modularity and by conceiving a numberof parameterised cells integrated in Cadence Design Framework. Thefull layout can be instantly modified for different sizing of currentsources, decoding logic and driving circuitry. Some examples ofdifferent instantiations of parameterised cells used in the DAC layoutare presented in Fig. 11.

165

5.2 Test Set-upAn RF test set-up was prepared for the characterisation of the high-speed DAC prototypes. The die was mounted on a ceramic substratecontaining local terminating resistors for the digital signals and for theoutput voltages, and also decoupling capacitors for the supplies. Thisassembly was enclosed in a metalic case, as shown in Fig. 12.

166

The full-scale current was set by an external precision current source,and the output load was defined by the local terminating resistortogether with a cable terminated by the equipment.

The input data was supplied to the DAC by a high-speed patterngenerator, and the measurements of the output were performed ineither single-ended or fully-differential mode, depending on the test.

5.4 Frequency Domain PerformanceThe frequency domain characteristics have been obtained byprogramming full-scale sinewaves of various frequencies in thepattern generator running at the nominal rate of 200 MHz, and bycoupling the differential output of the DAC to a high frequencyspectrum analyser by means of a wide-band transformer.

5.3 Static CharacteristicsFor the static characteristics, the DAC was clocked at the nominal rateof 200 MHz and a very slow staircase was applied to its input code,while the single-ended output was measured by a digital multimeter.The resulting INL and DNL characteristics obtained with a beststraight line method are presented in Fig. 13. The INL is within±0.65 LSB, while the DNL is less than ±0.3 LSB. The good resultsare an indication that the adopted measures for improving staticperformance were effective.

The first results presented in Fig. 14 were obtained with a low noise-floor spectrum analyser, and reflect the performance of the DACoperating at nominal update rate of 200 MHz with output frequenciesup to 20 MHz. In these conditions the spurious free dynamic range(SFDR) is always above 70 dBc.

167

168

For higher output frequencies a high-bandwidth spectrum analyserwas used. The result in Fig. 15 (a) corresponds to the same situationindicated in Fig. 14(e) and is included here to compare the type ofnoise-floor existing in the high-bandwidth spectrum analyser. Fig.15(b) shows that the SFDR of the DAC clocked at 200 MHz starts tofall very quickly for output frequencies above 20 MHz. At 40 MHzthe SFDR is 51 dBc.

Although the DAC was designed for a nominal clock rate of200 MHz, the design had to satisfy all the corners of process,temperature and supply voltage. This means that in typical conditionsthe frequency of operation can be higher. Fig. 15(c) and Fig. 15(d)show the type of performance that can be reached at 500 MHz and800 MHz clock rate.

5.5 Power Dissipation

169

The current consumption can be divided into a static part, which isindependent on the clock rate and input activity, and a dynamic part.In this prototype the measured static current consumption is 40 mA,while the dynamic current consumption is 20 mA for a clock rate of200 MHz and an output frequency of 20 MHz. This leads to a totalpower dissipation of 180 mW at 3 V power supply. The overallcharacteristics of the DAC are summarised in Table I.

6. ConclusionsDesign considerations have been presented for high-speed current-steering DAC, with a special focus on a specific implementation of a12-bit 200 MHz DAC in a CMOS digital technology. Thepresented design techniques were supported by experimental resultsof the integrated prototype. Some considerations for layoutretargetability of such DACs were also introduced in this presentation.

AcknowledgementsThe authors would like to acknowledge to ESAT-MICAS, K.U.Leuven, in particular to Prof. W. Sansen, for having kindly supportedthe experimental characterisation of the prototypes at the Laboratoryof the University. The contributions of P. Jesus to the design andcharacterisation of the prototypes are also acknowledged.

170

ReferencesA. Bugeja, B.-S. Song, P. Rakers, S. Gillig, "A 14bl00Msample/s CMOS DAC Designed for Spectral Performance",Proc. ISSCC1999, 148-149, Feb. 1999.

[1]

C.-H. Lin, K. Bult, "A 10-b, 500-Msample/s CMOS DAC in 0.6", IEEE JSSC, Vol.33, No. 12, pp. 1948-1958, Dec. 1998.

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

T. Miki, Y. Nakamura, M. Nakaya, S. Asai, Y. Akasaka, Y.Horiba, "An 80-MHz 8-bit CMOS D/A Converter, IEEE JSSC,Vol. SC-21, No. 6, pp. 983-988, Dec. 1986.J. Bastos, A. Marques, M. Steyaert, W. Sansen, "A 12-bitIntrinsic Accuracy High-Speed CMOS DAC", IEEE JSSC, Vol.33, No. 12, pp. 1959-1969, Dec. 1998.A. Bosch, M. Borremans, M. Steyaert, W. Sansen, "A 10-bit 1G-Sample/s Nyquist Current-Steering CMOS D/A Converter", IEEEJSSC, Vol. 36, No. 3, pp. 315-324, Mar. 2001.A. Bosch, M. Steyaert, W. Sansen, "An accurate Statistical YieldModel for CMOS Current-Steering D/A Converters", Proc. IEEEISCAS 2000, Vol. IV, pp. 105-108, May 2000.M. Pelgrom, A. DuinMaijer, A. Welbers, "Matching Properties ofMOS Transistors", IEEE JSSC, Vol. 24, No. 5, Oct. 1989.Y. Nakamura, T. Miki, A. Maeda, H. Kondoh, N. Yazawa, "A 10-b 70-MS/s CMOS D/A Converter", IEEE JSSC, Vol. 26, No. 4,pp. 637-642, Apr. 1991T.-Y. Wu, C.-T. Jih, J.-C. Chen, C.-Y. Wu, "A Low-Glitch 10 bit75-MHz CMOS Video D/A Converter", IEEE JSSC, Vol. 30, No.1, pp. 68-72, Jan. 1995.J. Vandenbussche, G. Plas, A. Bosch, W. Daemens, G. Gielen, M.Steyaert, W. Sansen, "A 14b 150Msample/s Update Rate Q2Random Walk CMOS DAC", Proc. ISSCC1999, pp. 146-147,Feb. 1999.

HIGH SPEED CMOS DA CONVERTERS FORUPSTREAM CABLE APPLICATIONS.

Raf ROOVERS

Philips Research, Prof. Holstlaan 45656AA Eindhoven, The Netherlands.

ABSTRACT

In the first part of this paper the function of Analog toDigital (AD) and Digital to Analog (DA) converters incommunication systems is discussed. The relationbetween the system data rate and the converter data rateis explored and the impact on the power dissipation inthe AD and DA converter is shown.The second part describes the realisation of a DAconverter for cable upstream application.

1. INTRODUCTION

The demand for higher data bandwidth delivered to the home is adriving factor for AD and DA converter design. As the developmentsin AD and DA converter design evolved from audio and video signalconverters in the eighties, a driving force in converter design isnowadays found in digital communication systems, both wired andwireless. To avoid the costs and delays associated with digging newcables to the homes, systems are built to get highest bandwidth out ofthe existing physical links to the homes: telephone and cable wires.The increasing possibilities of digital signal processing made systemsfeasible that maximise the transmitted data rate up to the theoreticallimits. These are imposed by the physical constraints of thetransmission medium related to the bandwidth and signal to noiselimitations.


172

Various standards have emerged that coexist with already presentservices over copper wires and put specific requirements on AD andDA converters.After some general considerations on the AD and DA converters fordigital communication systems, a specific design for cable upstream isdiscussed.

2. HISTORY AND EVOLUTION OF SYSTEMS

The evolution of copper wired digital data communication to the homeis shown in figure 1. It started in the eighties with low bit rate voiceband modem communication and led to modem-speeds of 33.6kbit/swhich is close to what can be theoretically achieved according to theShannon information theorem in telephone voice band. To increasedata rates over existing phone wires, ISDN was defined, with datarates that are still far below the theoretical limits of the telephonewires. Presently the channel limits are nearly reached with thedevelopments in xDSL technology with its different flavours. Theonly way to increase the data rate any further is to use another type ofwire with higher bandwidth and/or better SNR.

Apart from telephone wire, cable networks connect homes with a highbandwidth cable, offering a theoretical capacity that is orders of

173

magnitude larger than phone wires. Originally the cable system wasintended for one way broadcast services. When this cable is used todeliver bits to and from the home, its wide bandwidth has to be sharedwith other services and users. The effective available bandwidth for asingle user is therefore orders of magnitude lower than the total cablecapacity, but it is usually higher than the telephone wire capacity.

In contrast to the past, the physical constraints of the wires havebecome the limitation of the data rate. In order to get the maximumamount of data through the wire, both analog and digital signalprocessing is needed on both sides, as shown in figure 2. Moore’s lawmade digital signal processing abundantly available to implementcomplex coding and filtering while coding schemes have beendeveloped that maximise data throughput and minimise symbolinterference in the link. The digital signal has to be converted to ananalog signal and conditioned to the required levels according to thewire properties. Both the DA and AD converters have data rates (n.fs)that are a factor above the netto system data rate and can even exceedthe wire capacity.

3. AD AND DA CONVERTER: BIT/S , BW AND SNR

As AD and DA converters link the analog world to the digital world, itis here that analog bandwidth (BW) and signal to noise ratio (SNR)meets digital bandwidth or bit/s. This is shown in figure 3. The AD orDA converter can be regarded as a limitation to the analog BW andSNR, which is shown in the SNR - input signal frequency graph. This

174

SNR (or SINAD) defines an effective number of bits (enob) while theeffective resolution bandwidth (ERBW) defines the usefull bandwidthof the converter. For an ideal converter the conversion capacity equalsthe number of bits at the output of the converter: i.e. every bit is anunit of information. For real AD converter, this is somewhat lower asboth enob<n and ERBW<2.fs.

4. AD AND DA CONVERTER PERFORMANCE

The performance of AD and DA converters is graphically plotted in abandwidth-resolution graph (see figure 4). In this type of graph thebandwidth (X-axis) is defined as the useful bandwidth of the converteri.e. minimum of effective resolution bandwidth and Nyquist rate whilethe resolution (Y-axis) is defined as the effective resolution for lowfrequency input signals: enob=(SINAD-1.76dB)/6.02 with SINAD the

175

signal to noise and distortion for low frequency input signals.As a reference point, state of the art AD and DA converters fromJSSC 89-90 and JSSC99-00 are plotted [JSSC]. A steadily progresshas been made during the decade towards higher resolution andbandwidth.

The two regions audio and video indicate the performance required forAD-DA conversion of these signals which is determined by the BWand SNR of microphone and image sensor signals. The audioconverters became mainstream in early eighties and video convertersin the late eighties and early nineties.Also plotted in the same graph the requirements for direct AD and DAconversion of a transmitted signal: twisted pair, coax cable. It is clearthat for direct conversion of a cable signal (1GHz, 60 dB) with asingle AD or DA converter is not yet feasible. Hence for cable signals,analog signal processing blocks are required to bring part of the cablesignal into the bandwidth-resolution area where AD-DA conversion isfeasible.On the other hand, even if conversion with a single AD or DAconverter is feasible, this is not always the optimal solution frompower consumption point of view.

5. ANALOG-DIGITAL SIGNALS IN A COMMUNICATIONLINK.

The effective AD or DA bit rate is at least equal to the netto systemdata rate, and its maximum is related to the capacity of thetransmission medium. In general the AD and DA converters effectivedata rate is positioned somewhere in between, depending on thearchitectural choices. The position of AD and DA within the signalpath determines the factor by which the AD or DA effective bit rateexceeds the actual netto system data rate. This factor is calledimplementation factor as shown in figure 5. This implementationfactor indicates how much of signal conditioning is done in either theanalog or digital domain. Wired single user systems show the lowestfactors while cellular (multi-user, wireless) systems have the highestfactor. For some system realisations the implementation factor is

176

indicated in table 1. As long as AD-DA power dissipation is notdominant in the system, architectures with higher factor are preferredas these reduce the analog functionality in favour of more flexibledigital functionality. Hence it is important to know the powerdissipation as function of data rate.

177

6. POWER DISSIPATION OF CONVERTERS IN ACOMMUNICATION LINK

The estimation of power dissipation of AD and DA converters is arather difficult task as bandwidth and resolution can span severaldecades. The presented formula’s are only taking into account theeffective resolution and bandwidth and the power consumption andexcludes all other parameters (input range, technology used, need forexternal components or trimming, ...) and is shown in figure 6. Afigure of merit is calculated from existing AD and DA realisationsbased on the formula in figure 6.

The same formula can be used to predict the power dissipation in ADconverter based on effective data data rate:

This formula states that power dissipation is proportional to theeffective AD performance in signal to noise and usefulbandwidth A state of the art realisation has afigure of merit ranging from 1 to 10 pJ. For DA converters asimilar is defined with state of the art values ranging from 0.5to 10 pJ.

The power dissipation can be plotted in the bandwidth resolutiongraph as equal power dissipation lines as shown in figure 7. It isInteresting to compare these lines with lines of equal informationcapacity (bit/s):

both the power dissipation and the information capacity are linearlyproportional to the bandwidth

Based on figure 7 an energy / information bit can be calculated bydividing the information capacity by the power dissipation andnormalise it for BW. This is an ideal case with the assumption thatimplementation factor is 1: i.e. an effective output bit of the AD-DAequals an information bit. This also implies a theoretical optimalcoding and ideal an analog circuit implementation without any losses.

7. AD-DA CONVERSION POWER PER BIT

Finally all these assumptions can be used to make a prediction of thepower dissipation of AD-DA converters in a telecom link as afunction of the netto data rate, the implementation factor and thebandwidth used, together with the introduced FoM. In figure 8 thepower dissipation (Y-axis) is plotted as function of data rate (X-axis)for given bandwidth and implementation factor.It shows that for relative low data rates the implementation factor isnot that important and can be chosen high. However, for high datarate systems, the right choice of implementation factor can make a bigdifference in power dissipation.

178

the SNR proportionality is different due to the log function in theinformation capacity expression

179

8. CABLE MODEM UP-STREAM : FUNCTIONALITY ANDSTANDARDS.

A part of the cable frequency spectrum is reserved for upstream datacommunication as shown in figure 9. This spectrum is shared byseveral homes and different standards have emerged (Davic,Harmony, Docsis, ..) that define the channel bandwidths, accessschemes, modulation scheme, power control, out of band spurious, ...These standards are developed to coexist with other services on thecable. QPSK and QAM modulation schemes have been proposed withdifferent data rates, channel allocation and power control.

180

9. WHERE TO PLACE THE DA IN SIGNAL PATH ?

The balance of what part of signal processing is done in analog ordigital domain is determined by several factors :

For a QAM 16 modulation scheme the different options are shown infigure 10.

A first option is to use two DA converters with a relative low numberof bits and sample rate. The implementation factor is low butadditional analog circuits are required for upconversion, filtering andamplification. A second approach is to directly generate themodulated carrier in the digital domain, requiring a DA converter withabout 10 bit resolution and sample rate of about 200 MS/s. Theimplementation factor is higher but the upconversion is done in digitaldomain offering a higher flexibility. A third approach is to implementalso the complete variable gain range in the digital domain, requiringmore than 16 bit of resolution. This last option is not feasible from DAconverter design point of view. The second is very well feasible andturns out to be a cost effective and flexible solution: modulation-upconverting is implemented digitally and the variable gain is analog.

Feasibility of DA converter functionDigital complexity vs analog complexity of function ~ silicon areaPower dissipation for digital vs analog implementation of function

181

10. CONDITIONS FOR DA SPECIFICATIONSDA converters for telecom application require dynamic performance.These DA are often specified by distortion and spurious free dynamicrange when a single full scale sinewave is applied. In the actualapplication a modulated signal is generated with the DA, requiringspectral purity under these conditions. The use of single full-scalesinewave for testing or specifying the DA converter is not completelyadequate. This is illustrated in figure 11 by showing a QPSK signaland its square component both in time and frequency domain. Thesquare component has a power of 0.62 and a power density 0.47compared to the QPSK signal.

Figure 12 shows the second and third order distortion components of –50dBc of a single sinewave applied to a DAC together with thedistortion components of a modulated signal applied to same DAC.

182

11. EXAMPLE OF UPSTREAM DA REALISATIONThe realisation of an integrated upstream signal path is shown infigure 13. It is based on a 10 bit DA core and a variable gain amplifier(VGA). The DA and VGA is realised in a CMOS technologyand requires no process options as thick oxide transistor, dual ordouble poly. The power supply is 2.5V for the complete design. Thenominal sample rate is 162 MS/s which is sufficient for channelfrequencies up to 42MHz. The 10 bit DAC core provides 0.8Vppdifferential signal level and the VGA adds 12 dB scaling. This resultsin 0.3-1.2 Vpp differential signal swing in 75 ohm load resistors. Theremaining part of the power control is realised in the external linedriver. The complete realisation of power control on CMOS is notfeasible due to the required voltage levels.

183

12. CIRCUIT REALISATIONS

The integrated up stream signal path required a gain of 12 dB and aresolution of 10 bit in the converter. The DAC core and the VGA areseparated by a Track and Hold (T/H) to de-couple the static anddynamic performance requirements.The timing accuracy is now localised in one switch (clk2) while theactual DAC core can operate with a less demanding clock signal. Onlya 2.5 V power supply was available which significantly reduces theavailable internal signal swing. The realisation consists of three circuitparts : the DAC core, a Track and Hold (T/H) as de-glitcher and theVGA as shown in figure 14.

The 10 bit DAC core circuit is shown in figure 15. It consist of anarray of 32x32 p-MOS current sources combined in 6b binary / 4bsegmented configuration. Transistors are sized for static accuracylevel (INL) of 0.3 LSB. The output of the current source array isinternally converted into a voltage with a swing of 0.8 Vpp diff.

184

The T/H is added as a de-glitcher as shown in figure 16, reducing thedemands on the clock accuracy for the DAC core. The DAC coreswitches can be driven by a digital clock, while the T/H uses the onlyaccurate clock signal required. The T/H is based on a simple n-Mostswitch in between the in- and output buffers.

The variable gain amplifier / output buffer consists of a degeneratedvoltage to current converter and the gain steps are realised byswitching additional transistors in a current mirror. Figure 17 showsthe transistor implementation of the VGA.

186

The dynamic performance is limited by the and harmonics inthe VGA. Figure 19 shows measured QPSK signal with 160kSymbols/s. The carrier frequency is such that aliased

order distortion components occur in the neighbouring channel atA measurement of single sine wave is shown in

figure 20.

187

13. CONLUSIONS

Some considerations on AD and DA power consumption are madebased on figure of merit, implementation factor in a digitalcommunication system frontend and the netto data rate. Very highdata rate communication system will require a low or moderateimplementation factor to keep the power consumption in the AD orDA affordable and puts a limit on the digitisation of the frontend. Acable upstream signal path implementation is shown that uses a Trackand Hold as a de-glitcher and has a part of the required power controlrange on chip.

14. REFERENCES

[JSSC] Selected AD and DA papers from Vol.24, Vol.25, Vol.34 andVol.35 of IEEE Journal of Solid State Circuits.

SOLVING STATIC AND DYNAMICPERFORMANCE LIMITATIONS FOR HIGH

SPEED D/A CONVERTERS

Anne Van den Bosch, Michiel Steyaert and Willy SansenKatholieke Universiteit Leuven,

Afdeling ESAT-MICASKasteelpark Arenberg 10,

3001 Heverlee, BELGIUM

ABSTRACTIn this paper the factors determining the static and thedynamic performance of a current-steering CMOS D/Aconverter will be discussed. The impact of these factorswill be converted in some design guidelines that have tobe implemented in order to realize a D/A converter witha state-of-the-art performance.

The recent growth of the telecommunication market pushes thedesigner to put an increasing amount of effort in the integration ofdigital and analog systems on one chip. Consequently, the interfacebetween these systems is becoming one of the most challengingblocks to design in the telecommunication devices of today. Highperformance D/A converters find applications in the area ofbroadband and wireless communications. Because they are inherentlyfast and cost effective, CMOS current-steering D/A converters are theideal candidates for such applications.Until a few years ago, open literature used to mention mainly staticspecifications of D/A converters [1,2]. Recent publications [3,4,5]have revealed that a combination of a high update rate, a high

1. INTRODUCTION


190

resolution and a good linearity up to the Nyquist frequency is difficultto achieve. Except in [6], where 60 dB of spurious free dynamic range(SFDR) was achieved up to a 200 MSample/s clock, the degradationof the output signal linearity starts at a few (tens of) MHz.In this paper both the static as the dynamic performance of a current-steering D/A converter will be discussed in detail. In the first section,the D/A converter’s basic operation principles and the topologyselection are discussed. Based on this analysis, some high speed 10and 12-bit D/A converters have been realized that show a significantdynamic performance improvement. A 12-bit realization will bepresented in section 6 of this paper as an example. Finally, a figure ofmerit will be defined that provides a fair method for a D/A converter’sperformance comparison.

2. CURRENT-STEERING TOPOLOGY TRADE-OFF

Current steering D/A converters are based on an array of matchedcurrent sources that are switched to the output according to the digitalinput code. Three different architectures are possible depending on theimplementation of this array, namely the binary, the unary and thesegmented architecture. Each architecture will be briefly discussedincluding some advantages and disadvantages. A comprehensiveschematic overview is given in Fig. 1.

2.1. Specifications

The DNL error is the worst case deviation from an ideal one LSB stepbetween two subsequent output codes. The INL error is defined as themaximum deviation from the D/A converters ideal transfer function.Both the INL and the DNL errors are static non-linearityspecifications that determine the limit of the D/A converter’sperformance at a low frequency.The INL specification has the same requirement independent of thearchitecture. The influence of this specification on the design of acurrent-steering D/A converter will be described in detail in section3.1, where the relation between the technological matching propertiesand the performance of the D/A converter will be presented.

191

2.2. The Binary Weighted ArchitectureIn the binary implementation, every switch controls a current that istwice as large as that of the next less significant bit. The digital inputcode directly steers these switches. The advantages of this architectureare its simplicity (since no decoding logic is necessary) and the smallrequired silicon area. On the other hand, a large DNL error and anincreased dynamic error are intrinsically linked with this architecture.At the half scale transition, unit sources are switched on/offarchitecture and other independent sources are switched off/on.Assuming a normal distribution for the unit current sources with astandard deviation this step has a determined by:

This sigma, is a good approximation for the DNL. The atthis most significant bit transition is approximately a factor larger

192

than at the other bit transitions (with N the total number of bits and xthe number of the most significant switching bit).

2.3. The Unary Decoded Architecture

In the unary decoded architecture every unit current source isaddressed separately. The digital input code is converted to athermometer code that controls the switches. The advantages of thisarchitecture are its good DNL error and the small dynamic switchingerrors. In this architecture, the D/A converter has a guaranteedmonotone behavior since only one additional current source has to beswitched to the output for one extra LSB. The major disadvantage ofthe unary decoded architecture is the complexity, the area and thepower consumption of the thermometer decoder.Performing similar calculations as in (1) for the unary architectureleads to the following results:

This formula mathematically represents the idea behind the unarydecoding. The error between two consecutive codes is just thedeviation on the additional unity current source. The DNL wasdefined as the maximum deviation at a single LSB transition. For an Nbit converter, this means that the DNL is determined by the maximumwhen taking samples from a normal distribution with the sigmadefined in (2).

2.4 The Segmented Architecture

To get the best of both worlds, most current-steering D/A convertersare implemented using a segmented architecture. In this case, the D/Aconverter is divided into two sub-DACs : the B LSBs (least significantbits) are implemented using a binary architecture while the (N-B)MSBs (most significant bits) are implemented in a unary way. In thisarchitecture, a balance between good static and dynamic specificationsversus a reasonable decoder power, area and complexity can be found.Since the segmented architecture is a mixture of the previous two

193

architectures, the result for the most critical transition is of the sameform.

Note that the formula (3) for the segmented architecture is a generalformula that is valid for the binary (B=N-1) and the unaryimplementation (B=0).

3. THE STATIC PERFORMANCE OF A CURRENT-STEERING CMOS D/A CONVERTER

3.1 The influence of random mismatch

Due to the mismatch of the current source transistors, the INLspecification of different D/A converters made in the same processtechnology will vary randomly. It is therefor important to be able topredict this specification within certain boundaries. For this purpose,the concept of the D/A converter's INL_yield has been introduced.This yield is defined as the percentage of functional D/A converterswith an INL specification smaller than half an LSB (least significantbit). To obtain an accurate estimation of the INL_yield, the MonteCarlo simulation [7] is frequently used since the available yieldexpressions [8,9] do not provide the wanted accurate results.However, these simulations need a large amount of CPU time.Running a Monte Carlo simulation for a high-resolution D/Aconverter takes several hours and that is a major drawback for thisapproach.The statistical relationship has been investigated analytically, resultingin a new and accurate formula expressing directly the relationshipbetween the INL yield specification, the resolution and the relativeunit current standard deviation for the D/A converter. The basic ideabehind this theory is based on the assumption that if at any point theerror between the calculated and the ideal output value reaches half anLSB, there exists a 50% chance that the error increases and 50%chance that it decreases again since a normal distribution with mean

194

value zero is used. For the mathematical derivation, the reader isreferred to [10]. The result is given by :

with is the relative unit current standard deviation, N is theresolution of the D/A converter and the value of the coefficient C isgiven by:

The is the inverse function of the normal cumulativefunction integrated from to x.In fig.2 the INL_yield of a 8 bit, 10 bit and a 12 bit D/A convertercalculated using the new formula (eq.4) and simulated values usingthe Monte Carlo approach are depicted. From this figure, it can beconcluded that the formula is in good agreement with the Monte Carlosimulations.

195

To gain more insight in eq.4, the unit current standard deviation isplotted in logarithmic scale versus the resolution of the D/A converter(fig.3). As can be seen from fig.3 these are straight lines since

Furthermore, one can easily conclude from this figure that for thedesign of a high accuracy current-steering CMOS D/A converter thematching parameters play a significant role. A small deviation of therequired sigma(I)/I can lead to a severe yield degradation. The time tocreate a figure like fig.3 using Monte Carlo simulations in MATLABis given in table 1. In this table the results for the INL_yield from100% to 10% for a current-steering D/A converter with differentresolutions can be found. For all the simulations twenty values for therelative unit current standard deviation were taken. This can beunderstood as follows. In a first coarse approximation a simulationusing 10 values for sigma(I)/I -that span a wide range- is run. From

196

the obtained result the interval for the sigma(I)/I that obtain a highINL_yield can be specified. In this interval another 10 points aresimulated. In almost all cases this procedure gives accurate results.Constructing fig.3 using the new formula takes only a few minutes.The time to write the short program is so to speak the most timeconsuming. It is also worth noting that the time necessary to calculatethe yield is independent of the resolution of the D/A converter whilethe time consumption of the Monte Carlo simulations “explodes” withan increasing D/A converter’s accuracy.Based on these results and the size versus matching relation [11] forMOS transistors, the dimensions of the current source transistors aregiven by (7):

Increasing the gate overdrive voltage of the current sources reducesthe area consumed by the current source array. However, the value of

is limited by the fact that the switch transistors andhave to operate in the saturation region (fig.4.a).

3.2 The influence of systematic errors

197

Apart from the random errors, the static performance of the D/Aconverter is determined by the following systematic errors :

Although the transistor mismatch effect of the current source hasalready been taken into account during the sizing of thesetransistors, it can still have a negative influence on the staticperformance of the D/A converter due to the "edge effect" [12].This effect states that the mismatch behavior of transistors isheavily dependent on its immediate surroundings. To avoid thiserror, the current source array has to be expanded by insertingdummy rows and columns as to provide identical surroundingsfor all the active current source transistors.The voltage drop along the ground line will slightly change theoutput current of the different current source transistors placed onthe same row. This error is given by [13]:

where is the derivative of the full scale current to the currentsource bias voltage, R is the total resistance of the ground line andf is a factor depending on the used switching scheme. If all thecurrent sources are switched sequentially from the left to the right,the value for f equals 9. This error can be reduced by either usingsufficiently wide power supply lines (reducing the resistance R) orby using a special switching scheme (increasing the factor f).

If the resolution of the D/A converter increases by a single bit, thenumber of current sources in the current source array doubles. Thearea occupied by a single unity current source also doubles becauseof the random matching constraint. This leads to a four-times areaincrease for the current source array for each additional bit. ForD/A converters with a resolution of 10 bits and higher, thedimensions of the current source array become so large thatprocess- and temperature gradients have to be considered. Thenon-linearity errors introduced by these gradients can be (partially)compensated by the introduction of a special switching scheme.

198

If the error contributions of the current sources are totally random anduncorrelated, the yield of the D/A converter dictates the minimalrequirement for the matching precision of these current sources as isindicated in the previous paragraph. The random error can then bekept within the specified boundaries (INL<0.5LSB) by adjusting theactive area. This implies that in order to guarantee a good staticperformance of the D/A converter, the systematic errors introduced bylinear and/or symmetrical gradients have to be compensated in orderto keep the random errors dominant. This is done using optimizedswitching schemes for the current sources. A switching scheme isactually a layout technique that determines the interconnectionbetween the thermometer decoder and the inputs of the switches of thecurrent source matrix. Several switching schemes have been presentedin literature [5,13,14]. In section 6, an example of an advancedswitching scheme is given for a 12 bit current steering CMOS D/Aconverter.

4. THE DYNAMIC PERFORMANCE OF A CURRENT-STEERING CMOS D/A CONVERTER

4.1 Introduction

To obtain a thorough understanding of the behavior of current steeringD/A converters, system designers nowadays are not only interested inthe static performance of current steering D/A converters but also intheir frequency-domain performance since both the dynamic and thestatic non linearities are visible in the frequency domain as noise anddistortion. Where open literature used to mention only staticspecifications [1,2], recent papers [3,4,5] reveal the problem that highspeed Nyquist D/A converters are difficult to design. The limitedspurious free output signal bandwidth is the major bottleneck in highspeed high resolution designs.

4.2 The influence of the timing errors of the switch control signal

If the control signals of both switches and are not exactlymatched in time, a glitch error will be directly visible at the output ofthe D/A converter. This problem can be solved by placing asynchronization block immediately in front of the switch transistors.

199

In this way any delay introduced by the digital decoding logic iscanceled and the timing error is minimized. However, one should keepin mind that at the layout level, the implementation of this circuit hasno use unless identical connections between the synchronizing circuitand the switching transistors are drawn.

4.3 The influence of capacitive feed-through

The gate-drain capacitance of the switch transistors andform a feedthrough path that allows the digital control signals to havea direct impact on the output of the D/A converter. The glitch energyerror that is generated in this way can be significantly lowered by theuse of a reduced voltage swing at the input of the switches or it can beminimized by placing a cascode transistor on top of the switchtransistors [3] . Since the introduction of these cascode transistors(that also have to be switched on/off) does not solve the problementirely and leads to a higher area consumption and distortion of thefully symmetrical operating principle of the basic current cell, recentD/A converter designs opt for the first solution. In some designs theimplementation of a reduced voltage swing can be done by the samesynchronization circuit used to solve the problem described in theprevious section. Hence, no extra layout work is necessary.

4.4 The influence of voltage stability errors

If the crossing point of the switch control signals is situated at exactlythe value, the following problem will occur : a timeinterval exists in which both switch transistors are simultaneously inthe off-state. Since the current source transistor is still deliveringcurrent, the capacitance at its drain node will discharge. At themoment one of the switches starts conducting, an extra amount ofcurrent will flow through these transistors as to restore the DC voltageat that node. This will result in a glitch error at the output of the D/Aconverter leading to a deterioration of the dynamic performance. Thisproblem can be solved by the use of a special switch driver circuit [2].However, also for this building block a trend exists towards anintegration with the synchronization circuit [4,5].

200

4.5 The influence of the output impedance

As is generally known, the output impedance (fig.4.a) of eachcurrent cell has to be made large so that its influence on the INL(integral non-linearity) specification of the D/A converter isnegligible. The relation between this output resistance and theachievable INL specification is given by [15]:

with the load resistor, the LSB current and T the total numberof unit current sources. In most cases the cascode configuration of theswitch and current source achieves the INL specification. However,this is only true over a limited frequency bandwidth as can beconcluded from the following calculation. Fig.4.a shows the figure ofthe unit current cell of a current-steering D/A converter where theparasitic capacitance is indicated.

201

The impedance (the impedance seen from the output node intothe drain of the switch transistor can be calculated (fig.4.b) andequals:

This formula indicates that the impedance has a pole and a zero at thefollowing frequencies :

The possibility to shift this pole and zero to a higher frequency isdetermined by the flexibility in adjusting the following fourparameters : the output resistance of the current source transistorand the switch transistor the transconductance of the switchand the capacitance According to eq.(11) the pole can be shiftedtowards a higher frequency by minimizing the output resistance ofthe current source transistor. However, the value of this resistance cannot be freely adjusted since the gate-length L of the current sourcetransistor is dictated by matching considerations [11] and the current

through this transistor is determined by the full scale output signal.Since the current through the switches equals the current through thecurrent source transistors and the gate-length L of the switch transistoris chosen to be minimal for speed reasons, nothing can be gained bythe output resistance of these transistors. Also thetransconductance is fully determined since the gate overdrivevoltage of the switches is the result of an optimization processbetween the area occupied by the current sources and the optimumsettling time of the D/A converter.

202

At this point, the frequency dependency of the output impedance hasbeen discussed in detail but the question remains if this impedance hasa significant effect on the dynamic performance of the D/A converter.In the remainder of this paragraph the value for the required minimaloutput impedance for a unit current switch will be discussed infunction of the resolution of the D/A converter. It will then becomeclear that for high resolutions and designs with a large interconnectcapacitance the non-linearity introduced by the output impedanceseverely limits the output signal bandwidth.For the mathematical derivation of the required impedance, the readeris referred to [16]. Here the resulting formulas will be presented andevaluated. The ratio Q gives a value for the SFDR determined by thesecond order harmonic.

From this formula the value for the required for a givenresolution can be easily determined and equals :

203

Eq.(13) is plotted in fig.5 for a D/A converter with a resolutionbetween 8 and 16 bits. For a resolution of 10 bits the has to havea value of about which is still relatively easily to implement.However, for a 12 bit current steering DAC the ratio Q has to be atleast equal to 72 dB. If the load resistor is a double terminatedcable and N equals 4095, the value for the required has to be atleast in the Nyquist frequency range. This is no longer astraightforward design specification since for high speed, highaccuracy circuits the effect of the interconnect capacitances on theoutput impedance can no longer be neglected.

5. LAYOUT ISSUES

Having a good D/A converter in the design phase does not necessarilylead to a good D/A converter at the measurement stage if not enoughattention has been paid at the layout of the circuit. Several aspects thatare worth mentioning are the following :

The coupling between the digital and the analog part of the chiphas to be minimized. This is not only done by using different powersupply lines but also by placing guard rings around the analog andthe digital part of the chip and by using a separate array for theswitches together with their drivers. Another advantage of theseseparate arrays is that the layout area of a unity cell in the currentsource array can be minimized. In this way the distances betweenthe transistors are reduced resulting in improved matchingproperties.To reduce the voltage drop in the ground line of the currentsource transistors, wide supply lines are used. These are drawn ontop of the transistors together with the interconnections needed toimplement the switching scheme. In this way, a very compactcurrent source array can be realized.To avoid any edge effects, the current source array has to beexpanded with a number of additional rows and columns as wasalready mentioned earlier (section 3.2).

204

A multiple number of bondingpads is used at the output of the D/Aconverter as to lower the inductance of the wire bonding and as aresult minimize any ringing effects that could otherwise occur.Wherever possible, all interconnections have been made identical.In this way, no timing and/or load differences have beenintroduced.

6. DESIGN EXAMPLE : A 12-BIT CURRENT STEERINGCMOS D/A CONVERTER

In this paragraph, a high speed, 12 bit CMOS current-steering D/Aconverter with a segmented architecture is presented [22]. Fig.6 showsthe floorplan of the realised chip. The 5 MSBs are converted in anunary way while the 7 LSBs are converted using the binary approach,where the digital input bits directly control the switches. To minimiseany latency problems and to optimise the dynamic performance of theD/A converter, a dummy decoder has been inserted between the inputsand the switch transistors.

205

Based on the combination of a 99.7% yield specification for the D/Aconverter and the transistor mismatch equations [11], the dimensionsof the unity current source have been determined

Apart from the random matching errors, the systematic errorscaused by technological, electrical and temperature gradients over thedie have been compensated by the implementation of a special triplecentroid switching scheme. Since the first 7 LSBs are implemented ina binary way, the value of the unary current source equals 128 timesthe LSB current This unary current source has been split up into16 current sources with a value of The current source array hasbeen divided into 16 squares and the current sources are placedsymmetrically around the center of each square as is indicated in fig.7.As a result, any two dimensional symmetrical or graded error is fullycompensated.Four additional dummy rows and columns have been added to createidentical surroundings for the current sources situated at the edge ofthe current source array.

206

The dynamic performance of the D/A converter has been obtained bythe use of a well designed synchronised switch driver and by a carefuldesign of the DAC’s output impedance as to minimise any non-linearity caused by its frequency dependent value. To obtain a secondorder harmonic distortion that is better than 72 dB, the required outputimpedance of the D/A converter has to be larger thanThe chip has been realised in a single-poly five-metal layer standard

CMOS technology with a total active area of only 1 mm2.Extra attention has been paid at the layout.All measurements are single ended and have been performed with a3V analog power supply and a 2.2V digital power supply. Themeasured INL error is better than 0.3 LSB proving the 12-bitaccuracy. To give a more complete image of the dynamic performanceof the presented 12 bit current steering D/A converter, fig.9 is given.The first part of this figure shows the SFDR in function of the updaterate for a 1 MHz output signal. The SFDR for the 1MHz output signalremains above 70 dB up to a 700 MS/s update rate for the presentedDAC where previous designs reach this limit for update rates smallerthan 300 MS/s [1] respectively 200 MS/s [5]. The second part of fig.9shows the SFDR in function of the output signal for an update rate of300MS/S. Figure 8 and 9 clearly show the good static and dynamicperformance of the presented DAC.

207

7. THE FIGURE OF MERIT

To be able to compare the performance of the presented D/Aconverter with recently presented current-steering D/A converters, afigure of merit is introduced.

with N is the resolution and P is the power consumption of the D/Aconverter and is the output signal frequency where the SFDR hasdropped with 6 dB (=1 bit) in comparison with the expected result

For a 12 bit DAC, is the output signal frequency where theSFDR equals 66 dB.In fig. 10 this figure of merit is plotted versus the inverse of thenormalized area. On the same figure, the lines of equalFOM/normalized area ratio are shown. It can be concluded from thisfigure that the presented 12 bit D/A achieves a state-of-the-artperformance in comparison to recently published 10, 12 and 14-bitD/A converters [1,6,17,18,19,20,21].

208

8. CONCLUSION

Since high resolution current-steering D/A converters are stronglydependent on the matching characteristics of the technology in whichthey are processed, it is important to know the number of functionalchips in a set of fabricated devices. It is shown in the first part of thispaper that time consuming Monte Carlo simulations are no longernecessary to obtain results for the INL_yield with a good accuracy. Anaccurate formula has been presented that directly gives you theINL_yield of a current-steering D/A converter in function of thetransistor mismatch parameters of the current sources without any lossof design time.In the second part of this paper the SFDR-bandwidth limitationsencountered with high resolution D/A converters have been analyzed.A main fundamental limitation is identified to be the dynamic outputimpedance of the circuit. The impact of this output impedance on theSFDR has been calculated. Based on this analysis the requirements forthe value of the output impedance of each unit current branch has beenderived.

209

The implementation of the presented analysis results, has resulted inan important performance improvement of our recently developedcurrent-steering CMOS D/A converters [20,22].

9. REFERENCES

[1] J. Bastos et al., “A 12-bit Intrinsic Accuracy High-Speed CMOSDAC,” Journal of Solid-State Circuits, Vol. 33, No.12, pp.1959-1969,Dec. 1998[2] H. Kohno, Y. Nakurama et al. “A 350-MS/s 3.3-V 8-bit CMOSD/A Converter Using a Delayed Driving scheme,” IEEE Proc.ofCICC 1995, pp. 10.5.1-10.5.4[3] A. Marques, J.Bastos et al., “A 12-bit Accuracy 300 MS/s UpdateRate CMOS DAC,” Proc. IEEE 1998 Int. Solid State Circuits Conf.(ISSCC), pp. 216-217, Feb.1998[4] N. Van Bavel, “A 325 MHz 3.3V 10-bit CMOS D/A ConverterCore with Novel Latching Driver Circuit,” Proc. of the IEEE CustomIntegrated Circuits Conf. (CICC), pp. 11.6.1-11.6.4, May 1998[5] A. Van den Bosch et al., “A 12 bit 200MHz Low Glitch CMOSD/A Converter,” Proc. IEEE CICC 1998, pp.11.7.1-11.7.4[6] C.-H. Lin and K. Bult, “A 10-b, 500-Msample/s CMOS DAC in

” IEEE Journal of Solid-State Circuits, Vol. 33, No.12,pp.l948-1958, Dec.1998[7] C. Conroy, W. Lane and M. Moran, “A Comment on‘Characterization and Modeling of Mismatch in MOS Transistors forPrecision Analog Design,’” IEEE Journal of Solid State Circuits,vol.23, Feb. 1988, pp. 294-296[8] K. Lakshimikumar and al., “Characterization and Modeling ofMismatch in MOS Transistors for Precision Analog Design”, IEEEJournal of Solid State Circuits, vol.21, Dec 1986, pp. 1057-1066[9] K. Lakshimikumar and al., “Reply to ‘A Comment on :Characterization and Modeling of Mismatch in MOS Transistors forPrecision Analog Design”, IEEE Journal of Solid State Circuits,vol.23, Feb. 1988, pp. 296[10] A. Van den Bosch, M. Steyaert and W. Sansen, "An AccurateStatistical Yield Model for CMOS Current Steering D/A Converters,"Proc. IEEE 2000 Int. Symposium on Circuits and Systems (ISCAS),pp. IV.105-IV.108, May 2000

210

[11] M. J. M. Pelgrom et al., “Matching properties of MOSTransistors,” IEEE Journal of Solid-State Circuits, Vol. SC-24,pp.1433-1439, Oct. 1989[12] S. Wong, J. Ting and S. Hsu, “Characterization and Modelling ofMOS Mismatch in Analog CMOS Technology”, Proc. of the IEEEInt. Conference on Microelectronics Test Structures (ICMTS), pp. 171-176, March 1995[13] T. Miki, Y. Nakamura et al. “An 80-MHz 8-bit CMOS D/AConverter,” IEEE Journal of solid state circuits, vol. 21, December1986, pp. 983-988[14] Y. Nakamura, T. Miki et al. “A 10-b 70-MS/s CMOS D/AConverter,” IEEE Journal of solid state circuits, vol. 26, April 1991,pp.637-642[15] B. Razavi, “Principles of Data Conversion System Design,” IEEEPress, ISBN 0-7803-1093-4, 1995[16] A. Van den Bosch, M. Steyaert et W. Sansen, "SFDR-Bandwidth Limitations for High Speed High Resolution CurrentSteering CMOS D/A Converters," Proc. IEEE 1999 Int. Conf. onElectronics, Circuits and Systems (ICECS), pp. 1193-1196, Sept. 1999[17] G. Van der Plas et al., “A 14-bit Intrinsic Accuracy RandomWalk CMOS DAC,” Journal of Solid-State Circuits, Vol. 34, No. 12,pp. 1708-1718, Dec. 1999[18] A. Bugeja et al., ”A 14b l00Msample/s CMOS DAC Designedfor Spectral Performance,” Journal of Solid-State Circuits, Vol. 34,No. 12, pp.l719-1732, Dec. 1999[19] A. Bugeja and Bang-Sup Song, “A Self-Trimming 14b l00MS/sCMOS DAC,” Proc. IEEE ISSCC, Feb. 2000[20] A. Van den Bosch et al., “A 10-bit 1GSample/s Nyquist Current-Steering D/A Converter,” Proc. of IEEE CICC 2000, May 2000,pp.11.6.1-11.6.4[21] K. Khanoyan et al., “A 10b, 400 MS/s Glitch-Free CMOS D/AConverter,” Symp. VLSI Circuits Dig. Tech. Papers, paper 8-1, 1999[22] A. Van den Bosch, M. Borremans et al., “A 12b 500 Msample/sCurrent-Steering CMOS D/A converter,” IEEE Proc. Int. Solid-StateCircuits Conference (ISSCC01), Feb. 2001,pp. 366-367

HIGH SPEED DIGITAL-ANALOG CONVERTERS- THE DYNAMIC LINEARITY CHALLENGE

Alex R. BugejaTexas Instruments, Dallas, TX 75243, USA.

ABSTRACTIn this paper we examine the need for high dynamic linearity in highspeed digital-analog converters for communications applications, andthe challenges facing DAC designers attempting to maximize it. Abrief discussion of a DAC designed for high dynamic linearity is thenpresented, followed by some predictions of future trends.

1. INTRODUCTION

High dynamic linearity is crucial for communications applicationsdigital-analog converters (DACs) in the transmission paths of moderncellular and wireless LAN basestations. Such DACs typically exhibitsignificant roll-off of their SFDR performance with increasing inputfrequency for a given clock rate, introducing spurs in the outputspectrum which limit their use in such environments. In this paper wefocus on current switched digital-analog converters, which have beengenerally demonstrated to be the most feasible architecture for highspeed operation, are capable of driving resistive loads and passivefilters directly without the need of any high speed output buffers, andmay also be easily reduced to minimal power consumption designs.This paper first examines the challenges facing designers attemptingto maximize dynamic linearity in current mode DACs. A practicalcase study from the authors’ own research is then presented, followedby extrapolation to some future trends which may be anticipated.


212

2. PRACTICAL DYNAMIC LINEARITY ISSUES

For high speed and high resolution applications (>10 bits, >50MHz),the current source switching architecture is preferred since it can drivea resistive load directly without the need for a voltage buffer. Sucharchitectures can also be reduced to minimal power designs wherebymost of the power consumption is actually the signal current [1]. Aconventional high performance DAC architecture as used in suchapplications is shown in Fig. 1. As shown in Fig. 1, the DAC consistsof m-1 thermometer (linearly) decoded most significant bits (MSBs),u-1 thermometer decoded upper least significant bits (ULSBs) and 1-1binary decoded lower least significant bits (LLSBs). The currentsources, which are implemented differentially, are taken directly to apair of differential resistive loads. Modern high speed and highresolution DACs all use variations of this basic architecture [1-10].The ULSB/LLSB array is sometimes driven by an mth MSB to ensurethe sum of the LSBs is one MSB. Also, the ULSBs are sometimesomitted, so that the DAC has an upper array of thermometer decodedbits (MSBs) and a lower array of binary decoded bits (LSBs).Thermometer decoding has the well known advantages ofmonotonicity and reduction of glitch at major carries but fullthermometer decoded architectures are impractical to implement forhigh resolution [3].

213

The static performance of such DACs is well characterized bytraditional measures such as integral non-linearity (INL) anddifferential non-linearity (DNL), and various techniques have beenused to attain full n-bit static linearity for n-bit DACs. In particularsuch techniques have included sizing the devices appropriately forintrinsic matching and utilizing certain layout techniques [2, 6, 11],trimming [8, 9], calibration [12, 13], and dynamic elementmatching/averaging techniques [14]. The dynamic performance ofcurrent-switched DACs, however, has not scaled in proportion to thenumber of their bits. In particular, examination of the references willshow dynamic performance as measured by SFDR falling off rapidlywith increasing signal frequency. Effectively the larger number of bitsonly gives lower quantization noise at higher signal frequencies, nothigher SFDR. There are several causes for this behavior; the majorones are summarized below:

1. Code-dependent settling time constants: The time constants of theMSBs, ULSBs, and LLSBs are typically not proportional to thecurrents switched, owing to voltage headroom and parasiticcapacitance considerations in the switch devices; the problem is worseif R/2R ladders are employed in place of current dividers [15].

2. Code-dependent switch feedthrough: This results due to signalfeedthrough across switches which are not sized proportionately to thecurrents they are carrying, again owing to voltage headroom andparasitic capacitance considerations, and therefore shows up as code-dependent glitches at the output.

3. Timing skew between current sources: Imperfect synchronization ofthe control signals of the switching transistors will cause dynamicnonlinearities [6]. Synchronization problems occur both because ofdelays across the die, as well as because of improperly matchedswitch drivers. Thermometer decoding can actually make the timeskew worse because of the larger number of segments [9].

4. Major carry glitch: This glitch, which occurs when switchingin/out of circuit an MSB in place of a bank of LBSs, can be

214

minimized by thermometer decoding, but in higher resolution designswhere full thermometer decoding is not practical, it cannot be entirelyeliminated [3]. Increased thermometer decoding also brings otherproblems, such as timing skew.

5. Current source switching: Voltage fluctuations occur at the internalswitching node at the sources of the switching devices during theswitching process. Since the size of the fluctuation is not proportionalto the current being switched, and is particularly dependent onsecond-order nonlinearities arising from the switching device physics,it again gives rise to a nonlinearity proportional in size to the parasiticcapacitance at the switching device sources.

6. On-chip passive analog components: Drain/source junctioncapacitances are nonlinear; and any on-chip analog resistors alsoexhibit nonlinear voltage transfer characteristics. These devicestherefore cause dynamic nonlinearities when they occur in analogsignal paths. ESD protection on the output pads typically contributessubstantial additional nonlinear parasitic capacitance.

7. Mismatch considerations: Device mismatch is usually consideredin discussions of static linearity, but it also contributes to dynamicnonlinearity because switching behavior is dependent on switchtransistor parameters such as treshold voltage and oxide thickness.These differ for devices at different points on the die [16], introducingcode dependencies in the switching transients. Of course, any staticnonlinearities (in the current-generating transistors) will also show upas dynamic nonlinearities.

Dynamic nonlinearities increase in magnitude with increasing signalfrequency since the outputs change value more frequently and a largerproportion of the clock cycles is occupied by nonlinear switchingtransients. This explains the pronounced frequency degradation ofSFDR observed for the DACs cited above.

Alternatives to the current-mode DAC have been proposed in theliterature [17], but they are limited by the use of opamps and/or low

215

impedance followers as output buffers. Opamps introduce severaldynamic nonlinearities of their own, owing to their nonlineartransconductance transfer functions (slew limiting in the extremecase). High gain opamps connected in feedback configurations alsorequire buffers to drive lower impedance resistive loads. Buffersintroduce further distortion, due to factors such as signal dependenceof the bias current in the buffer devices and nonlinear buffer outputresistance.

One conceptual solution to the dynamic linearity problem is toeliminate the dynamic nonlinearities of the DAC, all of which areassociated with the switching and subsequent settling behavior, byplacing a track/hold circuit at the DAC output. The track/hold wouldhold the output constant whilst the switching is occuring, and trackonly once the current sources have settled to their dc value. Thus onlythe static characteristics of the DAC would show up at the output, andthe dynamic ones would be attenuated or eliminated. The problemwith this approach is that the track/hold circuit in practice introducesdynamic nonlinearities of its own which tend to be comparable to orworse than those of the DAC alone. These problems include track-to-hold step, droop rate error, hold mode feedthrough, and track modeerrors. A more detailed discussion of these nonlinearity sources isgiven in [4]. Because of these nonlinearities, track/hold circuits arenot commonly used at the outputs of high speed DACs.

216

3. CASE STUDY – A 14b 100MS/s DAC – 8

A - Introduction

In this section a 14b 100MS/s CMOS DAC designed for both highstatic and dynamic linearity is briefly presented as a case study. TheDAC is composed of a segmented current-source core driving aspecialized track/attenuate output circuit. Static linearity of the DACcore is enhanced by means of a calibration technique. Thetrack/attenuate circuit is designed to enhance the dynamic linearity ofthe outputs. A more detailed discussion is given in [5].

The chip architecture is outlined in Fig. 3. The main DAC is asegmented current source design with 4 most significant bits (MSBs),5 upper least significant bits (ULSBs), and 5 lower least significantbits (LLSBs). After passing through a 14b wide input latch array, theMSBs and ULSBs are thermometer decoded and used to drive 14MSB current sources and 31 ULSB current sources respectively. TheLLSBs are left in binary format since the LLSB current sources arebinary weighted. An additional bank of latches resynchronizes thedata prior to the current sources, and is followed by switchdrivers/buffers to drive the current source switches. Thermometerdecoding of the MSBs makes calibration straightforward whilstthermometer decoding of the ULSBs enhances static linearity andguarantees monotonicity within the ULSB range. The MSBs arecalibrated but the ULSBs are not, so that intrinsic matching at the 10blevel is required and built into the ULSB circuit by careful layouttechniques. A 16th MSB current source is used to drive the ULSBarray, and a 32 ULSB current source drives the LLSB array.

The self-trimming circuit is composed of a number of measurementresistors and a sigma-deltamodulator for accurate dc voltage measurement, a digital correctioncircuit which includes memory storage of the calibration errorcorrections, and a 12b calibration DAC (CALDAC) which reads thesecalibration corrections and converts them to an analog form in such away that they can be used to trim the MSB current sources. The

217

resistors and are used to measure the currents in thesum of the ULSBs and the MSBs respectively by changing it to avoltage which the sigma delta modulator can measure. The dummyresistors are connected to MSBs not beingmeasured, since only one MSB can be connected to the singlemeasurement resistor at a time. The digital output of thesigma delta modulator is analyzed in the digital correction circuit,which uses an iterative measurement process to compute appropriatecorrections for each of the MSBs so as to change them to the value ofthe sum of the ULSBs.

The detailed design and operation of the self-trimming circuit [5] isbeyond the scope of this paper, but the motivation for it is strong interms of dynamic linearity. Although static linearity can be obtainedby means of transistor sizing alone, (e.g. [2, 6]), such designs result inconsiderably larger MSB cells which exhibit larger parasitics. Moreseriously, owing to the need to spread the MSB cells around the die,and couple them together with metal wires, a large degree of parasiticcrosstalk between MSBs results at the switching instants. Thesefactors contribute to significant dynamic linearity degradation. In thedesign presented here, the MSB cells are small and isolated fromeachother, thus taking advantage of the calibration circuit to improvedynamic performance.

218

As also shown in Fig. 3, the current sources are taken to the output ofthe DAC by means of a current folding stage which folds the totalcurrent and makes the n-type DAC current sources capable of drivinga switching stage composed of n-type switches. This allows the use offast n-type switches in both the current sources and the output stage,and reduces the power supply voltage requirement, at the expense ofan extra 60mW in power consumption. The folding circuit alsoincludes a feedback loop to regulate the current source outputs, bothenhancing their static linearity, and isolating them from the switchingwaveforms at the outputs which would otherwise disturb their settledcurrents.

219

The current folding stage drives a track/attenuate circuit composed ofa number of switches which attenuate the current outputs during thefirst half of the clock cycle while the current sources settle, and trackthem during the second half of the clock cycle. The design of theswitches in this track/attenuate stage is optimized as described in thenext sub-section. In a similar way to return-to-zero, therefore, thedynamic nonlinearities associated with current source switching aretherefore greatly reduced. The track/attenuate stage drives a pair ofdifferential current outputs to which resistive loads of or lowerohmic value may be connected as with conventional current modeDACs. The DAC full scale current is 20mA, making for a 2Vp-pdifferential output signal when the two differential outputs arecombined.

B – Track/Attenuate Circuit

The track/attenuate concept is illustrated in Fig. 4. Conventional DACoutputs are full cycle as shown in the top half of the figure, with theDAC output being valid for the whole clock period T, albeit corruptedby dynamic linearities at the switching instant at the start of T. Thetrack attenuate output stage modifies the output waveform to thatshown in the lower half of the figure.

220

The output is attenuated during the first half of the clock cycle bylowering the effective output impedance by the parallel connection ofa low impedance with the output load. Although the output signal onthe load, including the dynamic nonlinearities, is not reduced entirelyto zero (the low impedance load still has a finite impedance), it isgreatly reduced, hence improving the SFDR. During the second halfof the clock cycle the low impedance load is removed and the outputtracks the DAC output. The effect is similar to return-to-zero in thatthe SFDR is improved (as well as the sin(x)/x rolloff) at the cost ofhalving the signal power [4]. RZ implementations inherently alsoincrease the clock jitter, but assuming that this is random in natureand not related to the signal source, the effect is only to raise the noisefloor and degrade SNR, not SFDR. This is acceptable forcommunications applications involving the Nyquist baseband beingsplit into several channels, as is typically the case, since no onechannel has a high noise floor. A single spur in such a channel due to

221

poor SFDR, on the contrary, would effectively wipe out the channelinformation and has to be avoided.

222

The track/attenuate output stage is shown in Fig. 5 in its differentialimplementation. It consists of 3 attenuate switches to function as lowimpedance loads in parallel with the external output load during thefirst half of the clock cycle when the ATTEN signal is brought high.Two of the switches, and are single-ended, shorting the outputto signal ground, whilst one switch, is differential and shortsthe outputs together. The use of all nMOS switches makes thetrack/attenuate action unipolar, since only a single clock signal(ATTEN) is required to drive the switches. This avoids problems withmatching rising and falling clock waveforms. The analysis behindoptimizing the performance of the switches will be presented in thissection.

The folding current sources and not strictly a part of thetrack/attenuate action, are also shown in Fig. 5, as well as theregulated cascode circuit that keeps the drain of these folding sourcesand the outputs of the DAC current sources at approximately constantpotential as required for correct static linearity. The unity gainbandwidth of the regulated cascode is kept in excess of 600MHz forall values of output current by forcing a fixed dc current componentthrough each side of the differential circuit. This dc component isobtained by excess biasing of the folding sources, and maintains theminimum acceptable bandwidth for settling the and nodes,even at the zero DAC current position. Finally the impedance lookinginto the regulated cascode is made sufficiently large so as to ensurethat the DAC current sources are adequately isolated from theswitching at midcycle and are not disturbed significantly from theirsettled position at that point.

Special sizing of the switches is carried out to maximize the dynamiclinearity performance. Consider first the circuit shown in Fig. 6(a).This shows the differential current outputs and of the currentmode DAC being sent to output loads on either side. Thetrack/attenuate circuit in this first case is composed simply of the twoMOS switches connected across the loads to ground, and we callthis circuit the two-switch circuit for convenience. Also, since thecircuit is shown in the attenuate phase we represent the switches,

223

which are operated in linear mode, by their on-resistance asshown. We consider the effects of other parameters such as channelcharge and treshold voltage shortly. During the track phase theswitches are turned off and the DAC currents flow solely to theloads. We are concerned with the attenuation introduced in the DACsignal by the switches during the attenuate phase as compared to thetrack phase. For maximum dynamic linearity this attenuation shouldbe as large as possible.

We define the attenuation factor, or AF, as the resistance during theattenuate phase for the differential output divided by the loadresistance. For the two-switch circuit For

we get that , so that for an attenuation of 50times we require that

Consider now the single switch circuit shown in Fig. 6(b). It can beshown that and that for

. Comparing the 2-switch scheme and the 1-switch scheme,furthermore, we see that if we consider the same total amount of MOSswitch size, we can make the MOS switch twice as large in the singleswitch scheme (without increasing the clock driver load necessary todrive the switches). We therefore get thatOn the basis of attenuation factor alone, the choice is clearly in favorof the single switch scheme over the two-switch scheme, although sofar factors such as charge injection and treshold voltage, whichdegrade the intrinsic linearity of the output stage, have not yet beencompared.

Consider now the three-switch scheme formed by combining thesingle and two-switch schemes as shown in the circuit of Fig. 6(c).During the attenuate phase and are connected together bymeans of the differential switch (resistance and also eachconnected to ground by means of the single-ended switches(resistance ). The attenuation factor is now given by

, where . We have alreadyseen that for the same total switch size . In the threeswitch scheme, for the same total switch size, we have allocated some

224

portion of the MOS switch size to the single-ended switches and someportion to the differential switch, so thatBased on this simple analysis alone, there is no motivation to useanything but the single switch scheme, to minimize the attenuationfactor.

This simple analysis however ignores the effect on differential switchresistance of adding the single-ended switches as in the three-switchscheme. In particular, this addition will result in the common-modevoltage of the output nodes being reduced from

in the single switch scheme toin the three-switch scheme. Since we design

, is close to zero during the attenuatephase. In a p-well CMOS process as used here, and for common modevoltages of around 0.5-1V in the single-switch scheme, as alsoresulting from the DAC output currents in this design, this reductionin the common mode voltage increases the gate drive ofthe differential switch by 0.5-1V, decreases because of the lowerbody-effect, and hence reduces the switch resistance by a factor ofapproximately 1/3. When we factor this into the analysis we get thatthe attenuation factor of the 3-switch scheme is approximately thesame as that of the single switch scheme for the same total switchsize.

In general if W was the original single differential switch widthcorresponding to , if we split the switch to obtain a three-switch scheme by keeping kW width in the differential switch andcreating two single-ended switches of width 1W each (such that k + 2l= 1), we then get that . This equation is validso long as enough switching capacity is allocated to the single-endedswitches to obtain the 1/3 improvement in differential switchresistance as described above; in practice this is satisfied so long as lis not very small. We will examine optimal allocation shortly, andquantify how small l can be. For k=0.5 and l=0.25, we get

again.

225

This being the case, AF alone narrows down the selection to either thesingle switch or the three switch scheme but is insufficient to choosebetween the two. To decide between these schemes we now considerother factors, in particular charge injection and treshold voltage effect,which introduce nonlinearities in the output stage dynamicperformance.

The first order models for , the channelcharge of the switches in linear region, and

the switch resistance in linear region areused here since they are fairly accurate for large switches where

. From these equations, comparing the 2-switch scheme withthe single switch scheme, we observe that the former will havesuperior channel charge injection and switch resistance characteristicsfrom the linearity viewpoint. In the two-switch scheme, the switcheshave their source node grounded. To first order, therefore, the channelcharge and the switch resistance remain constant since is aconstant dependent only on the clock waveform voltage and isconstant and signal independent because there is no signal on thesource node. Therefore the charge injection when the switches areturned off, as well as the charge uptake when they are turned on(significant in a current-limited DAC output) are both constant, andthe switch resistance is also constant. In the single switch case,however, there is a signal component on both the switch nodes. Thechannel charge is therefore signal-dependent, as is the tresholdvoltage due to the backgate effect, thus reducing the linearity of theoutput stage.

It therefore makes sense to allocate as much switch size as possible tothe single-ended switches instead of the differential switch, so long asthe AF is not reduced significantly. This suggests an optimizationprocess. A computer program was written for this purpose; thisprogram tracks the switch resistance of the differential switch as k isreduced and l is increased and calculates AF for each positionaccordingly. Based on this optimization, the current-switching outputstage implemented for this chip was a track/attenuate three switch

226

circuit as shown in Fig. 5 with k=0.5, i.e. the differential switch size istwice as large as the single-ended switch size.

C – Measurement Results

Fig. 7 shows a die photo of the fabricated chip. The die occupies anarea of 3.44mm x 3.44mm in a CMOS process. The main DACoccupies the central third of the die, and is composed of the MSBs,ULSBs, and LLSBs current sources and their associated latches,buffers, and bias circuitry. Their current outputs are collected andtaken to the folding sources and output stage on the right side of thedie. The self-trimming circuitry, composed of the sigma-deltamodulator circuit, the calibration DAC, and the digital calibrationlogic is shown on the left side of the die.

A summary of the measured chip characteristics is given in table formin Table 1. It can be seen that after calibration the INL and DNL arewithin the 14b specification as designed for. The dynamic linearity atthe design clock rate of 100MHz is around 6dB higher at frequenciesclose to Nyquist (42.5MHz) than similar DACs without atrack/attenuate output stage. The effectiveness of the circuit howeverfalls off for higher clock rates where nonlinearities due to the currentmode DAC driving the circuit can no longer be expected to settlecompletely by mid-cycle.

228

5. FUTURE TRENDS

The importance of the communications DAC market is such thatinterest in this area of development can only be extrapolated to growsignificantly in the next few years. The ultimate goal, so farunrealizable, is the full software radio with the data converter beingthe only component between the antenna and the digital signalprocessing circuitry. Certainly, new circuits and architectures will beneeded to meet even subsets of this challenge. Some trends can bepredicted with what the author hopes is reasonable accuracy:

(1) Basic DAC cores will move towards more thermometerdecoding as experience in dealing with the practical layoutcomplexities grows. As the least significant bits are pushed intolower significance compared to the most significant ones, anydynamic mismatches between the two have a reduced impact onthe overall DAC linearity.

(2) Dynamic Element Matching (DEM), currently mostly used onlyin unit element (full thermometer) implementations in multibitsigma delta feedback loops, will move into a position of greaterimportance in segmented communications DACimplementations. DEM cannot correct for MSB-LSB mismatch,but thermometer coding of more MSBs as in (1) makes this lessimportant. The advantage of DEM is that it matches dynamicmismatches between switches and parasitic capacitances incurrent sources, besides static mismatches, again improvingdynamic linearity.

(3) Calibration will continue to be used to correct for staticmismatches because of its advantages over intrinsic transistor-based matching in terms of lower parasitics and MSB crosstalk,and its greater degree of process independence. Calibration

229

remains advantageous even in a DEM enviroment where itreduces the random white noise floor otherwise introduced bythe DEM process due to static mismatches. New currentmeasurement techniques such as the use of accurate sigma deltamodulators open up new calibration possibilities in terms ofmeasurement accuracy obtainable. As regards currentcorrection, the implementation convenience of gate-chargestorage indicates that such methods will retain preference overcorrection DACs tied to the outputs of the main DAC. Suchcorrection DACs are severely dynamically mismatched to themain DAC and are thus unsuitable in communicationsapplications.

(4) The use of output stages correcting in some way the currentoutputs of DAC cores remains a possibility in areas where thehigher power consumption and complexity can be afforded;such methods however inherently carry the disadvantages ofincreased noise due to clock jitter and a drop in performance athigher clock rates.

(5) On-chip isolation of signals, both analog-analog and analog-digital will have to be emphasized for better dynamic linearity.Analog-analog isolation is particularly important in the MSBarray and can be addressed by calibration, increased layers ofmetal providing extra shielding, higher resistivity substrates, etc.Analog-digital isolation will be likely be addressed bytechniques such as higher resistivity substrates, custom-specificdigital coding schemes, and differential digital encoding close tothe MSBs. Digital input signals are correlated to the outputsignal and can produce harmonic distortion besides noise if careis not exercised.

(6) From the process standpoint, BiCMOS and pure CMOSimplementations appear to be the choice communications DACtechnologies of the future. CMOS obviously has strong costconsiderations driving it, whereas BiCMOS offers thepossibility of retaining CMOS DAC cores unchanged to a large

230

extent, but exchanging the CMOS switches for bipolar ones toincrease the switching speeds.

(7) Packaging and board design will become increasingly important.Low inductance packages which offer little “voltage kickback”will be necessary in fast current mode DACs where full scalecurrent changes can push the current sources into lowimpedance ranges of operation and thus adversely impactdynamic linearity. Fortunately modern BGA packages and chip-on-board (COB) implementations are now starting to approachthe sub-nH/pin specification. From the board standpoint, thegreatest challenge will likely remain that of isolating the digitalinputs and clocks from the analog outputs; it appears that newdriver schemes such as LVDS will become helpful here.

6. CONCLUSIONS

The sources of dynamic nonlinearities in high speed and highresolution DACs as required for modern communications applicationshave been summarized. A case study of a DAC which uses a specialtrack/attenuate stage to improve dynamic performance has beenreviewed. The communications market ensures that the requirementsplaced on the DAC component will continue to increase in the comingyears. Future trends which the author expects will be visible in thisarea over the next few years have been presented.

7. REFERENCES

[1] M.P. Tiilikainen, “A 1.8V 20mW 14b 100MS/s CMOS DAC”, Proceedings of theEuropean Solid State Circuits Conference, June 2000.

[2] G. Van der Plas et al., “A 14-bit Intrinsic Accuracy Random Walk CMOS DAC” IEEEJournal of Solid-State Circuits, vol. 34, pp. 1708-1718, Dec. 1999.

[3] C. Lin and K. Bult, “A 1 0bit 500Ms/s CMOS DAC in , IEEE Journal of Solid-State Circuits, vol. 33, pp. 1948-1958, Dec. 1998.

[4] A.R. Bugeja et al., “A 14-b 100-MS/s CMOS DAC Designed for Spectral Performance”,IEEE Journal of Solid-State Circuits, vol. 34, pp. 1719-1732, Dec. 1999.

231

[5] A.R. Bugeja and B.-S. Song, “A Self-Trimming 14-b 100-MS/s CMOS DAC”, IEEEJournal of Solid-State Circuits, vol. 35, pp. 1841-1852, Dec. 2000.

[6] J. Bastos et al., “A 12bit Intrinsic Accuracy High Speed CMOS DAC”, IEEE Journal ofSolid-State Circuits, vol. 33, pp. 1959-1969, Dec. 1998.

[7] A. Van den Bosche et al., “A 10bit 1Gsample/s Nyquist Current Steering CMOS D/AConverter”, Proceedings of the IEEE 2000 Custom Integrated Circuits Conference, pp. 265-268.

[8] B. Tesch and J. Garcia, “A Low Glitch 14bit 100MHz D/A Converter”, IEEE Journal ofSolid-State Circuits, vol. 32, pp. 1465-1469, Sept. 1997.

[9] D. Mercer, “A 16bit D/A Converter with Increased Spurious Free Dynamic Range”,IEEE Journal of Solid-State Circuits, vol.29, pp. 1180-1185, Oct. 1994.

[10] D. Mercer and L. Singer, “12bit 125Ms/s CMOS D/A Designed for SpectralPerformance”, International Symposium on Low Power Electronics and Design, pp. 243-246,1996.

[11] G. Van der Plas et al., “Systematic Design of a 14b 150MS/s CMOS Current SteeringD/A Converter ”, Proceedings of the 2000 Design Automation Conference, pp. 452-457.

[12] R. Hester et al., “CODEC for Echo-Canceling, Full-Rate ADSL Modems”, ISSCCDigest of Technical Papers, pp. 242-243, 1999.

[13] D. Groeneveld et al., “A Self-Calibration Technique for Monolithic High-ResolutionD/A Converters”, IEEE Journal of Solid-State Circuits, vol.24, pp. 1517-1522, Dec. 1989.

[14] M. Moyal et al, “A 25kft 768kb/s CMOS Transceiver for Multiple Bit-Rate DSL”,ISSCC Digest of Technical Papers, pp. 244-245, 1999.

[15] P. Hendriks, “Specifying Communications DACs”, IEEE Spectrum, vol. 34, pp. 58-69,July 1997.

[16] M. Pelgrom et al., “Matching Properties of MOS Transistors”, IEEE Journal of Solid-State Circuits, vol. 24, pp. 1433-1440, Oct. 1989.

[17] K. Khanoyan et al., “A 10b, 400MS/s Glitch-Free CMOS D/AConverter”, 1999 Symposium on VLSI Circuits, Digest of TechnicalPapers.

K. Khanoyan, F. Behbahani, and A. A. Abidi

Electrical Engineering DepartmentUniversity of California

Los Angeles, CA 90095-1594IntroductionModern integrated wireless transceivers increasingly use digital cir-cuits in critical building blocks. One example is the direct digital syn-thesis of sinewaves in a frequency agile transmitter [1]. A discrete-time sinewave and its quadrature phase are generated as a sequenceof digital words by table lookup in a ROM. The accumulation rate,programmed by a control word, on the left of the block diagram, setsthe sinewave frequency. The frequency can be instantly changed toany arbitrary value. Two D/A Converters (DACs) convert the outputwords into discrete-time analog waveforms. These DACs must be high-speed, compact, and most importantly for communication systems,they must not suffer from dynamic nonlinearity. Such a DAC is thesubject of this paper [2].

It is widely believed that the current-steering DAC is the only feasiblecircuit for operation at 100’s of MHz. This DAC is fast because theinput data after being latched is merely required to steer an array ofbinary-weighted currents into a differential line. There the currentssum to form the analog output. A binary-weighted DAC (Figure 1 (a))needs only N latches to convert an N-bit word. However, its DC linear-ity is limited by the accuracy of the Most Significant Current Sourcerelative to the sum of all other current sources. Segmenting the cur-rent source array into units of independently switched least significantcurrents (Figure 1 (b)) greatly relaxes the accuracy required on theindividual current source. This arrangement now needs latches toconvert an N-bit word, and the binary input word must be expandedinto a thermometer code to drive these latches. In practice, the explo-


Sources of Dynamic Distortion

A 400-MHz, 10-bit Charge Domain CMOS D/A Converterfor Low-Spurious Frequency Synthesis

234

sion in the number of latches limits use of the segmented DAC to onlya few bits. To satisfy DC accuracy most high resolution DACs will seg-ment the upper few bits, and binary weight the remaining lower bits.

Dynamic accuracy is however another matter. The problem stems fromthe fact that a clock edge must latch the data word at every currentswitch. It is fundamentally impossible to switch an array of currentsources simultaneously. In practice, because of distributed RC delay inthe clock lines, the clock edge arrives at some current cells a few pico-seconds later than at others (Figure 2(a)). As a result, momentarilythe DAC output current over- or undershoots its final value. This cur-rent glitch is worst at the mid-scale transition. An actual glitch wave-form is shown in Figure 2(b). What is worse is that the glitch dynamicsare code dependent, and are largely unaffected by efforts to improvethe DAC’s static accuracy. The waveform of the discrete-time sine-wave synthesized by the DAC now contains code-dependent glitches,as shown in Figure 2(c). Departure from a linear setting constitutesdynamic distortion. When the synthesized sinewave frequency exceedshalf Nyquist, the jumps between successive samples are larger and soare the glitches. As the clock rate rises, the glitch transient occupiesa larger fraction of the clock period. Thus, the worst-case distortionarises when synthesizing sinewaves above half Nyquist at high clockrates. Figure 2(d) shows the spectrum of a commercial 10b DAC asit synthesizes a 65 MHz sinewave at 2/3 Nyquist. The glitch-producedharmonics are aliased in-band, and here the largest harmonic is only45 dB below the fundamental tone. This is unacceptable for mostwireless applications, which require SFDR of more than 60 dB.

Charge Domain D/A ConversionAn entirely different approach to D/A conversion is by charge redis-tribution [3,4] (Figure 3(a)). Depending on the data bit, a capacitoris pre-charged to either the full-scale or to zero, and on the next clockphase an equal capacitor bisects the charge by redistribution. A three-phase clock forces charge to flow from left to right. Progressing tothe right, each capacitor is pre-charged by increasingly significant bitsof the input word. Therefore, charge introduced on later stages isbisected fewer times. This means that charge arriving on the last stagerepresents the required binary D/A conversion. Because charge isnaturally sampled and held at each stage before being passed to the

235

next stage, this DAC is glitch free. Furthermore, the operation may bepipelined so that a new conversion completes every clock period. Thesimple circuit shown here is a binary weighted DAC. We reported a100 MHz prototype of this DAC at ISSCC ’94 [4]. This DAC can besegmented by replacing the series pipeline with a parallel set of equalcapacitors switched into a summing node.

The charge output at the last, most-significant stage of this DACmust be buffered to voltage to drive the later circuits. An op amp-based switched capacitor amplifier is used for this purpose (Figure3(b)). The D/A conversion capacitors are actually two quasi-differen-tial arrays, which are differentially sampled by the balanced ampli-fier. This op amp is intended to drive an on-chip capacitor load. Overtwo of the three clock phases it acquires and holds the DAC outputcharge, and resets during the third clock phase. Reset after everysample means that the amplifier dynamics, and therefore dynamic dis-tortion, remain the same whether the DAC produces a DC output ora high frequency sinewave. Uniform behaviour under all conditions isdesirable.

Let us now turn to practical aspects in realizing 10b accuracy in acharge-redistribution DAC. Clearly, matching between unit capacitorslimits achievable accuracy. Conversion involves precharge to one of twopossible values, followed by charge bisection. Non-zero voltage coeffi-cient on two matched capacitors means that charge bisects, but volt-ages do not (Figure 4(a)). However, voltage coefficient is unimport-ant in the core of the DAC because conversion takes place in thecharge domain. When the converted charge is buffered as voltage atthe output, voltage coefficient in the feedback capacitor of the opamp distorts the output voltage (Figure 4(b)). All capacitors in thiscircuit are poly over thin oxide over heavy diffusion. The voltage coef-ficient is 0.1 %/V. For –60dB THD, this limits the maximum voltageswing to 0.5V, which is now the full-scale output. Unit capacitors of 0.5pF guarantee RMS spread of < 0.1% [5] and noise < ½ LSB at10b.

Accuracy Issues in Charge Domain DACLet’s take a closer look at one cell in the DAC core (Figure 5 (a)).At every node there is a stray capacitance to ground. As long as thecell capacitance, including the stray, matches well at every node, DAC

236

accuracy is preserved. However, strays between cells, shown by thecapacitor are troublesome. For example, during node nl pre-charges to a reference voltage while node n2 bisects charge with thecell to its right, will now leak into node n2 and corrupt the bisectedcharge. The resulting distortion worsens as the frequency of a synthe-sized sinewave approaches Nyquist. originates mainly in the fring-ing capacitance across the inter-cell switch. The photomicrograph andgraphic in Figure 5(b) show an unconventional layout of the switchFET to alleviate this. Metal contacts opposite halves of the sourceand drain diffusions, lowering fringing capacitance between the metalsidewalls to an estimated 0.2fF, which is less than 0.1 % of the unit DACcapacitor. To the first-order, charge injection by the switches does notcontribute distortion. This is because switches connected to the ref-erence voltages always turn off at the same voltage, independent ofthe eventual sample value, and therefore inject the same signal-inde-pendent charge. Whereas the switch shorting adjacent cells connectsto capacitors only, so whatever charge it injects through its inversionlayer at turn on is almost all removed at turn off. There is a second-order error due to the fact that the switch FET’s source drain voltagesat onset of turn on are unequal, but at turn off are equal.

Compared to the previously reported prototype, this DAC improveson the precharge logic as well. Instead of using two pass gates inseries (Figure 6(a)) to select the precharge reference voltage andthen enable precharge in a particular clock phase, now the output ofseparate AND gates drives a single pass transistor (Figure 6(b)). Thislowers the RC time constant at precharge to about 65 ps, guarantee-ing 10τ settling at 400 MHz. It also eliminates a troublesome interlinestray capacitance in the previous design which couples clocks into theDAC cell.

The reference voltage to precharge the DAC cells is provided fromoff-chip. The parasitic LC network formed by the bondwire and pack-age inductance and the cell capacitance is in fact underdamped bythe very small ON resistance of the precharge switch. Simulationsshow that at 300 MHz clock rate the cell voltage can be in error by5-10 LSBs because of ringing (Figure 7(a)). After considering severalmethods to damp the ringing, a 500-pF on-chip decoupling capacitorwas found to work best at this high clock rate. In place of the exter-nal voltage source, this now delivers charge to the DAC unit cells.

237

The reference voltages are differential, so the charge flowing throughthe decoupling capacitor circulates entirely on chip. The decouplingcapacitor is built into vacant parts of the chip. Connecting the exter-nal reference through two bondwires and package pins also halvesthe series inductance. Simulations show that the settling error is nowlower than 1 LSB at the end of the precharge phase (Figure 7(b)).

At these high conversion rates, clock skews are also a concern. Thewaveforms in Figure 8 (a) illustrate skew between the timing of thelatch carrying data to the DAC and the three-phase clock, which sam-ples this data into the DAC cell. Phase straddles a clock transitionin the latch, which means that the DAC cells that precharge on thisphase may be corrupt. As shown in Figure 8 (b), delaying the pipelineregister clocks to synchronize to the rising edge of or elimi-nates skew. The waveforms at the bottom show that this guarantees asafety margin between the conclusion of DAC cell precharge and theupdate in the latch contents.

In this application, the op amp used in the output buffer must be fast,and should not slew rate limit otherwise the discrete-time output wave-form will distort. A standard single-cascode op amp is used (Figure9 (a)). A gate overdrive voltage on the input stage FETsensures that the 0.5V ptp differential voltage applied to the inputstage does not drive it into slew-rate limiting. The output swing of theDAC is also 0.5V ptp. The continuous-time common-mode feedbackcircuit (Figure 9 (b)) is designed to operate with this signal swing. Theplot in Figure 9 (c) shows the decay in time of the op amp differen-tial input voltage on a logarithmic axis. The dashed lines correspondto perfect exponential settling with different time constants during

and The 50 dB DC gain determines the steady-state error. Thisgraph shows that the op amp settling is close to a piecewise exponen-tial, which means low dynamic distortion at the DAC output.

Experimental ResultsThe DAC is integrated in a CMOS process with linear MOScapacitors (Figure 10). The multiphase clock generation is on-chip.The total active area is 1.2 sq. mm. A digital frequency synthesizer todrive the DAC is also integrated on the same chip, although it is notshown in the photomicrograph. During testing, the chip is mountedin a standard large cavity ceramic package, which in turn is attached to

238

a PC board with a large ground plane. The externally supplied refer-ence voltages are capacitively decoupled to ground via low inductanceconnections.

Figure 11 shows measured spectra of sinewaves synthesized at the DACoutput as it clocks at 250 MHz. A 12 MHz synthesized sinewave isaccompanied by a harmonic 58 dB below the fundamental. Whenthe synthesized frequency goes up to 112 MHz, the largest spurioustone rises by only 3 dB, to 55 dB below the fundamental. Figure 12summarizes the measured spurious-free dynamic range (SFDR) as afunction of synthesized frequency over the Nyquist band, at conver-sion rates ranging from 50 to 300 MS/s. At low clock rates and synthe-sized frequencies, the peak SFDR is 64 dB. Two trends are apparent inthis plot. At any conversion rate, the SFDR declines by only 3 dB overthe full Nyquist band. This is proof that the D/A conversion is glitchfree. Also, the per-sample resetting action of the DAC output bufferguarantees uniformity in the output waveform independent of synthe-sized frequency. On the other hand, when the clock rate is raised from50 to 300 MS/s, peak SFDR falls by 10 dB. This is most likely dueto small departures from perfect exponential settling in the op amp,which at high conversion rates take up an increasing fraction of eachclock cycle.

This DAC’s SFDR is compared with two other CMOS DACs.Figure 13 (a) compares SFDR at 300 MS/s rate with a 12b currentsteering DAC [6]. At low synthesized frequencies the comparisonDAC shows superior SFDR, although it falls quickly with sinewave fre-quency. Our DAC’s SFDR is better for synthesized frequencies greaterthan 15 MHz. Figure 13 (b) compares the performance with a 10 bDAC clocked at 250 MS/s [7,8]. The comparison DAC, also currentsteering, was more carefully segmented and laid out for good dynamicperformance. In this case our DAC shows higher SFDR for sinewavefrequencies beyond 50 MHz. Figure 13(c) plots the same compari-son DAC’s worst-case SFDR over Nyquist versus clock rate against ourDAC’s. Beyond 100 MS/s, our DAC shows higher SFDR.

SummaryThis paper describes a 10b DAC implemented in CMOS,which converts at rates up to 400 MS/s. The DAC and associated cir-cuits occupy DNL is less than 0.25 LSB and INL less than

239

0.35 LSB. The DAC consumes 95 mW total from 3.3V, of which 25 mWis in the buffer op amp. This DAC’s unique feature is its relatively flatSFDR over the full Nyquist range of synthesized frequencies. At con-version rates beyond 100 MHz, op amp dynamics limit peak SFDR.

This work shows that for communication applications sensitive toSFDR, D/A conversion in the charge domain is an important alterna-tive to conventional conversion in the current-domain.

References

[1] A. Rofougaran, G. Chang, J. J. Rael, J. Y.-C. Chang, M. Rofouga-ran, P. J. Chang, M. Djafari, M. K. Ku, E. Roth, A. A. Abidi, and H.Samueli, “A Single-Chip 900 MHz Spread-Spectrum Wireless Trans-ceiver in CMOS (Part I: Architecture and Transmitter Design),”IEEE J. of Solid-State Circuits, vol. 33, no. 4, pp. 515-534, 1998.

[2] K. Khanoyan, F. Behbahani, and A. A. Abidi, “A 10 b, 400 MS/sglitch-free CMOS D/A converter,” in Symp. on VLSI Circuits, Kyoto,Japan, pp. 73-76, 1999.

[3] F.-J. Wang, G. C. Temes, and S. Law, “A Quasi-Passive CMOSPipeline D/A Converter,” IEEE J. of Solid-State Circuits, vol. 24, no. 6,pp. 1752-1756, 1989.

[4] G. Chang, A. Rofougaran, M. K. Ku, A. A. Abidi, and H. Samueli,“A Low-Power CMOS Digitally Synthesized 0-13 MHz Agile SinewaveGenerator,” in Int’l Solid State Circuits Conf., San Francisco, pp. 32-33,1994.

[5] M. J. McNutt, S. LeMarquis, and J. L. Dunkley, “SystematicCapacitance Matching Errors and Corrective Layout Procedures,”IEEE J. of Solid-State Circuits, vol. 29, no. 5, pp. 611-616, 1994.

[6] A. Marques, J. Bastos, A. Van den Bosch, J. Vandenbussche, M.Steyaert, and W. Sansen, “A 12 b Accuracy 300 Msample/s UpdateRate CMOS DAC,” in Int’l Solid-State Circuits Conf., San Francisco, CA,pp. 216-217, 1998.

[7] C.-H. Lin and K. Bult, “A 10b, 250 MS/s CMOS DAC in ,”in Int’l Solid-State Circuits Conf., San Francisco, CA, pp. 214-215, 1998.

[8] C.-H. Lin, A 10b 500MSamples/s CMOS DAC in , PhDThesis in Electrical Engineering. University of California, Los Ange-les: 1998.

Design Considerations for RF Power Amplifiersdemonstrated through a GSM/EDGE Power Amplifier Module

Peter Baltus and André van BezooijenPhilips Semiconductors

Gerstweg 26534 AE Nijmegen

Abstract

This paper describes the design considerations for RFpower amplifiers in general, including trends in systems,linearity and efficiency, the PA environment,implementation issues and technology.

As an example a triple-band (900/1800/1900MHz) dualmode (GSM/Edge) power amplifier module is described inthis article. The RF transistors and biasing circuitry areimplemented in silicon bipolar technology. A multi-layerLTCC substrate is used as carrier.

1. Introduction

Currently, many cellular systems are in use in different regions of the world,and in many places more than one system is in use simultaneously. In Europeand Asia the dominant system is currently GSM, in the US it is IS95, but alsoAMPS and GSM-like systems co-exist, and in Japan PHS, PDC and IS95 co-exist.

The handsets for these systems use a low-power transceiver to communicatewith a network of base stations using radio transmissions in the frequencyrange of 800MHz to 2500MHz, and transmit power levels in the range ofl0mW to 2W. Table 1 below shows an overview of important cellular


250

systems.

New so-called third generation (3G) systems are being introduced which willallow for higher capacity (more users and higher user bit rates), and bettercompatibility across the world. Several of these systems are covered by theIMT2000 standard and include W-CDMA/UMTS (Japan and Europe) andCDMA2000 (USA).

1.1 Trends in cellular systems

In most countries, there will be a gradual change from existing secondgeneration (2G) systems to future 3G systems, with both systems coexistingfor at least a few years. For this reason, and because no single standard willachieve world-wide coverage in the near future, handsets that can connect tomultiple systems will be required. Such handsets will provide access to thefeatures of the 3G networks where available, while still providing access to2G networks where this is not the case. This implies that there will be a needfor multi-mode and multi-band power amplifiers as well.

1.2 Challenges for PA designThe desired functionality for these RF power amplifiers is easily described

251

and modeled: it should accurately amplify an incoming RF signal by a fixed(or programmable) gain:

With the output power, the input power, and the power gain.

The simplicity of this desired functionality is apparent from many modernimplementations, which consist of relatively few active devices (starting at 2transistors).

Therefore, it might seem that the design of such a simple function requireslittle effort and deserves little attention. As with many other RF circuits incellular phones, the justification for all the effort that goes into their design isderived from the combination of:

the importance of these circuits to the overall performance of the handsetthe many specifications that need to be achieved simultaneously

The power amplifier is important to the overall performance of the handsetsince it typically consumes the largest part of the power in a handset whenactive, and is therefore the most important factor in the talk time of a handset.For that reason, power efficiency is a very important specification of a PA.

This efficiency has to be achieved while still meeting the many specificationsrequired to have the handset work well within the system:

Linearity is becoming an important issue especially in newer systems thatuse advanced modulation schemes to achieve better bandwidth efficiencyRobustness is important since the handset is part of a rather variableenvironment, in which power supply voltage, load impedance,temperature, transmit frequency, and output power can vary quickly andsometimes over large ranges. Since the optimization of efficiency oftenresults in voltages and currents close to the reliability limits of thetechnology, significant changes in any of these parameters can result inperformance degradation or even complete failure of the device.Conversely, preventing such robustness problems often results in designswith voltages and currents that cannot be optimized for efficiency.

252

Stability, especially under load mismatch conditions. Such conditions canfor example arise when the antenna environment changes.Noise, especially in the receive band of the system, since this affects thesensitivity of receivers in the systemSpurious emissions, which can interfere with other electronic equipmentor with transmissions from other handsets or from basestations in the samesystemThermal behavior, including performance impact of temperature changeson the handset, and impact on reliabilityMulti-mode multi-band functionality, which requires adjustable propertiesof the power amplifier, and/or switches and adjustable circuits around anumber of individual power amplifiers. This added complexity affects inturn the other specifications such as gain, linearity, output power, etc.

This paper will give an overview of these issues. In the next section, therelation between bandwidth efficiency, power amplifier linearity and powerefficiency will be discussed at system and circuit level.Section 3 describes the environment of the power amplifier, which is thebasis for relating the systems considerations of the previous section (2) to thePA issues in the next section (4).

Implementing the power amplifier is discussed in section 5, using a Si/LTCCintegrated GSM/EDGE power amplifier module to demonstrate the relevantissues.

2. Bandwidth Efficiency, Power Efficiency and PA Linearity

In the past more capacity could be found by moving up in frequency withimprovements in device technology (recently into the 2GHz range). Thistrend is not likely to continue in the future because the link budget isunfavorable for such higher frequencies (fig. 1). Instead, the increasedcapacity is achieved by more efficient use of available bandwidth throughadvanced modulation schemes and access methods.

An important consequence for power amplifiers is that these efficient

253

modulation schemes (such as QPSK, QAM, etc) are not constant envelopeanymore, and therefore require more linearity from the power amplifier.

It is not likely that customers will accept significant reductions in talk time orlarger and more expensive batteries and handsets, therefore the efficiency ofthe PA cannot be compromised too much by the new linearity requirements.

At the system level these amplifier non-linearities results in a so calledspectral re-growth. In-band energy is transformed into energy out of bandthat might disturb reception in adjacent frequency channels.

Figure 2 and Figure 3 show the effect of non-linearity (in this case hard-limiting) on vector diagrams and transmission spectra (spectral re-growth) ofadvanced modulation schemes (in this case UMTS).

Spectral re-growth is quantified through a parameter called adjacent channelpower rejection or ACPR. This parameter defines the amount of energy in theadjacent channel relative to the energy of the transmitted signal.

Non-linearities also have an impact on the wanted signal in the sense thatamplitude and phase information, modulated onto the carrier, are disturbedand therefore demodulation on the receiving side can result in incorrect dataat base-band. This data is often visualised as a set of amplitude normalised

254

discrete I and Q values that represent the symbols being transmitted. Due todistortion the I and Q values of each symbol shift. The Error VectorMagnitude (EVM) is used to quantify this shift.

ACPR and EVM are determined by the amplifier non-linearities incombination with the signal. For the various systems both the requirementson ACPR and EVM as well as the properties of the signal (Pout, peak-to-

255

average, power density distribution, ..) are different. This makes comparisonof linearity requirements for the various standards difficult.

On the other hand, AM-to-AM and AM-to-PM conversion are inherentproperties of the circuit. For a given protocol ACPR and EVM are related tothe combination of AM-to-AM and AM-to-PM. Therefore a maximumallowable AM-to-AM requirement can not be defined independent of themaximum allowable AM-to-PM and visa versa. In practice a whole set ofAM-to-AM and AM-to-PM combination can be found that fulfil the ACPRand EVM requirements. A much larger set that doesn’t. Consequently, PAcircuit optimisation can best be done by optimising for the system parametersACPR and EVM rather than for AM-to-AM and AM-to-PM [2].

A single stage bipolar amplifier has several causes of non-linearity. We candistinguish contributions due to voltage saturation at the collector, transistorcurrent density variations, supply voltage variations at the transistor base andsupply voltage variations at the collector.

256

2.1 Voltage saturation.Power amplifiers are optimised for optimum power added efficiency andlinearity. The collector load impedance at the fundamental frequency andharmonics is chosen such that at the maximum required output power thecomplete signal voltage headroom at the collector is used. Consequently, thetransistor is driven in saturation as much as possible up to the point wheresaturation becomes too severe. This trade-off is limited by the emitter ballastresistors needed for thermal stability and by the transistor collector resistanceand quasi saturation behaviour related to that.

2.2 Current density variationsAlthough power amplifiers for applications like cdmaOne, Edge, W-CDMAetc. are often referred to as linear amplifiers their behaviour is non-linear.Due to class A/B operation the current through an RF transistor, biasedtypically at 50mA, can increase up to an average of 500mA at maximumoutput power. Under influence of the carrier envelope the transistoroperating point changes drastically. As a result the transistor input impedancevaries with the carrier envelope [3]. The impedance match with the source ispower dependent and thus varies with the envelope. This effect can be usedadvantageously. At power levels close to saturation the amplitude of the gaintends to drop down. When the source impedance match is optimal around thispower level and less optimum at lower power levels the gain can be flattenedout over a wider power level range [4].

2.3 Supply voltage variationsNon-linearity of the RF-transistor results in low frequency components in thecollector and base current. These low frequency components are related tothe data modulated on to the carrier. For Edge the modulation bandwidth is100kHz approximately. Any resistance in the power supply of the collectorwill result in low frequency supply voltage variations at the collector. At highpower levels this drives the RF-transistor further in to saturation. Therefore,proper LF-decoupling of the collector supply voltage is necessary forachieving maximum linearity.

257

In the base of the RF-transistor low frequency current components arepresent beta times smaller than in the collector. Any resistance in the voltagesupply of the base (output resistance of the biasing circuit) results in lowfrequency supply variations at the base.The voltage drop due to the output resistance of the biasing circuit changesthe operating point of the RF-transistor which results in additional distortion.In a TDMA system, like GSM, the amplifier is switched on and off in burstsby switching the biasing circuitry. Consequently, LF-decoupling of the basecan not be applied because it would disturb the amplifier turn-on and turn-offbehaviour.

Together, this behavior results in a trade-off of linearity and efficiency asshown in the figure below (Figure 5):

3. Environment

The relation between the system requirements and the PA requirements isdetermined by the environment of the PA. The environment of the PAtypically consists of:

a transceiver IC at the input. This transceiver IC generates the RFtransmitter signal at a low power level, often around 0dBm. A filter isoften placed between the transceiver IC output and the PA input to

258

eliminate noise from the transmitter IC outside the transmission band.antenna interface circuits that can include matching circuits, isolator,duplexer, switches, and diplexers to connect the PA to one or moreantennasa control IC that sets gain, biasing, and/or output power levels through anumber of control pinsa power supply, which is often directly coming from the battery, but canalso be provided through a DC/DC converter

The duplexer is responsible for connecting the antenna to both the transmitterand receiver in such way that the energy from the transmitter is sent to theantenna only (and not the receiver), whereas the energy received by theantenna is sent to the receiver only (and not the transmitter). Depending onthe access and duplex methods of the system, the duplexer can beimplemented either as a traditional duplexer filter and/or through switches.The diplexer connects transceivers for different systems and/or bands to theantenna. Again, depending on the properties of the systems, this can be

A typical PA environment is shown in the figure below (Fig. 6):

259

implemented through filters and/or switches.The isolator serves to protect the power amplifier from impedancemismatches at the duplexer input. This is not always necessary, e.g. for GSMtype systems this component is typically left out. By presenting the PA afixed and nominal load impedance independent of the actual input impedanceof the duplexer, the design of the PA can be further optimized since theinfluence of the load impedance on linearity, reliability and stability does nothave to be taken into account.

To give a first impression of the type of load change that can be expectedfrom an antenna through changes in the environment, the figure below(Figure 7) shows simulation results of a dipole with and without a conductingbody at 2cm distance, not untypical for a handset antenna near the head or infree space.

As shown by the simulation results, the impedance change of the antenna is

260

quite dramatic and results in large changes of the return loss, e.g. from –10dBto –2dB around 1.37GHz. This results in a reduction of transmitted powerfrom 90% to 37%. Considering all the effort spent on optimizing theefficiency of the PA, these numbers are very significant. Measurements onvarious antennas show that these numbers do occur in practice as well, and insome cases can be even worse.

Since efficiency is such an important parameter, it is very useful to find outwhere power is lost in the total system. The figure below shows a typicalsituation for a GSM PA in a multi-mode system. The numbers represent thepower consumption in Watt.

From this figure, it becomes clear that there is a very large power lossbetween power drained from the battery and power ultimately delivered tothe electromagnetic field: the overall efficiency in this not so unrealisticscenario is 8%, and is composed of the following major items:

PA proper: 58% efficiencyAntenna interface (matching, duplexer): 43% efficiencyAntenna: 37% efficiency.

Considering that the theoretical efficiency of an ideal class A/B amplifier is

261

78%, it is obvious that the potential for improving overall efficiency byimproving the PA proper (e.g. by going to more expensive active devices) islimited. Instead, passive devices and the antenna are more obvious candidatesfor overall efficiency improvement.

4. Power amplifier

After taking into account the environment of the PA, what remains is anumber of issues and specifications that need to be achieved in the PA itself,through careful choices in the partitioning, implementation and technologies.

It is rather common to implement GSM power amplifiers as hybrids. Thisallows for usage of best combinations of active and passive technologies inorder to be able to meet severe specifications on reliability, ruggedness,stability, power added efficiency, size and cost. Moreover, a hybrid poweramplifier solutions is attractive because, due to the matching networks atinput and output and the on-module power supply decouplings, the amplifierfunction is well defined and therefore easily applicable.

Reliability (life-time) of a GSM amplifier is mainly related to the maximumtemperatures that occur. Especially for the recently defined class 12operation, with an on/off duty cycle of 50%, the solder between PCB andhybrid module, the glue to attach the die on the LTCC substrate and theAluminium interconnect of the die might approach critical temperaturevalues.

Moreover, the amplifier has to survive very severe conditions that mighthappen occasionally. For instance, the amplifier should not be damaged whenthe antenna is being disconnected while the battery is being charged andcollector voltages up to 20V may occur. This poses rather severerequirements on the collector-base breakdown voltage.

In a GSM handset the power amplifier dissipates a significant amount ofpower and thus determines the standby and talk time in to a great extend. Inparticular the final RF-transistor geometry and the output matching network

262

have to be designed for maximum power added efficiency [1]. The outputmatching network provides an optimum collector load impedance for thefundamental frequency as well as for the harmonics. It is realised by meansof High-Q microstrip lines, integrated on LTCC, and high-Q SMDcapacitors in order to minimise insertion losses.

For a dual mode GSM/Edge power amplifier additional requirements withrespect to linearity have to be met. There is not much design freedom foroptimising the linearity of a GSM/Edge amplifier when typical GSMspecifications have to be met anyway. In this example, the biasing of the RFtransistor in GSM mode has been made independent of the biasing in Edgemode. Optimum linearity is achieved by optimising the DC operating pointsof the three cascaded RF-transistors each.

5. Implementation

In this section we will discuss implementation details of the GSM/EDGE PA.This PA is relevant for a number of reasons:

This combination of systems in a single handset is likely to becomepopularIt is an optimised combination of a saturated, strongly non-linear PA forGSM mode and a linear PA for EDGE mode, with integrated modeswitchingIt is typical for many of the multi-mode multi-band power amplifiers thatwill be needed in the transition period between 2G and 3G systems.

The Edge protocol has been adopted as an evolutionary path for enhanceddata-rates and increased capacity in GSM. Edge is compatible to GSM in thesense that it operates in the same frequency bands and that it makes use of thesame channel bandwidth and channel spacing. The data-rate, however, hasbeen made a factor 3 higher by applying offset 8-PSK (non-constantenvelope) modulation and appropriate modulation filtering.The amplifier module consists of two fully independent RF line-ups (seeFigure 9). Each line-up consists of a 50 ohm input matching circuit, three

263

cascaded RF transistor with interstage matching circuits in between and a 50ohm output matching circuit. The module can operate either in GSM-mode orEdge-mode by activating the GSM biasing circuits or Edge biasing circuitsrespectively.

In GSM mode the output power can be controlled with Vcntrl. In Edge mode,however, the output power is determined by the input power. The gain of theamplifier is constant. The biasing circuits of Edge-mode are activated byapplying a stabilised voltage Vstab.

As shown in the block diagram, the module contains an output powerdetector for 900MHz and for 1800/1900MHz. These outputs can be used toclose a power control loop for smooth up and down ramping of the power.To study and optimise the linearity in Edge-mode the bias current of the 2ndand 3rd RF-transistor can be enhanced by applying a current Iref2 and Iref3respectively.

264

Figure 10 shows a photograph of the triple-band GSM/Edge power amplifier.The 900MHz line-up is visible at the left hand side and the wide-band1800/1900MHz line-up at the right hand side. The die at the bottom sideforms the driver IC for the final stage that is positioned at the top side. 0402SMDs are used for decoupling of the power supply lines feeding the RF-transistors and biasing circuits. Input, interstage and output matchingnetworks are build-up with discrete capacitors and microstrip line inductorson top of the ceramic substrate. In order to reduce DC and/or RF lossesrelatively wide traces are used for the RF-choke that feeds the final stage, aswell as for the output matching microstrip lines.In a final product the module is encapsulated with an 0.25mm thin plasticcap. The module size is 11x13.75x1.8mm.

265

5.1 Output matching networkOn the left top side of the module the RC-choke to feed the 900MHz finalstage is visible. The choke is RF decoupled at the supply side and maderesonant using a capacitor located close to the collector bondwires. Theoutput matching network, located next right to the RF choke, consists ofseveral sections to transform the 50ohm load, in several steps, to an optimumcollector impedance of about 2 Ohm at the fundamental frequency and0.5+l0j at the second harmonic. Rejection of the second and third harmonicis obtained by series resonance of matching capacitors and their series selfinductance plus via inductance which gives notches in the transfer function.In simulations a typical insertion loss of 0.8dB can be obtained. Theattenuation at the second and third harmonic is typically 25dB and 35dBrespectively.

5.2 Thermal designUnder nominal operating conditions the amplifier module dissipates, duringthe power burst, approximately 3.5W when the amplifier output power is at amaximum of approx. 3.5W. Under antenna mismatch conditions combinedwith high battery supply voltage the power dissipation can be even twice thatvalue. Thermal stability is ensured by applying emitter ballast resistors.

The heat, mainly generated in the emitters of the final stage, is being spreadby the 200um thick silicon die and flows through the glue toward the dieattach area on top of the LTCC. Internal LTCC layers are used to partlyspread out the heat horizontally. The heat flows further through the copperfilled vias of the LTCC substrate towards the PCB that contains severallayers of copper to further spread out the heat into the telephone set. Themodule is designed for a thermal resistance of less than 30 K/W in order tokeep the maximum die temperature below 125°C, for a maximum mountingbase temperature of 85°C and a power dissipation of 7W in the pulse with anon/off duty cycle of 25%.

5.3 Biasing circuit topologyFigure 11 shows the circuit topology for biasing the RF transistor in Edge

266

mode. The current drives, via the PNP mirror T60/T61, the NPN currentmirror formed by T62 and T1. Emitter degeneration resistances R60/R61 areadded to increase output resistance of the PNP current mirror T60/T61 inorder to reduce the supply voltage dependancy and to improve matching ofthis current mirror [5]. T63 improves the accuracy of the NPN current mirrorfactor with its current multiplication factor of beta.

The resistors R63 and R62 are added to make the topology of the left handside of the NPN mirror equal to the topology of right hand side where R16can be used to provides RF isolation between T1 and the biasing circuit andR1 is used to degenerate T1. Summing voltages around the loop includingT62 and T1 we find

Making the assumption that we find that

Since defined by emitter area ratios, the solution to (2) is

which is achieved by and

As a result the last term in (2) goes to zero which makes the biasing of T1almost temperature independent.

267

To guarantee stability of the circuit a resistor Rdamp is added to give at RFfrequencies resistive loading at the high ohmic point. The dotted transistorT13 is part of the GSM biasing circuit and is not active in Edge mode.

Conclusions

Power amplifiers are very relevant components of a handset transmitter, sincethey consume a large part of the total power dissipation. Moreover, thelinearity requirements in newer systems are difficult to combine with highefficiency. The total performance depends for a larger part on the passivecomponents and antenna than on the active part. The difficulty in designing agood PA is in achieving good efficiency while meeting many otherspecifications (stability, reliability, linearity, gain, power, etc.)simultaneously.

The GSM/Edge power amplifier module used as an example throughout thispaper illustrates that multi-mode multi-band power amplifiers can be realisedwell in Si-bipolar technology combined with multi-layer LTCC substrate.

268

AcknowledgementsThis paper is based on insights built up in several teams throughout Philips,including the Philips Semiconductor PA development teams in Sagamihara(Japan), Mansfield (U.S.A.) and Nijmegen (The Netherlands), as well as thePhilips Research Integrated Transceiver group in Eindhoven (TheNetherlands).The GSM/Edge power amplifier module could only be realised with the helpfrom enthusiastic team members. Being aware that their contributions to thesuccess of this project were very essential I would like to thanks DimaPrikhodko for his work on 1C circuit development, Skule Pramm and GerdKahmen for designing the substrate and optimising the module andChristophe Chanlo for simulating ACPR and EVM. Last but not least I wouldlike to thanks Reza Mahmoudi for the enlightened discussions we had and fordeveloping dedicated ACPR and EVM simulation tools.

References:[1] F. van Rijs, R. Dekker, H.A. Visser, H.G.A. Huizing, D. Hartskeerl,P.H.C. Magnee, R.Dondero. “Influence of output impedance on power addedefficiency of Si-bipolar power transistor” International microwavesymposium digest, Volume 3, June 11-16, 2000

[2] Private communication with Reza Mahmoudi

[3] Keng Leong Fong and Robert G. Meyer, “High-Frequency NonlinearityAnalysis of Common-Emitter and Differential-Pair TransconductanceStages”, IEEE journal of solid-stade circuits vol 33, no 4, April 1998

[4] R. Mahmoudi, “Multi-Disciplinary design method for 2.5 generation ofmobile communication systems” to be published in September 2001, TwenteUniversity Press.

[5] Paul R.Gray and Robert G. Meyer, “Analysis and Design of AnalogIntegrated Circuits”, second editions, John Wiley & Sons 1984

CLASS-E HIGH-EFFICIENCY RF/MICROWAVEPOWER AMPLIFIERS: PRINCIPLES OF

OPERATION, DESIGN PROCEDURES, ANDEXPERIMENTAL VERIFICATION

Nathan O. Sokal, IEEE Life FellowDesign Automation, Inc.

4 Tyler RoadLexington, MA 02420-2404

U. S. A.

ABSTRACT

Class-E power amplifiers [1]-[6] achieve significantlyhigher efficiency than for conventional Class-B or -C.Class E operates the transistor as an on/off switch andshapes the voltage and current waveforms to preventsimultaneous high voltage and high current in thetransistor; that minimizes the power dissipation,especially during the switching transitions. In thepublished low-order Class-E circuit, a transistorperforms well at frequencies up to about 70% of itsfrequency of good Class-B operation (an unpublishedhigher-order Class-E circuit operates well up to aboutdouble that frequency). This paper covers circuitoperation, improved-accuracy explicit design equationsfor the published low-order Class E circuit,optimization principles, experimental results, tuningprocedures, and gate/base driver circuits. Previouslypublished analytically derived design equations did notinclude the dependence of output power (P) on load-network loaded as a result, the output powerwas 38% to 10% less than expected, for values in


the usual range of 1.8 to 5. This paper includes anaccurate new equation for P that includes the effect of

270

1. "WHAT CAN CLASS E DO FOR ME?"

Typically, Class-E amplifiers [1]-[6] can operate with power lossessmaller by a factor of about 2.3, as compared with conventional Class-B or -C amplifiers using the same transistor at the same frequency andoutput power. For example, a Class-B or -C power stage operating at65% collector or drain efficiency (losses = 35% of input power)would have an efficiency of about 85% (losses = 15% of input power)if changed to Class E (35%/15% = 2.3). Class-E amplifiers can bedesigned for narrow-band operation or for fixed-tuned operation overfrequency bands as wide as 1.8:1, such as 225-400 MHz. (Ifharmonic outputs must be well below the carrier power, any amplifierother than Class A or push-pull Class AB cannot operate over a bandwider than about 1.8:1 with only one fixed-tuned harmonic-suppression filter.) Harmonic output of Class-E amplifiers is similarto that of Class-B amplifiers. Another benefit of using Class E is thatthe amplifier is a priori designable; explicit design equations aregiven here. The effects of components and frequency variations aredefined a priori [4, Figs. 5 and 6] and [7], and are small. When theamplifier is built as designed, it works as expected, without need for"tweaking" or "fiddling."

2. PHYSICAL PRINCIPLES FOR ACHIEVING HIGHEFFICIENCY

Efficiency is maximized by minimizing power dissipation, whileproviding a desired output power. In most RF and microwave poweramplifiers, the largest power dissipation is in the power transistor: theproduct of transistor voltage and transistor current at each point intime during the RF period, integrated and averaged over the RFperiod. Although the transistor must sustain high voltage during part

271

of the RF period, and must also conduct high current during part ofthe RF period, the circuit can be arranged so that high voltage andhigh current do not exist at the same time. Then the product oftransistor voltage and current will be low at all times during the RFperiod. Fig. 1 shows conceptual "target" waveforms of transistorvoltage and current that meet the high-efficiency requirements. Thetransistor is operated as a switch. The voltage-current product is lowthroughout the RF period because:

"On " state: The voltage is nearly zero when high current isflowing, i.e., the transistor acts as a low-resistance "on"switch during the "on" part of the RF period."Off " state: The current is zero when there is high voltage,i.e., the transistor acts as an "off" switch during the "off"part of the RF period.

Switching transitions: Although the designer makes the on/offswitching transitions as fast as feasible, a high-efficiency techniquemust accommodate the transistor's practical limitation for RF andmicrowave applications: the transistor switching times will,unavoidably, be appreciable fractions of the RF period. We avoid ahigh voltage-current product during the switching transitions, eventhough the switching times can be appreciable fractions of the RFperiod, by the following two strategies:

The rise of transistor voltage is delayed until after thecurrent has reduced to zero.The transistor voltage returns to zero before the currentbegins to rise.

The timing requirements of 3 and 4 are fulfilled by a suitable loadnetwork (the network between the transistor and the load that receivesthe RF power), to be examined shortly. Two additional waveformfeatures reduce power dissipation:

The transistor voltage at turn-on time is nominally zero (oris the saturation offset voltage for a bipolar junctiontransistor, hereafter "BJT"). Then the turning-on transistordoes not discharge a charged shunt capacitance of Fig.2), thus avoiding dissipating the capacitor's stored energyof f times per second, where V is the capacitor'sinitial voltage at transistor turn-on and f is the operating

5.

3.

4.

1.

2.

272

frequency. comprises the transistor output capacitanceand any external capacitance in parallel with it.)The slope of the transistor voltage waveform is nominallyzero at turn-on time. Then the current injected into theturning-on transistor by the load network rises smoothlyfrom zero at a controlled moderate rate, resulting in low

power dissipation while the transistor conductance isbuilding-up from zero during the turn-on transition, even ifthe turn-on transition time is as long as 30% of the RFperiod.

Result: The waveforms never have high voltage and high currentsimultaneously. The voltage and current switching transitions aretime-displaced from each other, to accommodate transistor switchingtransition times that can be substantial fractions of the RF period, e.g.,turn-on transition up to about 30% of the period and turn-offtransition up to about 20% of the period.

The low-order Class-E amplifier of Fig. 2 generates voltage andcurrent waveforms that approximate the conceptual "target"waveforms in Fig. 1; Fig. 3 shows the actual waveforms in that circuit.Note that those actual waveforms meet all six criteria listed above andillustrated in Fig. 1. Unpublished higher-order versions of the circuitapproximate more closely the target waveforms of Fig. 1, making thecircuit even more tolerant of component parasitic resistances andnonzero switching transition times.

Differences from conventional Class B and C: The load network is notintended to provide a conjugate match to the transistor outputimpedance. The load-network design equations come from thesolution of a set of simultaneous equations for the steady-stateperiodic time-domain response, of a network containing non-idealinductors and capacitors, to periodic operation of a non-ideal switch atthe load-network input port, at frequency f, to provide (a) an input-port voltage of zero value and zero slope at transistor turn-on time, (b)a first-order approximation to a time delay of the voltage rise attransistor turn-off, and (c) a nearly sinusoidal voltage across the loadresistance R, delivering a specified RF power P from a specified dc

6.

supply voltage Vcc .

273

The transistor's operating locus on the plane is not a tiltedstraight line (resistance) or a tilted ellipse (resistance + reactance).The operation during the "on" state of the switch is a nearly verticalline whose lower end is at the origin (0, 0); the "off" state of theswitch is a horizontal line whose left end is at the origin. By design,the operating locus avoids the remainder of the plane, theregion of simultaneous high voltage and high current, i.e., of highpower dissipation and consequent reduced efficiency; that region iswhere conventional Class B and C circuits operate.

3. ANALYTICAL AND NUMERICAL DERIVATIONS OFDESIGN EQUATIONS

Analytical derivations of design equations for the circuit of Fig. 2 canbe made only by assuming that the current in is sinusoidal. Thatassumption is strictly true only if the load network has infinite loadedQ defined as and yields progressively less-accurateresults for values progressively lower than infinity. is a free-choice design variable2, subject to the condition (obtainedfrom exact numerical analysis [4], [6]) to be able to obtain thenominal3 switch-voltage waveform, for the usual choice of the switch“on” duty ratio4 D being 50%.) The amplifier's output power Pdepends primarily (derivable analytically) on the collector/drain dc-supply voltage Vcc and the load resistance R, but secondarily (notderivable analytically) on the value chosen for Previouslypublished analytically derived design equations did not include thedependence of P on As a result, the output power is 38% to 10%less than had been expected, for values in the usual range of 1.8 to5. This paper includes an accurate new equation for P that includesthe effect of Similar restrictions apply to the analyticalderivations of design equations for and R. However, theneeded component values can be found by numerical methods. TableI gives normalized exact numerical solutions for output power (hencethe needed value of R), and for eight values of over the

274

entire possible range of 1.7879 to infinity, for the usual choice of D =50%.

The design equations in the next section are continuous mathematicalfunctions fitted to those eight sets of data. (Having the numericalvalues of Table I, readers can derive other mathematical functions tofit the data, if they wish, to substitute for the equations given below.)

Kazimierczuk and Puczko [5] published a tabulation similar to Table Ihere (using a different mathematical technique, but the two sets oftables agree well; see Section 5, below), but they did not includecontinuous-function design equations based on their tabular data. Asa result, a designer using [5] can produce an accurate design at anychosen tabulated value of but the designer lacks accurate designinformation for use at values of in-between the tabulated values.Avratoglou and Voulgaris [8] gave an analysis, and numericalsolutions as graphs, but no tables of computed values and no designequations fitted to the numerical results. Precise design values cannotbe read from the graphs.

To be able to make accurate circuit designs and a priori design

275

evaluations at any arbitrary value of the designer needs designequations comprising continuous mathematical functions, rather thana set of tabulated values as in Table I or [5]. The equations shouldgive accurate results, and should be simple enough to be easy for thedesigner to manipulate. Such equations are given below, for losslesscomponents. The losses are accounted for in [2], [4], [9], [10], andunpublished notes; the author intends to publish equations for allcomponents of power loss and the resulting collector/drain efficiency.Briefly: Calculate R from (6) or (6a), using for P the desired outputpower divided by the expected collector/drain efficiency (see (2)below for collector/drain efficiency). Then the needed load resistance

is

where is the "on" resistance of the transistor. is a generic termthat represents of a MOSFET or a MESFET, or of aBJT. The expected collector/drain efficiency is approximately

where is the 100%-to-0% fall time of theassumed linear fall of the collector/drain current at transistor turn-off,T= 1/f is the period of the operating frequency f, and "0.01" allocatesabout 1% loss of efficiency for the power losses in the dc and RFresistances of the dc-feed choke (substitute a different numericalvalue, if you wish).

4. EXPLICIT DESIGN EQUATIONS

The explicit design equations given below yield the low-orderlumped-element Class-E circuit that operates with the nominalwaveforms of Fig. 3. (Distributed-element circuits are discussedbriefly at the end of Section 9.) In the equations below, Vcc is the dcsupply voltage, P is the power delivered to the total effective circuit

276

resistance lumped into a single resistor R (see (1) above), f is theoperating frequency, (dc-feed choke), and are the loadnetwork shown in Fig. 2, and is the network loaded Q, chosen bythe designer as a trade-off among competing evaluation criteria.2 In anominal-waveforms circuit operating with the usual choice of D =50%, the minimum possible value of is 1.7879 (the circuit canwork well with lower values of but the transistor-voltagewaveform will be off-nominal: larger than zero at the transistor turn-on time); the maximum possible value of is less than the network'sunloaded Q. The design procedure is as follows:

The chosen safety factor (e.g., 0.75) allows for not exceeding thetransistor’s breakdown voltage by a higher-than-nominal peakvoltage (in this example, up to 1/0.75 = 133% of nominal) that couldresult from off-nominal load impedance. Choose as determinedby the transistor’s or the available power-supply voltage. Therelationship among P, R, and the transistor saturation offsetvoltage is least-squares fitted to the data in Table I, over the entirerange of from 1.7879 to infinity, within a deviation of ±0.15%, bya second-order polynomial function of

Hence

Alternatively, a third-order polynomial in gives a closer least-squares fit to the data, to within -0.0089% to +0.0072%:

277

Hence

The effective dc-supply voltage is the actual voltage, less the transistorsaturation offset voltage, hence is zero for a field-effect transistor. For a BJT, is of the order of 0.1 V at lowfrequencies, and up to a few volts (depending on the transistorfabrication) at frequencies higher than about

The design equations for and that fit the data in Table I aregiven below. The last terms in (7), (8), and (9) are adjustments to theexpressions fitted to the Table-I data, to account for the small effectsof the nonzero susceptance of The numerical coefficients in thoselast terms depend slightly on and those dependencies will bethe subject of a planned future article. For the example case ofand the usual choice of being 30 or more times the unadjustedvalue of the adjustments for the susceptance of add 2% or lessto the unadjusted value of and subtract 0.5% or less from theunadjusted value of

Finally, is determined by (a) the designer's choice2 for and (b)the value of R from (3) or (3a):

278

Equations (4) through (9) are more accurate than the older versions in[1], [2], [4], and [6].

5. ACCURACY OF DESIGN EQUATIONS

The maximum deviations of (5) from the tabulated values in Table Iare ±0.15%; those of (5a) are -0.0089% and +0.0072%; those of (7)and (8) are ±0.13%; and those of (9) are ±0.072%. Kazimierczuk andPuczko [5] give tables of numerical data (similar to Table I here),obtained by a Newton's-method numerical solution of a system ofanalytical circuit equations they derived, and other useful numericaland graphical data. The tabulated values of P in [5] are within -0.13%to +0.47% of the values obtained from the continuous function (3)above. Those differences include (a) the error in the fitting of thecontinuous function in (3) to the discrete values in Table I (±0.15%)and (b) the differences (if any) between the numerical results of [5]and of Table I here. Those two sets of tabulated values can becompared directly at only their two values of in common: infinity(identical results) and 1.7879 ([5] has the same capacitance values and0.28% lower P). The independently computed sets of data here and in[5] agree well (a maximum difference of about 0.3%), givingconfidence in the validity of both.

6. HARMONIC FILTERING AND ASSOCIATED CHANGES TODESIGN EQUATIONS

The power in (5) or (5a) is the total output power, at the fundamentaland harmonic frequencies. Most of the power is at the fundamentalfrequency. The strongest harmonic is the second, with a voltage orcurrent amplitude at R of relative to the fundamental. Forexample, with the second-harmonic power is -20 dBc (1% ofthe fundamental power) without any filtering. Even-order harmonics

279

can be canceled with a push-pull circuit, if desired. In that case, thestrongest harmonic is the third, at an amplitude of relative tothe fundamental, hence -36 dBc (0.025% of the fundamental power)without filtering, for the same example of 5.1 . Sokal and Raab [11]give the harmonic spectrum as a function of the chosen

If the circuit includes a low-pass or band-pass filter between R and thebranch instead of being a direct connection as in Fig. 2, the

fractions of the output power contained in each of the harmonics willdecrease, according to the transmission function of the filter at theharmonic frequencies. As a small side-effect, the total output power andthe waveforms of switch voltage and current will change slightly,requiring small changes to the numerical coefficients in (6) through (9)above, and in Table I and [5], New sets of numerical values can becalculated quickly with the help of a computer program such as HEPA-PLUS [7], described briefly in Sections 7 and 8 below, and availablefrom the author's employer.

7. OPTIMIZING EFFICIENCY

The highest efficiency is obtained by minimizing the total powerdissipated while the amplifier is delivering a desired output power. Thatcan be done by modifying the waveforms slightly away from thenominal ones shown in Fig. 3, allowing some of the components ofpower dissipation to increase, while having other components of powerdissipation decrease by larger amounts. For example, allowing theminimum of the voltage waveform to be at about 20% of the peakvoltage, instead of at zero, increases the power loss, but itreduces the rms/average ratio of the current waveform and thepeak/average ratio of the voltage waveform. Both of those effects canbe exploited to obtain a specified output power with a specified safepeak transistor voltage, with lower rms currents in the transistor,

280

and That reduces their dissipations. If their seriesresistances are large enough, the decrease in their power losses canoutweigh the increase of power loss.

The power loss in the transistor and in discharging a partiallycharged are not functions of the design frequency is inverselyproportional to frequency, so the product is independent offrequency). For given types of C or L components, losses in capacitorESRs (including that in the transistor's increase with designfrequency, inductor core losses increase, and inductor winding lossesdecrease.

The optimum trade-off depends on the specific combination ofparameter values of the types of components being considered in aparticular design. (It does not vary appreciably from one unit to anotherof a given design.) No a priori explicit analytical method yet exists forachieving the optimum trade-off among all of the components of powerloss. Optimization is a numerically intensive task, too difficult to do byexplicit analytical methods. But computerized optimization is practical.For example, running on an IBM-PC-compatible computer with aPentium III/667-MHz processor, a commercially available programHEPA-PLUS [7], developed specifically for high-efficiency poweramplifiers, designs a nominal-waveforms Class-E amplifier in a time tooshort to observe, simulates the circuit in 0.008 seconds, and optimizesthe design automatically, according to user-specified criteria, in about2.4 seconds. The program uses double-precision computation foraccuracy and robustness, yielding the circuit voltage and currentwaveforms and their spectra, dc input power, RF output power, and allcomponents of power dissipation.

8. EFFECTS OF NON-IDEAL COMPONENTS

Many of the non-idealities of the circuit components can be included inan analytical solution if the circuit is operating with the nominal switch-voltage waveform, but the task becomes progressively more difficult as

281

one attempts to include more of those effects simultaneously, andbecomes impossible if the circuit is not operating at the nominal-waveforms conditions. The HEPA-PLUS computer program [7],mentioned above, simulates an expanded version of the circuit of Fig. 2in any arbitrary operating condition (nominal or non-nominalwaveforms). It includes all important "real-world" non-idealities of thetransistor, the finite-Q power losses of all inductors and capacitors, andparasitic wiring inductances in series with and in series with thetransistor. Details are available from the author's employer.

9. APPLICABLE FREQUENCY RANGE IS ABOUT 3 MHz TO10 GHz (as of 1999)

The Class-E amplifier can operate at arbitrarily low frequencies, butbelow about 3 MHz, one of the three types of switching-mode Class-Damplifier might be preferred because it can provide as good efficiency asthe Class-E, with about 1.6 times as much output power per transistor,but with the possible disadvantage that transistors must be used in pairs,vs. the single Class-E transistor. Class E is preferable to Class D atfrequencies higher than about 3 MHz, for its higher efficiency, easierdriving of the transistor input port, and less-detrimental effects fromparasitic inductance in the output-port circuit. The upper end of theuseful frequency range for the low-order Class E is the frequency atwhich the achievable turn-off switching time is of the order of 17% ofthe RF period. In a Class-B amplifier, the turn-off transition time is25% of the period. Therefore a low-order Class-E circuit will work wellwith a particular transistor at frequencies up to about 17%/25% = 70%of the frequency at which that transistor works well in a Class-Bamplifier. (Unpublished higher-order Class-E circuits can operateefficiently at frequencies up to about double that of the low-orderversion.) Class-E circuits have been made successfully at frequenciesup to 10 GHz [42]. Several microwave designers have reported

282

achieving remarkably high efficiency by driving the amplifier intosaturation and using a favorable combination of series inductance to theload resistance [13] or fundamental and harmonic load impedances [14]-[20]. (The authors of [13]-[20] found the favorable tuning condition byusing an automatic tuner and/or a circuit-simulation program to make anexhaustive search over the multi-dimensional impedance space todiscover a favorable combination of circuit-element values, rather thanby using a priori explicit design equations.) Secchi [13] and Mallet etal [14] provided oscillograms of their drain-voltage and collector-voltage waveforms. Inspection of the waveform in [13, Fig. 2]shows a nominal Class-E waveform with

The waveforms in Fig. 2(b) of [14] are Class E, but with anunusually small conduction angle. Probably higher output power couldbe obtained by increasing the conduction angle and modifying the load-network impedance accordingly. This author does not know theoperating mode of [15]-[20]; very likely those amplifiers are distributed-elements versions (see below) of Class E, achieved empirically.

Distributed vs. lumped elements: High-efficiency waveformssimilar to those in Figs. 1 or 3 can be generated with lumped and/ordistributed elements. At a given frequency, the choice depends on theavailable components and the trade-offs among their sizes, costs, qualityfactors, and parasitic effects. [12], [21]-[23], [41], and [42] weretransmission-line versions of Class E, operating at 10, 8.35, 5, 2, 1, and0.5 GHz. The 5-, 2-, and 1-GHz circuits were described as having beendesigned a priori by explicit design procedures, worked as expected,and were operated and measured without making any experimentaladjustment.

10. EXPERIMENTAL RESULTS

Table II summarizes representative reported Class-E performance (as of1999), from 44 kW PEP at 0.52-1.7 MHz to 1.41 W at 8.35 GHz and100 mW at 10 GHz.

284

11. TUNING PROCEDURE

After adjusting the antenna tuner or the load-impedance-transformingnetwork (located between the antenna or other load and the right-handend of in Fig. 2) so as to provide an input-port resistance of R, theremight be residual series inductive and/or capacitive reactances in serieswith R. The series inductive reactance adds to the reactance of andthe series capacitive reactance adds to the reactance of Then theamplifier would operate with an off-nominal waveform, andpossibly an off-nominal value of output power, because the effectivevalues of and would differ from the design values. To correct forthat, the reactances of and should be reduced by the amounts ofthe residual inductive and capacitive series reactances of the input-portimpedance of the tuner or impedance-transforming network. Thefollowing text and figures explain how to make those adjustments to thecircuit, if needed, without needing to know, a priori, the values of thoseresidual series reactances. The text is in terms of a BJT; for a FET,substitute for

The circuit parameters were chosen, via equations (2) through (10), tomeet a chosen set of requirements. The circuit will operate with thenominal Class-E waveform, while delivering the specified output powerat the specified frequency, if the chosen parameter values are installed inthe actual hardware. The possible need for tuning results from (a)

Fig. 3 shows the nominal Class-E transistor-voltage waveform in thelow-order circuit of Fig. 2: at the transistor's turn-on time, the waveformhas zero slope, and has zero voltage for a FET or for a BJT. Anactual circuit, or a circuit in the HEPA-PLUS computer program [7], canbe brought from an off-nominal condition to that nominal-waveformcondition by adjusting and/or and the load resistance R if R isnot already the value that will provide the desired output power. Thedesired value of R is obtained from (6) or (6a) after having applied theallowance for parasitic resistances discussed in the last paragraph ofSection 3 above.6

285

tolerances on the components values (normally not a problem, becauseClass E has low sensitivity to component tolerances) and (b) thepossibility of unknown-value inductive and capacitive reactances beinginserted in series with R (hence in series with and after the loadresistance has been transformed to the chosen value of R. Those seriesreactances require that the reactances of and be reduced by theamounts of the unknown inserted inductive and capacitive seriesreactances. But how to do that when those inserted reactances areunknown?

Fig. 4 shows a waveform for an amplifier with off-nominal tuning,with the waveform features labeled for subsequent reference in the text.If we know a priori how changes of and will affect that waveform,we can adjust two parameters and so as to meet two criteria atthe operating frequency: (a) achieve the nominal waveform of Fig. 3and (b) deliver the specified value of output power.

Fig. 5 shows how and affect the waveform. We know alsothat increasing reduces the output power, and vice versa. With thepreceding information, and with (a) an oscilloscope displaying thewaveform and (b) a directional power meter indicating the power beingdelivered to the load, we can adjust and so as to fulfillsimultaneously the two desired conditions (nominal waveform anddesired output power) even if the inductive and capacitive reactances inseries with R are unknown.

If (comprising the transistor output capacitance and the externalcapacitor connected in parallel with it) is within about 10% of theintended value, will normally not need adjustment. But in case of apossible large deviation from the design value, can also be adjustedso as to achieve the nominal waveform, using the information inFig. 5 about the effects of on the waveform. In that case, thethree components and can be adjusted so as to achieve threeconditions simultaneously at the operating frequency: desired outputpower, transistor voltage of just before transistor turn-on, andzero slope of the waveform just before turn-on. The followingdiagrams and text explain how to adjust and R (if desired) to

286

adjust the shape of the waveform.

Changes in the values of the load-network components affect thewaveform as follows, illustrated in Fig. 5:

Increasing moves the trough of the waveform upwards andto the right.

Increasing moves the trough of the waveform downwardsand to the right.

Increasing moves the trough of the waveform downwardsand to the right.

Increasing R moves the trough of the waveform upwards (R isnot normally an adjustable circuit element).

Knowing these effects, you can adjust the load network for nominalClass-E operation by observing the waveform. (Depending on thesettings of the circuit component values, the zero-slope point and/or thenegative-going jump at transistor turn-on might be hidden from view, asin some of the waveforms in Fig. 6. If that occurs, the locations of thosefeatures on the waveform can be estimated by extrapolating from thepart of the waveform that can be seen.) The adjustment procedure is asfollows:

Set R to the desired value or accept what exists.

Set for the desired or accept what exists.

Set the frequency as desired.

Set the duty ratio to the desired value (usually 50%),with set to approximately 20% of the intended final value. If thetransistor turn-on is visible on the waveform (as in Fig, 4),measure the duty ratio. Otherwise, observe the waveform andassume that turn-on occurs when the positive-going edge of

1.

2.

3.

4.

287

reaches +0.8 V and turn-off occurs when the negative-going edge ofreaches 0 V. (The preceding voltage values are for a silicon

NPN transistor at room temperature. For other types of transistors,make appropriate modifications to the voltage values.)

5. Observe the trough of the waveform:

At the zero-slope point: What is the voltage relative toMore positive, more negative, or equal?At transistor turn-on: What is the slope? Positive, negative, orzero?

A.

B.

If these points are unobservable because they lie below the zero-voltage axis, the voltage at zero slope is “more negative.” Estimatethe slope at turn-on by extrapolation of the waveform.

If the voltage at zero slope is unobservable because transistor turn-onoccurs before zero slope is reached, the slope at turn-on is“negative.” Estimate the voltage at zero slope by extrapolation of thewaveform.

If you cannot estimate the or the slope by extrapolation, assumethat is “equal” or that the slope is “zero.”

Adjust and/or as shown in Fig. 5, and in expanded form in Fig.6.

If is now the desired value, go to Step 8. If is less than thedesired value, increase by up to 50% and readjust the duty ratio,

and as needed. (The increase will decrease the effectivevalue of the voltage-dependent causing the effective value ofto be reduced. Therefore will need to be increased slightly.)

For a final check of the adjustments, increase slightly togenerate an easily visible marker of transistor turn-on: thesmall negative-going step of Verify that the duty ratio isthe desired value (usually 50%) and that the waveform slope is

6.

7.

8.

288

zero at turn-on time. Now return to the value that brings thewaveform to at turn-on time (and also eliminates themarker).

12. GATE-AND BASE-DRIVER CIRCUITS

A simplistic view of the driver stage is that its design is much lessimportant than the design of the output stage, because the power level atthe driver stage is much lower than that at the output stage, by a factorequal to the power gain of the output stage, typically a factor of about 10to 100. That simplistic view is not correct, because the output transistorwill not operate as intended if its input is not driven properly. If theoutput transistor does not operate as intended, the output stage will notoperate as intended, either. The resulting output-stage performancemight or might not be acceptable. The output-stage transistor willoperate properly as a switch, as intended, if its input port (Gate-Sourceof a FET or Base-Emitter of a BJT) is driven properly by the output ofits driver stage. The driver stage must provide the output specifiedbelow. Symbols for FETs are used below; you can convert to BJTsymbols if you wish.

1. Enough "off" bias during the "off" interval to maintain thedrain or collector current at an acceptably small value. Ifyou are willing to tolerate a power loss of x fraction of thenormal dc input power due to non-zero "off"-state current,the drain or collector current during the "off" interval canbe up to where is the dccurrent drawn from the dc drain-voltage supply, and Dis the output-transistor's "on" duty ratio (usually 0.50, butit can be any value you choose and provide for in thechoice of R, L, and C values in the load network).Example: If you are willing to tolerate 1% additionalpower consumption from the voltage supply caused bythe non-zero "off"-state current, if is 5 A, and if D is

289

the usual value of 0.50, you can tolerate an "off"-statedrain current of 0.01 [5 A] [1/(1 - 0.50)] = 0.1 A = 100mA. That's easy. For example, the International RectifierIRF540 (rated 100 V, 28 A) is specified for 0.25 mAmaximum at and V, at afactor of 400 smaller than the 100 mA you are willing toaccept in this example.

2. Enough "on" drive during the latter 3/4 of the "on" intervalto maintain a low-enough You can choose what is"low-enough" for your purposes; refer to the last threesentences of Section 3. Why "the latter 3/4 of the 'on'interval": The current i(t) during the first 1/4 of the "on"interval is small enough that can be acceptablysmall for a fairly high because the small i(t) duringthe first 1/4 of the "on" interval causes an even smaller

(the square of a small number is even smaller thanthe number before squaring).

3. Enough turn-off drive to turn-off the drain or collectorcurrent from 100% to 0% in a fall-time fast enough tomake the turn-off power dissipation an acceptably smallfraction of the output power. That fraction iswhere and is the period ofthe operating frequency Choose the acceptable fractionof the output power to be dissipated during the non-zeroturn-off switching time. Then calculate the required drain-or collector-fall time that must result from the "enoughturn-off drive." Then provide sufficient turn-off drive toaccomplish your chosen objective, according to thecharacteristics of the chosen output transistor. (That is thesubject of an intended future publication.) For example, ifyou are willing to have the turn-off power dissipation

be 6% of the output power, and if theallowable value for

290

i.e., can be aslarge as 10.6% of the period.

4. Enough turn-on drive to turn-on the output transistor fastenough to make an acceptably small power dissipationduring the turn-on switching. That has never been aproblem with all of the drivers I have seen. Most drivercircuits turn the transistor "on" and "off" with about thesame switching times. If the more-important turn-offswitching time is fast enough, the accompanying turn-onswitching time will be more than fast enough.

The input-port characteristics of BJTs, MOSFETs, and MESFETs areenough different that different types of driver circuits should be usedto drive those three different types of transistors.7 I intend to publishone or more future articles that discuss in detail driver circuits thatmeet criteria 1-4 above, for MOSFETs, MESFETs, and BJTs. A briefsummary of driving a MOSFET or a MESFET follows. The polaritydescriptions assume N-channel or NPN; reverse the polaritydescriptions for P-channel or PNP.

The best gate-voltage drive is a trapezoid waveform, with the fallingtransition occupying 30% or less of the period. (Trade-off: Theshorter the turn-off transition time, the smaller will be the powerdissipation in the output transistor during turn-off switching, but thelarger will be the power consumption of the driver stage. For bothMOSFETs and MESFETs, the optimum drive minimizes the sum ofthe output-stage power dissipation and the driver-stage powerconsumption.) The upper level of the drive waveform should besafely below the MOSFET's gate-source maximum voltage rating, orthe MESFET's gate-source voltage at which the gate-source diodeconducts enough current to cause either of two undesired effects: (a)metal migration of the gate metalization at an undesirably rapid rate(making the transistor operating lifetime shorter than desired) or (b)

291

enough power dissipation to reduce the overall efficiency more thanthe efficiency is increased by the lower dissipation in the lowerthat results from a higher upper level of the drive waveform. Thelower level of the trapezoid should be low enough to result in asatisfactorily small current during the transistor's “off ”state, discussedin requirement 1 above.

A sine-wave is a usable (but not optimum) approximation to thetrapezoid waveform described above. To obtain an output-transistor“on” duty ratio of 50% (usually the best choice, but a larger or smallerduty ratio can be used if appropriate component values are used in theload network), the zero-level of the sine-wave should be positionedslightly above the FET's turn-on threshold voltage.

A better approximation is to remove the part of the sine-wave thatgoes below the value that ensures fully “off ” operation, replacingit with a constant voltage at that value. This reduces the input-drive power by slightly less than 50%, almost doubling the powergain of the output stage. A planned future article will discuss in detaila simple circuit that generates such a waveform.

ACKNOWLEDGEMENTS

The author thanks Prof. Alan D. Sokal of the Physics Department,New York University, for many helpful discussions and for producingthe numerical solutions in Table I and the initial set of equations thatfit the data in Table I; John E. Donohue, formerly of DesignAutomation, Inc., for computing the coefficients of in (4) to fitthe data in Table I, yielding (5) and (6); and Dr. Richard Redl of ELFIS.A., for computing the improved-accuracy functions in (5a), (7), and(9) that fit the P, and data of Table I.

This text is an edited version [including correction of a typographicalerror in (1)], of “Class E RF Power Amplifiers,” published in QEXmagazine, Jan./Feb. 2001, Issue No. 204, pp. 9-20, copyright 2000,

292

American Radio Relay League, Inc. That article added significantnew information to text taken from “Class-E High-Efficiency PowerAmplifiers, from HF to Microwave,” presented by this author at theIEEE International Microwave Symposium, June 1998, Baltimore,Maryland, U.S.A., and “Class-E Switching-Mode High-EfficiencyTuned RF/Microwave Power Amplifier: Improved Design Equations,”presented by this author at the IEEE International MicrowaveSymposium, June 2000, Boston, Massachusetts, U.S.A. The texts ofthe presented papers are included in the printed and CD-ROMversions of the Proceedings of the conferences, copyright 1998 and2000, respectively, by IEEE. The author thanks ARRL and IEEE forpermission to use the previously published material.

FOOTNOTES

1Most papers on the Class E amplifier of Fig. 2 (including this one) define asA few papers, e.g., [3], define as Kazimierczuk and Puczko

[5], to their credit, give both values in their tabulations, as and as respectively.

2The choice of involves a trade-off among operating bandwidth (wider with lowharmonic content of the output power [11] (lower with high and power loss

in the parasitic resistances of the load-network inductor and capacitor (lowerwith low

3The nominal switch-voltage waveform has zero voltage and zero slope at the time theswitch will be turned on. [l]-[4], and papers by other authors, referred to thatnominal waveform as the “optimum” waveform, a misnomer. That waveform is“optimum” for yielding high efficiency in the case of a switch with negligibly smallseries resistance. But if the switch has appreciable resistance, the efficiency can beincreased by moving away slightly from the nominal waveform, to a waveformwhose voltage at the switch turn-on time is of the order of 20% of the peak voltage.No analytical optimization procedure yet exists, but the circuit can be optimizednumerically, by a computer program such as HEPA-PLUS [7], discussed briefly inSections 7 and 8.

4Beware: A few publications define D as the fraction of the period that the switch is

off.

5Updates to [11]: (a) Delete the column in Table I for because must be

293

1.7879 to obtain the nominal Class-E collector/drain-voltage waveform in the circuitdescribed in [l]-[6], when the switch duty ratio D is 50%. (b) In (2), change thefactor 1.42 to 1.0147; the factor 2.08 to 1.7879; and the factor 0.66 to 0.773. (c)Recalculate the numerical values of using (2) with the revised factors.

6The 1997 two-part QST article [43] by Eileen Lau (KE6VWU) et al, about 300-wattand 500-watt 40-metre transmitters, discussed tuning in Part 2, but without adescription of how to adjust the load-network components values to obtain thenominal Class-E voltage waveform, as is included in Section 11 here.

7In the early 1980s, I made a driver circuit that would drive a BJT or a MOSFETinterchangeably, with no change needed in the driver or in the power-amplifier’sinput circuit. That driver was used in a Class-E demonstrator circuit, so that a personevaluating Class-E technology could insert any type of transistor for test purposes, beit an NPN BJT or an N-channel MOSFET, and observe that the changes of power-amplifier output power and efficiency were almost unnoticeably small, wheninserting any of thirty transistors of different type numbers and manufacturers, someBJT and some MOSFET. Some of those people, accustomed to working withconventional Class-C power amplifiers, were astonished when they witnessed theresults of that test.

REFERENCES

N. O. Sokal and A. D. Sokal, "High-efficiency tuned switching power amplifier,"U. S. Patent 3,919,656, Nov. 11, 1975 (now expired). [Includes a detailedtechnical description.]

"Class E – a new class of high-efficiency tunedsingle-ended switching power amplifiers," IEEE J. Solid-State Circuits, vol. SC-10, no. 3, pp. 168-176, June 1975. [The text of [1] cut to half-length; retains themost-useful information. Text corrections are available from N. O. Sokal.]

F. H. Raab, "Idealized operation of the Class E tuned power amplifier," IEEETrans. Circuits and Systems, vol. CAS-24, no. 12, pp. 725-735, Dec. 1977.

N. O. Sokal and A. D. Sokal, "Class E switching-mode RF power amplifiers —low power dissipation, low sensitivity to component tolerances (includingtransistors), and well-defined operation," Proc. 1979 IEEE ELECTROConference, Session 23, New York, NY, 25 April 1979; reprinted in R.F. Design,vol. 3, no. 7, pp. 33-38, 41, July/Aug. 1980. [Includes plots of efficiency vs.frequency with as a parameter and efficiency vs. variations of all circuitparameters.]

M. K. Kazimierczuk and K. Puczko, "Analysis of Class E tuned power amplifier atany Q and switch duty cycle," IEEE Trans. Circuits and Systems, vol. CAS-34,no. 2, pp. 149-159, Feb. 1987.

N. O. Sokal, "Class E high-efficiency switching-mode power amplifiers, from HFto microwave," 1998 IEEE MTT-S International Microwave Symposium Digest,June 1998, Baltimore, MD, CD-ROM IEEE Catalog No. 98CH36192 and also

[1]

[2]

[3]

[4]

[5]

[6]

294

1998 Microwave Digital Archive, IEEE Microwave Theory and TechniquesSociety, CD-ROM IEEE Product # JP-180-0-081999-C-0.

HEPA-PLUS computer program, available from the author's employer, DesignAutomation, Inc., 4 Tyler Rd., Lexington, MA 02420-2404, U.S.A.

Ch. P. Avratoglou and N. C. Voulgaris, "A new method for the analysis anddesign of the Class E power amplifier taking into account the factor," IEEETrans. Circuits & Systems, vol. CAS-34,. no. 6, pp. 687-691, June 1987.

F. H. Raab and N. O. Sokal, "Transistor power losses in the Class E tuned poweramplifier," IEEE J. Solid-State Circuits, vol. SC-13, no. 6, pp. 912-914, Dec.1978.

N. O. Sokal and R. Redl, "Power transistor output port model for analyzing aswitching-mode RF power amplifier or resonant converter," RF Design, June1987, pp. 45-48, 50-53.

N. O, Sokal and F. H. Raab, "Harmonic output of Class-E RF power amplifiersand load coupling network design," IEEE J. Solid-State Circuits, vol. SC-12, no.1, pp. 86-88, Feb. 1977. [Text corrections are available from N. O. Sokal.]

E. W. Bryerton, W. A. Shiroma, and Z. B. Popovic', "A 5-GHz high-efficiencyClass-E oscillator," IEEE Microwave and Guided Wave Letters, vol. 6, no. 12,Dec. 1996, pp. 441-443. [300 mW to external load at 5 GHz at 59% conversionefficiency (remaining RF output from transistor was used for input-drive tooscillator).]

F. N. Sechi, "High efficiency microwave FET amplifiers," Microwave J., Nov.1981, pp. 59-62, 66. [Several "saturated Class B and Class AB" amplifiers at2.45 GHz, using several types of GaAs MESFETs: 0.97 W at 71% PAE, 1.2 W at72% PAE, 1.27 W at 72% PAE. The waveform in Fig. 2 is a low-orderClass-E waveform with apparently = (2.7 V)/(0.688 A) = 3.9 ohms. Alldrain-current waveforms are sinusoidal; that seems to be inconsistent with thenon-sinusoidal drain-voltage waveforms. Perhaps the bandwidth of the current-sensing instrumentation was sufficient to display only the fundamentalcomponent of the probably non-sinusoidal current waveforms.]

A. Mallet, D. Floriot, J. P. Viaud, F. Blache, J. M. Nebus, and S. Delage, "A 90%power-added-efficiency GalnP/GaAs HBT for L-band and mobilecommunication systems," IEEE Microwave and Guided Wave Letters, vol. 6, no.3, pp. 132-134, March 1996. [Fig. 1 is well-annotated with the HBT parametervalues, but it omits values forS. R. Mazumder, A. Azizi, and F. E. Gardiol, "Improvement of a Class-Ctransistor power amplifier by second-harmonic tuning," IEEE Trans. MTT, vol.MTT-27, no. 5, pp. 430-433, May 1979. [800 mW output at 865 MHz, 53.3%collector efficiency, coupled-TEM-bar circuit. In a similar paper at the 9thEuropean Microwave Conference, Sept. 1979, the same authors reported 64%collector efficiency at 800 mW output at 850 MHz.]

J. J. Komiak, S. C. Wang, and T. J. Rogers, "High efficiency 11 watt octave S/C-band PHEMT MMIC power amplifier," Proc. IEEE 1997 MTT-S InternationalMicrowave Symp., Denver, CO, June 8-13,1997, IEEE Catalog No. 0-7803-3814-6/97, pp. 1421-1424. [17 W at 5.1 GHz, 54.5% PAE, harmonic tuning]

[7]

[8]

[9]

[10]

[12]

[13]

[14]

[15]

[16]

[11]

295

J. J. Komiak, L. W. Yang, "5 watt high efficiency wideband 7 to 11 GHz HBTMMIC power amplifier," Proc. IEEE 1995 Microwave and Millimeter-WaveMonolithic Circuits Symp., Orlando, FL, May 15-16, 1995, IEEE Catalog No.95CH3577-7, pp. 17-20.

W. S. Kopp and S. D. Pritchett, "High efficiency power amplification formicrowave and millimeter frequencies," 1989 IEEE MTT-S Digest, IEEE CatalogNo. CH2725-0/89/0000, pp. 857-858.

Bill Kopp and D. D. Heston, "High-efficiency 5-watt power amplifier withharmonic tuning," 1988 IEEE MTT-S Digest, pp. 839-842. [12 FETs in parallelproduced (from Table 3) 5.27 W output (apparently 0.27 W of that is lost inpower-combining network) at 10 GHz with 35.3% PAE (Abstract says 5 W at36% PAE). Exhaustive search for best combination of impedance vs. frequency.Built with distributed elements.]

L. C. Hall and R. J. Trew, "Maximum efficiency tuning of microwaveamplifiers," 1991 IEEE MTT-S Digest, IEEE Catalog No. CH2870-4/91/0000,pp. 123-126. [Circuit simulations of optimum design found by exhaustive searchof 12-dimensional parameters-space; the resulting design appears to be higher-order Class E with 3rd-harmonic resonator (Class F3).]

T. Mader, M. Markovic', Z. B. Popovic', and R. Tayrani, "High-efficiencyamplifiers for portable handsets," Conference Record, IEEE PIMRC'95(Personal, Indoor, & Mobile Radio Communications), Sept. 1995, Toronto,Ontario, Canada, IEEE publication 0-7803-3002-1/95, pp. 1242-1245. [Class E,0.94 W at 1 GHz, at 75% drain efficiency, 73% PAE, Siemens CLY5 GaAsMESFET]

T. B. Mader and Z. B. Popovic', "The transmission-line high-efficiency Class-Eamplifier," IEEE Microwave and Guided Wave Letters, vol. 5, no. 9, Sept. 1995,pp. 290-292. [0.94 W at 1 GHz at 75% drain efficiency, 73% PAE; 0.55 W at 0.5GHz at 83% drain efficiency, 80% PAE; Siemens CLY5 GaAs MESFET]

T. B. Mader, "Quasi-optical Class-E power amplifiers," PhD thesis, 1995, Univ.of Colorado, Boulder, CO. [Class E with transmission lines: 0.55 W at 0.5 GHzat 83% drain efficiency, 80% PAE from Siemens CLY5 MESFET; 0.61 W at 5GHz at 81% drain efficiency, 72% PAE from Fujitsu FLK052WG MESFET;four of the latter into a quasi-optical power combiner gave 2.4 W at 5.05 GHz at74% efficiency, 64% PAE.]

T. Sowlati, C. A. T. Salama, J. Sitch, G. Rabjohn, and D. Smith, "Low voltage,high efficiency GaAs Class E power amplifiers for wireless transmitters," IEEEJ. Solid-State Circuits, vol. 30, no. 10, pp. 1074-1080, Oct. 1995; same authorsand almost-identical title and text in Proc. IEEE GaAs IC Symposium,Philadelphia, PA, Oct. 18-19, 1994, IEEE Catalog No. 0-7803-1975-3/94, pp.171-174. [24 dBm = 0.25 W output at 835 MHz, at >50% power-addedefficiency using integrated impedance-matching networks (PAE would be 75%with hybrid matching networks), from a GaAs IC at 2.5 Vdc]

T. Sowlati, Y. Greshishchev, C. A. T. Salama, G. Rabjohn, and J. Sitch, "Lineartransmitter design using high efficiency Class E power amplifier," ConferenceRecord, IEEE PIMRC'95 (Personal, Indoor, & Mobile Radio Communications),

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

296

Sept. 27-29, 1995, Toronto, Ontario, Canada, IEEE publication 0-7803-3002-1/95, pp. 1233-1237. [24 dBm = 251 mW at 835 MHz, 65% PAE]

J. Imbornone, R. Pantoja, and W. Bosch, "A novel technique for the design ofhigh efficiency power amplifiers," European Microwave Conference, Cannes,France, Sept. 1994. [32.1 dBm = 1.6 W output at 850 MHz, at 62.3% power-added efficiency, from a GaAs IC (output stage and driver stage) withhigh-Q lumped elements, at 5 Vdc. Simulated and waveforms foroptimized output stage are Class E with V/27.4 V = 18%, asdiscussed in Section 7.]

K. Siwiak, "A novel technique for analyzing high-efficiency switched-modeamplifiers," Proc. RF Expo East '90, Nov. 1990, pp. 49-56. [higher-order ClassE with 3rd-harmonic resonator (Class F3)]

C. Duvanaud, S. Dietsche, G. Pataut, and J. Obregon, "High-efficient Class FGaAs FET amplifiers operating with very low bias voltages for use in mobiletelephones at 1.75 GHz," IEEE Microwave and Guided Wave Letters, vol. 3, no.8, pp. 268-270, Aug. 1993. [higher-order Class E with 3rd-harmonic resonator(Class F3)]

R. M. Porter and M. L. Mueller, "High power switch-mode radio frequencyamplifier method and apparatus," U. S. Patent 5,187,580, Feb. 16, 1993. [ClassE with substantial voltage at turn-on, as in Section 4 here.]

Y-O Tam and C-W Cheung, "High efficiency power amplifier with travelling-wave combiner and divider," Int. J. Electronics, vol. 82, no. 2, pp. 203-218,1997. [Class E 450 MHz/5 W with 89.4% collector efficiency. The outputs offour such amplifiers were combined with a traveling-wave power-combiner,yielding 14.96 W output at 89.5% collector efficiency.]

J. E. Mitzlaff, "High efficiency RF power amplifier," U. S. Patent 4,717,884, Jan.5, 1988. [1.6 W at 76% drain efficiency at 840 MHz. At least 1.5 W output with[at least?] 74% efficiency over 50-MHz band centered at 840 MHz (6% band).Described as Class F. Appears to be high-order Class E with lumped andtransmission-line resonators. Shows transistor voltage and current waveformsfor three "prior-art" circuits, but not for the circuit covered by this patent.Detailed explanation of how to synthesize load network to produce desired input-port impedance vs. frequency.]

M. Kessous and J.-F. Zürcher, "Amplificateur VHF en classe E utilisant untransistor à effet de champ (FET) VMOS de puissance" (VHF Class E amplifierusing VMOS power FET), AGEN-Mitteilungen (Switzerland), no. 30, pp. 45-49,Oct. 1980. [2.58 W output at 145 MHz at 96.5% drain efficiency, 81.3% total

using Siliconix VMP-4 MOSFET]N. O. Sokal, "Design of a Class E RF power amplifier for operation at 2.45 GHz,and tests on a scaled-frequency model at 122.5 MHz" [1/20 frequency], Oct.1979, unpublished report of Design Automation, Inc. Project 4198. [UsedRaytheon RPC3315 GaAs MESFET intended to be used at 2.45 GHz. Initial testwith frequency scaled-down by factor of 20, all inductors and capacitors(including transistor capacitances and expected wiring parasitic inductances)scaled-up by factor of 20, and all resistances, voltages, and currents at intended

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

297

final values. 210 mW output, 77% drain efficiency, 24 mW input drive, 9.4 dBpower gain, 71% overall efficiency 68% PAE.]

D. W. Cripe, "Improving the efficiency and reliability of AM broadcasttransmitters through Class-E power," National Association of Broadcastersannual convention, May 1992, 7 pp.

S. Hinchliffe and L. Hobson, "High power Class-E amplifier for high-frequencyinduction heating applications," Electronics Letters, vol. 24, no. 14, pp. 886-888,July 7, 1988. [>550 W at 3-4 MHz at >92% efficiency across the band, 450 W at3.3 MHz at 96% efficiency from 104 Vdc, IRF450 MOSFET.]

R. Redl and N. O. Sokal. "A 14-MHz 100-watt Class E resonant [dc/dc]converter: principles, design considerations and measured performance," PowerElectronics Conf., San Jose, CA, Oct. 1986. [Class E dc/dc converter had 87%drain efficiency at 100 W dc output. IRF540 RF power stage supplied estimated105 W at 91.4% efficiency because of estimated 5 W loss in couplingtransformer and rectifier associated with 100-W dc load.]

N. O. Sokal and Ka-Lon Chu, "Class-E power amplifier delivers 24 W at 27MHz, at 89-92% efficiency, using one transistor costing $0.85," Proc. RF ExpoEast, Tampa, FL, Oct. 1993, pp. 118-127, and presented at RF Expo West, SanJose, CA, March 1993 but not in Proc. [International Rectifier (89%) and HarrisSemiconductor (92%) IRF510 SMPS MOSFET; Harris device slightly larger die,lower and higher efficiency. Silicon-gate (about 1-2 ohms, but neverspecified by vendor) was borderline-acceptable at 27.12 MHz for input-drive power. varies as it would have been quite acceptable at 13.56MHz.]

N. O. Sokal and I. Novak, "Tradeoffs in practical design of Class-E high-efficiency RF power amplifiers," Proc. RF Expo East, Tampa, FL, Oct. 1993, pp.100-117, and presented at RF Expo West, San Jose, CA, March 1993, but not inProc.

P. J. Poggi, "Application of high efficiency techniques to the design of RF poweramplifier and amplifier control circuits in tactical radio equipment," Proc.MILCOM'95, San Diego, CA, Nov. 5-8, 1995, pp. 743-747.S. C. Cripps, RF Power Amplifiers for Wireless Communications, Artech House,Norwood, MA, 1999, ISBN 0-89006-989-1, pp. 170-177. [Fig. 6.19 on p. 176:GaAs MESFET, 840 MHz, 79% efficiency at 1.24 W output, 15 dBm (31.6 mW)input, power gain = 1.24 W/0.0316 W = 39.2 =15.9 dB.]

M. D. Weiss, M. H. Crites, E. W. Bryerton, J. F. Whitaker, and Z. Popovic',"Time-domain optical sampling of switched-mode microwave amplifiers andmultipliers," IEEE Trans. MTT, vol. 47, no. 12, pp. 2599-2604, Dec. 1999.

M. D. Weiss and Z. Popovic', "A 10 GHz high-efficiency active antenna," 1999IEEE MTT-S International Microwave Symposium Digest, June 13-19, 1999,Anaheim, CA, file TU4B_5.PDF on CD-ROM IEEE Catalog No. 99CH36282C.

E. Lau (KE6VWU), K-W Chiu (KF6GHS), J. Qin (KF6GHY), J. Davis(KF6EDB), K. Potter (KC60KH), and D. Rutledge (KN6EK), "High-efficiencyClass-E power amplifiers — Part 1," QST, vol. 81, no. 5, pp. 39-42, May 1997,and "... Part 2," vol. 81, no. 6, pp. 39-42, June 1997.

[34]

[35]

[36]

[37]

[38]

[39]

[40]

[41]

[42]

[43]

300

See next page for Fig. 6.

LINEAR TRANSMITTER ARCHITECTURES

Lars SundströmEricsson Mobile Communications AB/Lund University

Lund, Sweden

ABSTRACTThe need for linear transmitter architectures in current andfuture wireless systems is briefly discussed. The principlesand properties of various linear transmitter architecturesbased on power amplifier linearization and direct modula-tion are given with a focus on analog implementation andbattery-operated user equipment.

1. INTRODUCTION

There has been a migration from frequency and phase modulation to-wards more spectrally efficient modulation schemes in wireless systemsfor more than a decade. First, analog FM-based systems were replacedby digital standards and now digital standards are replaced or extendedto include more efficient and flexible modulation schemes as we enterthe era of third-generation mobile systems and a more diversified use ofwireless communication in general. Common for these modulationtechniques is that they require more or less linear transmitters to pro-duce a sufficiently accurate waveform at high power levels that, in par-ticular, preserve the desired spectral properties. Let us exemplify withsome important standards:


304

D-AMPS (US) and PDC (Japan) - based on DQPSK narrowbandmodulation and root raised-cosine (RRC) filtering.

GSM - based on GMSK (constant envelope, does not require a linear transmitter) mod-ulation but extended standard (GSM Phase 2+) include linear modulation (EDGE) that isspectrally compatible with the GMSK signal.

WCDMA - based on a QPSK-like wideband modulation scheme with RRC filtering.Extension of standard under development to provide higher data rates by the use of adap-tive modulation and multicode schemes.

Bluetooth - based on FM today (constant envelope, does not require a linear transmit-ter) but extension currently being developed (Bluetooth 2) to improve data rate substan-tially by the use of linear adaptive modulation that will require a linear transmitter.

Hiperlan II - one of the alternatives is based on OFDM (linear modulation).

For each standard there is a detailed specification for the transmitter thatregulates output RF spectrum emissions (spectrum emission mask, ad-jacent channel leakage power ratio etc.) to ensure compatibility in fre-quency domain in general and the error vector magnitude (EVM) toensure that the waveform is sufficiently accurate in order to avoid sig-nificant loss in the link budget. The linearity of the transmitter is veryimportant as it affects both output RF spectrum emissions and EVM.

The linearity of a transmitter or the power amplifier (usually the domi-nating nonlinearity in a transmitter) may be specified as amplitude-to-amplitude (AM-AM) and amplitude-to-phase (AM-PM) conversion,that is the output amplitude and phase through the nonlinearity as afunction of the input amplitude. Note that this does only model inter-modulation distortion (IMD) and not harmonic distortion. IMD is, how-ever, the most serious distortion product as coincides with the desiredsignal (in frequency) and the techniques discussed below mainly reduceIMD.

Besides linearity, the transmitter should of course have a high power ef-ficiency, in particular if we consider battery-operated user equipment.Furthermore, the efficiency will be of increasing importance as we gofrom voice transmission to data transmission. As more stringent linear-ity requirements are introduced it will be increasingly difficult to main-tain a high efficiency and this is where the linear transmitterarchitectures discussed below will be important.

305

Linear transmitter architectures can be divided into linearization archi-tectures and direct modulation architectures. The former is defined as atechnique where the nonlinear characteristic of the (final) power ampli-fier (PA) is exercised such that distortion is generated. A counteractingcircuitry (linearizer) is added to cancel or reduce the distortion. As op-posed to this, direct modulation techniques avoid the nonlinear behav-iour of the PAs altogether by applying modulation components directlyto or at the PAs.

Besides linearity, we may also gain in efficiency when using a linearizeras the PA can be operated deeper into its nonlinear region where the ef-ficiency is higher. However, the boost in efficiency is limited becausethe linearizer will basically rescale the signal amplitude range back toregions with less efficiency (although the peak amplitude will remainthe same). Thus, if further improvement in efficiency is desired, directmodulation should be considered instead as it can exploit high efficien-cy switched amplifiers for all output amplitudes.

A huge amount of linearization architectures can be found in the litera-ture. Most if not all are based on one or a combination of three basictechniques: predistortion, feedback and feedforward.

The classical feedforward technique is used in multicarrier basestationsamplifiers as it provides an unprecedented combination of linearity andbandwidth but also large volume and very low efficiency. The complex-ity of a practical feedforward amplifier is usually quite high as accuratecontrol of phase/delay and gain balance between parallel signal paths isrequired for proper operation. From a cost point of view this is usuallynot a problem as the major cost is connected to the PA devices.

If we instead turn our focus towards linearization techniques that can beimplemented with a high level of integration and are well suited for bat-tery-operated and small user equipment we find predistortion and feed-back. As for direct modulation techniques, envelope elimination andrestoration (EER) and linear amplification with nonlinear components(LINC) are the two most prominent ones in this respect. These fourtechniques will be discussed in more detail below.

306

2. PREDISTORTION

Adding a complementary nonlinearity in series with the PA is probablythe most obvious technique for linearization. The basic principle of pre-distortion as well as predistortion at baseband, IF and RF are illustratedin figure 1. Note that these figures does not define the location of inter-face between the digital and the analog domain. In practice, this inter-face may be located anywhere from the baseband signal generation upto and including an IF predistorter.

Predistortion is widespread, in some cases in the form of simple first-order compensation circuits [1][2] that cancel or reduce, say, the third-order term of the PA nonlinearity, and in other cases as a complex archi-tecture where the actual predistortion takes place in the digital domain[3]. Predistortion can also be used together with other linearizationschemes such as feedforward to further improve the linearity in a base-station amplifier.

In cases where a high degree on linearization is required, say 20dB andhigher, continuous adaptation of the predistorter is a must to track thevarying characteristics of the PA, caused by varying temperature, load,transmit power and aging. Thus, the complexity of predistortion tends

307

to become rather high as some means of monitoring the PA output sig-nal is required to guide the adaptation process. Another important issueis the memory effects. For narrowband signals it can be assumed that thePA is memoryless, consequently the predistorter can be memorylesstoo. This assumption holds to some extent but for wideband basestationamplifiers with a stringent spectrum mask, this is not case.

Digital baseband and IF predistortion techniques receive a significantamount of attention nowadays as it is believed to have the potential ofgiving more cost-effective, smaller and flexible basestation amplifierscompared with a feedforward implementation. While it is easier to rep-resent an arbitrary nonlinearity in the digital domain and also do thenecessary adaptation processing etc. the interface between the analogand digital domain will be moved closer to the PA where the require-ments on resolution and sampling rate will be substantially higher [11].This is of course even more true if the digital part directly generates anIF signal instead of baseband components.

More advanced analog predistortion circuits do also exist. In theorythere are many options to generate a predistorter function in the analogdomain such as the multi-tanh and translinear principles. However, aprogrammable predistorter should have properties that allows for adap-tation towards a global minimum of distortion at the output of the PA.Thus it helps if the predistorter synthesizes a simple analytical function.In addition to this, both AM-AM and AM-PM distortion must be cor-rected for and this increase the complexity. All this suggests that an an-alog predistorter should be implemented as a complex-valuedpolynomial gain function of low order [4][5].

Recently, integrated circuits in both BiCMOS and CMOS technologyhave been presented [6][7]. Both of them implement a fifth order com-plex-valued polynomial

where and are adjustable coefficients.The circuits can be configured to operate with baseband signals inwhich case the signals and are complex-valued, see figure 2, orwith IF/RF bandpass signals, see figure 3.

308

Experimental results with both these circuits have shown that the third-order intermodulation products could be reduced by more than 30dB.The amplifiers that were used was of class A type driven deep into sat-uration. However, as discussed in [9] the attainable improvement quick-ly drops as more nonlinear power amplifiers are used, i.e. class B and Camplifiers, when driven by a two-tone signal. When using signals withlimited modulation depth the results will improve though.

309

Negative feedback is a well-established technique to build accurate andlinear amplifiers. With RF power amplifiers, however, it is difficult if notimpossible to obtain a reasonable amount of loopgain and at the sametime preserve stability (although the evolution of process technologymight change this). Therefore, at RF feedback usually means modula-tion feedback and not feedback of the entire broadband RF signal withall its harmonics. That is, the modulation components of the transmitsignal are detected and compared with the corresponding componentsof the reference signal. This can be done using Cartesian or polar com-ponents at baseband or IF/RF. The techniques are illustrated in figures4 and 5.

All of these techniques have been studied in detail for many years. Inparticular, Cartesian feedback has been put into products and proven towork well with narrowband signals. Furthermore, the technique is con-sidered to be appropriate for TETRA1 equipment where the spectrummask is one of the most stringent.

1. TErrestrial Trunked RAdio, ETSI standard for digital land mobile radio

3. FEEDBACK

With wideband signals such as WCDMA the memory effects in thepower amplifier and in the surrounding circuitry such as matching net-works and filters may have a detrimental effect on distortion cancella-tion. As discussed in [8] it is possible to introduce memory effects in thepredistorter as well by rather simple means and obtain significant im-provement in linearity.

Adaptation of the coefficients can be made quite easily. The adjacentchannel power can be detected to guide an optimization algorithm as in[5]. The disadvantage with such a solution is slow adaptation that maynot be able to track variations in the PA characteristics. Variations dueto change in load and device temperature (due to change in output pow-er) may call for faster adaptation schemes and this requires demodula-tion of e.g. the I and Q components of the output signal. Together withthe reference signal these can be used to calculate new coefficients forthe polynomial. One such scheme with a rather high complexity is de-scribed in [10].

310

For wideband signals, however, modulation feedback appears to be lesspromising. The reason is the loop delay and how it affects stability. Thestability can be studied by means of the phase margin. If the loop ischaracterised by a DC loop gain of and a single pole cut-off frequen-cy the phase margin can be calculated as

where is the loop delay. Here, it is assumed that the pole contributeswith Typically, the phase margin should be 60° or more to avoidnoise peaking far from the carrier [12]. At the same time the loopgain-bandwidth product must be sufficiently large to suppress distor-tion products to levels below the spectrum mask.

We may assume that the baseband part of the loop may be given anytransfer function as we desire. The delay in the RF part of the loop, how-

311

ever, appears to be rather fixed. Matching networks in the PA and ifpresent, e.g filter, couplers etc. contribute to a group delay that can bequite significant. The design of the matching networks are dictated bytechnology, device size, acceptable loss in the matching networks etc.Thus, the matching networks are usually of low order and a first ordernetwork may be studied by means of its equivalent RLC circuit that ex-hibits a group delay of

Thus, for a matching network tuned for 1GHz and with a Q of 3 gives1ns of group delay. The total delay of a power amplifier is typically sev-eral ns and if we consider an example with 5ns of delay, a DC loopgainof 100 (40dB), then a 60° phase margin is obtained for a loop band-width of This bandwidth is typically chosen to be thesame as the signal bandwidth or larger. Note that the signal bandwidthhere refers to the bandwidth of the modulation components. That is,while the Cartesian components have the same spectral properties as theundistorted RF bandpass signal, polar component exhibit a much largerbandwidth.

In practice, the picture becomes more complicated as several additional(parasitic) poles present in the loop will contribute more phase shift thusleaving less headroom for the delay. To some extent small delays, e.g.in the PA, can be compensated for though.

Is it worth noting that the fundamental limitation imposed by the delaycompletely disqualifies the use of most higher order filters in the loope.g. SAW filters as they exhibit very large group delays (several hundrednanoseconds and higher).

Nevertheless, in [13] a Cartesian feedback system was reported to givea 20dB reduction of third-order intermodulation distortion at 1.5MHzaway from the carrier using a 500kHz two-tone test.

There is an additional aspect on stability of Cartesian feedback thatmakes it a more complicated compared to regular feedback [12]. Thesynchronism of the reference and output modulation components canonly be maintained by a control loop that ensures the proper phase shiftalong the RF path [14][15]. Any deviation in phase shift from the opti-

312

mal value will directly degrade the phase margin with the same amount.Note that this includes the varying AM-PM conversion in the power am-plifier which can easily extend to 20° over the entire amplitude range.

If the required improvement in linearity is modest a more self-sufficientand less complex feedback technique can be exploited such as envelopefeedback and power feedback [16][17]. These do only correct for AM-AM distortion but in many cases this is sufficient. The basic techniqueis illustrated in figure 6.

To counteract the AM-AM distortion generated in the PA a variable gainamplifier (VGA) with an appropriate gain control precedes the PA. Thegain control is obtained from the error signal given by the difference ofthe detected RF input and attenuated output signals Theerror signal is amplified and filtered and finally an optional offset isapplied. The signals and are detected individually using exactlythe same operation D, in practice an envelope detector or power detec-tor. In a practical realization the VGA function might be a separate en-tity as in [17] but it could also be incorporated into the PA to reducecomplexity and delay. In any case, the VGA function must not introduceany AM-PM distortion as there is no compensation for phase variations.This is usually not a problem though as the required gain control rangeof the VGA is rather limited.

A closer look at this scheme reveals that it is not as intuitive as regularfeedback, the input-output amplitude relationship is given by

313

for envelope feedback and

for power feedback where and are the input and output amplitudes,respectively. In both cases we can define the loopgain asNote, however, that this is not entirely correct. If the gain variations ine.g. the PA is large we should instead consider the differentialloopgain.

In both cases it is, however, readily seen that the expression may be sim-plified to

if the loopgain is sufficiently large. But as the loopgain is dependent onthe input signal amplitude there will be no control of the loop for smallinput amplitudes and this is where the optional coefficient will be ef-fective. Assuming that the PA is linear with a gain for the rangewhere the loopgain is insufficient, we may set

to obtain the desired gain for small input amplitudes as well. If is notproperly set, the architecture will actually become nonlinear even whena linear PA is used. To further illustrate the behaviour of this scheme theamplitude gain derived from (4) and (5) is illustrated in figure 7 withvarious sets of parameters and a weakly nonlinear PA.

From figure 7 it can be seen that a proper setting of is equally im-portant as having a sufficient loopgain. As a matter of fact, there is noreason to increase the loopgain if it is not followed by a correspondingincrease in accuracy of

314

From figure 7 there is no reason to use a power detector as the loopgainappear to drop much more rapidly with decreasing input signal com-pared to when using envelope detectors. But an envelope detector is avery nonlinear block. As such it is difficult to obtain matched detectorsfor the input and output signals, respectively. Mismatch between the de-tectors will degrade the linearization performance. Furthermore, a sig-nal that run close to the origin will result in large spectral expansion,which will increase the loop bandwidth requirements. A power detectoror squaring function, on the contrary, provide both good matching andlimited bandwidth expansion.

315

Linearization of RF power amplifiers will improve efficiency as it willbe possible to operate the PA deeper into the saturation region. Howev-er, as the output signal should be a replica of the reference signal, theoutput signal must have the same modulation depth as the reference sig-nal. Thus, substantially less efficient regions of the PA will still be exer-cised and the boost in efficiency will be limited. To improve efficiencyfurther the PA must always be operated in saturation or as a switcheddevice. This is the main driver for direct modulation techniques. Enve-lope elimination and restoration (EER) is one such technique [18] andthere are many similar architectures. The basic idea is illustrated in fig-ure 8.

The PA is fed with a constant envelope signal that only contains thephase component of the reference signal and the amplitude componentis applied by means of supply modulation. These components may beobtained using a limiter and an envelope detector, respectively, asshown in figure 8. Another option is to generate these components di-rectly from the digital baseband circuitry.

As the supply modulation path is far from linear in practice, such anopen-loop approach is only feasible for some standards with relaxed re-quirements on spectral emission and EVM. The linearity can be im-proved with feedback though in similar fashion to the polar feedback

Practical results are promising, with a fairly linear PA as starting point,a l0dB reduction of distortion was obtained in [17] with a narrowbandsignal. Provided that the offset is properly set, power and envelope feed-back are both self-sufficient schemes.

4. EER - Envelope Elimination and Restoration

316

technique illustrated in figure 5 [19][20] at the expense of reduced band-width.

Even without a feedback loop the bandwidth of this scheme is ratherlimited in practice. One reason is found in the spectral expansion of thereference signal as it is decomposed into its polar components. This isexemplified in figure 9 with a WCDMA signal1. Thus the bandwidth ofthe amplitude and phase paths of the EER architecture must be verymuch larger than the bandwidth of the reference signal, at baseband itcan correspond to 3-4 times the symbol rate or more depending on thespectral emission and EVM requirements. Furthermore, as the band-width of various signal paths expands the system become more sensitiveas interfering signals and noise will more easily couple into the circuitryand result in unwanted modulation of the output signal. Furthermore, asthe signal bandwidth is increased the spectral density of the desiredcomponents are reduced (assuming that the power of the signals re-mains the same).

Another bandlimiting factor is the varying supply. For optimal efficien-cy it is tempting to use a switched DC/DC converter but the control sig-

The modulation format for WCDMA is complex and varies with data rate and spreading factor. Forsimplicity all examples are based on a regular QPSK signal with filtering according to standard specifi-cations. This corresponds to a special case when the same spreading factor is used for data and controlbut it does not include complex scrambling (HPSK spreading).

1.

317

nal bandwidth of such a converter is rather limited as it must be muchlower than the switching frequency. Note also that it is the amplitudecomponent that controls the DC/DC converter and as shown earlier theassociated bandwidth is much larger than the bandwidth of the refer-ence signal.

The signal paths for the amplitude and phase components are funda-mentally different in nature. This means that delay mismatch will be apotential problem as the two paths will not automatically track verywell. Furthermore, as opposed to a delay mismatch between the I and Qcomponents, the effect of delay mismatch between amplitude and phasecomponents may result in severe spectral expansion at the output of thePA. The sensitivity to delay mismatch varies from one modulationscheme to another. The effect is illustrated for a WCDMA signal in fig-ure 10 where the spectra are shown for delay mismatch of 1%, 2%, 5%and 10% of the chip period (260ns=l/3.84Mcps). For modulationschemes as the ones used in WCDMA and EDGE the maximum toler-able delay mismatch is typically a few percent or less of the chip andsymbol period, respectively, depending on the design margins.

318

5. LINC - LInear amplification with Nonlinear Components

LINC [21][22], like EER, can be considered as a direct modulationtechnique. The input signal is divided into two constant envelopephasors that are separately amplified. The RF output signal is obtainedby combining these two phasors after the PAs, see figure 11. The firstproposed architectures were based on analog circuit techniques [22] butlater on the use if digital techniques has been assumed to be the best op-tion [23].

As is the case with EER, the advantage of LINC is its high efficiencypotential as switched amplifiers can be used to amplify the two phasors.LINC also share the disadvantage of EER in that the nonlinear functionsinvolved results in substantial frequency expansion of the internal sig-nals compared with the reference signal. The phasors in frequency do-main may be represented by the sum of two signal where the referencesignal is a narrowband signal and e(f) is a wideband signal.The spectra of these two signals are shown in figure 12 for a WCDMAsignal, e(f) has about the same power spectral density as withinthe signal bandwidth and decays slowly with increasing frequency.Thus, the linearity of LINC relies on an accurate subtraction of twolarge quantities and the effect of a small phase or gain imbalance be-tween the two branches can be detrimental. As the power spectral den-sity of e(f) is as high as -10 to -20dBc in adjacent channels and therequirements on spectral emission for this region could be as low as–60dBc the accuracy of the subtraction should result in a residue equal

319

to e(f) suppressed by some 40 to 50dB. Thus, this residue is obtainedby scaling e(f) with

where and is the relative gain imbalance and phase imbalance,respectively. For example, a 40dB reduction of e(f) is obtained with0.1 dB gain imbalance or 0.5° phase imbalance. If the phasors are gen-erated at baseband and separately upconverted in quadrature modulatorswe must also consider the imbalances and offsets within these buildingblocks [24]. To avoid this problem, the phasors can be generated at IF[25][26] (or even at RF). This would then, on the other hand, prevent usfrom using accurate digital techniques for the nonlinear functions in-volved unless a low digital IF is used[27]. With an analog (IF/RF) solu-tion the bandwidth of the nonlinear function is one of the main obstaclesfor wideband signal generation but experimental results show that it isat least feasible up to 1MHz of reference signal bandwidth [26].

While several authors have reiterated the 100% efficiency potential ofLINC, very few has addressed this particular topic in more detail. In-stead, prototype circuits have constantly been based on the use of tradi-tional power combiners for the recombination process. Thereby,sufficient isolation between the two branches is obtained to avoid cross-

320

products due to a nonlinear interaction of the two PAs. The penalty isloss. It can be shown that the “efficiency” of the power combiner aloneis given by where PAR is the peak-to-average-pow-er-ratio of the modulated signal which typically gives be-low 50%.

Despite its obvious shortcomings the LINC technique has been success-fully implemented in multicarrier basestations amplifiers that covers thewhole CDMA-band of 60MHz. It is also worth to mention that severaltechniques have been proposed that are based on LINC with globalfeedback (encompassing the PAs) to generate the phasors through indi-vidually modulated VCOs in each branch [28][29]. By doing so, lesslinear but power efficient recombination techniques can be used as theloop will correct for the errors but as with any feedback system the at-tainable bandwidth is limited. The Neoteric signal concept [30] also de-rives from the LINC technique. As described above, in LINC the e(t)-signal in the two phasors is cancelled by subtraction at the output. Thisis the only way to cancel e(t) as this signal has the same center frequen-cy as the desired signal. In the Neoteric signal concept, however, the sig-nal that is added to the reference signal has a completely different centerfrequency but, still, the properties that ensures a constant envelopephasor. Now, as the undesired signal is located around another frequen-cy it may be removed by filtering and, furthermore, only one amplifieris necessary. The major disadvantage with this technique is the wide-band properties of the Neoteric signal which must be preserved withhigh accuracy through the whole transmitter chain.

6. DISCUSSION AND SUMMARY

Linearity is dictated by the standards, efficiency is not. Sufficiently highlinearity can always be obtained by means of a properly backed off PAoperating in class A/AB at the expense of poor efficiency. This leavesus with a large potential in efficiency improvement which can be trans-lated to increased talk time, reduced energy per transmitted bit or what-ever measure that is the most applicable. This efficiency potential canonly be exploited by means of more advanced transmitter architectures.

For systems with moderate linearity requirements direct modulationtechniques will be the desired choice as they will provide higher effi-

321

ciency compared with linearized PAs. Here the starting point is a verynonlinear but high efficiency switched PA. Similarly, when the linearityspecification is stringent linearized PAs will be the preferred or only so-lution as the starting point is a moderately nonlinear PA with moderateefficiency.

There is, however, a number of important issues that make the adoptionof linear transmitter architectures non-trivial. For example, the availableheadroom in power consumption is limited if efficiency is still to be im-proved, especially for systems with large power control range and thosewith small output power levels in general. In addition to this, the largebandwidth associated with many of the new systems cannot easily behandled by most of the techniques discussed. Production cost is anotherimportant factor. For example, techniques that are based on IF signalsmay require additional SAW filters, which are generally avoided when-ever possible.

For the various techniques that have been described we may summarizesome of the most important disadvantages that should be addressed fora successful adoption of each technique. Predistortion can be very effec-tive in improving linearity for wideband signals. One major disadvan-tage with this technique is the need for adaptation of the predistorternonlinearity which requires a feedback path from the PA output andmost likely some digital signal processing as well. Feedback techniquesare attractive as they, in principle, can be made self-sufficient. Here,however, more focus must be put on lowering the delay in the PA to al-low for larger signal bandwidths. EER and its derivatives are very at-tractive from an efficiency point of view. The large bandwidthsassociated with the signal components make this technique sensitive tointerfering signals and spectral shaping. Also, the need for an accurateand fast DC/DC converter for supply modulation is a critical issue.LINC does not require supply modulation but like EER the techniqueoperate with internal signals that have large bandwidths. Otherwise, themain problem with LINC is the recombination of the PA output signals.No one so far has identified a sufficiently linear technique that can pre-serve the high efficiency of a switched PA while operating with frequen-cies in the GHz range.

322

7. REFERENCES[1] C. S. Yu, W. S. Chan, and W. L. Chan, “Linearised 2 GHz amplifier for IMT-2000”, InProceedings of IEEE 51st Vehicular Technology Conference, 2000, pp. 245-248.

[2] M. Nakayama, K. Mori, K. Yamauchi, Y. Itoh, and T. Takagi, “A novel amplitude andphase linearizing technique for microwave power amplifiers”, In IEEE MTT-S InternationalMicrowave Symposium Digest, 1995, pp. 1451-1454.

[3] J. K. Cavers, “Amplifier linearization using a digital predistorter with fast adaptation andlow memory requirements”, IEEE Transactions on Vehicular Technology, vol. 39, Nov. 1990,pp. 374-382.

[4] J. Namiki, “An automatically controlled predistorter for multilevel quadrature amplitudemodulation”, IEEE Transactions on Communications, vol. 31, no. 5, May 1983, pp. 707-712.

[5] S. P. Stapleton, G. S. Kandola and J. K. Cavers, “Simulation and analysis of an adaptivepredistorter utilizing a complex spectral convolution”, IEEE Transactions on Vehicular Tech-nology, vol. 41, no. 4, pages 387-394, November 1992.

[6] T. Rahkonen, T. Kankaala, and M. Neitola, “A programmable analog polynomial predis-tortion circuit for linearising radio transmitters”, In Proceedings of the 24th European Solid-State Circuits Conference, ESSCIRC, 1998, pp. 276-279.

[7] E. Westesson and L. Sundström, “A complex polynomial predistorter chip in CMOS forbaseband or IF linearization of RF power amplifiers”, In Proceedings of the 1999 InternationalSymposium on Circuits and Systems, 1999. ISCAS’99, pp. 206 -209.

[8] J. Vuolevi, J. Manninen, and T. Rahkonen, “Cancelling the memory effects in RF poweramplifiers”, to appear in Proc. of International Symposium on Circuits and Systems, ISCAS’01.

[9] T. Kankaala,V. Jutila, A. Heiskanen, and T. Rahkonen, “Using analog predistortion forlinearizing class A - C amplifiers”, In proc. 1998 Norchip Seminar, Lund, Sweden, November9-10, 1998. pp. 257-263.

[10] M. Ghaderi, S. Kumar, and D. E. Dodds, “Fast adaptive polynomial I and Q predistorterwith global optimisation”, In IEE Proceedings on Communications, vol. 143, no. 2, April 1996,pp. 78-86.

[11] L. Sundström, M. Faulkner and M. Johansson, “Effects of reconstruction filters in digitalpredistortion linearizers for RF power amplifiers”, IEEE Transactions on Vehicular Technolo-gy, vol. 44, Feb. 1995, pp. 131-139.

[12] M. A. Briffa and M. Faulkner, “Stability analysis of Cartesian feedback linearisation foramplifiers with weak nonlinearities”, IEE Proceedings on Communications, vol. 143, no. 4, Au-gust 1996, pp. 212-218.

[13] M. Johansson and T. Mattsson, “Linearised high-efficiency power amplifier for PCN”,Electronics Letters, vol. 27, no. 9, April 1992, pp. 762-764.

[14] M. Faulkner, “An automatic phase adjustment scheme for RF and Cartesian feedback lin-earizers”, IEEE Transaction son Vehicular Technology, vol. 49, no. 3, May 2000, pp. 956-964.

[15] J. L. Dawson and T. H. Lee, “Automatic phase alignment for high bandwidth Cartesianfeedback power amplifiers”, In Proc. of IEEE Radio and Wireless Conference 2000, pp. 71-74.

[16] T. Arthanayake and H. B. Wood, “Linear amplification using envelope feedback”, Elec-tronics Letters, vol. 7, no. 7, April 1971, pp. 145-146.

323

[17] B. Shi and L. Sundström, “A 3.3V power feedback chip for linearization of RF power am-plifiers”, Journal of Analog Integrated Circuits and Signal Processing, vol. 26, January 2001,pp. 37-44.

[18] L. R. Kahn, “Single-sideband transmission by envelope elimination and restoration”, Pro-ceedings of the IRE, July 1952, pp. 803-806.

[19] V. Petrovic and W. Gosling, “Polar-loop transmitter”, Electronics Letters, vol. 15, no. 10,May 1979, pp. 286-288.

[20] D. K. Su and W. J. McFarland, “An IC for linearizing RF power amplifiers using envelopeelimination and restoration” IEEE Journal of Solid-State Circuits, vol. 33, no. 12, December1998, pp. 2252-2258.

[21] H. Chireix. High power outphasing modulation. Proceedings IRE, vol. 23, no. 11, pages1370-1392, November 1935.

[22] D. C. Cox, “Linear amplification with nonlinear components”, IEEE Transactions onCommunications, vol. 22, no. 12, pages 1942-1945, December 1974.

[23] S. A. Hetzel, A. Bateman, and J. P. McGeehan. “LINC transmitter”, Electronics Letters,vol. 27, no. 10, pages 844-846, May 1991.

[24] L. Sundström, “Spectral sensitivity of LINC transmitters to quadrature modulator mis-alignments”, IEEE Transactions on Vehicular Technology, vol. 49, no. 4, July 2000, pp. 1474-1487.

[25] B. Shi and L. Sundstrom, “A translinear-based chip for linear LINC transmitters”, In Di-gest of Technical Papers, Symposium on VLSI Circuits 2000, pp. 58-61.

[26] B. Shi and L. Sundstrom, “An IF CMOS signal component separator chip for LINC trans-mitters”, IEEE Custom Integrated Circuits Conference, pp. 49-52, May 2001.

[27] C. P. Conradi, “LINC transmitter linearization techniques”, M.Sc. thesis, University ofCalgary, Canada, January 2000.

[28] M. K. DaSilva, Vector locked loop, U.S. Patent 5, 105, 168, Apr. 14, 1992.

[29] A. Bateman, “The combined analogue locked loop universal modulator (CALLUM)”, InProc. 42th IEEE Vehicular Technology Conference, 1992, pp. 759-763.

[30] R. E. Schemel, “Neoteric signal: method for linearising narrow-band amplifiers or signalpaths up to their peak powers”, Electronics Letters, vol. 36, no. 7, pp. 666-668, March 2000.

GaAs Microwave SSPA’s: Design and characteristics

A.P. de Hek and F.E. van VlietTNO Physics and Electronics Laboratory

P.O. Box 96864, 2509 JGThe Hague, The Netherlands

Email: [email protected]; [email protected]

ABSTRACT

The performance of GaAs SSPA’s is crucial to a rapidlyincreasing number of systems. This tutorial aims atclarifying the design choices and trade-offs, and atwarning the new designer for pitfalls and unexpectedproblems.The tutorial starts, after a brief introduction, with asurvey of the relevant GaAs technologies. After this, thetutorial follows the steps of a normal broadbandmicrowave GaAs SSPA design: The transistor unit celland then the operating point are chosen and the trade-offbetween power and gain for the load impedance is made.The topology for the total amplifier is determined, basedon paralleling sufficient transistors for the requiredoutput power, and adding stages to achieve the requiredgain. Finally the matching is performed, starting at theoutput and working its way back to the input.Everywhere, the stability of the transistors and the totalamplifier is of concern. Oscillations pose the biggestthreat to SSPA designers.Finally, the design steps are illustrated with a recentexample of a 5-7 Watt, 30 dB gain HFET amplifier.At the end of the tutorial, a relatively long list ofreferences is included. They were included especially toassist the new designer in finding his way in literature.


The design of broadband microwave GaAs SSPA’s is a difficult job.

The performance of such an SSPA is strongly dependent on thetopology of the SSPA, on the (non-linear) models of active andpassive components used during the design, on the choice of thetransistor sizes and shapes and on the technology chosen at the start.Furthermore, the design is often started with incomplete informationand the parameters mentioned before cannot be changed independentof the others. Then, the performance of the amplifier will be at bestmoderate unless highly accurate modelling is used for everything inthe amplifier. In addition, if not all that is enough, the nature ofcommon SSPA’s is such that a variety of typical oscillations canoccur.

However, for a number of systems the performance of the SSPA is ofcrucial importance for the performance of the system, giving alegitimate reason to spend sufficient time on their design. Historically,this is definitely the case for expensive military phased-array radarsystems. A sufficient bandwidth is here essential for a guaranteedradar performance in presence of a jamming signal, while themaximum output power secures the detection range of the radar andthe power-added efficiency requirements limit the amount of DCpower needed to operate the radar. Modern low-cost handheldtelephones, to take the opposite application, are in a similar waydependent of the SSPA performance. Speak-time is directly related tothe efficiency of the SSPA and stringent linearity and power controlrequirements give good reasons to invest in design time.

There is much more to SSPA’s that falls outside the scope of thistutorial, which is limited to the design. The characterisation methods

1. INTRODUCTION326

327

of non-linear devices are not discussed, although especially for powertransistors with their characteristic low output impedance this shouldbe done properly. The thermal aspects of SSPA’s are not discussed,but are definitely important for reliability reasons and have a directimpact on the packages as well. Both thermal issues and packages, butalso compact and physical models, and pulsed vs. continuousoperation will not be discussed here.

2. TECHNOLOGY

2.1. IntroductionAs was made clear by the title, this tutorial focuses on GaAs. Reasonfor this is the frequency range considered. At microwave frequenciesabove a few GHz, GaAs devices dominate the market; between oneand a few GHz, GaAs coexists with Si devices. The reason for this isin the high of GaAs transistors, in turn caused by the high electronmobility and saturation velocity.

The majority of the technologies are either MESFET, HFET or HBT.Until recently, the reliability of HBT transistors was questionable, andwe therefore focus on MESFET and HFET technologies. A lot of thetechniques are however directly applicable to HBT designs.

2.2. Transistor modellingFirst and most important is the modelling of the power transistor.SSPA’s have very specific requirements to the exact transistor used.Standard foundry models therefore are seldom satisfactory. In order tohave maximum flexibility, a matrix of transistors with variations in alldesign parameters must be available and characterised.

First, S-parameters and IV-curves must be measured. Out of these, anequivalent circuit is deduced. Standard simulation software has a largevariety of models available. Experience has shown so far that EEFET3and EEHEMT1 models were satisfying for our designs for MESFETsand HFETs respectively. As an illustration, the equivalent circuit of aHFET power transistor is shown in figure 2.1.

328

For the chosen power transistor, the optimum load impedance mustthen be determined, based on measurements. This is not an easymeasurement, due to the very low output impedance of the transistor.For a measurement system based on passive tuners, it is hard tomeasure at the right output impedance; the losses in probes and cablesprevent the low impedance to be presented to the transistor. An activeload-pull set-up [1] is preferable, our own set-up is shown in thefigure below for illustration purposes [2].

329

2.3. Modelling of passive componentsThe only components left to be modelled are the passive components.In SSPA’s, the passive components considered are usuallytransmission line structures, capacitors, resistors and via holes. Again,the foundry-supplied models are seldom of sufficient accuracy andshould therefore not be trusted. Measurements rule, but fortunately theavailability of EM simulators, such as Agilent’s Momentum andSonnet Software’s EM has made the modelling task much easier. Theuse of EM simulations in combination of proper verification structureshas turned out to be the right combination.

The importance of these simulations is illustrated with two cases: Thefirst compares for a microstrip 45° bend the standard equivalent circuitmodels with EM simulations, and the second compares the measuredvalues of a parallel capacitor embedded in microstrip lines with itsEM counterpart. Note that with the meshing of capacitors, it is crucialthat the meshes of top- and bottom-plate match.

330

With the design increasing in complexity, bigger parts of for examplethe matching can be simulated as a whole in the EM simulator. Beprepared to perform a lot of these simulations!For some structures the metal thickness plays its role. This is generallydifficult to take into account with planar EM simulators. It mayrequire the use of a full three-dimensional simulator, such as e.g.Agilent’s HFSS, for the problem to be solved.

3. UNIT TRANSISTOR CELL

3.1. IntroductionThe first step in the high-power amplifier design is the determinationof the amplifier topology, see chapter 4, and the selection of the unittransistor cell. In this chapter the selection of the unit transistor celland the factors that influence the performance of the transistor arediscussed in section 3.2. In section 3.3 the effect of the operating classon both output power and power added efficiency is discussed.Finally, in section 3.4 the stability of the unit transistor cell isdiscussed.

3.2. Selection unit transistor cellThe size of the unit transistor cell is determined by the required outputpower and the number of transistors used in parallel. These numbers

331

result in a required output power per transistor. Consequently, the totalneeded amount of gate width is known for a given transistortechnology. The degree of freedom that is left to the designer for therealisation of the required amount of gate width is the selection of thenumber of gate fingers and the unit gate width of these fingers. Otherfactors that influence the performance of the unit transistor cell are:

The gate-to-gate spacing of the transistor.The layout of the transistor.

The effect of before mentioned items is strongly dependent on thefrequency band of interest. In general, the higher the frequency themore important above factors become.

Examples of the two most commonly encountered transistor layoutsare shown in figure 3.1. At higher frequencies (f > 5 GHz) thetransistors based on a fishbone layout will have better performancebecause the unit gate width is smaller, this will be discussed in moredetail in the remainder of this section. The use of fishbone transistorscan also result in amplifiers that have a smaller occupied chip area fora given output power. This is because less transistors in parallel haveto be used to realise the required output power.

332

The gain of a transistor decreases as the unit gate width of thetransistor increases. This gain degradation is frequency dependent [3].The higher the frequency the more the gain decreases. An example ofthe gain reduction as function of the unit gate width is shown in figure3.2.

The results show that the output power is more or less constant andthe gain decreases with increasing unit gate width. Therefore, there isa maximum to the unit gate width. When this limit is reached the onlyway to increase the output power any further is using more gatefingers in parallel. This is only to a certain extent possible. Whenmore gate fingers in parallel are used the phase difference between theinner and outer fingers starts to increase. Consequently, the gain of thetransistor will be reduced. Practical limits at 10 GHz are a maximumunit gate width of and a maximum number of gate fingers inparallel of 50.

333

Another important issue that can be influenced by a designer is thegate-to-gate spacing of the transistor. Reducing this spacing has theobvious advantage that the size of the transistor diminishes andtherefore the amplifier size is reduced. A disadvantage is thedecreasing mean time to failure when the junction temperature isincreased [4]. In figure 3.3, an example of the calculated junctiontemperature as function of the gate-to-gate spacing is shown. Thesecalculations are performed with the help of Hotpac [5]. A maximumjunction temperature of 125 °C is a general accepted upper limit. Fromthe results depicted in figure 3.3 can be concluded that the minimumgate-to-gate spacing is A larger spacing will not give a muchlower temperature but will only result in a larger unit transistor cell.

3.3. Operating class and load impedanceAfter the dimensions and layout of the unit transistor cell are chosen itis time to select the operating point of the transistor and determine theload impedance. The drain voltage is determined by the minimum

voltage swing formed by the knee voltage of the transistor and themaximum voltage formed by the breakdown voltage. A drain voltagein the middle of these two extreme values is a good choice. As nextstep the operating class must be determined. This operating class isdirectly related to the Power Added Efficiency (PAE) of the transistor,

Equation 3.1 shows that not only a good output power todissipated DC power ratio is essential but also the power gain ofthe transistor should be as high as possible. In [6] an equation forthe PAE is given where the class A power gain is related with theoperating class. The depicted results show that the operating point thatwill give maximum PAE moves from class A for a transistor with alow power gain to class B for one that has a high power gain. Ingeneral, a class AB operating point will give the maximum PAE. Theoperating class has also influence on the output power. In [6] it isshown that the maximum obtainable output power is found for anoperating point, which lies between class A and B and will decreasetowards class C. The choice, which operating class can be used is alsolimited by the application in which the amplifier will be used. The useof for instance class C is not allowed for amplifiers that have highdemands with respect to linearity.

The discussed result is valid for amplifiers that are operated in their‘linear’ region. Other ways to improve the PAE are driving thetransistor into compression [7] and/or applying harmonic terminations[8,9]. If harmonic tuning is applicable depends on the requiredfrequency band of interest. If the bandwidth of the application is large,it will become difficult if not impossible [3] to realise the requiredload impedances. For higher frequencies, the influence of thecapacitor between drain and source starts to increase. This capacitorstarts to act as a harmonic termination (short) [10]. Consequently, theeffect of harmonic terminations is reduced.

334

335

The next step is the determination of the optimum load impedance.Commonly encountered methods are:1. The Cripps method [10,11].2. Load-pull simulation with a large-signal transistor.3. Perform load-pull measurements.The latter method gives the most accurate results but involves the useof costly measurement equipment, which is not always available. Forthese cases, method 1 or 2 can give a reasonable estimation of the loadimpedance.

3.4. Stability unit transistor cellThe final step that needs to be taken before the overall amplifierdesign can start is the analysis and if necessary improvement of thestability of the unit transistor cell. At microwave frequencies, thestability analysis is commonly performed with the help of the K-factor[12]. Note that the K-factor is not sufficient for the analysis of the

336

complete amplifier, see also chapter 4. Examples of other methodsthat can be used to analyse the stability of the transistor can be foundin [13,14]. In general, the transistors are not unconditional stable overthe entire frequency band of interest. Therefore, networks thatimprove the stability of the transistor should be applied. The mosteffective way is the application of a series RC network at the input ofthe transistor, see figure 3.5.

Stability improvement with the help of parallel or series feedback isnot applicable in the case of a microwave power amplifier because toomuch gain and output power will be lost. In addition, the use of anetwork in series with the output of the transistor results in both areduction of the output power and PAE. The before mentioned RCnetwork is also helpful in the suppression of parametric oscillations[15].

4. HIGH-POWER AMPLIFIER DESIGN

4.1. IntroductionIn the previous chapter the selection of the unit transistor cell and theselection of the operating point are discussed, In this chapter theamplifier topology is discussed in section 4.2. In section 4.3, thedesign of the matching networks is discussed. Finally, this chapterconcludes in section 4.4 with some remarks regarding the stabilityanalysis of the complete power amplifier.

337

4.2. Amplifier topologyIn the previous chapter, it is discussed that the performance of atransistor is limited by a maximum number of gate fingers and amaximum unit gate width. Therefore, a number of transistors have tobe used in parallel to realise the required output power level, seefigure 4.1.

The overall of the complete two-stage amplifier can becalculated as function of the losses of the matching networksand and the transistor parameters

From this equation be concluded that for a high overall PAE of theamplifier:1. The loss of the output matching network should be as low as

possible. Of course this demand is also essential for an as high aspossible output power.

338

The gain and the PAE of the last stage transistors should be ashigh as possible.The PAE and gain and of the first stage transistors andthe losses of the input and interstage matching networks andcan not be neglected.

2.

3.

4.3. Design matching networksThe design of the matching networks starts at the output of theamplifier. After this network is realised the interstage matchingnetwork(s) are designed. Finally, the input matching network isdesigned. The matching networks must perform the followingfunctions:1. Present the required source and load impedances to the input and

output of both the transistors and the complete power amplifier.2. Divide/combine power to and from the transistors.3. Supply the bias voltages to the transistors.4. Enhance the stability of the transistors.

As first step in the design of the matching networks a model of thesource and load impedance of the matching networks is determined.For frequencies up to at least 12 GHz the in figure 4.2 depictedequivalent schematics for the source and load impedances are valid forrelative frequency band widths up to 40%.

When the component values are known the maximum obtainablebandwidth and/or matching ratio can be analysed with the help of theBode-Fano limit [16,17]. For the source and load impedance’s

depicted in figure 4.2 the matching network can be synthesised withe.g. the theory described in [17]. Another approach first described in[18] and known as the real frequency matching technique does notrequire any knowledge regarding the model of the load and sourceimpedance.

The design of the matching networks is demonstrated here for theoutput matching network. The first thing that is observed for thisnetwork is the need to combine transistors in parallel. In other words,there must be some kind of physical connection between thetransistors. Therefore, the use of a low-pass matching network seemsto be the correct choice. Another aspect that must be taken intoaccount from the start is the way the bias voltages must be applied tothe transistors. Wherever possible this is done with the help of aparallel inductor. This inductor should be applied at the point that hasthe lowest impedance level and has therefore the least influence on theoverall performance of the matching network. In general, this point islocated directly at the output of a transistor. The resulting equivalentschematic of the output matching network is depicted in figure 4.3.

At the output of the matching network a DC blocking capacitor isadded. The analytical techniques from [17, 18] are not directlyapplicable for the determination of the component values of thematching network due to the existence of the bias inductor Tocircumvent this problem the following procedure is used. As first step,the source impedance is made real at the centre frequency with thehelp of the bias inductor. The influence of the blocking capacitor isconsidered negligible. As second step, the component values

339

according to a Tchebychev approximation are calculated with theequations given in [19]. In the final step, the overall performance isoptimised as function of frequency. With the help of this approach,excellent results are obtained.

At this point, it is time to convert the ideal component values into theirlayout equivalent. The inductors are realised with the help ofmicrostrip lines. This is necessary due to the high drain currents thatwill flow through the lines, which prevent the use of integratedinductors. The capacitors are realised with the help of MIMcapacitors. An example of the layout of an output matching network isshown in figure 4.4.

340

Figure 4.4 shows that there exists coupling between the various part ofthe layout. Therefore, the use of an electromagnetic field simulator fora final optimisation of the layout is mandatory. In the beforementioned approach the effect of the losses has been accounted forwith the help of the optimiser. Examples of approaches where thecomponent losses are taken into account from the start of the designcan be found in [21,22]. The design of the interstage and inputmatching networks can be performed in a similar way. The onlyexception might be the need for a frequency dependent loss tocompensate for the frequency dependent gain role-off of thetransistors. The way to realise such a frequency dependent loss is notdiscussed here. Information regarding this subject can be found forinstance in [23].

4.4. Stability analysisAfter the design of the matching networks is completed, it is time toanalyse the stability of the amplifier as much as possible at allthinkable operating conditions. The transistor cells have beenstabilised for different load impedances. Unfortunately it is notpossible to realise sufficient on-chip decoupling lower than 1 GHz.Therefore off-chip decoupling must be applied to guarantee stability.

The amplifiers can also become unstable due to the existence of onand off-chip feedback loops. Methods to analyse this kind ofinstabilities are described in [24,25]. Inequalities in the transistors ormatching networks can give rise to odd-mode oscillation [24].

341

This type of oscillation can be prevented by the use of odd-modesuppression resistors in between the transistors, see figure 4.5.

The final type of oscillation that needs attention is the so-calledsubharmonic or parametric oscillation [15]. In [26,27] an analysismethod and insight in this type of oscillation is given. Thestabilisation RC network discussed in the previous chapter is also veryuseful in the prevention of subharmonic oscillations [15].

342

5. DESIGN EXAMPLE SOLID STATE POWER AMPLIFIER

As conclusion of the described design procedure we summarise with asolid-state power amplifier design, which is designed with the help ofthe techniques described in the previous chapters. The discussedamplifier design aimed at an output power between 5-7 Watt with again of 30 dB at X-band and maximum PAE. The required outputpower is realised by placing eight transistors in parallel, see figure 5.1.

6. REFERENCES

Y. Takayama, “A New Load-pull Characterization Method for MicrowavePower Transistors”, 1976 IEEE MTT-S Symposium Digest, pp. 218 - 220,June 1976.A.P. de Hek, “A Novel Fast Search Algorithm for an Active Load-pullmeasurement system”, GAAS98 Symposium Digest, pp.268-275, October1998.J.L.B. Walker, “High-Power GaAs FET amplifier”, Artech House, 1993.

[1]

[2]

[3]

343

At the input of the transistors RC stabilisation networks are placed.The amplifier is developed with the help of the HFET technology ofthe Fraunhofer Institute for Applied Solid State Physics (FhG-IAF)[28]. The results of this amplifier are depicted in figure 5.2.

The results show that target goals mentioned at the beginning of thischapter have been reached. More information regarding amplifiersdesigned by TNO-FEL with methods described in this paper can befound in [29-34].

344

[4]

[5]

[6]

[7]

[8]

[9]

W.J. Roesch, “Thermo-Reliability Relationships of GaAs ICs”, GaAs ICsymposium Digest, pp. 61 –64, 1988.J.A. Albers, “HOTPAC: Programs for Thermal Analysis Including Version3.0 of the TXYZ Program, TXYZ30, and the Thermal Multilayer Program,TML”, NIST special publication 400-96, August 1995.Y. Takayama, “Considerations for High-Efficiency Operation ofMicrowave Transistor Power Amplifiers”, IEICE Trans. Electron., Vol.E80-C, pp. 726-732, June 1997.D. M. Snider, “A theoretical analysis and experimental conformation of theoptimally loaded and over-driven RF power amplifier”, IEEE Trans.Electron Devices, vol. ED-14, pp. 851-857, June 1967.H.L.Kraus, C.W. Bostian and F.H. Raab, “ Solid State Radio Engineering”,chapters 12-14, John Wiley & Sons, 1981.F. H. Raab, “Class-F Power Amplifiers with Maximally Flat Waveforms”,IEEE Trans. Microwave Theory Tech., vol. MTT-45, pp. 2007-2012,November 1997.S.C. Cripps, “RF Power Amplifiers for Wireless Communications”, ArtechHouse, 1999.S.C. Cripps, “A Theory for the Prediction of GaAs FET Load-pull PowerContours”, IEEE MTT-S Symposium digest, pp. 221-223, 1983.J. Rollet, “Stability and Power Gain Invariants of Linear Two Ports”, IRETrans. on Circuit Theory, vol, 9, pp. 29-32, March 1962.A. Platzker, W. Struble and K.T. Hetzler, “Instabilities Diagnosis and theRole of K in microwave Circuits”, IEEE MTT-S Symposium Digest, pp.1185-1188, June 1993.W. Struble and A. Platzker, “A Rigorous Yet Simple Method forDetermining Stability of linear N-port Networks”, GaAs IC SymposiumDigest, pp. 251-254, 1993.D. Teeter, A. Platzker and R. Bourque, “ A Compact Network forEliminating Parametric Oscillations in High Power MMIC Amplifiers”,IEEE MTT-S Symposium Digest, pp. 967-970, June 1999.H.W. Bode, “Network Analysis and Feedback Amplifier Design”, D. vanNostrand company Inc., 1945.R.M. Fano, “Theoretical Limitations on the Broadband Matching ofArbitrary Impedances, “ Journal of the Franklin Institute, vol. 249, pp. 57-83 and 139-154, January/February 1950.R.M. Cottee and W.T. Joines, “Synthesis of Lumped and DistributedNetworks for Impedance Matching of Complex Loads”, IEEE Trans.Circuits Syst., vol. CAS-26, pp. 316-329, May 1979.H.J. Carlin, “A New Approach to Gain-Bandwidth Problems”, IEEE Trans.Circuits Syst., vol. CAS-24, pp. 170-175, April 1977.Y.S. Zhu and W.K. Chen, “Low-pass impedance transformation networks”,IEE Proc.-Circuits Devices Syst., vol. 144, pp. 284-288, October 1997.L.C.T. Liu and W.H. Ku, “Computer-Aided Synthesis of Lumped LossyMatching Networks for Monolithic Microwave Integrated Circuits(MMIC’s)”, IEEE Trans. Microwave Theory Tech., vol. MTT-32, pp. 282-290, March 1984.L. Zhu, “A Novel Approach to the Synthesis of Mixed and DistributedLossy Networks”, IEEE MTT-S Symposium Digest, pp. 1355-1358, June1992.

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

W.H. Ku and W.C. Peterson, “Optimum Gain-bandwidth Limitations ofTransistor Amplifiers as Reactively Constrained Two-port Networks”,IEEE Trans. Circuits Syst., vol. CAS-22, pp. 523-533, June 1975.R.G. Freitag, “A Unified Analysis of MMIC Power Amplifier Stability”,IEEEMTT-S Symposium Digest, pp. 297-300, May 1992.M. Ohtomo, “Stability Analysis and Numerical Simulation of MultideviceAmplifiers”, IEEE Trans. Microwave Theory Tech., vol. MTT-41, pp. 983-991, June/July 1993.T. Takagi, M. Mochizuki, Y. Tarui, Y. Itoh, S. Tsuji and Y. Mitsui,“Analysis of High Power Amplifier Instability due to LoopOscillations”, IEICE Trans. Electron, vol. E78-C, pp. 936-943, August1995.J. Imbornone, M. Murphy, R.S. Donahue and E. Heaney, “New InsightInto Subharmonic Oscillation Mode of GaAs Power Amplifiers UnderSevere Output Mismatch Condition”, IEEE Journal of Solid-state circuits,vol.32, pp. 1319- 1325, September 1997.W. Marsetz, A. Hülsmann, K. Köhler, M. Demmler and M. Schlechtweg,“GaAs PHEMT with 1.6W/mm output power density”, Electronic letters,vol.35, pp. 748-749, April 1999.F.L.M. van den Bogaart, A.P. de Hek and A. de Boer, “MESFET High-power High-Efficiency Amplifiers at X-band with 30% bandwidth”,GAAS’96 proceedings, pp. 3A2-1 - 3A2-4, June 1996.A.P. de Hek, F.L.M. van den Bogaart, “Broadband High Efficient X-bandMMIC Power Amplifiers for Future Radar Systems”, Wocsdice 97,Workshop on Compound Semiconductor Devices and Integrated Circuitsproceedings, pp. 63 - 64, May 1997.F.L.M. van den Bogaart, A.P. de Hek, “First-pass Design Strategy forHigh-Power Amplifiers at X-band”, IEE Tutorial Colloqium on “Design ofRFICs and MMICs” digest, pp. 8/1 - 8/6, November 1997.A.P. de Hek, F.L.M. van de Bogaart, “Optimisation of High-PowerAmplifiers using non linear models”, IEEE European Workshop on: Non-Linear Device Characterisation and Use in RFIC and MMIC PowerAmplifier Design, July 1999.A.P. de Hek, P.A.H. Hunneman, M.Demmler, A.Hülsmann, “A CompactBroadband High Efficient X-band 9-Watt PHEMT MMIC High PowerAmplifier for Phased Array Radar Applications”, GAAS ’99 SymposiumDigest, pp. 276 - 280, October 1999.A.P. de Hek, P.A.H. Hunneman, “Small sized high-gain power amplifiersfor X-band applications”, GAAS’00 Symposium Digest, pp. 221-223,October 2000

345

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

Monolithic integrated lumped planar transformers were in-troduced more than ten years ago. We present a comprehen-sive review of the electrical characteristics which results inan accurate lumped low-order equivalent model. Amplifiers,mixers and Meissner-type voltage controlled oscillators us-ing monolithic transformers have been published a few yearsago. For the first time, integrated transformer-coupled poweramplifiers with a high performance up to 2 GHz are demon-strated. This presentation gives an introduction into mono-lithic transformer and circuit design of push-pull type poweramplifiers. Two designs were realized:

Transformers have been used in radio frequency (rf) circuits since the earlydays of telegraphy. Normally transformers are relatively large and expen-sive components in a circuit or system. But there are several outstanding


MONOLITHIC TRANSFORMER-COUPLEDRF POWER AMPLIFIERS IN SI-BIPOLAR

W. Simbürger, D. Kehrer1, A. Heinz, H.D. Wohlmuth,M. Rest, K. Aufinger, A.L. Scholtz1

INFINEON Technologies AG, Corporate Research, High Frequency CircuitsOtto-Hahn-Ring 6, D-81739 Munich, Germany

1 Technical University of Vienna,

Institute of Communications and Radio-Frequency Engineering

Gusshausstrasse 25/389, A-1040 Vienna, Austria

ABSTRACT

A monolithic 2.5 V, 1 W Si-bipolar power amplifier with55% power-added efficiency at 1.9 GHz.

A monolithic 2.8 V, 3.2 W Si-bipolar power amplifierwith 54 % power-added efficiency at 900 MHz.

1.

2.

INTRODUCTION1

348

advantages using transformers in circuit design: direct current (dc) isola-tion between primary and secondary winding, balanced-unbalanced (balun)function, impedance transformation and no power consumption.

The requirements of nowadays telecommunication systems needs a highdegree of monolithic integration. Today it is possible to integrate lumpedplanar transformers in silicon-based integrated circuit (IC) technologieswhich have excellent performance characteristics in the 1-20 GHz frequencyrange. The outer dimensions are in the range of about down to

diameter depending on the frequency of operation and the IC tech-nology.

Monolithic integrated lumped planar transformers are introduced bye.g. [1]. A review of the electrical performance of passive planar trans-formers in IC technology was presented by [2]. Amplifiers and mixersusing monolithic transformers are presented in [3, 4]. A monolithic 2 GHzMeissner-type voltage controlled oscillator is realized in e.g. [5].

The transformer coupled push-pull type rf power amplifier was inventedin the early days of tubes which has survived into the semiconductor erawith its benefits. There appears a 4:1 load-line impedance benefit for apush-pull combining scheme in an equal-power comparison to simple par-allel device connection [6]. In general, impedance mismatch losses at theoutput of the power amplifier due to the decrease of the impedance requiredat low supply voltages limits the output power and the power-added effi-ciency (PAE). Due to the balanced amplifier design each output transistorcontributes only half the total output power. The emitter bondwire induc-tance is not so critical because of the differential output stage. However,this approach requires a balun at the output of the power amplifier. For thefirst time, monolithic integration in Si-based technologies becomes success-ful [7, 8, 9]. But up to now there was no way to get an accurate predictionand model of the electrical characteristic of on-chip transformers.

Section 2 presents a new winding scheme, modelling and model veri-fication of an integrated lumped planar transformer in silicon which hasexcellent performance characteristics. A lumped low-order model, whichconsists of 24 elements gives an accurate prediction of the electrical be-haviour and ensures a fast transient analysis, because of the low complex-ity. The method of parameter extraction for the equivalent circuit is basedon a tool developed by the authors which uses a new expression for thesubstrate loss and two finite element method (FEM) cores called FastHenry[10] and FastCap [11].

In Section 3 a monolithic rf power amplifier for 1.8–2 GHz is presentedwhich has been realized in a Si bipolar technology. The chipis operating down to supply voltages as low as 1.2 V. The balanced 2-

349

stage power amplifier uses two on-chip transformers as input-balun and forinterstage matching, with a high coupling coefficient of At 1.2 V,2.5 V, and 3 V supply voltage an output power of 0.22 W, 1 W and 1.4 W isachieved, at a PAE of 47%, 55% and 55%, respectively, at 1.9 GHz. Thesmall-signal gain is 28 dB.

Section 4 shows a power amplifier design optimized for high outputpower at 900 MHz and low supply voltages. The chip is operating from2.8V to 4.5 V. At 2.8 V the output power is 3.2 W with a PAE of 54%.The maximum output power of 7.7 W with an efficiency of 57% is achievedat 4.5 V supply voltage. The small-signal gain is 38 dB.

In Section 5 a lumped LC-balun as output matching network is reviewedand extended to a dual-band balun.

2 MONOLITHIC TRANSFORMER DESIGN

Monolithic transformers have been presented in various geometric designsand many different kinds have been realized. A special planar windingscheme for monolithic transformers which results in a very high couplingcoefficient is discussed in this section. To realize other values than N=1:1

of the turn ratio, different numbers of primary and secondary turns areused. This implements that some adjacent conductors belong to the same

350

winding which results in a lower A solution for this problem isto use an interlaced winding-scheme. One winding (e.g. the secondary)is sectioned into a number of individual turns connected in parallel ratherthan one continuous winding. Each segment of the secondary windings isinterlaced with a primary turn. The line width of each segment is designedto carry the same current to obtain a homogeneous magnetic field distri-bution. The monolithic transformer shown in Fig. 1 consists of six primaryturns P1-P6 and two secondary turns S1-S2. The turn ratio is N=6:2. Thecenter taps PCT and SCT are available.

Fig. 2 shows a three-dimensional topview of the transformer. The pri-mary ports, P+, PCT and P-, are located on the left side. The secondaryports, S+, SCT and S-, are located on the right side. The transformer de-sign is nearly symmetric about a line. The outer diameter isand the inner diameter is The lateral spacing between theturns is about and has different values for each metal layer becauseof different design rules (Fig. 4). The conductor width on the primary sideis about The conductor width on the secondary side about

and different for each winding to get the same series resistanceof each segment of the secondary turns.

Fig. 3 shows a cross-section of this transformer. The primary windingconsists of metal 3 and metal 2 connected in parallel and is separated to

351

An electrical model of a transformer can be recognized from the physicallayout. The circuit devices in Fig. 3 are the basic elements of the equivalentcircuit shown in Fig. 5 and can be identified as: multiple coupled inductors

to ohmic loss in the conductor material to parasitic ca-pacitive coupling between the windings to and into the substrate

to and finally substrate losses to With this basicelements a lumped low-order equivalent model was constructed.

2.1 Lumped Low-Order Equivalent Model

the substrate by The secondary winding consists of metal1-3 connected in parallel to decrease ohmic loss. The substrate distanceis The secondary winding consists of metal 1-3 connected inparallel to decrease ohmic loss.

A cross section of the substrate and metal layer stack is shown in Fig. 4.

352

The lumped low-order equivalent model (Fig. 5) describes the electricalbehaviour of the monolithic integrated lumped transformer. This sectiongives the background details about extraction of all elements used in theequivalent-circuit.

Inductance and Series Resistance

Transformers composed of straight conductors can be treated with thesummation of self- and mutual-inductances of all individual conductor el-ements. The whole transformer geometry built up of straight conductorsis the input to the FEM-core FastHenry [10]. The exact modeling of the

2.2. Parameter Extraction

Limits of the Transformer Model

In general the transformer model is valid down to dc. The upper frequencylimit of the model depends on the transformer geometry. For valid simu-lation results the maximum outer dimensions of the transformer must be

the guided wavelength. In most cases the upper limit of the proposedmodel is about 3/2 times the self resonant frequency of the transformer.

353

planar construction is an important task for an accurate inductance ex-traction. The exact modeling of the layer construction is less importantfor the inductance calculation. Each inductance is coupled mu-tually with every other inductance, denoted by the coupling coefficients

where is the extracted mutual inductance. Ohmiclosses in the conductor material due to skin effect, current crowding andfinite conductivity are modeled by the series resistances at thefrequency of operation of the transformer.

Capacity Extraction

Capacities are difficult to determine accurately and capacitive effects arebest investigated in mesh point analysis. The exact modeling of the layerconstruction is important to get accurate results. In order to reach shortprocessing times only a small part of the transformer’s cross section is theinput to the FEM-core FastCap [11]. The static specific capacities from

354

primary to secondary primary and secondary to substrate areextracted. The capacities of the transformer are

and where and are the mean perimetersof the included transformer turns. In the case of a circular transformer asshown in Fig. 3 they are calculated as

to of Fig. 5 are determined asand The sum of the capacities

for each winding is the static capacity and to the substrate.The parasitic capacitive coupling between primary and secondary wind-

ing are determined as

Substrate Loss

Fig. 6 shows a conductor (i.e. a turn of a transformer) suspended in adielectric. Capacitive coupling causes a current flow down to the groundplane shown in Fig. 6 as lines of constant current density.

From Fig. 6 is clear that the current-feed-in area at the substrate edgehas a greater width than the physical width W. We define a effective feed-in width Weff depending on the distance and conductor height Tusing the approximation:

355

The specific resistance in from a single conductor to groundas shown in Fig. 6 can be written as

The error of (4) is always smaller than 3 % in the range ofof a complete transformer winding is based on (3) and (4) where

W is the complete width of the primary or secondarywinding as shown in Fig. 3. for the primary windingcan be written as

and similar for the secondary winding, respectively. toof Fig. 5 are determined as and

More detailed informationon modelling and parameter extraction of monolithic transformers can befound in [12].

2.3 Transformer Model Verification

The transformer is placed on silicon using two test structures includingdeembedding structures to measure the scattering parameters. One teststructure is used to evaluate the primary-to-secondary transmission coef-ficient, where one input terminal of the primary and secondary winding isgrounded, respectively. The second test structure is used to characterizethe primary winding and secondary winding separately, where the oppositewinding is left open, respectively. The center taps are always left open. Ingeneral, a 4-port measurement setup would give a little bit more accuracy,but would require much more measurement efforts.

The equivalent circuit of the high coupling performance transformer isshown in Fig. 7. All parameter values are extracted by using the methoddescribed in Section 2.2. Node 1 is connected to the substrate. The valuesof the primary and secondary self inductance are

The strength of magnetic coupling between primary and secondary sidedenoted by the k-factor is

The series resistance of the conductors on the primary side isand on the secondary side Tue due the greater distance tothe substrate the parasitic capacity of the primary windingis less than the capacity of the secondary winding Thesubstrate resistances of both windings are in the same range of about

Fig. 8 shows the measured and simulated reflection S11 and S22 of thehigh coupling performance transformer. Measurement and model showsexcellent agreement up to 5 GHz. Fig. 9 shows S21. The insertion loss isabout 9 dB at 1.9 GHz. The difference between simulation and model isnegligible. The S-parameters describe the electrical behavior of a mono-lithic transformer completely. But, not only the scattering parametersmust be observed. Also the Z-parameters, Y-parameters, and Q-factor derived directly from the S-parameters give a fundamental insightto the transformer’s characteristic.

356

Fig. 10 shows primary inductance and secondary as a function offrequency. The self inductances are analyzed using

357

358

The simulated and measured self resonance is at 4 GHz.Analyzing the coupling coefficient as a function of frequency the rela-

tions

are useful. Then the coupling coefficient can be written as

Fig. 11 shows the coupling coefficient versus frequency. A of 0.9 at1.9 GHz is a very high value for monolithic lumped planar transformers.

Especially which represents the input impedance of the secondaryshort-circuit transformer, becomes significant importance because of thelow input impedance of the driver stage and output stage of the poweramplifier. Fig. 12 shows the real part of the measured and simulated realpart of and Fig. 13 shows the imaginary part. Simulation andmeasurement agrees very well up to 3 GHz. The quality factor of the

transformer with the secondary winding open circuit and short circuit

359

can be analyzed using the following expressions

and of this transformer at 2 GHz. In most cases theupper limit of the model is about 2/3 times the self resonant frequency ofthe transformer.

2.4 Transformer Tuning

In many applications, i.g. input matching and interstage matching of apower amplifier, a high current transfer ratio of the on-chip transformer isdesired. In contrast to an ideal transformer the current transfer ratio of alossy transformer is not equal to the value of the turn ratio. Fig. 14 showsa secondary short-circuit transformer. It consists of a primary windingand a secondary winding and are mutually coupled, denotedby the In most cases the input impedance of the driver stageand the output stage is very low. Therefore, the secondary winding of thetransformer in Fig. 14 is short-circuit, but without loss of generality. Theohmic loss of the primary winding ohmic loss the secondary winding

and the input impedance of the transistors (assumed real valued) areconsidered by the admittance G. The transformer is connected as a parallelresonant device using the capacitor C.

Then the resonant frequency of the tuned transformer can be de-rived as

360

The quality factor Q of the resonant circuit is

The inner current transfer ratio of the ideal transformer is

Now the total current transfer ratio of the parallel resonant trans-former can be expressed by

This relation shows, that in contrast to the untuned transformer, thetotal current transfer ratio can be increased by a quality factor of Q > 1.

3 A MONOLITHIC 2.5 V, 1 W SI-BIPOLAR POWER AMPLI-FIER WITH 55 % PAE AT 1.9 GHZ

This section presents a circuit design using the transformer described inSection 2. Fig. 15 shows the schematic diagram of the power amplifier for

1.9 GHz. The circuit consists of a transformer X1 as input-balun, a driverstage T1 and T2, a transformer X2 as interstage matching network and apower output stage T3 and T4. The transformers X1 and X2 are of thesame kind (Sect. 2). X1 is connected as a parallel resonant device using the

361

MOS capacitor (Sect. 2.4). The transformer acts as balun aswell as input matching network. The interstage power transformer X2 isconnected as a parallel resonant device using andare realized using two MOS capacitors connected in antiseries, respectively.

The effective emitter area of the driver stage is two times Theemitter area of the output stage is two times The bias operatingpoint of the driver stage and the output stage is adjusted using the currentmirrors R1, D1 and R2, D2 respectively, connected via the center taps ofthe transformers.

Fig. 16 shows the die photograph of the amplifier. The chip size isThe power amplifier has been fabricated in an advanced

production-near silicon bipolar technology [13]. The transistors have adouble-polysilicon selfaligned emitter-base-configuration similar to a lotof current production technologies of various companies. As only standardprocess tools are used the technology is highly manufacturable at low costs.The minimum lithographic feature size is The doping of the speed-limiting base profile is done by low-energy ion implantation and subsequentdiffusion using rapid thermal processing. This enables a final base widthof only 50 nm at an intrinsic base sheet resistance of The deviceshave transit frequencies and maximum oscillation frequencies (extractedfrom the maximum available gain) of 50 GHz and provide an ECL gatedelay of 16 ps. The collector-base breakdown voltage isand the collector-emitter breakdown voltage is A supplyvoltage of more than is possible, if low impedance driving conditionsare present [14].

The power amplifier was tested at to 2 GHz using chip-on-board packaging on a two-sided Rogers RO4003 test board. Conductiveepoxy is used for the die attach. The input of the amplifier chip is connectedvia a micro-strip line to the input signal. The supply-voltage line ofthe output stage consists of two lines, translating a lowimpedance at to the output transistors. The optimum load impedanceat is translated by a balanced odd-mode micro-strip line. Twolumped capacitors are used to set the real part and the imaginary part ofthe optimum load impedance at A compensated semi-rigidline acts as balun.

Fig. 17 shows the measured output power and efficiency as a function ofrf input power at 1.9 GHz, and as a function of power supply voltage. Thematching network is unchanged for all supply voltages and the completefrequency range. The power amplifier is operating in a pulsed mode witha duty cycle of 12.5%. The pulse width is 0.577 ms. The bias operatingcurrent, without rf excitation, is two times 20 mA at the driver stage and

362

two times 75mA at the output stage. The bias operating currents areadjusted to these values at each level of supply voltage. When operatingfrom a 1.2 V supply, the amplifier has a maximum output power of 0.22 W(23.4 dBm), and a power-added efficiency of 47 % at 1.9 GHz. At 3 V supplyvoltage, the output power is 1.4W (31.5 dBm) at a power-added efficiencyof 55 %. Fig. 18 shows the output power and PAE versus the frequencyfrom 1.8 GHz to 2 GHz.

Fig. 19 shows the two-tone intermodulation performance of the poweramplifier at 1.9 GHz and 2.5 V supply voltage. The 3rd-order output inter-

cept point is +30 dBm. The 7th-order signal-to-intermodulationratio extracted from the measurements in Fig. 19, is shown in Fig. 20.The ratio is 8.5 dB in the fully saturated region. Table 1 summarizes themeasurement results of the power amplifier.

363

4 A MONOLITHIC 3.2 W SI-BIPOLAR POWER AMPLI-FIER WITH 54% PAE AT 0.9 GHZ AND 2.8V

In this section a circuit design for 900 MHz, optimized for high outputpower at low supply voltages around 3V is presented using the trans-former described in Section 2, except that the shape of the transformer isenlarged by a factor of about two. The outer diameter of the transformeris now. Fig. 21 shows the model of the enlarged transformer. Theseries resistance of the conductors on the primary side is andon the secondary side The primary inductance is 7nH, thesecondary inductance is 1 nH. The coupling factor is The self res-onant frequency is 1.8 GHz. The frequency of operation of this transformershould be less than 1.2 GHz for good circuit performance.

Fig. 22 shows the simplified schematic diagram of the balanced 2-stagepower amplifier. The rf-part of the power amplifier consists of an on-chiptransformer X1 as input-balun, a driver stage T1, T2, two transformersX2, X3 as interstage matching network and a power output stage T3, T4.

364

The effective emitter area of the output stage is two times Theinput-transformer is connected as a parallel resonant device using two MOScapacitor connected in antiseries. The transformer acts as balun as well asinput matching network.

The interstage matching network of the power amplifier consists of twotransformers X2 and X3 connected in parallel, to get a high current transferratio at a low signal voltage swing.

To diminish break-down effects at high supply voltages a closed loopbias operating point circuit is implemented. The maximum usable outputvoltage of the driver and the power stage depends on the driving conditions[14]. Thus, the source impedance of the bias driver should be as low aspossible. The bias current of the driver stage is set by an operationalamplifier U1 and T7, T8 via the secondary center tap of X1. T5 acts ascurrent sensing device. The collector current of T5 is compared with thebias operating point reference current This closed loop ensures a low

365

impedance driving condition and a constant collector bias current over awide range of supply voltage, for the driver stage T1, T2. R1 matches theoutput characteristic (breakdown) of the sensing device T5 to the driverstage transistors T1, T2. The bias circuit of the power stage T3, T4 is ofthe same kind.

Fig. 23 shows a die photograph of the power amplifier. The chipmeasures The chip is fabricated in a standard

3-layer-interconnect silicon bipolar production technology of Infi-neon B6HF [15]. The collector-base breakdown voltage isand the collector-emitter breakdown voltage is

For measurements the chip is bonded on a FR4 test board (see Fig. 26,Fig. 24, Fig. 25). The input of the amplifier chip is connected via amicro-strip line to the input signal of The supply-voltageline of the output stage consists of two lines translatinga low impedance at to the output transistors. The optimum loadimpedance at is translated by a balanced micro-stripline. The real and imaginary part of the load impedance is determinednearly orthogonal by two capacitors. A compensated semi-rigidline acts as balun. This balun-line can be replaced by a lumped LC-balunwith slight loss of performance. A more detailed description and evaluationof performance of this matching network compared to a lumped LC balunis presented in [7].

Fig. 27 shows the output power and PAE versus input power as a func-tion of supply voltage at 900 MHz. The matching network is unchangedfor all supply voltages. At 2.8 V supply voltage an output power of 3.2 W

366

with 54% PAE is achieved. The small-signal gain is 38 dB. The maximumoutput power at 4.5V supply voltage is 7.5 W at a PAE of 57%. The col-

367

lector efficiency of the output stage is 68 % in this case. But at this highlevel of output power, load impedance mismatch can result in damage ofthe output stage. However, at an output VSWR=10 the maximum usablesupply voltage is 3.5 V. Output power and PAE versus frequency are shownin Fig. 28. The 3rd-order output intercept point is +41.3 dBm at 900 MHzand 3 V supply voltage. Table 2 gives a summary of the power amplifierperformance.

368

5 A LUMPED LC-BALUN AS OUTPUT MATCHING NET-

where is the characteristic impedance of the bridge-typecircuit. is the frequency of operation. and are assumed tobe real valued. If should be complex valued, matching is possible, but

Fig. 29 shows a lumped LC balun, which was originally used as an antennabalun [6, 16, 17]. This circuit can be used as a simple output matching net-work for push-pull type power amplifiers. However, the PAE-performanceof the power amplifier is decreased due to inappropriate impedances at theharmonic frequencies. The performance of a lumped LC balun at 900 MHzand 4 W output power is evaluated in [7].

The bridge-type circuit (Fig. 29) consists of two inductors andtwo capacitors A rf-choke coil and a dc-block capacitor is usedto feed the supply voltage.

is the balanced input impedance of the bridge. Each collector isloaded by is the load resistor, usually. L and C can be

WORK

calculated by

369

then the bridge becomes more or less imbalancedBetter performance and less sensitivity against changes in component val-ues can be achieved, if the imaginary part of the optimum load impedanceis matched separately using a simple additional transformation network(L, C or LC) connected in series or in parallel to the output of the poweramplifier.

If the inductors are replaced by a parallel resonant circuit and thecapacitors are replaced by a series resonant circuit in Fig. 29, then a lumpeddual-band LC balun, shown in Fig. 30, is available.

The circuit provides a balanced input impedance at andat Independent matching and balun conversion at two

different frequencies can be done. and can be calculated by

is a must, using the design equations above.

6 CONCLUSION

A study is presented of the electrical characteristics of lumped planar trans-formers. A precise lumped low-order equivalent model is derived from thephysical layout. Measurement and model shows excellent agreement.

For the first time, transformer-coupled push-pull type power ampli-fiers with a high performance are integrated in Si-bipolar at 900 MHz and

REFERENCES

[1] G. Rabjohn, Balanced Planar Transformers. United States Patentwith Patentnumber 4,816,784, 1989.

where and are the characteristic impedancesof the bridge at and and are assumed to be real valued.Note, that

2 GHz.

370

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

J. Long, “Monolithic Transformers for Silicon RF IC Design,” IEEEof Solid-State Circuits, vol. 35, pp. 1368–1382, September 2000.

J. McRory et al., “Transformer Coupled Stacked FET Power Am-plifiers,” IEEE Journal of Solid-State Circuits, vol. 34, pp. 157–161,February 1999.

J. Long et al., “A 5.1-5.8 GHz Low-Power Image-Reject Downconverterin SiGe Technology,” in Proceedings of the 1999 Bipolar/BiCMOS Cir-cuits and Technology Meeting, (Minneapolis), pp. 67–70, September1999.

H. Wohlmuth et al., “2 GHz Meissner VCO in Si Bipolar Technology,”in 29th European Microwave Conf., Conf. Proc., Vol. 1, (Munich),pp. 190–193, October 1999.

S. Cripps, RF Power Amplifiers for Wireless Communications. Nor-wood, MA 02062: Artech House, first ed., 1999.

Simbürger, W. et al., “A Monolithic Transformer Coupled 5 W SiliconPower Amplifier with 59 % PAE at 0.9 GHz,” IEEE Journal of Solid-State Circuits, vol. 34, pp. 1881-92, December 1999.

W. Simbürger et al., “A Monolithic 2.5V, 1W Silicon Bipolar PowerAmplifier with 55% PAE at l.9GHz,” in IEEE MTT-S InternationalMicrowave Symposium Digest, (Boston), pp. 853-856, IEEE, June2000.

MIT, FastHenry USER’s GUIDE, Version 3.0. Massachusetts Insti-tute of Technology, 1996.

MIT, FastCap USER’s GUIDE. Massachusetts Institute of Technol-ogy, 1992.

D. Kehrer, “Design of Monolithic Integrated Lumped Transformersin Silicon-based Technologies up to 20 GHz,” Master’s thesis, Tech-nical University of Vienna, Institute of Communications and Radio-Frequency Engineering, Gusshausstrasse 25/389, A-1040 Vienna, Aus-tria, December 2000.

A. Heinz et al., “A Monolithic 2.8V, 3.2W Silicon Bipolar Power Am-plifier with 54% PAE at 900MHz,” in IEEE Radio Frequency IntegratedCircuits (RFIC) Symposium Digest of Papers, (Boston), pp. 117-120,IEEE, June 2000.

[13]

[14]

[15]

[16]

[17]

Böck, J. et al, “A 50GHz Implanted Base Silicon Bipolar Technol-ogy with 35 GHz Static Frequency Divider,” in Symposium on VLSITechnology, Digest of Technical Papers, pp. 108–109, 1996.

Rickelt, M. and Rein, H.-M., “Impact-Ionization Induced Instabilitiesin High-Speed Bipolar Transistors and their Influence on the MaximumUsable Output Voltage,” in Bipolar/BiCMOS Circuits and TechnologyMeeting, (Minneapolis), pp. 54–57, IEEE, September 26-28 1999.

Klose, H. et al., “B6HF: A 0.8 Micron 25GHz/25ps Bipolar Technol-ogy for ”Mobile Radio” and ”Ultra Fast Data Link” IC Products,” inIEEE Bipolar Circuits and Technology Meeting, pp. 125–127, IEEE,1993.

A. Krischke, Rothammels Antennenbuch. Stuttgart: Franck-Kosmos,11th ed., 1995.

P. Vizmuller, RF Design Guide - Systems, Circuits, and Equations.Norwood, MA 02062: Artech House, first ed., 1995.

371

Low Voltage PA design in standard CMOS

Koen Mertens, Michiel SteyaertK.U.Leuven, ESAT-MICAS

Kasteelpark Arenberg 10B-3001 Heverlee, Belgium

Abstract

Recent years there is a trend to low voltage single supplyamplifiers. When going to lower supply voltage there is astrong decrease in output power and efficiency, and thisfor all classic available power devices such asMESFET’s, PHEMT’s, HBT’s, .... For this low supplyvoltages the standard CMOS technology can becompetitive in comparison with other technologies. Theprospect of having one technology for all the RF anddigital building blocks is very attractive. Only onetechnology has to be supported, which results in lowerproduction cost. With the aid of some selected papers wewill discuss some design aspects of CMOS PA design. Inthe last chapter a comparison with the classical devicesconcerning power and efficiency will be made.

1. Introduction

The problem of delivering output power in CMOS, results in a lot ofinvestigation on CMOS Power amplifier design. First attempts werepresented in [1]. This paper resuscitates the use of Sokal’s class Eswitching power amplifier [2], which was presented in the year 1975.It is only recently that the number of papers about CMOS poweramplifiers has increased drastically over the years. More than 90percent of the papers recently published, still use the class E amplifieras the basic topology. The reason for this lays in three facts.

373J. H. Huijsing et al (eds.), Analog Circuit Design, 373-394.© 2002 Kluwer Academic Publishers. Printed in the Netherlands.

374

The first reason is that this type of amplifier guarantees the bestefficiency, when a minimum of lumped elements is used in the outputmatching network. When we take in mind that high efficiency andlow voltage supply are not compatible, the most efficient type ofamplifier must be selected. The efficiency typically dominates thepower consumption in portable radio devices, which is directly relatedwith the battery lifetime.

The second reason is that the drain-source capacitance of thedevice can be quit high, compared to non-CMOS power amplifiers.The total drain-source capacitance is the sum of the drain sourcejunction capacitance of the device with the capacitance of the metaltraces. The metal capacitance on his own can easily be a few pF large.This can be well understand, if we know that high output powerdemands a high RMS current. The output interconnection conductingthis RMS current has to be compliant with electron migration rules,and for this reason wide traces are used. As will be seen later, the totaldrain-source capacitance is a crucial design parameter in the design.

Finally the third fact is that hot electron or Time-DependentDielectric breakdown can be avoided, because current and voltage arein the ideal case not present at the same time. This guarantees thatthere is no performance degradation during the lifetime of the device.For a save operation, the maximum allowable voltage stress over theswitch transistor is the only remaining specification of interest.

2. Driving the NMOS switch transistor

The switch transistor must be driven with the aid of a drivingcircuit. Making a good efficient driving stage to drive the capacitanceof the NMOS switch is a challenge. Using digital buffers to switch thepower NMOS transistor is not a good idea. The total power consumedby the buffers depends on the dynamic and short circuit dissipation.For a typically designed buffer, the short circuit dissipation canbecome larger than the dynamic dissipation. As shown in [3] atapering factor different from the normal ‘e’ factor, which is derivedfrom optimization towards the propagation delay, has to be chosen.Even when we use the recommended tapering factor of 11.5 thepower consumption will still be to high. A better method is to tune outthe gate capacitance of the switch transistor, lifting the broadband

375

character of the driver. The power consumption will be less, so theefficiency of the driving stage is increased. The driving signal is nowa sine-wave of two times the supply voltage. Driving the NMOSswitch with this large signal is an additional benefit over a digitalbuffer design. For a Class E amplifier driven by a sine-wave the classE conditions (Vds=0 and dVds/dt=0), at the moment of closing theNMOS switch, stay valid. This means that the switching losses fromthe off to on state can be kept minimal, and only leads to a minorperformance degradation of two percent. In a fabricated poweramplifier this is not the only source of power loss. For that reason thedifferent losses are investigated in the next section.

3. Causes of power loss

In figure 1, a simplified schematic of a class E amplifier withlosses is given.

Following values for Cshunt and Lx should to be used:

376

to satisfy the well-known class E conditions, for a certain target loadresistance R. For the calculated values the amplifier operates underpeak power output capability Further more, negative voltage andcurrent waveforms are eliminated. The target load resistance R isquite small and an upward LC impedance transformation network isemployed to transform this to By lumping the components, weeventually come to a single inductor and two capacitors. The globaldrain efficiency of the schematic can be expressed in terms ofindividual efficiencies, which are considered to be independent ofeach other. This means that for each individual loss the class E stageis supposed to be working in perfect class E conditions. This impliesthat the drop in efficiency can be assigned to the loss underconsideration, giving rise to an intermediate efficiency. For the fivelosses represented in figure 1, the intermediate efficiencies arediscussed in the following paragraphs.

3.1. Ron loss and the loss in the driving stage

Due to the ‘on’-resistance, power is consumed in the NMOStransistor, therefor the output power in the target resistor R is notequal to:

but lowered to:

The variable g used in above formulas is called the DC-current to RFvoltage transfer constant. The value for g is usually 1.862, but can becalculated by solving a set of differential equations [4]. The

intermediate efficiency is given by [5].

377

Summing the target load resistance R with the parasitic resistance ofthe excess inductor forms the actual load resistance Ra, used inexpression (5). For large output powers the actual load resistance

must be low. To guarantee a high an NMOS transistor with alarge width has to be selected. The draw back of this action is that alarge gate capacitance is formed. A large gate capacitance lowers theequivalent parallel resistance seen by the driver tank, increasing thepower consumption of the driver circuit. In publications [1,6-8], thegate capacitance is tuned out with the aid of a bond wire, hence theequivalent parallel resistance yields:

The factor used in above expression stands for the parasiticresistance of the bond wire given in When a class C amplifier

(with a typical efficiency of around ten-percent) is used forthe driver stage, than the intermediate efficiency can be written asfollows:

From the explanation above we can conclude, that if you designfor maximum output power, both intermediate efficiencies will below. Resulting in an overall low efficiency. To increase the efficiencythe specification of maximum output power must be released. This

means that the target resistance must be taken larger to increaseA PA with optimal efficiency can be achieved, when bothintermediate efficiencies can be designed equal. To do so, scaling the

378

NMOS transistor is necessary. The conclusion is that for the PA inCMOS two different approaches can be followed. They can bedesigned for maximum output power [1,6], or they can be designedfor maximum power added efficiency [7]. The power addedefficiency is defined as the output power minus the input power,divided by the supply power. Above approaches are substantialdifferent and must be well understood.

3.2. Inductor Losses

For each of the inductors, given in figure 1, the parasiticresistance can be given in Over these resistance’s a voltagedrop can be measured. In case of the DC feed inductor, this voltagedrop is lowering the supply voltage to

As the output power is proportional to the square of the DC voltageseen by the drain, above expression can be transformed in to:

for the value calculated with (1), the class E conditions no longerapplies. To fulfill the class E conditions for the actual load resistance,an adjusted excess inductance can be calculated. The expression forthe excess inductance is given by:

This leads to an intermediate efficiency for which is defined as:

379

A fraction of the output power is absorbed in the parasitic resistanceof the excess inductance The degraded output power in the targetresister R is given by converting formula (4) to

In above formula takes the influence of the excess inductance into account. In figure 2 the efficiency for the excess inductance isplotted.

The picture clearly shows that has to be small for high efficiencies.On chip inductors on today’s silicon substrates have typical a valuearound This implies that if standard CMOS is aimed for,only bonding wire inductors can be selected for the excess inductanceand the DC feed inductance. Only when going to Silicon-On-Insulatorstructures or GaAs substrates, these inductors can be made with low

380

enough resistance. For this technologies a total integrated poweramplifiers, as demonstrated in [10] and [11], can be made. However,due to the ever increasing number of metal layers [12] and Cumetalisations in deep-submicron CMOS, integration of the inductorscan become an option in the near future.

3.3 Dirac Losses due to imperfections

Practically built class E amplifier, will have tolerances on itscomponent. These non-idealities will produce a Dirac impulse in thecurrent characteristic when the switch closes, see fig. 3.

The Dirac pulse is smeared out in the time domain, because thecurrent through the NMOS has a finite response time. The result isthat voltage and current are overlapping during a short time. Anestimation of this loss can be done, by varying the component values,and measuring the power loss. An intermediate efficiency of 95% dueto the Dirac impulse is a realistic assumption.

The maximum overall efficiency of the amplifier can be foundby combining the intermediate efficiencies of the previous sections.This results in the expression:

The efficiencies that are related to the power dissipated in thetransistor are summed, all the others are multiplied. Ones again weemphasize that for a maximum efficiency design the intermediateefficiencies have to be equal.

381

3.4 Combining the efficiencies

4. Impact of the transistor junction capacitance

The junction capacitance of the transistor can not be ignored inthe design process. This non-linear capacitor can be modeled by usingthe model:

Where V is the reverse voltage over the junction, is the built-involtage of the junction, and is the zero-bias capacitance. Thetolerated current per contact area does not scale proportional with thetechnology. This keeps the drain area relative constant. The resultingzero-bias capacitance, when scaling down in technology, is thereforepractical the same. A zero-bias capacitance of can betaken as a good reference value. The grading coefficient

typical ranges from 0.55 to 0.9 for sub-micron processes. From paper[13] it was shown that the peak voltage becomes higher, for a class Eamplifier with a non-linear shunt capacitance.

382

The current waveform, the output voltage and the load networkcomponent values are unaffected. Above expression can benormalized with respect to the supply voltage. This normalizedmaximum drain voltage is plotted in figure 4 as function of

From the expression of the peak drain voltage above, a correctionfactor can be introduced: The factor 3,56 isthe normalized maximum drain voltage for an ideal class E amplifierwith infinite DC feed. E.g., for a Vdd/Vbi of 4, becomes a valuebetween 1.15 and 1.30 (see Fig. 4). This required correction hasserious consequences for the design. The amplifier has to remainunder the junction and oxide breakdown voltage of the device toguaranty the reliability of the amplifier. This means that the supply

The first method is used in paper [7]. In this design the robustness ofthe class E topology to component variations is exploited [14]. Usinga higher shunt capacitance than calculated with expression (2) lowers

while improving the power added efficiency. The normalizedmaximum drain voltage is additionally lowered, because the ratiobetween the linear capacitance and the non-linear capacitanceincreases. Raising the supply voltage, due to decreasing cancompensate for the drop in output power. Maintaining the sameoutput power as before is therefore not a problem. Figures 5 illustratethe effects of variation of the shunt susceptance for a class E amplifierwith infinite DC choke and constant supply voltage. We observe thatthe output power is lowered for increasing shunt susceptance, withouta substantial loss in efficiency. This can only be true, when the DCresistance seen by the power supply is increased. Consequently, theparameter g, which links the load resistance with the DC resistance,must be raised as well. The resulting improvement of is canceledby the influence of so leaving the term inexpression (13) almost unaffected. The improvement of the overallefficiency emanates therefore from the elevated intermediate

383

4.1. Increasing the shunt capacitance.

4.2. Method one

voltage has to be lowered, leading to less output power and efficiency.The unavoidable wiring capacitance will help to linearize the totalshunt capacitance. Making the wiring capacitance large with respectto the non-linear junction capacitance of the NMOS switch, leads toan optimal performance. So, ones again the sizing of the switch isrequired. To maintain a low Ron resistance, methods have to be usedfor increasing the shunt capacitance. This is certainly true when goingto higher frequencies. For higher frequencies the shunt capacitancefrom (2) becomes smaller, and as a result the non-linear shuntcapacitance plays a more dominant roll. Two possible ways forincreasing the shunt capacitance will be discussed in followingsection.

384

efficiency (10). So, this design method maximizes the outputpower, the shunt capacitance and the power added efficiency.

4.3. Method two

The second method is reported in [6]. In this design aCMOS technology is used, in stead of the technology used inprevious example. When scaling down, the oxide capacitanceincreases, because the oxide thickness is lowered.

As a consequence a lower supply voltage and gate voltage has to beused, to satisfy the breakdown requirements. The current flowingthrough the switch transistor, operating in its linear region, isapproximated by

This means that for small transistor lengths more current is sourced.To exploit the better current capability of the NMOStransistor, the class E amplifier must be moved to an operation region,which favors the current property of the switch. This can be done byreducing the DC feed inductor. From [15] we know that the output

power, RMS current and shunt capacitance increases, while isreduced. We can clearly see in fig.6, that the shunt capacitance isincreased compared with the calculated value, using formula (2). Thetotal shunt capacitance of 37pF is large enough to have a good ratiobetween the wiring and junction capacitance. The achieved efficiencyof 42 percent, with 0.9 Watt of output power, points to a design that isoptimized for maximum output power. The design achieves the samemaximum output power and efficiency than with a CMOStechnology, while the supply voltage is lowered to 1.8 volts.

In case the linear bulk capacitance dominates over the non-linear bulk capacitance, the maximum peak-drain voltage chances

385

5. Voltage compensation for Ron

from

into

The effective voltage used in expression (20) is the DC drain voltageminus the voltage drop over the on-resistance, when the switch is

closed. Writing the effective voltage in function of results in

Form (21) we know that the effective voltage is always lower thanVdrain,dc. The result is that the supply voltage can be taken higher,for the same voltage stress of the device. For the CMOStechnology used in [7], the maximum allowable peak drain voltage issituated around 7.5 volt. For a and a Ron of

the maximum permitted supply voltage is set to 2.3 Volt.Following values were used to calculate the output power andefficiency for one side of the differential structure of figure 8:Vdd=2.3V; g=2.2;(L=1.8nH); Thecalculated output power equals 0,52 Watt with an efficiency of 66%.Due to the differential structure the total output power is multiplied bytwo, which gives a total output power of 1,04 Watt.

386

An additional trick, for raising the total efficiency, is the use of adifferential cross-coupled pair as output stage. The benefit of the useof such a structure is clearly demonstrated in the 1.9GHz designpublished in [8]. The inner transistors are driven with a gate voltageof 3.6 times the supply voltage. This means that the inner transistorsare roughly three times better than the outer transistors. The on-resistance of the transistor is therefore lowered, while the gatecapacitance and non-linear shunt capacitance are kept as low aspossible. The differential output is not a handicap, because in portableapplications the amplifier can be placed close to the antenna.

387

6. Cross-coupled output stage

Consequently the output power and efficiency calculated withformulas (12) and (13) approaches closely the measured peak outputpower and PAE of figure 7. Notice that in the measurement the supplyvoltage is plotted up to the maximum allowable supply voltage of 2.3volt.

388

7. Frequencies above 1GHz

In spite of some differences between the 700MHz design [7] andthe 1.9GHz design [8], following observation can be made. In the700MHz design a 1.8nH inductor is used for the driver stage. Whenmoving this design to a frequency of 1.9GHz, the new inductor valuefor the driver stage becomes equal to 0.25nH. Such an inductancevalue is difficult to manufacture. In the 1.9GHz design, see fig. 9, aninductor value of 0.37nH is used instead. Hence, for operating at thesame resonance frequency, a smaller switch transistor must beselected. Comparing the outer transistor sizes used in the two designs,reveal that this action has taken place. The transistor width is reducedfrom 6,5 mm to 3,6 mm. This reduction has a large effect on theoverall efficiency, because the product of and decreases. Thepower added efficiency for the 1.9GHz design is lowered to 48%,while an output power of 1.1 Watt is achieved.

The 0.37nH inductors, in figure 9, are made by placing bonding wiresin parallel. The inductance value for parallel bonding wires is givenby:

389

The term represents the mutual coupling between two inductors.When the current through two neighboring bonding wires flows inopposite direction, a negative sign for this term is achieved. A lowerinductance and a higher parasitic resistance for the inductor are aresult. The maximum number of bonding wires, that can be placed, islimited by the parasitic capacitance of the bonding pads. Achievingvery small inductance values requires therefore additional measures.In the 1.9GHz design the die was thinned to lowering thedistance between the die and the substrate.

Also, the consumed chip area is a factor 1,5 to 3 times larger, thanthat of a discrete power amplifier. The CMOS PA designed for

8. CMOS versus other technologies

Since high efficiency CMOS PA are non-linear class E topologies,they can only be compared with other non-linear PA. In table 1 asummary, for a few leading Class E amplifiers, is given. The tablereveals that the output power for a fully Integrated Power Amplifier isabout 6 to 9 dBm lower, compared with a discrete power amplifier.

390

Due to the high non-linearity the operation of the CMOS PA isrestricted to constant envelope modulation scheme, such as GMSK.Two linearization schemes can be used, to solve this problem.

The first method is the use of the LINC topology. In this methodtwo matched power amplifiers are combined. Each of the nonlinearamplifiers produces a sine-wave with a certain phase. When a couplersums the sine-waves, the phase relation between the two signalsdefine the amplitude of the resulting output signal. The problem is toisolate the two outputs of the amplifiers in a sufficient manner. Onlywhen a micro-strip coupler is used enough isolation can be achieved.The size of the micro-strip coupler is given by the used frequency, forfrequencies lower than 2GHz this micro-strip structure is too big to bepractical. Another disadvantage is that the resulting efficiency isreduced to half of the efficiency of a single power amplifier [17].

The second method seems to be more elegant. In the EnvelopeEstimation and Restoration technique, see figure 10, the power supplyis modulated to produce different output powers (amplitudes) [18].We notice, from figure 7, that the relation between output power andsupply voltage for a class E amplifier is not linear. A fraction of theoutput power is therefore fed back to the input of a differentialamplifier. There it is compared with the wanted output signal. Theresulting error signal is used to adjust the supply voltage of thenonlinear class E amplifier. A class S switching power supply makesthe supply voltage for the class E PA. This low frequency powersupply modulator can easily made in the same standard CMOStechnology as the class E amplifier.

9. Linearization

maximum efficiency [7], can be regarded equivalent to the IPA’sgiven in table 1. After all, when the on-chip inductors where replacedby off-chip inductors the output power and efficiency would increaseto equivalent values. The benefit offered by technologies such asGaAs and SOI would of course vanish. Hence, for higher outputpowers it would always be better, for a given supply voltage, todesign with off-chip inductors. For low voltage applications thatrequire large output powers, CMOS is the best choice.

391

The class E amplifier always operates under the recommendednominal power supply. This gives the class S amplifier enoughheadroom to produce the maximum output voltage for the class Eamplifier. The efficiency of the class S amplifier depends on thebandwidth, output swing and allowable distortion. For a smallbandwidth and low distortion good efficiencies can be achieved. Inpaper [18] an efficiency of 80% was achieved for a bandwidth of20KHz, a peak sinusoidal signal of 0.8V and a distortion of –55dBc.The resulting efficiency for the combined system of the class S andclass E amplifier will be the product of both efficiencies. When anefficiency of 42% (maximum output power design) for the Class Eamplifier is assumed, a resulting efficiency of 33.5% can be achieved.This efficiency is far better than amplifiers that use Output powerBack-Off to linearize their output power. For linear amplifiers usingorthogonal frequency division multiplexing an OBO of 6 dB must beused, to accommodate for the strong fluctuated envelope of theOFDM signal. The linear amplifier will accordingly have anefficiency below 13%. Hence, a CMOS power amplifier with EE&Ris a strong candidate for this type of applications [19]. A supplementaladvantage of using EE&R is that the amplifier and the power supplymodulator can be used and designed separately. Also othermodulations than OFDM can be used. The only restriction is thattransitions through the origin in the constellation diagram must be

392

avoided. The origin must be avoided, because the swing of the class Samplifier is limited. Following modulations can be used:(NADC/IS-54), OQPSK (CDMA). offset 8-PSK (GSM EDGE).For example [18] the maximum swing is 1.6V, setting the maximumsupply voltage equal to 2.3V. The resulting maximum output powerof 28dBm is necessary to apply with the North American DigitalCellular standard. The corresponding output power for the minimumsupply voltage of 0.7Volt is 6dBm. This gives that signals withamplitude variations of 22dBm, which is more than actual needed,can be transmitted. Figure 11 presents two constellation diagrams forthe example of [18]. A large improvement in performance, due to theEE&R, can be observed.

10. Conclusion

As explained CMOS PA’s can be designed for maximum outputpower or for maximum efficiency. Independent of the choice ofoperating point, CMOS PA’s can compete with other low voltagedesigns in other technologies. When moving to output powers higherthan 25dBm CMOS becomes the best candidate. Using moreexpensive technologies loses its attractiveness, because the inductorsfor the IPA designs can not be integrated.

QPSK Constellation of CMOS PA

References

393

[10]

D. Su and W. McFarland, “A 2.5V, 1W Monolithic CMOS RFPower Amplifier”, Proceedings of the custom integrated circuitsconference, IEEE, May 1997, pp. 189-192.N. Sokal and A. Sokal, “Class E–A new class of high-efficiencytuned single-ended switching power amplifiers”, IEEE JSSC,Vol.10, No. 3, June 1975, pp. 168-176.H. J. M. Veendrick, “Short circuit dissipation of static CMOScircuitry and its impact on the design of buffer circuits”, IEEEJSSC, Vol. sc-19, No. 4, August 1984, pp. 468-473.Chen Wen, H. Floberg and Qiu Shui-Sheng. “A New Analyticalmethod for analysis and design of class E power amplifierstaking into account the switching device on resistance”,International Journal of Circuit Theory and Applications, 27,1999, pp. 421-436F. H. Raab and N. Sokal, “Transistor Power Losses in the classE tuned power amplifier”, IEEE JSSC, Vol. sc-13, No. 6,December 1978, pp. 912-914.C. Yoo and Q. Huang, “A Common-Gate switched, 0.9W ClassE power amplifier with 41 % PAE in CMOS”, VLSICircuits Symposium, June 2000, pp. 56-57.K. Mertens, M. Steyaert and B. Nauwelaers, “A 700MHz, 1Wfully differential Class E power amplifier in CMOS”,ESSCIRC, September 2000, pp. 104-107.K. C. Tsai and P. R. Gray, “A 1.9GHz 1W CMOS Class Epower amplifier for wireless communications”, IEEE JSSC,Vol.34, No. 7,July 1999, pp. 962-970.Thomas H. Lee, The Design of CMOS Radio-Frequencyintegrated Circuits, ISDN 0521639220, Cambridge universitypress 1998, pp. 52.T. Sowlati, C. Salama, J. Sitch, G. Rabjohn and D. Smith,“Low voltage, High efficiency GaAs class E power amplifiersfor wireless transmitters”, IEEE JSSC, Vol. 30, No. 10, October1995, pp. 1074-1079.

[4]

[9]

[5]

[2]

[3]

[6]

[7]

[8]

[1]

394

[11]

[13]

[19]

Y. Tan, M. Kumar, J Sin, L. Shi and J. Lau, “A 900 MHz fullyintegrated SOI power amplifier for single-chip wirelesstransceiver applications”, IEEE JSSC, Vol.35, No. 10, October2000, pp. 1481-1486.A. Jain, W. Anderson, et. al, “A 1.2GHz Alpha Microprocessorwith 44.8GB/S Chip Pin Bandwidth”, digest of technical papersISSCC, 6 February 2001, pp. 240-241.P. Alinikula, K.Choi and S. I. Long, “Design of Class E poweramplifier with nonlinear parasitic output capacitance”, IEEEtransactions on circuits and systems _ II: Analog and digitalsignal processing, Vol.46, No. 2, February 1999, pp. 114-119.F. H. Raab, “Effects of circuit variations of the Class E tunedpower amplifier”, IEEE JSSC, Vol. sc-13, No.2, April 1978,pp. 239-247.R. E Zulinski, “Class E power amplifiers and frequencymultipliers with finite DC-Feed inductance”, IEEE transactionson circuits and systems, Vol. cas-34, No.9, September 1987,pp. 1074-1087.S. L. Wong, H. Bhimnathwala, S. Luo, B. Halali and S. Navid,“A 1W 830MHz Monolithic BiCMOS power amplifier”, digestof technical papers ISSCC 1996, pp. 52-53.S. Tomisato, K. Chiba, K. Murota, “Phase error free LINCmodulator”, Electronics Letters, Vol. 25, No. 9, April 1989, pp.576-577.D. K. Su and W. J. McFarland, “An IC for linearizing RFPower amplifiers using envelope elimination and restoration”,IEEE JSSC, Vol.33, No. 12, December 1998, pp. 2252-2258.W. Liu, J. Lau and R. S. Cheng, “Considerations on applyingOFDM in a highly efficient power amplifier”, IEEEtransactions on circuits and systems _ II: Analog and digitalsignal processing, Vol.46, No.11, November 1999, pp. 1329-1336.

[12]

[14]

[15]

[16]

[17]

[18]

Analog-Circuit-Design-Scalable

Documents

Transcript of Analog-Circuit-Design-Scalable