Lecture Notes on Mixed Signal Circuit Design by Prof Dinesh.K.sharma

Basics of Semiconductor Devices

Dinesh Sharma

Microelectronics group

EE Department, IIT Bombay

October 13, 2005

1

http://www.satishkashyap.com/


In this booklet, we review the fundamentals of Semiconductor Physics and basicsof device operation. We shall concentrate largely on elemental semiconducors suchas silicon or germanium, and most numerical values used for examples are specific tosilicon.

1 Semiconductor fundamentals

A semiconductor has two types of mobile charge carriers: negatively charged elec-

trons and positively charged holes. We shall denote the concentrations of these chargecarriers by n and p respectively. The discussions in this booklet apply to elemen-tal semiconductors (like silicon) which belong to group IV of the periodic table. Wecan intentionally add impurities from groups III and V to the semiconductor. Theseimpurities are called dopants. Impurities from group III are called acceptors whilethose from group V are called donors. Each donor atom has an extra electron, whichis very loosely bound to it. At room temperature, there is sufficient thermal energypresent, so that the loosely bound electron breaks free from the donor, leaving thedonor positively charged. This contributes an additional electron to the free chargecarriers in the semiconductor, and a positive ionic charge at a fixed location in thesemiconductor. Similarly, an acceptor atom captures an electron, thus producing amobile hole and becoming negatively charged itself. A semiconductor without anydopants is called intrinsic. An unperturbed semiconductor must be charge neutral asa whole. If we denote the concentration of ionised donors by N+

d and the concentra-tion of ionised acceptors by N−

a , we can write for the net charge density at any pointin the semiconductor as:

ρ = q(N+d − N−

a + p − n) (1)

where q is the absolute value of the electronic charge. In an unperturbed semicon-ductor, ρ will be zero everywhere. Electrons and holes are generated thermally - theavailability of energy equal to the band gap of the semiconductor results in the gen-eration of an electron - hole pair. Simultaneously, electrons and holes can recombineto annihilate each other, giving out energy which is equal to the band gap of thesemiconductor. Thus we have the reversible reaction:

e− + h+ Eg

Where Eg is the band gap energy of the semiconducor.Applying the law of mass action to the above reaction, we can write for the equilibriumconcentration of holes and electrons:

n · p = constant

The above relation applies to doped as well as intrinsic semiconductors. But for anintrinsic semiconductor,

n = p ≡ ni

Therefore, the constant in the equation connecting n and p must be n2i . Thus, for

a semiconductor in equilibrium,n · p = n2

i (2)

Since n and p are not independent, but are constrained by the above relation, we candefine a single independent variable, the Fermi potential by

ΦF ≡ KBT

qln

p

ni=

KBT

qln

ni

n(3)

2



Where KB is the Boltzmann constant, T is the absolute temperature and q is theabsolute value of the electronic charge. At room temperature, KBT/q is approxi-mately 26 mV and ni is of the order of 1010/cm3 for silicon. Now electron and holeconcentrations are given by:

n = nie−

qΦFKBT

p = nieqΦFKBT (4)

To simplify these relations, we define a dimensionless Fermi potential by:

uF ≡ qΦF

KBT= ln(p/ni) = ln(ni/n)

then:

n = nie−uF

p = nieuF (5)

Generally, a semiconductor will be doped with only one kind of impurity. Asemiconductor doped with donors will have many more electrons than holes. Thistype of semiconductor is called N type, and electrons are the majority carriers in thistype of semiconductor. Similarly, holes are the majority carriers in a semiconductordoped with acceptors and it is termed P type. If both types of dopants are present,the one present in higher concentration determines the ‘type’ of the semiconductor.The net doping is defined as the difference in the concentrations of the more abundantand the less abundant dopants.

In most practical cases, the ratio of majority to minority carriers is very high. Theconcentration of majority carriers is then very nearly equal to the net dopant concen-tration. To take a typical example, consider P type silicon with boron concentrationof 1016 atoms/cm3. This gives:

p = Na = 1016/cm3

n = n2i /p ≈ 1020/1016/cm3 = 104/cm3

p/n ≈ 1012 !

1.1 Band Diagrams

The above concepts are often visualised with the help of band diagrams. The arrange-ment of atoms in a semiconductor results in certain electron energies which are notpermitted. Thus, the energy range is divided into bands of permitted energy valuesalternating with forbidden gaps.

The highest such band which is nearly filled with electrons is called the valanceband. Unoccupied levels in this band correspond to holes. For stability, electronsseek the lowest energy level available. If a vacancy is available at a lower energy - anelectron at a higher energy will drop to this level. The vacancy thus bubbles up to ahigher level. Therefore, holes seek the highest electron energy available.

The band just above the valance band is called the conduction band. In a semi-conductor, this is partially filled. Conduction in a semiconducor is caused by electronsin the conduction band (which are normally to be found at the lowest energy in the

3



conduction band) or holes in the valance band - (found at the highest electron energyin the valance band). Band diagrams are plots of electron energies as a function ofposition in the semiconductor. Typically, the top of the valance band (correspondingto minimum hole energy) and the bottom of the conduction band are plotted. We canshow the Fermi potential and the corresponding Fermi energy(= -qΦF) in the banddiagram of silicon as a level in the band gap. We use the halfway point between theconduction and the valence band as the reference for energy and potential. Whenn = p = ni, the Fermi potential is 0 (from eq. 3) and correspondingly, the Fermienergy lies at the intrinsic Fermi level halfway in the band gap. (Actually, this levelcan be slightly away from the middle of the band gap depending on the density ofallowed states in the conduction and valance bands - but for now, we’ll ignore this).When holes are the majority carriers, ΦFis positive and the Fermi energy (= -q ΦF)lies below the mid gap level, as shown in the adjoining figure. When electrons are themajority carriers, ΦFis negative, and the Fermi energy lies above the mid gap level.

1.2 A semiconducor in the presence

EEEE

ciFv

F−qO

of an electric field

In the presence of an electric field, the elctrostatic potential is different at differentpositions.

The energy of an electron has an extra com-

X

V

E iEcE

FEv

Figure 1: Potential distributionand Band Diagram in the pres-ence of a field

ponent = −qφ where φ is the electrostatic poten-tial. Consequently in the band diagram the con-duction, valance and intrinsic levels are bent. Inequilibrium, the Fermi level is still straight. (Weshall see later that in the absence of a current, theslope of the Fermi level must vanish). Relationsfor n and p must now take the electrostatic po-tential as well as the Fermi potential into accountand the electron and hole concentrations are notuniform over the semiconductor. If we representthe concentrations of electrons and holes withoutany applied field by n0 and p0 respectively, thenin the presence of a field (but in equilibrium),

n = n0eqφ

KBT

p = p0e−

qφ

KBT (6)

where φ is the electrostatic potential.If we define a dimensionless electrostatic potential by:

u ≡ qφ

KBT(7)

we can write the above relations as:

n = n0eu = nie

(u−uF )

p = p0e−u = nie

−(u−uF ) (8)

Since there is equilibrium, even though electron and hole concentration is not uniform,the product of n and p is still constant and equal to n2

i everywhere.

4



1.3 Non-equilibrium case

The above relations assume a semiconductor in equilibrium. It is possible to createexcess carriers in the semiconductor over those dictated by equilibrium considerations.For example, if we shine light on a semiconductor, electron-hole pairs will be created.Since the value of n as well as that of p goes up, the np product will exceed n2

i , till theequilibrium is restored after the light is turned off (by enhanced recombination). If thenumber of excess carriers is small compared to the majority carriers, we may assumethat the carrier concentrations are still described by relations like those given above.However, the concentrations of electrons and holes are not constrained by relation(2)any more. Therefore, we cannot use the same value of uF for describing electron aswell as hole concentrations. We now have separate values of ΦF for electrons and holes.These are called quasi Fermi levels (or imrefs) for electrons and holes, ΦFn

and ΦFp,

defined by the relations

n = nie(u−uFn )

p = nie−(u−uFp) (9)

Where uFnand uFp

are the dimensionless versions of quasi Fermi levels ΦFnand

ΦFpdefined as in equation(7)). The np product is now given by

np = n2i e

(uFp−uFn) (10)

and is no longer constant. Because the number of additional carriers is assumed to besmall compared to the majority carriers, the concentration of majority carriers andhence its quasi Fermi level is very close to the equilibrium value. The relative changein the concentration of minority carriers could, however, be large and consequently theminority carrier quasi Fermi level could be substantially different from the equilibriumFermi level.

2 The p-n diode

We shall analyse the abrupt pn junction, in reverse and forward bias.We assume that the doping density is constant

dpX dnX

NP

EEE

E

cFi

vP

N

Figure 2: The abrupt p-n junc-tion

and its value = Na on the P side and Nd onthe N side, changing abruptly at the metallurgi-cal junction as shown. Because there is a strongconcentration gradient for electrons and holes atthe junction, there will be a diffusion current ofholes towards the N side and of electrons towardsthe P side. As these carriers leave behind ioniseddopants, small regions on either side of the junc-tion acquire a charge. The P side, from wherepositively charged holes have left, (leaving behindnegatively charge acceptor ions), acquires a neg-ative potential. Similarly, the N side becomespositively charged. The regions from where mo-bile charges have left, are called depletion regions.

The potential difference resulting from this charge redistribution (called the built-involtage) opposes further diffusion of carriers. A dynamic equilibrium is reached whenthe drift current due to this potential difference and the diffusion current due to the

5



concentration gradient become equal and opposite. In equilibrium, The electron aswell as hole currents must be zero individually (principle of detailed balance). Writingthe electron and hole current densities as sums of their respective drift and diffusioncurrent densities:

Jn = nqµn(−∂φ

∂x) + qDn

∂n

∂x

Jp = pqµp(−∂φ

∂x) − qDp

∂p

∂x(11)

From equation(9)

∂n

∂x= nie

(u−uFn) ∂

∂x(u − uFn

)

∂p

∂x= nie

(uFp−u) ∂

∂x(uFp

− u)

or

∂n

∂x= n

q

KBT

∂

∂x(φ − ΦFn

)

∂p

∂x= p

q

KBT

∂

∂x(ΦFp

− φ)

Using Einstein relations ( qKBT

D = µ), and Substituting in the relations for Jn and Jp,

Jn = −nqµn(∂φ

∂x) + nqµn

∂

∂x(φ − ΦFn

)

Jp = −pqµp(∂φ

∂x) − pqµp

∂

∂x(ΦFp

− φ)

Which leads to

Jn = −nqµn∂ΦFn

∂x;

Jp = −pqµp

∂ΦFp

∂x; (12)

When there is no flow of current, ΦFn= ΦFp

= ΦF. according to the relations derivedabove, the derivative of ΦFmust vanish everywhere for zero current. Thus, the Fermilevel is constant and the same at the two sides of the junction. The Fermi potentialsbefore being put in contact were:

ΦF = KBTq

ln(Na/ni) P side : x < 0

ΦF = −KBTq

ln(Nd/ni) N side : x > 0

The Fermi potential difference was, therefore, KBTq

ln(

NdNa

n2i

)

. Since after being put

in contact, the Fermi levels have equalised on the two sides, the built in voltage mustbe equal and opposite to this potential, taking the P side to a negative potential andthe N side to a positive potential. We can write for the magnitude of the built involtage:

Vbi =KBT

qln

(

NaNd

n2i

)

(13)

6



2.1 pn Diode in Reverse Bias

The diode is reverse biased when we apply a voltage such that the n side is morepositive as compared to the p side. In this case, the applied voltage is in the samedirection as the built-in field, which opposes the movement of majority carriers andwidens the depletion regions on either side of the junction. We analyse the reversebiased diode by making the depletion approximation. We assume that in reverse bias,the depletion regions have zero carrier density, and the field is completely confined todepletion regions. Solving Poisson’s equation in P region (x < 0) and the N region(x > 0)

∂2φ

∂x2= qNa

εsi(for x < 0)

∂2φ

∂x2= − qNd

εsi(for x > 0)

Integrating with respect to x

∂φ

∂x= qNa

εsix + c1 (for x < 0)

∂φ

∂x= − qNd

εsix + c2 (for x > 0)

where c1 and c2 are constants of integration, which can be evaluated from the conditionthat the field vanishes at the edge of the depletion regions at -Xdp and at Xdn. Thisleads to

∂φ

∂x= qNa

εsi(x + Xdp) (for x < 0)

∂φ

∂x= − qNd

εsi(x − Xdn) (for x > 0) (14)

Since the value of the field must match at x = 0;

NaXdp = NdXdn (15)

Integrating equation (14) once again with respect to x, we get

φ = qNa

εsi

(

x2

2+ Xdpx

)

+ c3 (for x < 0)

φ = − qNd

εsi

(

x2

2− Xdnx

)

+ c4 (for x > 0)

Where the constants of integration c3 and c4 can again be evaluated from the boundaryconditions at -Xdp and Xdn. If we require that the potential is 0 at -Xdp and V at Xdn,

c3 =qNa

2εsiX2

dp

c4 = V − qNd

2εsiX2

dn

Substituting these values, we get:

φ = qNa

εsi

(

x2+X2dp

2+ Xdpx

)

(for x < 0)

φ = V − qNd

εsi

(

x2+X2dn

2− Xdnx

)

(for x > 0) (16)

7



Since the potential at x = 0 should be continuous,

qNa

2εsiX2

dp = V − qNd

2εsiX2

dn

so, V =q

2εsi

(NaX2dp + NdX

2dn) (17)

making use of equation (15), we can write

V =qNaX

2dp

2εsiNd(Nd + Na)

=qNdX

2dn

2εsiNa

(Nd + Na)

which leads to

Xdp =

√

2εsiV

q(Nd + Na)

Nd

Na

Xdn =

√

2εsiV

q(Nd + Na)

Na

Nd(18)

From which the total depletion width can be calculated as:

Xd ≡ Xdp + Xdn =

√

2εsiV

q(Nd + Na)

(√

Nd

Na+

√

Na

Nd

)

which gives

Xd =

√

2εsiV

q

(

1

Na

+1

Nd

)

(19)

The voltage V in the above expressions is the total voltage across the junction. Sincethere is a reverse bias of Vbi for a zero applied voltage, that will add (in magnitude)to the applied reverse voltage. Using equation(13) we can write:

V = Vbi + Vappl = Vappl +KBT

qln

(

NaNd

n2i

)

(20)

3 The pn diode in forward bias

If we apply an external voltage, such that the P side is made positive with respectto the N side, the applied voltage will reduce the built in voltage across the junction.The magnitude of the built-in voltage is such that it balances the drift and diffusioncurrents, resulting in zero net current. But if the voltage across the junction is reduced,a net current will flow through the diode. This is the forward mode of operation.Because of this flow of current, electrons are injected into the P side and holes intothe N side. Consequently, the concentration of carriers is no longer at the equilibriumvalue. We denote the equilibrium value of electron and hole concentrations on P andN side by np0

, nn0, pp0

, pn0respectively. Since the majority carrier concentration in

equilibrium is equal to the doping density, we have:

nn0≈ Nd, pp0

≈ Na and np0= n2

i /Na, pn0= n2

i /Nd

8



According to equation(10)np = n2

i e(uFp−uFn)

As we make the potential of P type more positive compared to N type, the np productin forward bias is greater than n2

i . From relations(12), we see that the change in quasiFermi levels is small wherever the carrier concentration is high. Thus, we can assumethat the quasi Fermi levels of the majority carriers at either side of the junction remainat their equilibrium values. Hence the voltage across the junction is given by

V = φFp− φFn

and therefore the non-equilibrium np product is given by

np = n2i e

(

qV

KBT

)

therefore,

np =n2

i

pp

e

(

qV

KBT

)

= np0e

(

qV

KBT

)

pn =n2

i

nne

(

qV

KBT

)

= pn0e

(

qV

KBT

)

(21)

(22)

The continuity equation for any particle flow can be written as

∇.(particle current dencity) = − ∂

∂t(particle concentration)

Applying it to electron and hole currents in 1 dimension on the n side,

∂

∂x

(

Jn

−q

)

= U

∂

∂x

(

Jp

q

)

= U

where U is the net recombination rate. Using relation(11), we have

∂

∂x

(

nnµn∂φ

∂x− Dn

∂nn

∂x

)

= U

∂

∂x

(

pnµp∂φ

∂x+ Dp

∂pn

∂x

)

= U

or

µn∂nn

∂x

∂φ

∂x+ µnnn

∂2φ

∂x2− Dn

∂2nn

∂x2= U

µp∂pn

∂x

∂φ

∂x+ µppn

∂2φ

∂x2+ Dp

∂2pn

∂x2= U

Assuming the regions outside the small depletion regions to be charge neutral,

(nn − nn0) ≈ (pn − pn0

)

9



We define ambipolar diffusion and lifetime by the relations

Da ≡ nn + pn

nn/Dp + pn/Dp(23)

τa ≡ pn − pn0

U=

nn − nn0

U(24)

multiplying the electron continuity equation with µppn and the hole continuity equa-tion with µnnn and combining, we get

−pn − pn0

τa+ Da

∂2pn

∂x2+

nn − pn

nn/µp + pn/µn

∂pn

∂x

∂φ

∂x= 0 (25)

If we make the low injection assumption (pn << nn ≈ nn0), this reduces to

−pn − pn0

τp+ Dp

∂2pn

∂x2+ µp

∂pn

∂x

∂φ

∂x= 0 (26)

In the neutral region, ∂φ∂x

is zero, so the above simplifies further to

∂2pn

∂x2− pn − pn0

Dpτp

= 0 (27)

This can be solved with the boundary condition given by relation(21) and noting thatpn = pn0

at x = ∞ to give:

pn − pn0= pn0

(

eqV

KBT − 1)

ex−xn

Lp (28)

whereLp ≡

√

Dpτp (29)

Evaluating the hole current at Xdn, we get

Jp = −qDp∂pn

∂x=

qDppn0

Lp

(

eqV

KBT − 1)

(30)

Similarly, we can evaluate the electron current on the p side as

Jn = qDn∂np

∂x=

qDnnp0

Ln

(

eqV

KBT − 1)

(31)

which gives the total current density as

J = Jp + Jn = Js

(

eqV

KBT − 1)

(32)

Where Js ≡ qDppn0

Lp+

qDnnp0

Ln(33)

4 The MOS Capacitor

It is important to understand the MOS capacitor in order to understand the behaviourof the the MOS transistor. Before we describe the MOS structure, it is useful to reviewthe basic electrostatics as applied to parallel plate capacitors. We shall then go on toanalyse the MOS structure.

10



4.1 The Parallel Plate Capacitor

The parallel plate capacitor consists of two parallel metallic plates of area A, separatedby an insulator of thickness ti and dielectric constant ε. If we place a charge Q on theupper plate, it attracts charges of opposite sign in the bottom plate, while repellingcharges of the same sign.

If the bottom plate is connected to ground, the repelled charge flows to ground.Now the two capacitor plates hold equal and opposite charge. This charge resides justnext to the insulator on either side of it. This is true, whatever the quantity or sign

of charge placed on the upper plate. The inducing and induced charge are alwaysseparated by the thickness of the insulator, ti. Therefore this structure has a constant

capacitance given by:

Ctotal =Aε

ti

Since there are no charges inside the dielectric, the electric field in the insulator isconstant and the electrostatic potential changes linearly from one plate to the other.

4.2 The MOS capacitor

+ + + + + + + +

− − − − − − − −

Q

it−Q

In a MOS capacitor, we replace the lower plate by a semiconductor. Unlike a metal,a semiconductor can have charges distributed in its bulk.

For the sake of an example, let us consider

Metal

Insulator (Oxide)

Semiconductor

Depletion region

Metal

a P type semiconductor (Si) doped to 1016atoms /cm3.As we know, holes outnumber electrons in thissemiconductor by an extremely large factor. If weplace a negative charge on the upper plate, holeswill be attracted by this charge, and will accu-mulate near the silicon-insulator interface. Thissituation is analogous to the parallel plate capac-itor and thus, the capacitance will be the same asthat for a parallel plate capacitor. If, however, weplace a positive charge on the upper plate, neg-ative charges will be attracted by it and positivecharges will be repelled. In a P type semiconduc-tor, there are very few electrons. The negative charge is provided by the ionisedacceptors after the holes have been pushed away from them. But the acceptors arefixed in their locations and cannot be driven to the edge of the insulator. Therefore,the distance between the induced and inducing charges increases - so the capacitanceis lower as compared to the parallel plate capacitor. As more and more positive chargeis placed on the upper plate, holes from a thicker slice of the semiconductor are drivenaway, and the incremental induced charge is farther from the inducing charge. Thusthe capacitance continues to decrease. This does not, however, continue indefinitely.We know from the law of mass action that as hole density reduces, the electron densityincreases. At some point, the hole density is reduced and electron density increased tosuch an extent that electrons now become the “majority” carriers near the interface.This is called inversion. Beyond this point, more positive charge on the upper plate isanswered by more electrons in the semiconductor. But the electrons are mobile, andwill be attracted to the silicon insulator interface. Therefore, the capacitance quicklyincreases to the parallel plate value.

11



V

Cap

acita

nce

Accumulation Inversion

Depletion

Figure 3: Low frequency capacitance for a MOS capacitor

4.3 Quantitative Analysis

Consider a one dimensional representation of the MOS structures as shown in thefigure below.

The origin is assumed to be at the silicon-oxide

Xo

M O S Minterface and the positive x direction is into thebulk of silicon. Using a one dimensional analysis,we want to relate the semiconductor charge to theapplied gate voltage. In a practical case, there is apotential difference between two dissimilar mate-rials in contact. Also, the silicon - oxide interfacewill have some fixed charge sitting there. How-ever, we consider the ideal case first - where there

is no built in contact potential between the semiconductor and the metal, and thereis no interface charge.

4.3.1 Ideal Case

Let the back surface of Si be at zero potential and the voltage applied to the gateterminal be Vg. Let the electrostatic potential at any point x be denoted by φ(x) andlet the potential at the silicon-oxide interface be φs.

We construct a Gaussian box passing through

M O S M

Gaussean Box

the interface and extending to +∞. Accordingto Gauss law, the integral of the outward point-ing D vector around the box should be equal tothe charge contained inside. The only boundarywhere D is non zero is the one passing throughthe interface. Therefore,

Area × εoxφs − Vg

tox= Total Charge in silicon

If we define Qsi to be the semiconductor chargeper unit area, and Cox to be the parallel plate capacitance per unit area, we get

Vg = φs −Qsi

Cox

Thus, the surface potential and the applied gate voltage can be related to each other.If the surface potential is known, we can evaluate the semiconductor charge by inte-grating the Poisson’s equation in the semiconductor, once.

12



We can write the Poisson’s equation in the semiconductor as

∇ · D = ρ

or

−εsi∂2φ

∂x2= q(N+

d − N−

a + p − n)

Since the electrostatic potential is dependent only on x, we can change partial deriva-tives to total derivatives.

−d2φ

dx2=

d

dx

(

−dφ

dx

)

=d

dx(E)

where E is the electrostatic field. Changing the variable from x to φ.

−d2φ

dx2=

dEdx

=

(

dφ

dx

)

d

dφ(E) = −E d

dφ(E) = −1

2

d

dφ

(

E2)

If we defineu ≡ βφ where β ≡ q

KBT

We get

−d2φ

dx2= −1

2

d

dφ

(

E2)

= −β

2

d

du

(

E2)

(34)

The right hand side of the Posson’s equation represents the charge density. In theabsence of an applied voltage, this must be zero everywhere. Therefore,

q(N+d − N−

a + p0 − n0) = 0

where p0 and n0 represent the hole and electron density in the absence of an appliedfield. therefore,

N+d − N−

a = −(p0 − n0)

Sustituting equation(34) and the above in the Poisson’s equation,

−βεsi

2

d

du

(

E2)

= q [p − p0 − (n − n0)]

sod

du

(

E2)

= −2qp0

βεsi

[

p

p0

− 1 − n0

p0

(

n

n0

− 1)

]

From equation(8)n = n0e

u and p = p0e−u

So,d

du

(

E2)

= −2qp0

βεsi

[

e−u − 1 − n0

p0

(eu − 1)

]

This can be integrated from x = ∞ (where E = 0 and u = 0) to x to give

E2 =2qp0

βεsi

[

e−u − 1 + u − n0

p0

(eu − 1 − u)

]

Therefore

E = ±√

2qp0

βεsi

[

e−u − 1 + u − n0

p0(eu − 1 − u)

] 12

13



And thus, the displacement vector D can be evaluated as:

D = εsiE = ±√

2qp0εsi

β

[

e−u − 1 + u − n0

p0(eu − 1 − u)

] 12

(35)

This equation permits us to calculate D (= − εsi

β∂u∂x

) from u. In fact if u is very small,the exponentials in u can be exapanded to second order. The first two terms cancelwith 1 and u, leaving

∂u

∂x' ∓

√

qβp0

εsi

(

1 +n0

p0

)

u

if we take n0 << p0, we get exponential solutions for u with a characteristic lengthLD =

√

εsi

qβp0This implies that small local perturbations in potential tend to decrease

exponentially, with this characteristic length. This length is known as the extrinsicDebye Length.

By putting u = us in eq. 35, we get the D vector at the surface. We constructa Gaussean box passing through the interface and enclosing the semiconductor (asdesribed in section 4.3.1) The charge contained in the box is then the integral ofthe outward pointing D vector over the surface of the box. D is non zero only atthe interface. The outward pointing D is along the negative x axis. Therefore byapplication of Gauss theorem,

Sem. Charge = Area × (−D)

Hence the charge in the semiconductor per unit area is:

Qsi = ∓√

2εsi

βLD

[

e−us − 1 + us +n0

p0(eus − 1 − us)

] 12

(36)

where us ≡ βφs

β ≡ q

KBT

and LD ≡√

εsi

qβp0= The Extrinsic Debye Length

Notice that Qsi is the charge in the semiconductor per unit area. In this treatment,we shall use symbols of the type Q and C with various subscripts to denote the cor-responding charges and capacitance values per unit area. Qsi consists of mobile aswell as fixed charge. The mobile charge is contributed by holes when us < 0 andby electrons when us > 0 (for a P type semiconductor). As we shall see later, themobile electron charge is substantial only when the positive surface potential exceedsa threshold value.

The fixed charge is contributed by the depletion charge when the surface potentialis positive. The depletion charge per unit area can be calculated by the depletionformula.

Qdepl = −qNaXd =√

2qNaεsiφs (φs > 0)

A somewhat more accurate expression for depletion charge accounts for slightly lowercharge density at the edge of the depletion region by subtracting KBT/q from φs.

Qdepl = −qNaXd =√

2qNaεsi(φs − KBT/q) (φs > KBT/q) (37)

14



Gate Voltage (V)

Qtotal

QDepl.

Maj. CarrierCharge

Abs

. Sem

. Cha

rge

(C/c

m )2

1e−09

1e−08

1e−07

1e−06

1e−05

−0.4 −0.2 0 0.2 0.4 0.6 0.8 11e−09

1e−08

1e−07

1e−06

1e−05

−0.4 −0.2 0 0.2 0.4 0.6 0.8 1

Figure 4: semiconductor charge as a function of surface potential

Calculated values for the total semiconductor charge per unit area (ie. inclusiveof depletion and mobile charge) and just the depletion charge per unit area havebeen plotted in figure 4 for a P type semiconductor doped to 1016/cm3. For smallpositive surface potential, the total semiconductor charge contains only depletioncharge. However, beyond a surface potential near 2ΦF, the total charge exceeds thedepletion charge very rapidly. This additional charge is due to mobile minority carriers(in this case, electrons).

4.3.2 Practical case

A practical MOS structure will differ from the ideal case assumed above in a fewrespects. There is a built-in potential difference between the metal used and Si, dueto the difference between their work functions. This shifts the relationship betweenVgand φs. Also, there is a fixed oxide charge which resides essentially at the silicon-oxide interface. Thus, the total charge in the Gaussian box includes this fixed chargeand the semiconductor charge. These two non-idealities can be accounted for bymodifying the relationship between Vgand φsto be

Vg = Φms + φs −Qsi + Qox

Cox(38)

Where Φms is the metal to semiconductor work function difference.Figure 5 shows the surface potential as a function of applied voltage for a MOS

capacitor with oxide thickness of 22.5 nm, substrate doping of 1016/cc, oxide chargeof 4 × 1010q and aluminium as the gate metal. The surface potential changes quiteslowly as a function of gate voltage in the accumulation and inversion regions.The absolute value of semiconductor charge has been plotted as a function of applied

gate voltage in figure 6. (The charge is actually negative for positive gate voltages).As one can see, for small positive gate voltages, the entire semiconductor chargeis depletion charge. As the voltage exceeds a threshold voltage, the total chargebecomes much larger than the depletion charge. The excess charge is provided bymobile electron charges. This is the inversion region of operation, where electronsbecome the majority carriers near the surface in a p type semiconductor. Notice that

15



0.0 2.0 4.0−2.0−4.0GATE VOLTAGE (V)

0.2

0.4

0.8

1.0

0.0

−0.2

0.6

Surf

ace

Pote

ntia

l (V

)

Figure 5: Surface potential as a function of gate voltage

Q depletion

Qtotal

Qinv

Gate Voltage (V)

Abs

. Sem

. Cha

rge

(C/c

m )2

1e−09

1e−08

1e−07

1e−06

−2 −1 0 1 2 3 4 51e−09

1e−08

1e−07

1e−06

−2 −1 0 1 2 3 4 5

Figure 6: Semiconductor charge as a function of gate voltage

the depletion charge is practically constant in this region. This region begins whenthe surface potential exceeds 2ΦF.

5 The MOS Transistor

Inversion converts a p type semiconductor to n type at the surface. We can use thisfact to construct a transistor. We place semiconductor regions strongly doped to Ntype on either side of a MOS capacitor made using P type silicon. Now if we try

P type Sin+ n+

DS GATE

Figure 7: A MOS Transistor

to pass a current between these two N regions when inversion has not occurred, weencounter series connected NP and PN diodes on the way. Whatever the polarity ofthe voltage applied to pass current, one of these will be reverse biased and practicallyno current will flow.

16



However, after inversion, the intervening P region would have been converted toN type. Now there are no junctions as the whole surface region is n type. Currentcan now be easily passed between the two n regions. This structure is an n channelMOS transistor. PMOS transistors can be similarly made using P regions on eitherside of a MOS capacitor made on n type silicon. When current flows in an n channeltransistor, electrons are supplied by the more negative of the two n+ contacts. Thisis called the source electrode. The more positive n+ contact collects the electrons andis called the drain. The current in the transistor is controlled by the metal electrodeon top of the oxide. This is called the gate electrode.

6 I-V characteristics of a MOS transistor

A quantitative derivation of the current-voltage characteristics of the MOS device iscomplicated by the fact that it is inherently a two dimensional device. The vertical

field due to the gate voltage sets up a mobile charge density in the channel region asseen in figure 6. The horizontal field due to source-drain voltage causes these chargesto move, and this constitutes the drain current. Therefore, a two dimensional analysisis required to calculate the transistor current, which can be quite complex. However,reasonably simple models can be derived by making several simplifying assumptions.

6.1 A simple MOS model

We make the following simplifying assumptions:

• The vertical field is much larger than the horizontal field. Then, the resultantfield is nearly vertical, and the results derived for the 1 dimensional analysisfor the MOS capacitor can be used to calculate the point-wise charge densityin the channel. This is known as the gradual channel approximation. Accuratenumerical simulations have shown that this approximation is valid in most cases.

• The source is shorted to the bulk.

• The gate and drain voltages are such that a continuous inversion region existsall the way from the source to the drain.

• The depletion charge is constant along the channel.

• The total current is dominated by drift current.

• The mobility of carriers is constant along the channel.

Figure 8 shows the co-ordinate system used for evaluating the drain current. The xaxis points into the semiconductor, the y axis is from source to the drain and thez axis is along the width of the transistor. The origin is at the source end of thechannel. We represent the channel voltage as V(y), which is 0 at the source end andVd at the drain end. We assume the current to be made up of just the drift current.Since we are carrying out a quasi 2 dimensional analysis, all variables are assumedto be constant along the z axis. Let n(x,y) be the concentration of mobile carriers(electrons for an n channel device) at the position x,y (for any z). The drift currentdensity at a point is

J = no. of carriers × charge per carrier × velocity

17



S DY

dy

W

L

X

Z

Figure 8: Coordinate system used for analysing the MOS transistor

= n(x, y) × (−q) × µ ×(

−∂V (y)

∂y

)

= µn(x, y)q∂V (y)

∂y

Integrating the current density over a semi-infinite plane at the channel position y (asshown in the figure 8) will then give the drain current.

Id =∫

∞

x=0

∫ W

z=0µn(x, y)q

∂V (y)

∂ydzdx

Since there is no dependence on z, the z integral just gives a multiplication by W.Therefore,

Id = µWq∫

∞

x=0n(x, y)

∂V (y)

∂ydx

the value of n(x,y) is non zero in a very narrow channel near the surface. We can

assume that ∂V (y)∂y

is constant over this depth. Then,

Id = µWq∂V (y)

∂y

∫

∞

x=0n(x, y)dx

but q∫

∞

x=0 n(x, y)dx = −Qn(y) where Q

n(y) is the electron charge per unit area in the

semiconductor at point y in the channel. (Qn(y) is negative, of course). therefore

Id = −µW∂V (y)

∂yQ

n(y)

(39)

Integrating the drain current along the channel gives∫ L

0Iddy = −µW

∫ L

0Q

n(y)

∂V (y)

∂ydy

Id × L = −µW∫ Vd

0Q

n(y)dV (y)

So, Id = −µW

L

∫ Vd

0Q

n(y)dV (y)

18



We now use the assumption that the surface potential due to the vertical field saturatesaround 2ΦFif we are in the inversion region. Therefore, the total surface potential atpoint y is V(y) + 2 ΦF. Now, by Gauss law and continuity of normal component ofD at the interface,

Cox

(

Vg − ΦMS − φs

)

= − (Qsi + Qox)

therefore,−Qsi = Cox

(

Vg − ΦMS − V (y) − 2ΦF + Qox/Cox

)

However,Qsi = Q

n+ Q

depl

So

−Qn(y) = −Qsi(y) + Q

depl

= Cox

(

Vg − ΦMS − V (y) − 2ΦF + (Qox + Qdepl

)/Cox

)

We have assumed the depletion charge to be constant along the channel. Let us define

VT ≡ ΦMS + 2ΦF −(Qox + Q

depl)

Cox

then−Q

n(y) = Cox(Vg − VT − V (y))

and therefore,

Id = µCoxW

L

∫ Vd

0(Vg − VT − V (y))dV (y)

= µCoxW

L[(Vg − VT)Vd −

1

2V 2

d ] (40)

This derivation gives a very simple expression for the drain current. However, itrequires a lot of simplifying assumptions, which limit the accuracy of this model.If we do not assume a constant depletion charge along the channel, we can apply thedepletion formula to get its dependence on V(y).

Qdepl

= −√

2εsiqNa(V (y) + 2ΦF)

then,

−Qn

= Cox

(

Vg − ΦMS − V (y) − 2ΦF

)

+ Qox −√

2εsiqNa(V (y) + 2ΦF)

which leads to

Id = µCoxW

L

[(

Vg − ΦMS − 2ΦF +Qox

Cox

)

Vd −1

2V 2

d

−2

3

√2εsiqNa

Cox

(

(Vd + 2ΦF)3/2 − (2ΦF)

3/2)

]

This is a more complex expression, but gives better accuracy.

19



6.2 Modeling the saturation region

The treatment in the previous section is valid only if there is an inversion layer all theway from the source to the drain. For high drain voltage, the local vertical field nearthe drain is not adequate to take the semiconductor into inversion. Several modelshave been used to describe the transistor behaviour in this regime. The simplest ofthese defines a saturation voltage at which the channel just pinches off at the drainend. The current calculated for this voltage by the above models is then supposedto remain constant at this value for all higher drain voltages. The pinchoff voltage isthe drain voltage at which the channel just vanishes near the drain end. Therefore,at this point the gate voltage Vg is just less than a threshold voltage above the drainvoltage Vd. Thus, at this point,

Vdsat = Vg − VT

The current calculated at Vdsat will be denoted as Idss. Thus,

Idss = µCoxW

L[(Vg − VT)2 − 1

2(Vg − VT)2]

for the simple transistor model. Thus

Idss =1

2µCox

W

L(Vg − VT)2 (41)

The drain current is supposed to remain constant at this Vd independent value for alldrain voltages > Vg − VT .

6.2.1 Early Voltage approach

Assuming a constant current in the saturation region leads to an infinite output resis-tance. This can lead to exaggerated estimates of gain from an amplifier. Therefore,we need a more realistic model for the transistor current in the saturation region.One of these is a generalisation of the model proposed by James Early for bipolartransistors. This model is not strictly applicable to MOS transistors. However, dueto its numerical simplicity, it is often used in compact models for circuit simulation.

A geometrical interpretation of the Early model states that the drain currentincreases linearly in the saturation region with drain voltage, and if saturation char-acteristics for different gate voltages are produced backwards, they will all cut thedrain voltage axis at the same (negative) drain voltage point. The absolute value ofthis voltage is called the Early Voltage VE.

The current equations in saturation mode now become:

Idss ≡ Id(Vg, Vdss)

Id = IdssVd + VE

Vdss + VE

For Vd > Vdss (42)

Any model can be used for calculating the drain current for Vd < Vdss. The value ofVdss will be determined by considerations of continuity of the drain current and itsderivative at the changeover point from linear to saturation regime. For example, if

20



we use the simple model described in eq. 40,

∂Id

∂Vd= µCox

W

L(Vg − VT − Vd) For Vd ≤ Vdss

And∂Id

∂Vd=

Idss

Vdss + VEFor Vd ≥ Vdss

Where Idss ≡ µCoxW

L

[

(Vg − VT )Vdss −1

2V 2

dss

]

On matching the value of ∂Id

∂Vdon both sides of Vdss, we get

Vdss = VE

√

1 +2 (Vg − VT )

VE− 1

In practice, VE is much larger than Vg − VT . If we expand the above expression, wefind that to first order the value of Vdss remains the same as the one used in the simplemodel - that is, Vg − VT . Expansion to second order gives

Vdss ' (Vg − VT )(

1 − Vg − VT

2VE

)

(43)

6.2.2 Simulation Model

Since the value of Vdss does not change substantially from the ideal saturation case,a simpler approach can be tried. The drain current is calculated using the idealsaturation model and its value is multiplied by a correction factor = (1 + λVd) insaturation as well as in linear regime. This automatically assures continuity of Id andits derivative. λ is a fit parameter, whose value is ≈ 1/VE. This approach is used inSPICE, a popular circuit simulation program.

21



The Design ProcessBasic HDL concepts

Concurrent and sequential Descriptions

Hardware Description LanguagesBasic Concepts

Dinesh Sharma

Microelectronics Group, EE DepartmentIIT Bombay, Mumbai

May 2006

Dinesh Sharma, May 2006 Hardware Description Languages




Concurrent and sequential DescriptionsDesign Flow

The Design Process

We ask our selves the question:What is Electronic Design?






The Design Process


Given specifications, we want to develop a circuit by connectingknown electronic devices, such that the circuit meets givenspecifications.






The Design Process



“Specifications” refer to the description of the desired behaviourof the circuit.






The Design Process



“Specifications” refer to the description of the desired behaviourof the circuit.

“Known” devices are those whose behaviour can be modeledby known equations or algorithms, with known values ofparameters.






Electronic Design

Electronic Design is the process of convertinga behavioural description (What happens when ..)

to

a structural description (What is connected to what and how ..)

After conversion to a structural description, we may need to do“Physical Design” which involves choosing device sizes,placement of blocks, routing of interconnect lines etc.

This part is already done for us in FPGA based design.






Conquest over Complexity

The main challange for modern electronic design is that thecircuits being designed these days are extremely complex.

While IC technology has moved at a rapid pace,capabilities of human brain have remained the same :-(

The human mind cannot handle too many objects at thesame time. So a complex design has to be broken downinto a small number of ‘manageable’ objects.

If each object is still too complex to handle, the aboveprocess has to be repeated recursively. This leads tohierarchical design.

Systematic procedures have to be developed to handlecomplexity.






A page out of the software designer’s book

We must learn from the experience of software designers forhandling complexity.

We must adopt:

Hierarchical Design.

Modular architecture.

Text based, rather than pictorial descriptions.

Re-use of existing resources






Abstraction Levels

Structural

Functional

Y chartGajski and Kahn

Types and levels of modeling

Low

High

AbstractionLevels of

Geometric

Abstraction levels refer tofunctional, structural orgeometric views of the design.

Top down design begins withhigher levels of abstraction.

As we go to lower levels ofabstraction, the level of detailgoes up.

It is advantageous to do asmuch work as possible athigher levels of abstraction,when thw detail is low.






Abstraction Levels: Geometric



Stick Diagrams

Unit Cells

Floor Plan

Polygons

Geometric

At high levels of geometricabstraction, we view the layoutas a floor plan with blocks.

At lower levels, we look atbasic cells.

At lower levels still, we viewtransistors as stick diagrams.

At the lowest level, we have toworry about all rectangles andpolygons making up thelayout.






Abstraction Levels: Structural

Structural



Transistors

Registers

BlocksFunctional

Gates

At high levels of abstraction,we view the structure in termsof functional blocks or IPcores.

At lower levels, we see it interms of registers, simpleblocks

At still lower levels, we view itin terms of logic gates etc.

At the lowest level, we have tosee full details at transistorlevel.






Abstraction Levels: Functional

Functional



Specifications

Control Flow

Algorithms

Data and

Equations

At the top level, we have thefunctional specifications.

At lower levels, we view thedesign in terms of protocolsand algorithms.

At Still lower levels, we view itin terms of data and controlflow etc.

At the highest level of detail,we have to worry about all thegoverning equations at allnodes.






Design Flow: System and logic level

System Partitioning

Block specification

Block Level Simulation

Logic Design

Logic Simulation

OK?

OK?






Design Flow: Physical level

OK?

Physical Design

Layout, Back extraction

Resimulation, Timing

Fabrication

Mask Making

Test

OK?

Debug






Hierarchical Design

The design process has to be hierarchical.A complex circuit is converted to a structural description ofblocks which have not yet been designed - but whosebehaviour can be described.Each of these blocks is then designed as if it was anindependent design problem of lower complexity.This process is continued till all blocks are broken downinto “known” devices.It is essential that any departure from proper operation isdetected early - at a low complexity level.A hardware description language must be able to simulatea system whose components have been designed todifferent levels of detail.






But Hardware is different!

Hardware components are concurrent(all parts work at the same time).

Whereas (traditional) software is sequential -(executes an instruction at a time).

Description of hardware behaviour has timing as an integralpart.

Traditional software is not real time sensitive.

Therefore, design of complex hardware involves many morebasic concepts beyond those of programming languages.






Hardware Description Languages

Hardware description languages need the ability to

Describe

Simulate at

BehaviouralStructuraland mixed

level.

and to synthesize (structure from behaviour).






Timing and DelaysconcurrencySimulation of hardware

Basic HDL concepts

Timing

ConcurrencyHardware Simulation process which involves:

AnalysisElaborationand Simulation

Simulation proceeds in two distinct phasesSignal updateSelective re-simulation







HDL Uses

Hardware Description Languages are used for:Description of

InterfacesBehaviourStructure

Test Benches

Synthesis







Delays

How do we describe delays?

Delay = 30uSOutIn

Out <= In AFTER 30 uS;

Is this description unambiguous?







Delay: Inertial

In Outx

In

x

out

30uS







Delay: Transport

Optical Fibre

Delay=30uS

In Out

In

Out







Modeling Delay

So the same amount of delay (30 µS in our example), canresult in qualitatively different phenomena!

We have to define two different kinds of delay

Inertial Delay is the RC kind of delay, which swallows pulsesmuch narrower than the delay amount.

Transport Delay is the optical fibre kind of delay, which lets allpulses pass through irrespective of their width.

In most hardware description languages, Delays are inertial bydefault.The delay amount is taken to be zero if not specified.







Signal Assignments: Transactions

To represent real hardware, each signal assignment has to beassociated with a delay.

When a value is assigned to a signal, the target signal does notacquire the assigned value immediately. The value is acquiredafter some delay.

Remembering that a signal is scheduled to acquire a value inthe future is called a “Transaction”

Thus, when an assignment is made, we imply that the targetsignal will acquire this value after so much delay of this type.







Concept of delta delay

When a transaction is placed on a signal, the default type ofdelay is inertial and the default amount of delay is zero.

Zero delay is implemented as a small (δ) delay which goes tozero in the limit.

This has scheduling implications.Events occurring at t , t + δ, t + 2δ are all reported as havingoccurred at t, but are time ordered as if δ were non zero.







Handling Concurrency

Concurrency is handled by following an even drivenarchitecture.

In a concurrent system many things can happen at thesame time.We can efficiently handle only one thing at a time,Therefore we need to ‘control’ the passage of time.Time is treated as a global variable. Things which happensimultaneously are handled one after the other, keepingthe time value the same. Time is incremented explicitlyafter all events at the current time have been handled.Obviously, the value of the time variable represents thetime during the operation of the concurrent system - andhas nothing to do with the actual time taken by a computerto simulate the system.







Hardware Simulation

Hardware simulation involves three stages:

Analysis Syntax of hardware description is checked andinterpreted.

Elaboration This is a preparatory step which sets up ahierarchically described circuit for simulation.

Flattening the hierarchy: For structuraldescriptions, components are expanded, tillthe circuit is reduced to an interconnection ofsimple components which are describedbehaviourally.Data structures describing “sensitivity lists” ofall elemental components are built up.

Simulation Event driven simulation is carried out.







Analysis

Check for Syntax and SemanticsSyntax: Grammar of the language

Semantics: Meaning of the model

Analyse each design unit separately

Place analysed units in a working library,(generally in an implementation dependent internal form toenhance efficiency).







Elaboration

This step ‘builds up’ a detailed circuit from a hierarchicaldescription.

‘Flatten’ the design hierarchyCreate ports (interfaces with other blocks).Create signals and processes.For each instantiated component, copy the component‘template’ to the instance.Repeat recursively till we are left only with behaviourallydescribed ‘atomic’ modules.

The end result of elaboration is a flat collection of signalnets connected to behaviourally described modulesthrough defined ports.







Event Driven Simulation

We maintain a time-ordered queue of signals which arewaiting to acquire their assigned values.

The time variable is advanced to the earliest entry in thisqueue.

All signals waiting for acquiring their values at this time areupdated.

If this updating results in a change in the value of a signal,an Event is said to have occurred on this signal.







Sensitivity List

During the elaboration phase, we determine which pieces ofhardware are affected by (are sensitive to) which event.

This is called a ‘sensitivity list’

The data structure is optimized for reverse look up:That is, given an event, one can quickly get a list of allhardware which is sensitive to it.

Notice that hardware could be sensitive to a particular kind ofchange- for example to a rising edge of the clock.







The Simulation Cycle

The time variable is advanced to the earliest time entry in thetime ordered queue of transactions.

The update phase Update all signals which were to acquiretheir values at the current time (and then deletetheir entry from the queue).

Event handling phase If the value of a signal changes due tothe above update, it is said to have had an event.All events which resulted at the current time arehandled by a scheduler.







Scheduling

For each event that took place at the current time,

We re-simulate all modules which are sensitive to thisevent.

As a result of re-simulation, fresh transactions will beplaced on various signals. These are inserted atappropriate positions in the time ordered queue.

This is done for all events which occurred at the current time.

When all events have been handled, we advance the time tothe earliest entry in the time ordered transactions list and startthe update phase again.







A Simulation Example

86

A

B

C

200 50

Nodes: A,B and CInput A, Output CInverter Delay: 8 unitsNAND delay: 6 units

Sensitivity List

Event on A Inverter, NAND

Event on B NAND

Time ordered Transaction List:Time Trans.0 A = 020 A = 150 A = 0








86

A

B

C

200 50

At Time = 0, update A = 0.Time A B CInitial X X X0 0 X XA has an event.

Inverter and NAND are sensitive to A.

InitialTime Trans.0 A = 020 A = 150 A = 0

Re-evaluate:

Inverter: B → 1 at 8;NAND: C → 1 at 6

After Re-simTime Trans.6 C = 18 B = 120 A = 150 A = 0








86

A

B

C

200 50

At Time = 6, update C = 1.Time A B C0 0 X X6 0 X 1C has an event.

No module is sensitive to C.

A

B

10 20 30 40 50 60

X

C X

InitialTime Trans.6 C = 18 B = 120 A = 150 A = 0

Re-evaluate:

None Required

After Re-simTime Trans.8 B = 120 A = 150 A = 0








86

A

B

C

200 50

At Time = 8, update B = 1.Time A B C6 0 X 18 0 1 1B has an event.

Only NAND is sensitive to B.

A

B

10 20 30 40 50 60

X

C X

InitialTime Trans.8 B = 120 A = 150 A = 0

Re-evaluate:

NAND: C → 1 at 14

After Re-simTime Trans.14 C = 120 A = 150 A = 0








86

A

B

C

200 50

At Time = 14, update C = 1.Time A B C8 0 1 114 0 1 1There is no event.

No Sensitivity is triggered.

A

B

10 20 30 40 50 60

X

C X

InitialTime Trans.14 C = 120 A = 150 A = 0

Re-evaluate:

None Required

After Re-simTime Trans.20 A = 150 A = 0








86

A

B

C

200 50

At Time = 20, update A = 1.Time A B C14 0 1 120 1 1 1A has an event.

Inverter and NAND are sensitive toA.

A

B

10 20 30 40 50 60

X

C X

InitialTime Trans.20 A = 150 A = 0

Re-evaluate:


After Re-simTime Trans.26 C = 028 B = 050 A = 0








86

A

B

C

200 50

At Time = 26, update C = 0.Time A B C20 1 1 126 1 1 0C has an event.

No module is sensitive to C

A

B

10 20 30 40 50 60

X

C X

InitialTime Trans.26 C = 028 B = 050 A = 0

Re-evaluate:

No update is required.

After Re-simTime Trans.28 B = 050 A = 0








86

A

B

C

200 50

At Time = 28, update B = 0.Time A B C26 1 1 028 1 0 0B has an event.

Only NAND is sensitive to B.

A

B

10 20 30 40 50 60

X

C X

InitialTime Trans.28 B = 050 A = 0

Re-evaluate:

NAND: C → 1 at 34

After Re-simTime Trans.34 C = 150 A = 0








86

A

B

C

200 50

At Time = 34, update C = 1.Time A B C28 1 0 034 1 0 1C has an event.

No module is sensitive to C.

A

B

10 20 30 40 50 60

X

C X

InitialTime Trans.34 C = 150 A = 0

Re-evaluate:

No evaluation needed.

After Re-simTime Trans.50 A = 0








86

A

B

C

200 50

At Time = 50, update A = 0.Time A B C34 1 0 150 0 0 1A has an event.

Inverter and NAND are sensitive toA.

A

B

10 20 30 40 50 60

X

C X

InitialTime Trans.50 A = 0

Re-evaluate:


After Re-simTime Trans.56 C = 158 B = 1








86

A

B

C

200 50

At Time = 56, update C = 1.Time A B C50 0 0 156 0 0 1There is no event

No Sensitivity is triggered.

A

B

10 20 30 40 50 60

X

C X

InitialTime Trans.56 C = 158 B = 1

Re-evaluate:

No re-evaluationrequired.

After Re-simTime Trans.58 B = 1








86

A

B

C

200 50

At Time = 58, update B = 1.Time A B C56 0 0 158 0 1 1B has an event

Only NAND is sensitive to B

A

B

10 20 30 40 50 60

X

C X

InitialTime Trans.58 B = 1

Re-evaluate:

NAND: C → 1 at 64

After Re-simTime Trans.64 C = 1








86

A

B

C

200 50

At Time = 64, update C = 1.Time A B C58 0 1 164 0 1 1There is no event

No sensitivity is triggered.

A

B

10 20 30 40 50 60

X

C X

InitialTime Trans.64 C = 1

Re-evaluate:

No re-evaluationrequired.

After Re-simTime ordered list is

empty.







Scheduling for Delay types

What do we do if there is more than onetransaction waiting for the same signal?Inertial Delay A transaction scheduled for later time results in

deletion of waiting transactions for a different valueon the same signal.

Transport Delay All transactions are retained and signalassignments made at their respective times.







Inertial Delay Example

0 40 45 13080In

In

Out

Out

11030 160

Inertial 30uS

Time Transaction

0 In := 0

40 In := 1

45 In := 0

80 In := 1

130 In := 0








0 40 45 13080In

In

Out

Out

11030 160

Inertial 30uS

Time Transaction

30 out :=0

40 In := 1

45 In := 0

80 In := 1

130 In := 0








0 40 45 13080In

In

Out

Out

11030 160

Inertial 30uS

Time Transaction

40 In := 1

45 In := 0

80 In := 1

130 In := 0








0 40 45 13080In

In

Out

Out

11030 160

Inertial 30uS

Time Transaction

45 In := 0

70 Out := 1

80 In := 1

130 In := 0








0 40 45 13080In

In

Out

Out

11030 160

Inertial 30uS

Time Transaction

70 Out := 1

75 Out :=0

80 In := 1

130 In := 0








0 40 45 13080In

In

Out

Out

11030 160

Inertial 30uS

Time Transaction

75 Out :=0

80 In := 1

130 In := 0








0 40 45 13080In

In

Out

Out

11030 160

Inertial 30uS

Time Transaction

80 In := 1

130 In := 0








0 40 45 13080In

In

Out

Out

11030 160

Inertial 30uS

Time Transaction

110 Out := 1

130 In := 0








0 40 45 13080In

In

Out

Out

11030 160

Inertial 30uS

Time Transaction

130 In := 0








0 40 45 13080In

In

Out

Out

11030 160

Inertial 30uS

Time Transaction

160 Out := 0






concurrent DescriptionsSequential Descriptions

Concurrent Descriptions

The order of placing ‘concurrent’ descriptions in ahardware description language is immaterial.

As seen in the example described earlier, each concurrentblock is handled when its ‘sensitivity’ is struck, wherever itis placed in the overall description.

So what defines the limits of a ‘concurrent block’?

If it is a single line, there is no problem.

If the description of a concurrent block needs multiplelines, How are these lines to be executed?







Multi-line concurrent descriptions

A multiline concurrent block has to be executed completelywhen its sensitivity is struck.

Therefore, the multi-line description of a complexconcurrent block must be executed sequentially, line byline.

A hardware description language must therefore provide asyntax to distinguish sequential parts from concurrentparts.(After all, a single line of description could be astand-alone concurrent description or part of a multi-linesequential code).

Multiline descriptions of hardware blocks are concurrentoutside and sequential inside!







Sequential Descriptions

Describing hardware by sequential code raises a problem!What happens when the sequential description reaches itsend?

Hardware blocks are perpetual objects. These cannot‘terminate’ like software routines.










We can make sequential descriptions perpetual by addingthe convention that a sequential description loops back toits beginning when it reaches its end.










We can make sequential descriptions perpetual by addingthe convention that a sequential description loops back toits beginning when it reaches its end.

This, however, leads to yet another problem!







Suspending endless loops

An endless loop will never terminate.Then how can we handle the next event?

Indeed, when can we advance the time variable?







Suspending endless loops

An endless loop will never terminate.Then how can we handle the next event?

Indeed, when can we advance the time variable?

The convention should therefore be that when a sequentialdescription ends, execution will loop back to the beginning,and execution of the loop will be suspended here!

The supsended loop will restart only when the sensitivity of thisblock is struck again.







Now we can handle multiple blocks waiting to be handled at anygiven time.

We handle each block whose sensitivity has been triggered, tillit is suspended.

Then we handle the next block and so on, till all blocks havebeen done.

Now we update the time to the next earliest entry in the timeorder queue and go through the next signal update - eventhandling cycle.







Hardware Description Languages

This ends

The first part of the lecture series on

HARDWARE DESCRIPTION LANGUAGES

Fundamental Concepts




Current Mode Interconnect

Dinesh Sharma, Marshnil Dave

Department Of Electrical EngineeringIndian Institute Of Technology, Bombay

Sept. 25, 2010



Current Mode Interconnects Group at IIT Bombay

Prof. Maryam Shojaei Baghini Marshnil Dave Amit Vishnani Navin Kacharappu Sandeep Waikar Girish Naik Dinesh Sharma

Supreet Joshi Rajkumar Satkuri Mahavir Jain M. Veerraju



Part I

Current Mode Data Communication

ScalingUnscaled Interconnect Delay

Solutions for Interconnect Delay problemBuffer InsertionCurrent signalingInductive PeakingDynamic Overdriving



Scaling

To increase packing density, we would like to reduce thesize of transistors and passive components.

In order to decrease lateral sizes, we have to reducevertical sizes too.

If dimensions are scaled down, voltages must also bereduced to avoid breakdown.

This is known as constant field scaling.

So what price do we have to pay to get denser, more complexcircuits?



MOS model

0.2

0.4

0.0 0.5 1.0 4.03.02.52.0 4.51.5

0.6

0.8

1.0

1.2

1.4

Dra

in C

urre

nt (

mA

)

Drain Voltage (V)

1.5

2.0

2.5

3.0

Vg = 3.5

1.03.5

K ≡ µCoxWL

Cox ≡ǫox

tox

For Vgs ≤ VT ,Ids = 0

For Vgs > VT and Vds ≤ Vgs − VT ,Ids = K

[

(Vgs − VT )Vds −12V 2

ds

]

For Vgs > VT and Vds > Vgs − VT ,Ids =

K2 (Vgs − VT )

2

(Gate capacitance Cox is per unit area)



Consequences of Scaling

All dimensions and voltages divided by the factor S(> 1).Device area ∝ W × L : (↓ S)(↓ S) ↓ S2

Cox ǫox/tox : const/(↓ S) ↑ SCtotal ǫA/t : (↓ S2

)/(↓ S) ↓ SVDS, VGS, VT Voltages : (↓ S) ↓ SId µCox(W/L)(∝ V 2

) :

(↑ S)(const)(↓ S2) ↓ S

Slew Rate dVdt I/Ctotal : (↓ S)/(↓ S) const .

Delay V/dVdt : (↓ S)/(const) ↓ S

Static Power V × I : (↓ S)(↓ S) ↓ S2

dynamic power CtotalV 2f : (↓ S)(↓ S2)(↑ S) ↓ S2

Power delay product delay × power(↓ S)(↓ S2) ↓ S3

Power density power/area : (↓ S2)/(↓ S2

) const .



Impact of scaling

Improved packing density: ↑ S2

Improved speed: delay ↓ S Improved power consumption: ↓ S2

However . . .The above improvements apply to active circuits.

What about passive components?

Also, reduced voltages imply a lower signal to noise ratio.



Concern: Interconnect Delay

L

Wtmti

R = ρL

Wtm, C = ǫ

LWti

Charge Time ≈ RC = ρǫL2

tmti To first order, delay is independent of W.

This is because increasing W reduces resistance butincreases capacitance in the same ratio.

Unfortunately W is the only parameter that the circuitdesigner can decide! (L is fixed by the distance betweenthe points to be connected, ρ, ǫ, tm and ti are decided bythe technology).



Concern: Interconnect DelayR

elat

ive

Fre

quen

cy

Normalized Wire length

Local interconnects scale with device size. Global interconnects scale with die size.

Interconnect Delay = ρǫtmti

L2 ≡ AL2

For local interconnects, L scales the same way as tm, ti ,so delay is invariant.

For Global Interconnects, L goes up with die size, while tm andti scale down. This leads to a sharp increase in delay.



Buffer Insertion

Global Interconnect delay can be the determining factor for thespeed of an integrated system.

The L2 dependence of interconnect delay is a source ofparticular concern.

This problem can be somewhat mitigated by buffer insertion inlong wires.

We define some critical wire length and when a wire segmentexceeds this length, we insert a buffer.



Repeater Insertion in Voltage Mode

What is the optimum wire length after which we should insert abuffer? (Wire Delay = ρǫ L2

tm ti= AL2)

Length = L’ Let the wire segment length = L’.Segment wire delay = AL′2.Let buffer delay = τ

For n segments, there will be n-1 buffers, and L = nL’ .

∆ = nAL′2+ (n − 1)τ =

LL′

AL′2+ (

LL′

− 1)τ = ALL′+ (

LL′

− 1)τ

Putting the derivative with respect to L’ = 0 for optimization,

AL −L

L′2 τ = 0, so AL′2= τ

L’ should be so chosen that the wire segment delay = τ .Total delay is proportional to n and so, is linear in L.



Difficulties with Buffer Insertion

Currently, buffer insertion is the most widely used method tocontrol interconnect delay.However, there are several difficulties with buffer insertion.

Buffers consume power and silicon area. Typically, we do floor planning and layout first and then put

in the interconnects. When the wire length reaches L’, weneed to put in a buffer. However, it is quite possible thatthere is active circuitry underneath, and there is no room toput in a buffer!

We either live with buffer insertion at non-optimal wirelengths or create space by pushing out existing cells andmodifying the lay out.



Problem with bi-directional data transmission

Global interconnects often include data busses, which mayrequire bidirectional data transmission. (For example, abus connecting a processor and memory).

However, buffer insertion fixes the direction of data flow! We need to replace buffers with bidirectional transceivers. These require a direction signal, which will enable a buffer

in the desired direction. This direction signal must also be routed with the bus and

should have its own buffers. It should reach thebidirectional buffers ahead of the data.



Concern: Signal Integrity

As interconnect wire separation is reduced . . . There is a serious signal integrity problem because of

electrostatic coupling between long wires. Inter-signal interference can lead to unpredictable delay

variations. Grounded shielding wires must often be inserted to avoid

interference. This leads to extra capacitance and CV 2f power loss.



Concern: Timing closure

Global interconnects are placed after active circuit designand layout is complete.

One has to anticipate the wire length, and then design theactive circuits to meet total delay specifications.

If the actual wire length is different from what wasanticipated, one has to re-design the active circuits afterlayout.

After a fresh layout, wire lengths and hence, delays arechanged.

This leads to a design-layout-redesign iteration known asTiming Closure. This iteration becomes longer and longerwhen total delays are dominated by interconnect delay.



Promise of current mode signaling

Why not signal with current rather than voltage? Current rise time is limited by inductance rather than

capacitance. Typically, inductive effects are much smallerthan capacitive effects.(After all, ǫ ≃ 4, µ = 1 for insulators used in IC’s).So electromagnetic coupling is lower than electrostaticcoupling.

Signal voltage swings are limited by scaled down supplyvoltages: this does not restrict current swings.

In fact, we can use multiple current values to send morethan one bit down the same wire!



Promise of current mode signaling

If we hold the Voltage on the interconnect nearly constant

Dynamic power is negligible. Latency is much lower. We also have the option of using multiple current levels to

transmit multiple bits simultaneously. This can giveHigher Throughput.Lower interconnect area.

Possibility for improving Latency, Throughput and Powersimultaneously!

Since ∆V → 0, while ∆I 6= 0⇒ We need a low (near 0) input impedance receiver.



Digital Designers need not panic!

Only the interface works in current mode. Rest of the circuit istraditional.A library circuit does the voltage mode to current conversion(transmitter) and another converts the current back to voltagemode (receiver).

To put this plan into action, we need a receiver with very lowinput impedance.

(If inductive effects are to be taken into account, we would liketo terminate the line into its characteristic impedance.)



Zero input impedance circuit

Low rin amps are used for photo-detectors. 1

Mp2Mp1

Mn1 Mn2

Vref

v1 v2ii1 2

v i1 = gmn1v1 = gmp1(v − v2)

i2 = gmn2v1 = −gmp2v2

v2 = −gmn2gmp2

v1 = −gmn2gmp2

i1gmn1

i1 = gmp1v +gmn2/gmn1

gmp2/gmp1i1

define Γ ≡gmn2/gmn1

gmp2/gmp1then, i1(1 − Γ) = gmp1v

This gives rin = (1 − Γ)/gmp1

1C.-K. Kim et al, “High Injection Efficiency Readout Circuit for LowResistance Infrared Detector”, IEE Electronic Letters, 35, 1507, 1999.



Robustness of design

In saturation,

Id =12µCox

WL

(Vg − VT )2

So, gm = µCoxWL

(Vg − VT ) =

√

2µCoxWL

Id

gmn2/gmn1 =

√

(W/L)n2

(W/L)n1

I2I1

gmp2/gmp1 =

√

(W/L)p2

(W/L)p1

I2I1

Therefore Γ ≡gmn2/gmn1

gmp2/gmp1=

√

(W/L)n2/(W/L)n1

(W/L)p2/(W/L)p1



Receiver Design - Input stage

Mp2Mp1

Mn1 Mn2

Vref

v1 v2ii1 2

Iint

Iout

Input resistance controlled by geometry of transistors Interconnect voltage held fixed Input resistance insensitive to process variations



Reduced swing signaling

Buffer/ampLine

Low Swing Voltage mode

DriverLow swing

Low Swing Current Mode

RL

ReceiverLine

DriverLow swing

In reduced swing voltage mode signaling, the line is notterminated in a low impedance.

Current mode signaling terminates the line in a lowimpedance.

This reduces the time constant, increases bandwidth. However, this also leads to static power consumption.



Improving Current Mode Signaling


RL

ReceiverLine

DriverLow swing

Current mode signaling Consumes Static Power Direct Trade-off between speed and static power

Possible Improvements Inductive Peaking Dynamic Over-driving



Concept of Inductive Peaking

On-chip interconnects can bemodeled as distributed RC which isessentially a low pass filter.

Bandwidth enhancement techniquesused in RF amplifiers can beemployed for bandwidthenhancement on interconnects

Inductive Peaking: Line terminationcircuit exhibits inductive inputimpedance

Shows enhancement of about500MHz in 3dB bandwidth.

R0

C0

R0 R0 R0

C0 C0 C0

L

RL

DRIVER



Bandwidth Enhancement Vs Load Inductance

For a given line length, the amountof bandwidth enhancement is afunction of inductance and loadresistance.

Significant bandwidth enhancementcan be achieved for a wide range ofinductance values greater thanLpeak .

The required inductance forsignificant enhancement inbandwidth is a few hundreds of nanoHenries !!

An active inductor is required



Beta Multiplier: A Gyrator

Mp2Mp1

Mn1 Mn2

Vref

v1 v2ii1 2

v The Beta Multiplier essentially forms agyrator circuit with two Gm elementsconnected back to back along with theparasitic capacitance of the transistors.

So Beta Multiplier Circuits can exhibitinductive input impedance for somefrequency range if designed properly.



Beta Multiplier: Input Impedance

Zin =(τ1τ2 + kτ2τ3)s2

+ (τ1 + τ2 + k(τ3 + τ2))s + 1 + k − γ

(gmp1 +1

R3)(1 + τ1s)(1 + τ2s)(1 + τ4s)

τ1 =Cg1gmn1

τ2 =Cg2gmp2

R1 =1

gmn1

τ3 = Cg3rop1 τ4 =Cg3gmp1

R3 = rop1

γ =gmp1/gmp2gmn1/gmn2

k =R1R3

Rin =

(1 − γ) +1

gmn1rop1

gmp1 +1

rop1

Cg1

Cg3

Cg21/gmn1

ro_p1 1/gmp2

i1

i2

i2 = gmn2 vg1

i1 = gmp1 (vint - vg2)int



Beta Multiplier : Equivalent Circuit

Relative location of poles and zeros determine nature ofimpedance (inductive of capacitive)

If the first zero occurs a decade prior to the first pole, inputimpedance is inductive

γ − 1gmn1rop1

> 0.9 and any two time constants being equalensures that a zero occurs a decade prior to the first pole

Leff =rop1

gmp1rop1 + 1

Cg1

gmn1+

Cg2

gmp2

+Cg2

gmp2gmn1rop1+

Cg3

gmn1gmp1rop1

Reff =

(1 − γ) +1

gmn1rop1

gmp1 +1

rop1

Ceff = KCgx

Ceq

Req

Leq

Zin



Beta Multiplier : Input Impedance Control

Beta Multiplier shows an effective inductance of hundredsof nano Henries for a practical range of input current andtransistor geometries.

Its effective resistance can be controlled by ratios oftransconductances while its effective inductance dependson the absolute value of transconductance.

It is possible to control Rin and Leff with very littleinteraction between the two. Inductance changes from100nH to 980nH while the value of effective resistanceremains within 12% of its nominal value for 20µA change inthe current.



Current Mode Receiver Circuit with Beta Multiplier

Source Type

Sink TypeBeta Mult.

Beta Mult.

Input

Vdd

Mp11

Mn11

Mp22

Mn22

Mp1 Mp2

Mn1 Mn2

Inv Amp

Vref

Effective impedance offered by the receiver isequal to the parallel combination of theimpedance offered by individual beta multipliers.

Voltage at input node swings around Vref . Smallvoltage swing on the line is sensed andamplified by the inverting amplifier.

Vref is generated by shorting the input andoutput of an inverter to ensure that the value ofVref is the same as switching threshold ofreceiver amplifier across all process corners.

rout of Vref generation circuit comes in series with betamultiplier Zin and hence beta multiplier has to be sizedaccordingly.

Vref generation circuit consumes static power.



Simulation Results

Performance Comparison of three signaling schemes (line=6mm, Power measured at 1Gbps)

Signaling Delay Throughput Power AreaScheme (ps) (Gbps) ( µW ) (µm2)

CMS-BMul(30 mV)[1] 420 2.56 310 2.00CMS-Diode-CC(30 mV)[2] 500 2.45 380 2.00

Voltage Mode 1000 2.85 3000 12.53

Inductive termination gives 16% improvement in delay andabout 18 % improvement in power. Also more than 50 %improvement in delay at the same time an order ofmagnitude lower power.

[1] M Dave et. al., ISLPED 2008, [2] V. Venkatraman et. al. ISQED 2005



Concept of Dynamic Overdriving/Pre-emphasis

Current mode transmission can be speeded up by usinghigh drive current.

However, this increases static power consumption. One possible solution is to dump high drive current only

when the state of the line needs to be changed from 0 to 1or from 1 to 0.

When the line remains at 1 or 0 from one bit to the next, weuse a small drive current to maintain the line at therequired voltage.

This is called Dynamic Over Driving. Dynamic Overdriving essentially means amplifying high

frequency components of the input signal



Possible implementation of Dynamic Overdriving

Steady State (Weak)Driver

Input

VDD

p Drive

n Drive

Swing Control (High)

Swing Control (Low)

The p channel driver gate is low (enabled)when the input is 1.

As the line reaches VDD − VTp, the upperp channel transistor turns off, restrictingline voltage swing.

Similarly the n channel driver transistor isenabled when the input is 0 and the lowertransistor turns off when the inputapproaches VTn during discharge.

A. Katoch et. al. ESSCIRC, 2005




Dynamic (Strong)Driver

Wire

Feedback

Input

VDD

The feedback inverter acts as an invertingamplifier converting low swing logic levels onthe wire to full swing (inverted) CMOS logiclevel on its output.

P channel gate is low (enabled) only when the input is highAND the line is at 0.

N channel gate is high (enabled) only when the input is lowAND the line is at 1.

Input to the feedback inverter is a low swing level aroundVDD/2. Therefore it consumes static power.



Self limiting Strong Driver

Dynamic (Strong)Driver

Wire

Feedback

Input

VDD

Input = 1, Wire voltage < Vm

Inverter output = 1, NAND output = 0, NOR output = 0

P channel driver dumps current to chargethe line.

Input = 0, Wire voltage > Vm

Inverter output = 0, NAND output = 1, NOR output = 1

N channel sinks current to discharge theline.

As soon as low swing logic level on the line = inputInverter output = input , NAND output = 1, NOR output = 0

This disables both drive transistors automatically.




Dynamic Overdriving with Inductive termination?

Dynamic Overdriving (DOD) and Inductive line termination bothessentially amplify high frequency components of input signal.

Can we use both?



Current Mode Signaling Schemes with IdealComponents

Following four current mode signaling schemes were simulated:

CMS Scheme with DOD and Resistive Load CMS Scheme with Simple Driver and Resistive Load CMS Scheme Inductive Load CMS Scheme with DOD and Inductive Load

Implementation details of these circuits are: Dynamic Overdriving driver is implemented by ideal VCCS

with current wave shape as shown in the figure. Controllingvoltage is input.

Simple driver is implemented as VCCS with square waveshape. The input current ranging from −Iavg to +Iavg.

Iavg =Ipeak tp+Istatic(t−tp)

t RL = 4kΩ, l = 4µH



Comparison of Delay

With Large Overdrive (Ipeak = 500µA) Dynamic overdriving shows 5 ×

improvement in delay over RC Inductive peaking does not offer

substantial additional advantage whencombined with dynamic overdriving.

Inductive peaking alone shows 25% ofimprovement in delay over RC

With Small Overdrive (Ipeak = 50µA) Dynamic Overdriving alone and inductive

peaking alone give nearly the same delay Inductive peaking along with dynamic

overdriving shows around 20%improvement in delay over dynamicoverdriving alone



Comparison of Throughput (Eye-opening)

Dynamic overdriving improvesthroughput by 5 × over RC

Inductive peaking does not offersubstantial additional advantagewhen combined with dynamicoverdriving.

Inductive peaking shows throughputenhancement of 26% over RC



Conclusion: Inductive Peaking vs Dynamic Overdrive

For very high data rate applications, dynamic overdrivingalone should be employed as inductive peaking does notoffer any additional advantages

For low power and low data rate applications, the use ofinductive peaking can give 26% improvement in throughputover RC

For low power and low data rate applications, the use ofinductive peaking can give 16% improvement in delay overRC

For low power and low data rate applications, the use ofdynamic overdrive along with inductive peaking can furtherimprove throughput by 20%



Part II

Variation Tolerant Current Mode Signaling

Need for Process Variation Tolerance

Effect of Process Variations on different CMS Schemes

The Proposed Variation Tolerant CMS Scheme

Performance Evaluation

Bidirectional LinksSimulated Performance of Bidirectional Link



Need for Process Variation Tolerance

Current mode signaling derives its advantages overvoltage mode due to the reduced swing on the line.

Careful design is necessary, otherwise small changes indevice parameters can have a disproportionate effect onthe performance of the system.

In modern short channel processes, variations in transistorparameters are large – some of the parameters can varyby as much as 60%.

we have to design circuits, so that they are robust withrespect to batch-to-batch variations, as well as variationsbetween devices on the same die.

Batch-to-batch or inter-die variations can shift operatingpoints and drive strengths.

Intra-die variations cause mismatch in parameters oftransmitter and receiver transistors.



Robustness requirements

Process, Supply Voltage and Temperature variations willaffect the core logic as well as data communicationcircuitry.

The requirement for data transmission is therefore not ofcomplete invariance with respect to PVT variations.

We have to ensure that throughput and delay properties ofthe interconnect are at least as good as data generationand clock rates.

Thus the deterioration in interconnect properties should beno worse than the deterioration in general logic.

Because global interconnects, by definition, connectremote points on the die, on chip variations can be ofgreater concern.



Effect of common mode voltage mismatch

Vcm−Rx

Vcm−Rx

Transmitter

Ideal

Receiver

Misaligned

In case of ideal match, small fluctuationsin line voltage are converted to rail to railswing by the receiver.

If, however, the mismatch is large, thesmall swing on the line may be completelyignored by the receiver.

It is important, therefore, that the amountof swing on the line is much more than themismatch in common mode voltages.

But high swing will cause powerdissipation.

It is better to have smart bias circuits,which will reduce mismatch and the needfor a large swing.



System parameters affected by variations

Variations in the following parameters have a strong influenceon the performance of the signaling scheme:

1. Ipeak : Peak current supplied by the strong driver duringinput transition

2. tp: Duration for which the strong driver is ON3. ∆V : Line voltage swing at the receiver end in steady state4. Mismatch between any VCMRx and operating point of an

amplifier



CMS Scheme with Feedback (CMS-Fb)

Wire

WireFeedback

Input

+−

I 1

StrongDriver

WeakDriver

RL

LineRx

Vcm Rx

VDD

RxOut

Receiver Eq. Circuit

NAND/NOR generates pulses to turn-on/off the strongdriver

Input transition → the strong driver turns on→ line voltage at transmitter end crosses VM of inverter I1→ strong driver turns off.

Weak driver supplies Istatic and line voltage swing atreceiver end is VCMRx ± IstaticRL




Effect of Inter-die Process Variations on CMS withfeedback

Wire

WireFeedback

Input

+−

I 1

StrongDriver

WeakDriver

RL

LineRx

Vcm Rx

VDD

RxOut


Variations in Ipeak are well compensated due to thefeedback at the driver end.

If the driver is weaker due to process variations, the feedback system keeps it on for longer till the line reaches thedesired voltage.

This might, however, not be optimum from a power point ofview.



Effect of Intra-die Process Variations on CMS-Fb

VCMRxV∆

VM−Tx

Line voltage is not constant forconstant low input voltage

During low to high transitionthe strong driver is turned offwell before the line voltagecrosses VCMRx



CMS Scheme without Feedback (CMS-Fpw)

WireInput

+−

StrongDriver

WeakDriver

RL

LineRx

Vcm Rx

VDD

RxOut


Delay

Fixed WidthPulse Generator

tp is given by delay element Less sensitive to intra-die variations In the skewed corners, sourcing Ipeak and sinking Ipeak are

different, leading to different rise and fall delay Throughput can degrade significantly in skewed corners

A.Tabrizi et. al. MWSCAS, 2007



Minimizing Process Dependence

To minimize process dependence, we need smart bias circuitswhich sense the process corner and adjust the bias tocompensate for variations.

Short p MOS

Long n MOS

Vbp

Vdd

Vdd

Long p MOS

Short n MOS

Vbn

Long Channel transistors show relatively less variationwith process compared to Short Channel transistors inthe same process.

We can make use of this difference to design a biasgenerator which senses the process corner and triesto increase the transistor current in the slow cornersand to decrease it in the fast corners.

Simple bias generators using inverters with input andoutput shorted and which use this feature are shownhere.



Proposed CMS Scheme with Smart Bias

We propose a Dynamic Overdrive scheme in which both thestrong and the weak drivers use constant current sourcescontrolled by process aware bias generators.

Short

nMOS

pMOS

Long

Long

nMOS

WireDelay

ShortpMOS Vbp

Vbn

Vdd

Vdd

Vdd

Rx

RxBias

Inv.Amp

Input

Output

Strong Dr.

p Bias Gen

n Bias Gen

Weak Dr.

There is no feedback inverter in the driver circuit Bias voltages change in the desired direction to keep the

current through weak and strong drivers the same acrossall corners



Effect of Process Variation on the Proposed CMSScheme

Ipeak remains nearly the same across all corners. Inextreme corners, SS and FF, small change in Ipeak iscompensated by the opposite change in tp.

∆V = IstaticRL remains the same across all corners,RL= 1

gmn+gmp

The inverter with input-output shorted and the inverteramplifier are designed using fingers and placed close toeach other so that their switching thresholds are closelymatched across all corners.

This makes the proposed circuit less sensitive to intra dieprocess variations as well.



Simulation Setup

Foundry specified four corner model files and mismatchmodel file for Montecarlo simulations were used.

All the signaling schemes offer the same input capacitance(equivalent to one minimum sized inverter).

All signaling scheme drive FO4 load. Line RLC used were: Rline = 244Ω /mm,

Lline = 1.5nH/mm, Cline = 201fF /mm. All schemes were designed for a throughput of 2.65Gbps. Current mode schemes are designed for Ipeak = 500µA



Effect of Intra-die Process Variations

Mismatch in VM of inverter can be up to 40 mV. 2. ForVM-mismatch of 40 mV

CMS system Percentage DegradationDelay Throughput

CMS-Fb 25 33CMS-Fpw 10 14CMS-Bias 4 9.5

2Mismatch Data sheet from the foundry



Effect of Inter-die Process Variations

Signaling System/ Percentage DegradationLogic Circuit SS SNFP FNSP

CMS-Fb 17.5 5.7 2.9CMS-Fpw 32 33.6 34.9CMS-Bias 18.75 8.2 7.14

Voltage Mode 27 < 1 2.8Ring Oscillator Freq 23 2.88 3

Interconnects with CMS-Fpw scheme become thebottleneck in overall performance of the chip in skewedcorners

Degradation in the throughput of the proposed scheme inthe skewed corners is around 7% which is less than that inCMS-Fpw scheme



Overall Comparison

Performance Comparison of four signaling schemes (line=6mm, Power measured at 1Gbps)


CMS-Fb(90 mV) 700 2.56 146 2.00CMS-Fpw 503 2.65 114 2.40

Proposed CMS 490 2.56 113 3.07Voltage Mode 1100 2.85 655 12.53

The CMS-Fb scheme consumes higher power than otherschemes due to static power consumption in the feedbackinverter

The proposed scheme shows 78% improvement in areaover voltage mode scheme whereas other schemes,CMS-Fb and CMS-Fpw show 84% and 80% respectively



Overall Comparison

X 6.6

Line =1.5mm

Data Rate = 500 Mbps

(d)CMS Power < VM Power

Data Rate=50 Mbps

X 8125 Mbps

Line=6mm(a) (b)

(f)

(c)

(e) Line=6mmData Rate=500 Mbps

0

200

400

600

800

0 2 4 6 8 10 12 14Line Length (mm)

Pow

er (

uW)

0

50

100

150

200

0 2 4 6 8 10 12 14Line Length (mm)

Pow

er (

uW)

0.01

0.1

1

10

10 100 1000 10000 Data Rate (Mbps)

Ene

rgy

(pJ)

0

200

400

600

800

2 3 4 5 6 7 8 9 10Line Length (mm)

Dat

a R

ate(

Mbp

s)

0

0.5

1

1.5

2

2.5

0 2 4 6 8 10 12Line Length (mm)

Del

ay (

ns)

10

100

1000

10000

10 100 1000 10000 Data Rate(Mbps)

Pow

er (

uW)

DOD−Fb+Rx−Fb [1] DOD−Fpw+Rx−BMul [3] Voltage ModeProposedDOD−Fpw+Rx−Fb [2]



Bidirectional Links

In many applications, on-chip buses need to carry signal in bothdirections.

For example, the bus between processor and memory, mainprocessor and floating point multiplier etc.

Often bidirectional buffers with direction control are used forthis.



Limitations of Conventional Bidirectional Buffer

Back-to-Back ConnectedTri-state Buffers

En

En

En

En

En En

En

SegmentWire

SegmentWire

En

SegmentWire

En= Signal

Direction

One of the two tristate buffers isenabled at a given time

Two transistors in stack ⇒ increasedsizes of PMOS and NMOS

Delay of a bidirectional repeater is morethan that of a unidirectional buffer

Direction control signal is required byeach repeater

Buffers offer huge load to directioncontrol signal

Buffers carrying direction control signalconsume additional power

We need a repeaterless Signaling Scheme



The Proposed Current Mode Bidirectional Link

Employs only two bidirectional transceivers, one at eachend of the line.

Direction signal is required only at two ends of the line The direction control signal can be the same as one of the

control signal or derived from it based on communicationprotocol

Assumption: Direction signal (Tx/Rx) is locally available atboth ends before data transmission starts



Proposed Current-Mode Transceiver

Tx/Rx

Tx/Rx

element Delay

Tx/Rx

Tx/Rx

Vbn

Vbp

Vbp

Long

Tx_ip_1

In

Tx_ip_0

Long

Driver Driver

Transmitter Part

AmplifierTerminator

Receiver Part

Wir

e

Data

out

PMOS

NMOSShort

NMOS

ShortPMOS

Weak Strong

Vbn

Inverter

Either the transmitter part or the receiver part is enabled at atime



Speed-Power of Proposed Bidirectional CMS Scheme

Current-Mode Vs. Voltage-Mode

2 3 4 5 6 7

2.5 2

1.5 1

0.5 0

8 2 3 4 5 6 7 8

180

140

100 60 20

CM−Bid VM−Bid

2 3 4 5 6 7 8

PowerCM−Bid

PowerVM−Bid

Data Rate=500Mbps 10e3

1e3

1e2

(c)

5X100Mbps

Line=4mm

1000 100

1e3

10e3

1e2

Line Length (mm)

(a) (b)

(d)

Dat

a R

ate

(Mbp

s)C

ross

over

35%7x

Line Length (mm)

Pow

er (

uW)

Line Length (mm)

Data Rate(Mbps)

Del

ay (

ns)

Pow

er (

uW)

35% improvement in delay fornearly all line lengths

1.7× lower power for 2mm linesand 7× lower power for 8mmline

Power crossover frequency100Mbps for 4mm long lines

5 × reduction in power at 1Gbps For lines longer than 2mm

communicating at data-ratesmore than 180Mbps, theproposed scheme consumesless power than voltage-mode

Designed in 180nm for Vdd=1.8V using nominal Vt devices

Line Characteristics: R=211Ω/mm and C=0.245pF/mm



Effect on Supply Noise

Peak Current Drawn From Supply

68% reduction in peak current and hence contribution tosupply noise is much less

80% reduction in active area



Performance of Proposed Scheme in Four DigitalProcess Corners

Specs Delay (ns) Power (µW)VM-Bid CM-Bid VM-Bid CM-Bid

TT 1.35 0.81 2127 567SS 1.57 0.90 2055 435FF 1.21 0.69 2163 727

FNSP 1.35 0.80 2113 572

For a 4mm line operating at 500Mbps 38% improvement in delay even in worst case (SNFP) 3.45× lower power consumption even in worst case

(SNSP)



Part III

Implementation on Si and Measured Results

On chip measurementTime to Frequency ConversionTime to Voltage Conversion

Implementation on a Test Chip

Measurement ResultsBidirectional Lines



Motivation

Delays of on-chip interconnects are of the order ofhundreds of pico-seconds.

It is nearly impossible to measure these off-chip.

We need on chip delay measurement circuits. We havedesigned two test circuits based on:

Time to Frequency Conversion Time to Voltage Conversion



Time to Frequency Conversion

MUX

DEMUX

NVERTERS

I

S 0S

Tx

(a) Delay Measurement Circuit: Principle

L2

L3

L1

Receiver

Wire

RORO

L3

S

Transmitter

MuxDemux

S

L3=L1+L2

(b) Delay Measurement with CMS Link: Floorplan

L1Rx

L2

LinkCMS

1Wire

Wire

with

Transmission gates were used toimplement switches.

Multiplexer(demultiplexer) are designed sothat delays for both possible paths throughthe mux/demux pair are the same.

The floor plan of the circuit is such that thebeginning and the end of the longinterconnect are close to each other.

Therefore when the short path L3 ischosen, the total delay corresponds to thedelay in inverters, mux/demux etc.



MUX

DEMUX

NVERTERS

I

S 0S

Tx

(a) Delay Measurement Circuit: Principle

L2

L3

L1

Receiver

Wire

RORO

L3

S

Transmitter

MuxDemux

S

L3=L1+L2

(b) Delay Measurement with CMS Link: Floorplan

L1Rx

L2

LinkCMS

1Wire

Wire

with We first measure the frequency of

oscillation choosing the short wire pathbetween the demux and mux.

This gives the delay of the measurementcircuit except for the system under test.

We now select the interconnect systemwhose delay we want to measure and findthe frequency again.

Delay = 0.5

1fRO

−1

fsystem



Time to Frequency Conversion: Accuracy

To assess the accuracy of the scheme, we simulated the wholecircuit, for different line lengths up to 14 mm in a 180 nmprocess.

The delay through the interconnect scheme was notedfrom the simulation results. We call this the “SimulatedDelay”

The delay was also calculated by the formula:

0.5

1fRO

−1

fsystem

We call this the “Calculated Delay” These results were tabulated to assess the expected

accuracy from this test scheme.



Time to Frequency Conversion: Accuracy

Line Length Simulated Calculated % Error(mm) Delay (ps) Delay (ps)

4 501 507 1.26 661 658 0.4

10 1068 1077 0.814 1575 1599 1.5

Delays are the average of rise and fall delay Power-delay product can be evaluated using this circuit. This being a differential measurement, the only source of

error is differences in rise and fall time



Time to Voltage Conversion

VddVref

Mn0 Mn1

Clock

Test Pulse

Input

SystemUnder Test

Pulse SelectDelayedInput

0

1

I D

Capacitor C is pre-charged to peak value during thenegative phase of the clock.

It is then discharged for a time equal to the delay throughthe system.

Delay =C∆V

I = k∆V Value of k is found experimentally using a calibration pulse

of known duration.



Time to Voltage Conversion: Accuracy

Line Simulated Delay Calculated Delay ErrorLength rising falling rising falling rising falling(mm) (ps) (ps) (ps) (ps) % %

4 380 393 378 398 0.8 1.06 478 497 482 503 0.8 1.2

10 730 769 733 781 0.4 1.814 1065 1149 1078 1171 1.2 1.9

This scheme permits the measurement of rise and falldelays separately.

Accuracy of about 2% is predicted by simulations.



Current-Mode Signaling Test Chip

1.5mm × 1.5mm chip fabricated in 180nm MM/RF process 44-pin die packaged in QFN56 package



Measurement Results

(Frequency measured using a 6-digit frequency counter)

Signaling Delay Energy EDP Measured atScheme (ns) (pJ) (pJ×ns) Data Rate (Mbps)

Voltage Mode 1.191 4.54 5.328 371CMS-Fb 1.006 1.52 1.52 400

CMS-Bias 0.938 0.851 0.799 621

The proposed circuit offers 22% improvement in delay and 85%improvement in EDP over voltage-mode scheme.



Performance of Proposed CMS Scheme

8

Del

ay (

ns)

Pow

er (

mW

)

Power Powerof

VMCMS−Biasof

Dat

a R

ate

(Mbp

s)B

reak

even

Line Length (mm)

Line Length (mm)

(a) (b)

(c) (d)

66.66 Mbps

Line=6mm

Data Rate=600 Mbps

Line Length (mm)

40%

0.4

0.8

1.2

1.6

2 3 4 5 6 7 8

3 4 5 6 7 8 0.1

10

1

3 4 5 6 7 8

0

2

4

6

8

100 1000 Data Rate(Mbps)

Ene

rgy/

bit (

pJ)

VM CMS−Fb CMS−Bias

20

60

100

140

180

Voltage-mode scheme was optimized for delayseparately for every line length

At least 7× lowerpower in the worstprocess corner

78% gain in activearea

65% reduction inpeak current



Comparison with Existing Dynamic Overdriving CMSSchemes

Source JSSCC CICC ESSCIRC This This*2006 2006 2005(CMS-Fb) work work

Sim./Measured Meas. Meas. Meas. Meas. Sim.Tech. 130nm 250nm 130nm 180nm 180nm

Line (mm) 10 5 10 6 6Gain in Delay 32% 28.3% 53% 22.5% 32%

Gain in Energy/bit 35.48% 67% 25% 81.0% 87%Gain in EDP 56.5% 76.8% 65.5% 85% 90%

Data Rate (Gbps) 3 2 0.7 0.62 1Activity α 1.0 1.0 NA 1.0 1.0



Comparison With Voltage Mode Buffer Insertion

The proposed dynamic overdriving CMS scheme offers26-40% improvement in delay over the voltage-modescheme for 2mm-8mm long lines.

These also offer improvement in energy consumption overbuffer insertion scheme for lines longer than 2mmoperating at data-rates more than around 66Mbps.

The proposed 6mm long link reduces energy consumptionat least by a factor of 7 compared to the voltage-modescheme at 1Gbps.

It offers 85% improvement in Energy Delay Product (EDP)over voltage-mode scheme.



Comparison With Other Current Mode Schemes

The scheme proposed by us offers 22% improvement inPower Delay Product (PDP) over the current mode schemewith feedback proposed by Katoch et al.

The CMS scheme with feedback is sensitive to intra-dievariations. Our CMS scheme remains faster than logiccircuit even in the presence of intra-die and inter-dieprocess variations.



Measurement Results for Bidirectional Links

Measurement results match simulation results within 20% Voltage-mode bidirectional link was not put on silicon due

to limited number of pads

Signaling Delay Power PDP Data rateScheme (ns) (µW ) (mW×ns) of Measurement(Gbps)CM-Bid 1.16 680 0.788 0.56



Matched Model Parameters

BSIM parameters corresponding to this run were extracted A few main model parameters (BSIM) were changed to

define four process corners (FF,SS,FS,SF) Main model parameters (BSIM) were adjusted to match

Isat , Vth, Ioff and a few points on measured Ids-Vgs

characteristics of the devices fabricated in this process run.



Simulation with Matched Model Parameters

Parameters TT Measured MMP % MatchBasic Device Parameters

Isatn(mA) 6.23 6.44 6.43 99.8Isatp(mA) 2.40 2.22 2.28 97.3Vtn(mV) 501 510 506 99.2Vtp(mV) 494 493 499 98.8Ioffn(pA) 75 170 120 82.4Ioffp(pA) 80 48 58 80.5

Idsn/Idsp@ Vgs Ids − [email protected] (µA) 66.6 65 66.4 [email protected] (µA) 76.2 70 67.5 [email protected] (µA) 154.4 150 145 [email protected] (µA) 191 170 172 [email protected] (µA) 347 330 317 [email protected] (µA) 491 440 452 97.27



Measurement Results and Simulation Results withMMP

0.9

1.1

1.3

1.5

1.7

200

700

1200

1700

2200

1.6 1.7 1.8 1.6 1.7 1.8

0.3 0.8 1.3 1.8 2.3 2.8

1.6 1.7 1.8

CM−Bid (MMP) VM−Bid (MMP ) CMS−Bid (Measured)

Pow

er (

uW)

Vdd (V) Vdd (V)

Vdd (V)

PD

P (

X 1

e−12

)D

elay

(ns

)

Improvement in Specs

Vdd (V)

36.8 7.2

34.41.7

1.8 4.01 6.0

6.84.39

4.5

34.21

1.6

For Simulations using MMP

Delay(%) Power(x) PDP(x)



Conclusion

Global interconnects form a major bottleneck forperformance of digital system at scaled down technology.

Use of current mode signaling is promising to remove thisbottleneck.

Through simulation, circuit fabrication and actualmeasurements, we have demonstrated that current modesignaling has overwhelming advantages over the currentlyused voltage mode buffer insertion schemes.

We have demonstrated that the particular configurationsuggested by us for a current mode scheme is superior toother current mode schemes.

Our scheme is robust with respect to batch to batchparametric variations and to on chip parametric variation.

Therefore we assert that it is a practical option for use inmodern systems for implementing both unidirectional andbidirectional data links.



Current Mode Interconnect

Marshnil Dave, Maryam Shojaei Baghini, Dinesh Sharma

Department Of Electrical EngineeringIndian Institute Of Technology, Bombay

December 2, 2010



Contents

1 Introduction 21.1 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.1.1 Unscaled Interconnect Delay . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Buffer Insertion for Delay Reduction . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.1 Optimum Buffer Insertion . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Concerns with Voltage mode Buffer Insertion Technique . . . . . . . . . . . . . 4

1.3.1 Timing closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3.2 Problem with bi-directional data transmission . . . . . . . . . . . . . . 51.3.3 Signal Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4 Current signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.4.1 Zero input impedance circuit . . . . . . . . . . . . . . . . . . . . . . . . 6

1.5 Other low impedance line terminations . . . . . . . . . . . . . . . . . . . . . . 81.5.1 Digital Designers need not panic! . . . . . . . . . . . . . . . . . . . . . 8

1.6 Reduced swing signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.7 Improvment in Current Mode Signaling . . . . . . . . . . . . . . . . . . . . . . 10

1.7.1 Inductive Peaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.7.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151.7.3 Dynamic Overdriving . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Variation Tolerant Current Mode Signaling 222.1 Need for Process Variation Tolerance . . . . . . . . . . . . . . . . . . . . . . . 222.2 Robustness requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.2.1 Effect of Process, Voltage and Temperature Variation . . . . . . . . . . 222.2.2 Effect of common mode voltage mismatch . . . . . . . . . . . . . . . . 23

2.3 System parameters affected by variations . . . . . . . . . . . . . . . . . . . . . 232.4 A brief review of Current Mode Signaling Schemes . . . . . . . . . . . . . . . . 24

2.4.1 CMS Scheme with Feedback (CMS-Fb) . . . . . . . . . . . . . . . . . . 242.5 Effect of Process Variations on different CMS Schemes . . . . . . . . . . . . . 25

2.5.1 CMS Scheme with Feedback (CMS-Fb) . . . . . . . . . . . . . . . . . . 252.5.2 CMS Scheme with fixed pulse width (CMS-Fpw) . . . . . . . . . . . . 26

2.6 The Proposed Variation Tolerant CMS Scheme . . . . . . . . . . . . . . . . . . 27

1



2.7 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.8 Bidirectional Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.8.1 Simulated Performance of Bidirectional Link . . . . . . . . . . . . . . . 31

2



Chapter 1

Introduction

1.1 Scaling

VLSI technology has used device scaling to continually improve the performace of circuits.In constant field scaling, all device dimensions as well as all voltages are scaled down bysome factor S. This leads to improved packing density: (↑ S2), improved speed (delay ↓ S),and improved power consumption (↓ S2). However these improvements apply only to activecircuits. What about passive components?

1.1.1 Unscaled Interconnect Delay

Consider an interconnect in a chip. This is made of a metal layer of thickness tm running overan insulator of thickness ti.

L

Wtm

ti

Figure 1.1: Delay through an Interconnect

R = ρL

Wtm, C = ǫ

LW

ti

Charge Time ≈ RC = ρǫL2

tmti(1.1)

To first order, delay is independent of W. This is because increasing W reduces resistancebut increases capacitance in the same ratio. Unfortunately W is the only parameter that thecircuit designer can decide! (L is fixed by the distance between the points to be connected,

3



ρ, ǫ, tm and ti are decided by the technology).

If we see the distribution of wirelengths on a design, there are a large number of wireswith short lenths which connect a gate to the other locally. At the same time, there is a con-

Rel

ativ

e F

requ

ency

Normalized Wire length

Figure 1.2: Notional distribution of wire lengths on a chip

siderable number of much longer wires which run over the entire chip. These include clocks,power on reset signals, power supply lines, data buses etc. These are the global interconnects.

While local interconnects scale with device size, global interconnects scale with die size.From eqn 1.1

Interconnect Delay =ρǫ

tmtiL2 ≡ AL2 (1.2)

For local interconnects, L scales the same way as tm and ti, so delay is invariant. However, evenas the transistor sizes are scaled down as the technology advances, average chip sizes show anincreasing trend. This is because the complexity of systems that we put on integrates circuitshas increased at a rate higher than the rate at which device geometries shrink. Therefore,for Global Interconnects, L goes up with die size, while tm and ti scale down. This leads to asharp increase in delay.

1.2 Buffer Insertion for Delay Reduction

Global Interconnect delay can be the determining factor for the speed of an integrated system.The L2 dependence of interconnect delay is a source of particular concern. This problem canbe somewhat mitigated by buffer insertion in long wires. We define some critical wire lengthL′ and when a wire segment exceeds this length, we insert a buffer.

1.2.1 Optimum Buffer Insertion

What is the optimum wire length after which we should insert a buffer? Consider a long wirein which we insert buffers after every segment of length L’. From eqn 1.2,

Segment wire Delay = ρǫL′2

tmti= AL′2

4



Let buffer delay = τ . For n segments, there will be n-1 buffers, and L = nL’ . If the total

Length = L’

Figure 1.3: A buffered interconnect line

delay is denoted by ∆

∆ = nAL′2 + (n − 1)τ =L

L′AL′2 + (

L

L′− 1)τ = ALL′ + (

L

L′− 1)τ

Putting the derivative with respect to L’ = 0 for optimization,

AL −L

L′2τ = 0, so AL′2 = τ (1.3)

Since AL′2 is the wire delay for the segment, this equation tells us that L’ should be so chosenthat the wire segment delay = τ . Total delay is proportional to n and so, is linear in L.

1.3 Concerns with Voltage mode Buffer Insertion Tech-

nique

Currently, buffer insertion is the most widely used method to control interconnect delay.However, there are several difficulties with buffer insertion. Buffers consume power and siliconarea. Also, we normally do floor planning and layout first and then put in the interconnects.When the wire length reaches L’, we need to put in a buffer. However, it is quite possible thatat this point, there is active circuitry underneath, and there is no room to put in a buffer!Then we either have to live with buffer insertion at non-optimal wire lengths or create spaceby pushing out existing cells and modifying the lay out.

1.3.1 Timing closure

Global interconnects are placed after active circuit design and layout is complete. One has toanticipate the wire length, and then design the active circuits to meet total delay specifications.If the actual wire length is different from what was anticipated, one has to re-design the activecircuits after layout. After a fresh layout, wire lengths and hence, delays are changed. Thisleads to a design-layout-redesign iteration known as Timing Closure. This iteration becomeslonger and longer when total delays are dominated by interconnect delay.

5



1.3.2 Problem with bi-directional data transmission

Global interconnects often include data busses, which may require bidirectional data trans-mission. (For example, a bus connecting a processor and memory). However, buffer insertionfixes the direction of data flow! Therefore, if we need bidirectional transmission, we need toreplace buffers with bidirectional transceivers. These require a direction signal, which willenable the buffers pointing in the desired direction. This direction signal must also be routedwith the bus (and should have its own buffers) and it should reach the bidirectional buffersahead of the data.

1.3.3 Signal Integrity

As interconnect wire separation is reduced, there is a serious signal integrity problem becauseof electrostatic coupling between long wires. Inter-signal interference can lead to unpredictabledelay variations. Grounded shielding wires must often be inserted to avoid interference. Thisleads to extra capacitance and CV 2f power loss.

1.4 Current signaling

Because of these problems with voltage mode signaling, we propose that 1’s and 0’s be signaledby the presence or absence of a current and not by a high or a low voltage. This has severaladvantages:

• Current rise time is limited by inductance rather than capacitance. Typically, inductiveeffects are much smaller than capacitive effects. (After all, ǫ ≃ 4, µ = 1 for insulatorsused in IC’s). So electromagnetic coupling is lower than electrostatic coupling.

• Signal voltage swings are limited by scaled down supply voltages: this does not restrictcurrent swings.

• In fact, we can use multiple current values to send more than one bit down the samewire!

If we hold the Voltage on the interconnect nearly constant dynamic power will be negligibleand latency will be much lower.

We also have the option of using multiple current levels to transmit multiple bits simul-taneously. This can give higher Throughput and lower interconnect area.

Current mode transmission offers the possibility for improving Latency, Throughput andPower simultaneously!

Since ∆V → 0, while ∆I 6= 0, ⇒ We need a low (near 0) input impedance receiver.

6



1.4.1 Zero input impedance circuit

Low rin amps are used for photo-detectors [?]. Once such configuration is shown below: This

Mp2Mp1

Mn1 Mn2

Vref

v1 v2ii1 2

v

Figure 1.4: Low input impedance Beta Multiplier Circuit

circuit uses complementary current mirrors feeding each other. This configuration is alsoknown as a beta multiplier. To derive its input impedance, we can write small signal currentsand voltages as:

i1 = gmn1v1 = gmp1(v − v2)i2 = gmn2v1 = −gmp2v2

v2 = −gmn2

gmp2

v1 = −gmn2

gmp2

i1gmn1

i1 = gmp1v +gmn2/gmn1

gmp2/gmp1

i1

We define Γ ≡gmn2/gmn1

gmp2/gmp1

(1.4)

then, i1(1 − Γ) = gmp1v

Which gives rin = (1 − Γ)/gmp1 (1.5)

By making Γ close to 1, we can reduce the input impedance to 0. In fact we can set theinput impedance to any value, (for example, the characteristic impedance of a transmissionline) by a proper choice of Γ and gmp1. However, we should make sure that Γ does not exceed1, because that will lead to a negative input impedance, and instability. Therefore it is ofsome interest to determine how accurately we may set the value of Γ inspite of power supply,process and temperature variations.

7



Robustness of design

In saturation,

Id =1

2µCox

W

L(Vg − VT )2

So, gm = µCoxW

L(Vg − VT ) =

√

2µCoxW

LId

gmn2/gmn1 =

√

√

√

√

(W/L)n2

(W/L)n1

I2

I1

gmp2/gmp1 =

√

√

√

√

(W/L)p2

(W/L)p1

I2

I1

Therefore Γ ≡gmn2/gmn1

gmp2/gmp1

=

√

√

√

√

(W/L)n2/(W/L)n1

(W/L)p2/(W/L)p1

(1.6)

This means that Γ depends only on transistor geometries and is independent of supply voltage,bias values, transistor parameters or temperature. This enables us to choose a value of Γ veryclose to 1, which in turn can provide very low input impedence.

Receiver Design - Input stage

Just by adding another current mirror transistor and a current to voltage converter, we canuse the beta multiplier as a receiver for current mode data signaling.

Mp2Mp1

Mn1 Mn2

Vref

v1 v2ii1 2

Iint

Iout

Figure 1.5: A Beta Multiplier based Current Mode Receiver

The input resistance is controlled largely by the geometry of transistors. The beta mul-tiplier also has the property that it drives its own input through a low output impedance tobring it to the same voltage as Vref . Thus the interconnect voltage is held fixed. The Inputresistance is largely insensitive to process variations. The only dependence comes through

8



gmp1, but since it is multiplied by 1 − Γ which is close to 0, the sensitivity to variations isquite low.

1.5 Other low impedance line terminations

The beta multiplier is not the only choice for providing low input impedance. Simpler circuitslike a diode connected MOS transistor are often used. Another option is to use an inverterwith its output shorted to its input as the termination. This is equivalent to terminating theline to ground through a diode connected n channel transistor and to Vdd through a diodeconnected p channel transistor. The effective terminating admittance is the sum of gm valuesof n and p channel transistors.

Indeed in our later work, we have preferred a reference inverter with its output shortedto input as the line termination. Low input impedance can be achieved by adjusting the

Vdd

Figure 1.6: Alternative circuit for Low impedance Termination

geometry of the p and n channel transistors. This termination is faster because of the absenceof parasitic capacitances contributed by the beta multiplier transistors. The termination holdsthe line at a DC potential which is matched to the transition voltage of the amplifier inverterwhich follows the termination.

1.5.1 Digital Designers need not panic!

We suggest that only the interface works in current mode. Rest of the circuit remains tradi-tional.

A library circuit will do the voltage mode to current conversion (transmitter) and anotherwill convert the current back to voltage mode (receiver).

9



To put this plan into action, we need a receiver with very low input impedance. (If inductiveeffects are to be taken into account, we would like to terminate the line into its characteristicimpedance.)

1.6 Reduced swing signaling

The main advantage of the current mode signaling comes from the fact that the line voltageis held nearly constant. This is somewhat similar to low swing signaling in voltage mode.Low swing signaling in voltage mode involves driving high capacitive loads like interconnects

Buffer/ampLine

Low Swing Voltage mode

DriverLow swing

Figure 1.7: Reduced Swing Voltage Mode Signaling

to re-defined levels for 0 and 1 which drastically reduce the voltage swing on the load. Thelevels are restored to the usual CMOS levels at the receiver end by amplification. This candrastically reduce the power required by line drivers

It is important to distinguish between reduced swing voltage mode signaling and currentmode signaling.


RL

ReceiverLine

DriverLow swing

Figure 1.8: Current Mode signaling

• In reduced swing voltage mode signaling, the line is not terminated in a low impedance.

• Current mode signaling terminates the line in a low impedance.

• This reduces the time constant, increases bandwidth.

• However, this also leads to static power consumption.

10



1.7 Improvment in Current Mode Signaling

Traditional current mode signaling consumes Static Power and presents a trade-off betweenspeed, static power and signal to noise ratio. Its performance can be improved by two tech-niques:

• Inductive Peaking

• Dynamic Over-driving

1.7.1 Inductive Peaking

On-chip interconnects can be modeled as distributed RC lines which is essentially a lowpass filter. This results in severe attenuation of high frequency components of the signalarriving at the receiver end. This can be corrected by bandwidth enhancement techniquesused in RF amplifiers. This involves inductive peaking where the line termination circuitexhibits inductive input impedance. Current flowing through the inductor will produce avoltage (jωL)i, which increases with frequency. Thus, this can counteract the high frequencyattenuation due to the line.

R0

C0

R0 R0 R0

C0 C0 C0

L

RL

DRIVER

Figure 1.9: Inductively Terminated Line

We performed simulations in which the interconnect line was represented by a realisticLCR segmented line. This was then terminated with resistive/inductive loads of differentvalues. Results of the simulation are shown in fig. 1.10 for a 4mm long line terminatedwith a 1K resistor in series with different inductance values. The transfer function of theterminated line is plotted as a function of frequency on a log-log scale in fig. 1.10 (a). For agiven line length, the amount of bandwidth enhancement is a function of inductance and loadresistance. The bandwidth increases with inductance upto a point and after that it remainsfixed at that value. As can be seen, we can achieve enhancement of about 500MHz in 3dBbandwidth in this example for an inductive termination of 100 nH. (Because of the log scale,the separation between the curves does not truely reflect the amount by which the bandwidthhas been increased). The bandwidth enhancement remains at roughly the same value forlarger inductances. We designate the inductance at which the improvement in bandwidth

11



(a) (b)

Figure 1.10: Effect of Inductive Termination on Bandwidth

saturates as Lpeak. As seen from fig 1.10 (b), The dependence on L is not very critical as longas the value is greater than Lpeak. The required inductance for significant enhancement inbandwidth Lpeak is of the order of a few hundreds of nano Henries. This cannot be convenientlymade from spiral inductors etc. Therefore for a practical implementation, we need an activeinductor.

Beta Multiplier: A Gyrator

The beta multiplier circuit suggested earlier for achieving low input resistance values caninfact be used to simulate inductances of required values. The Beta Multiplier essentially

Mp2Mp1

Mn1 Mn2

Vref

v1 v2ii1 2

v

forms a gyrator circuit with two Gm elements connected back to back along with the para-sitic capacitance of the transistors. So Beta Multiplier Circuits can exhibit inductive inputimpedance for some frequency range if designed properly.

12



Beta Multiplier: Input Impedance

The input impedance of the beta multiplier is calculated by taking parasitic capacitances intoaccount.

Cg1

Cg3

Cg21/gmn1

ro_p1 1/gmp2

i1

i2

i2 = gmn2 vg1

i1 = gmp1 (vint - vg2)int

Figure 1.11: Small Signal Equivalent Circuit of Beta Multiplier

We define:

τ1 ≡Cg1

gmn1

τ2 ≡Cg2

gmp2

R1 ≡1

gmn1

τ3 ≡ Cg3rop1 τ4 ≡Cg3

gmp1

R3 ≡ rop1

γ ≡ gmp1/gmp2

gmn1/gmn2

k ≡ R1

R3

Then the input impedance can be shown to be:

Zin =(τ1τ2 + kτ2τ3)s

2 + (τ1 + τ2 + k(τ3 + τ2))s + 1 + k − γ

(gmp1 + 1

R3

)(1 + τ1s)(1 + τ2s)(1 + τ4s)(1.7)

Correspondingly, the resistive part of the input impedance can be expressed as:

Rin =(1 − γ) + 1

gmn1rop1

gmp1 + 1

rop1

Beta Multiplier : Equivalent Circuit

The nature of input impedance (inductive of capacitive) is determined by the relative locationof poles and zeros. If the first zero occurs at least a decade prior to the first pole, the inputimpedance is inductive. To ensure that a zero occurs a decade prior to the first pole, we haveto choose operating currents etc., such that γ − 1

gmn1rop1

> 0.9 and any two time constants

are equal. Under these conditions, we may approximate the input impedance of the beta

13



Ceq

Req

Leq

Zin

Figure 1.12: Equivalent circuit for the Beta Multiplier

multiplier by the equivalent circuit shown in fig 1.12

Where

Leff =rop1

gmp1rop1 + 1

Cg1

gmn1

+Cg2

gmp2

(1.8)

+Cg2

gmp2gmn1rop1

+Cg3

gmn1gmp1rop1

(1.9)

Reff =(1 − γ) + 1

gmn1rop1

gmp1 + 1

rop1

(1.10)

Ceff = KCgx (1.11)

Beta Multiplier : Input Impedance Control

We are interested in using an inductor whose value should be in hundreds of nano Henries. Wewant to find if these values can be achieved under reasonable bias and geometry conditions.We therefore evaluated the input impedance of the beta multiplier under various operatingconditions. As can be seen from the figure, the beta multiplier shows an effective inductance

Figure 1.13: Bandwidth enhancement with Beta multiplier termination

14



of hundreds of nano Henries for a practical range of input current and transistor geometries.Its effective resistance can be controlled by ratios of transconductances while its effectiveinductance depends on the absolute value of transconductance. It is possible to control Rin

and Leff with very little interaction between the two. Inductance changes from 100nH to980nH while the value of effective resistance remains within 12% of its nominal value for20µA change in the current.

Current Mode Receiver Circuit with Beta Multiplier

Source Type

Sink TypeBeta Mult.

Beta Mult.

Input

Vdd

Mp11

Mn11

Mp22

Mn22

Mp1 Mp2

Mn1 Mn2

Inv Amp

Vref

Figure 1.14: Current mode receiver with inductinve peaking using beta multipliers

We can design a current mode receiver with inductive peaking using two beta multipliersas shown in fig. 1.14 above. One of the beta multipliers sources current while the other sinkscurrent. The Effective impedance offered by the receiver is equal to the parallel combinationof the impedance offered by individual beta multipliers. Voltage at the input node swingsaround Vref . The small voltage swing on the line is sensed and amplified by the invertingamplifier. Vref is generated by shorting the input and output of an inverter to ensure thatthe value of Vref is the same as the switching threshold of receiver amplifier across all processcorners.

rout of Vref generation circuit comes in series with beta multiplier Zin and hence betamultiplier has to be sized accordingly.Vref generation circuit consumes static power.

15



1.7.2 Simulation Results

To see the effectiveness of inductive termination, we should compare the power as well as speedof the voltage mode buffer insertion scheme, Diode connected MOS terminated current modescheme and the beta multiplier based inductive peaking scheme. Simulations were performedfor a 6mm long line at a rate of 1 Gbps. Results of the comparison are summarized in thetable below:

(line=6 mm, Power measured at 1Gbps)Signaling Delay Throughput Power AreaScheme (ps) (Gbps) ( µW ) (µm2)

CMS-BMul(30 mV)[1] 420 2.56 310 2.00CMS-Diode-CC(30 mV)[2] 500 2.45 380 2.00

Voltage Mode 1000 2.85 3000 12.53

Inductive termination gives 16% improvement in delay and about 18 % improvement in powercompared to Diode termination. Compared to Voltage Mode scheme, we see more than 50 %improvement in delay and an order of magnitude lower power [?, ?].

1.7.3 Dynamic Overdriving

Inductive peaking attempts to correct the low pass nature of the line by putting a high passtermination at the receiver end. However, by the time the signal reaches the receiver, itshigh frequency components have been severely attenuated. Therefore boosting them back tonormal level will also boost high frequency noise.

Rather than boosting the high frequency components at the receiver end, why don’t weboost them before attenuation at the transmitter itself? This technique of boosting the highfrequency components before passing them through a low pass channel is know as “pre-emphasis”.

Concept of Dynamic Overdriving/Pre-emphasis

Current mode transmission can be speeded up by using high drive current. However, thisincreases static power consumption. One possible solution is to dump high drive current onlywhen the state of the line needs to be changed from 0 to 1 or from 1 to 0. When the lineremains at 1 or at 0 from one bit to the next, we use a small drive current to maintain theline at the required voltage. This is called Dynamic Over Driving. Dynamic Overdrivingessentially means amplifying high frequency components of the input signal

16




The transmitter end contains a weak driver and a strong driver. The strong driver is enabledonly when a level change is needed from 0 to 1 or from 1 to 0.

Weak Driver

The weak driver provides the minimal drive required to keep the line (terminated by lowimpedance) at the desired voltage level. When the input is 1, the p channel driver gate is low

Input

VDD

p Drive

n Drive

Swing Control (High)

Swing Control (Low)

Figure 1.15: Steady State (Weak) Driver

(enabled). This charges up the output. As the line voltage reaches VDD − VTp, the upper pchannel transistor turns off, restricting line voltage swing in the up direction.

Similarly when the input is 0 the n channel driver transistor is enabled by a high level atits gate. The transistor discharges the line. However, when the line voltage approaches VTn

during discharge, the lower transistor turns off, stopping the discharging process.

Thus the line can only swing beween VDD − VTp and VTn. [?]

Strong Driver

The strong driver should be enabled only when the input and the level on the output line donot represent the same logic. The feedback inverter acts as an inverting amplifier convertinglow swing logic levels on the wire to full swing (inverted) CMOS logic level on its output. TheP channel gate is low (enabled) only when both inputs to the NAND are 1. This will happenonly when the input is high AND the line is at 0. This is indeed the condition when we want

17



Wire

Feedback

Input

VDD

Figure 1.16: Dynamic (Strong) Driver

the strong driver to charge the line.

The N channel gate is high (enabled) only when both inputs to the NOR gate are 0. Thiswill happen only when the input is low AND the line is at 1.

Notice that the input to the feedback inverter is a low swing level around VDD/2. There-fore it consumes static power.

The action of the strong driver is self limiting. This is because both NAND and NORreceive the input and the inverted logic level of the line. If the input and the logic level ofthe line are the same, NAND and NOR are fed with input and input. Thus one of the inputsto NAND/NOR is 1, while the other is 0. This ensures that the output of NAND is 1, whilethat of NOR is 0, so that both the p and n channel transistors are OFF. Therefore the strongdriver does not need a series transistor as was the case for the weak driver.

When the Input = 1 and Wire voltage < Vm,the inverter output = 1, NAND output = 0 and NOR output = 0.The P channel driver is ON and dumps current to charge the line.

When the Input = 0 and Wire voltage > Vm,the inverter output = 0, NAND output = 1 and NOR output = 1.the N channel driver is ON and sinks current to discharge the line.

As soon as low swing logic level on the line becomes equal to the logic level at the inputInverter output = input,and so NAND output = 1, NOR output = 0;which disables both drive transistors automatically.

18



Dynamic Overdriving with Inductive termination?

Dynamic Overdriving (DOD) and Inductive line termination both essentially amplify highfrequency components of input signal. Can we use both?

Figure 1.17: Current drive from a Dynamic Over Drive (DOD) type transmitter

To answer this question, the following four current mode signaling schemes were simulated:

• CMS Scheme with DOD and Resistive Load

• CMS Scheme with Simple Driver and Resistive Load

• CMS Scheme Inductive Load

• CMS Scheme with DOD and Inductive Load

Dynamic Overdriving driver was implemented by an ideal voltage controlled current source(VCCS) with the output current wave shape as shown in fig 1.17. The Simple driver wasimplemented as a Voltage Controlled Current Sounce with a square output current waveshape. The drive current in this case is −Iavg for a 0 at the input and +Iavg for a 1 at theinput. For a fair comparison, Iavg for the simple driver is equal to the weighted mean of thecurrent used for dynamic overdrive transmitter.

Iavg =Ipeaktp + Istatic(t − tp)

t(1.12)

For this comparison, we used terminations of

RL = 4kΩ, L = 4µH

19



Comparison of Delay

With Large Overdrive (Ipeak = 500µA)

• Dynamic overdriving shows 5 × improvement in delay over RC

• Inductive peaking does not offer substantial additional advantage when combined withdynamic overdriving.

• Inductive peaking alone shows 25% of improvement in delay over RC

With Small Overdrive (Ipeak = 50µA)

• Dynamic Overdriving alone and inductive peaking alone give nearly the same delay

• Inductive peaking along with dynamic overdriving shows around 20% improvement indelay over dynamic overdriving alone

20



Comparison of Throughput (Eye-opening)

We apply a random sequence of bits to the input at a given data rate and observe the waveform at the receiver. The wave form, when observed for two clock periods, looks like a pair ofeyes and is known as the “eye diagram”. Wide open eyes in the vertical direction representgood signal to noise ratio as the ‘1’ level and the ‘0’ level are well separated. Goof eye openingin the time direction represents low timing jitter in the arrival time of bits – which is also adesirable feature.

As the data rate is increased, The eye closes in the vertical direction, as there is notsufficient time for the driver to charge/discharge the line. Assuming that the receiver iscapable of resolving a 30mV input to a full rail to rail swing output, we determine the datarate at which the eye opening is reduced to 30mV. This is the maximum throughput which canbe supported by the interconnect. Using this criterion, We can now compare the throughputfor the different schemes. We find that

• Dynamic overdriving improves throughput by 5 × over RC

• Inductive peaking does not offer substantial additional advantage when combined withdynamic overdriving.

• Inductive peaking shows throughput enhancement of 26% over RC

Conclusion: Inductive Peaking vs Dynamic Overdrive

• For very high data rate applications, dynamic overdriving alone should be employed asinductive peaking does not offer any additional advantages

• For low power and low data rate applications, the use of inductive peaking can give 26%improvement in throughput and 16% improvement in delay over RC.

• For low power and low data rate applications, the use of dynamic overdrive along withinductive peaking can further improve the throughput by 20%

21



Figure 1.18: Eye diagram for different schemes at data rates where the eye opening is ≈ 32mV

22



Chapter 2

Variation Tolerant Current ModeSignaling

2.1 Need for Process Variation Tolerance

Current mode signaling derives its advantages over voltage mode due to the reduced swing onthe line. Careful design is necessary, otherwise small changes in device parameters can have adisproportionate effect on the performance of the system. In modern short channel processes,variations in transistor parameters are large – some of the parameters can vary by as muchas 40% of their nominal values. We have to design circuits, so that they are robust withrespect to batch-to-batch variations, as well as variations between devices on the same die.Batch-to-batch or inter-die variations can shift operating points and drive strengths, whileintra-die variations cause mismatch in parameters of transmitter and receiver transistors.

2.2 Robustness requirements

Process, Supply Voltage and Temperature (PVT) variations will affect the core logic as wellas data communication circuitry. The requirement for data transmission is therefore not ofcomplete invariance with respect to PVT variations. We have to ensure that throughput anddelay properties of the interconnect are at least as good as data generation and clock rates.Thus the deterioration in interconnect properties should be no worse than the deteriorationin general logic.

2.2.1 Effect of Process, Voltage and Temperature Variation

Due to process, voltage and temperature variations, the drive capabilities and operatingpoints of various circuits used for data transmission will vary. The cumulative effect of all

23



these variations on the performance of the interconnect scheme.

2.2.2 Effect of common mode voltage mismatch

Because global interconnects, by definition, connect remote points on the die, on chip vari-ations can, in fact, be of even greater concern. On chip variations will result in differentcommon mode voltages at the transmitter and the receiver end. In case of ideal match, small

Vcm−Rx

Vcm−Rx

Transmitter

Ideal

Receiver

Misaligned

Figure 2.1: Mismatched common mode voltages at Transmitter and Receiver

fluctuations in line voltage are converted to rail to rail swing by the receiver. If, however, themismatch is large, the small swing on the line may be completely ignored by the receiver. It isimportant, therefore, that the amount of swing on the line is much more than the mismatch incommon mode voltages. But high swing will cause power dissipation. Therefore, it is betterto have smart bias circuits, which will reduce mismatch and the need for a large swing.

2.3 System parameters affected by variations

Variations in the following parameters have a strong influence on the performance of thesignaling scheme:

1. Ipeak: Peak current supplied by the strong driver during input transition

2. tp: Duration for which the strong driver is ON

24



3. ∆V : Line voltage swing at the receiver end in steady state

4. Mismatch between VCMRx and operating point of an amplifier

2.4 A brief review of Current Mode Signaling Schemes

Several current mode signaling schemes have been suggested in the literature. We shallconcentrate on three schemes here.

2.4.1 CMS Scheme with Feedback (CMS-Fb)

This scheme uses feedback at both the transmitter and the receiver ends to adjust the oper-ating points of these circuits. [?] The transmitter used by this scheme is shown below:The feedback inverter converts low swing logic levels on the line to full rail to rail CMOS

Wire

Feedback

Input

I 1

StrongDriver

WeakDriver

VDD

From

Wire

Figure 2.2: Transmitter used by CMS scheme with feedback

levels. The NAND/NOR gates ensure that the strong driver is turned on only during datatransitions and is turned off as soon as the line crosses the swithing point of the feedbackinverter to make the logic level on the line equal to the input. The weak driver supplies Istatic

and the line voltage swing at the receiver end is VCMRx ± IstaticRL The receiver also usesfeedback to adjust its common-mode voltage. Take the case where VCMTx at the transmitterend

25



2.5 Effect of Process Variations on different CMS Schemes

2.5.1 CMS Scheme with Feedback (CMS-Fb)

Wire

WireFeedback

Input

+−

I 1

StrongDriver

WeakDriver

RL

LineRx

Vcm Rx

VDD

RxOut


Figure 2.3: Current Mode Scheme with Feedback (CMS-fb)

Effect of Inter-die Process Variations on CMS with feedback

• Variations in Ipeak are well compensated due to the feedback at the driver end.

• If the driver is weaker due to process variations, the feed back system keeps it on forlonger till the line reaches the desired voltage.

• This might, however, not be optimum from a power point of view.

Effect of Intra-die Process Variations on CMS-Fb

If the VCMTx for the feedback inverter at the transmitter end is not the same as the VCMRx

for the receiver amplifier, this scheme does not work very well. Take the case where VCMTx

VCMRxV∆

VM−Tx

Figure 2.4: Mismatched common mode voltages at Transmitter and Receiver

at the transmitter end is lower than the VCMRx at the receiver end. During the low to hightransitions the strong driver will be turned off well before the line voltage crosses VCMRx.This can result in very slow charging of the line after the strong driver is turned off, leadingto a low throughput. In an extreme case, the line voltage may never reach VCMRx, leading to

26



malfunction.

The same phenomenon will occur for the high to low transition if VCMTx > VCMRx.

2.5.2 CMS Scheme with fixed pulse width (CMS-Fpw)

WireInput

+−

StrongDriver

WeakDriver

RL

LineRx

Vcm Rx

VDD

RxOut


Delay

Fixed WidthPulse Generator

• tp is given by delay element

• Less sensitive to intra-die variations

• In the skewed corners, sourcing Ipeak and sinking Ipeak are different, leading to differentrise and fall delay

• Throughput can degrade significantly in skewed corners

[?]

27



2.6 The Proposed Variation Tolerant CMS Scheme

Minimizing Process Dependence

To minimize process dependence, we need smart bias circuits which sense the process corner

and adjust the bias to compensate for variations.

Short p MOS

Long n MOS

Vbp

Vdd Vdd

Long p MOS

Short n MOS

Vbn

• Long Channel transistors show relatively less variation with process compared to ShortChannel transistors in the same process.

• We can make use of this difference to design a bias generator which senses the processcorner and tries to increase the transistor current in the slow corners and to decrease itin the fast corners.

• Simple bias generators using inverters with input and output shorted and which use thisfeature are shown here.

Proposed CMS Scheme with Smart Bias

We propose a Dynamic Overdrive scheme in which both the strong and the weak drivers useconstant current sources controlled by process aware bias generators.

Short

nMOS

pMOS

Long

Long

nMOS

WireDelay

ShortpMOS Vbp

Vbn

Vdd

Vdd

Vdd

Rx

RxBias

Inv.Amp

Input

Output

Strong Dr.

p Bias Gen

n Bias Gen

Weak Dr.

• There is no feedback inverter in the driver circuit

• Bias voltages change in the desired direction to keep the current through weak andstrong drivers the same across all corners

Effect of Process Variation on the Proposed CMS Scheme

• Ipeak remains nearly the same across all corners. In extreme corners, SS and FF, smallchange in Ipeak is compensated by the opposite change in tp.

28



• ∆V = IstaticRL remains the same across all corners, RL= 1

gmn+gmp

• The inverter with input-output shorted and the inverter amplifier are designed usingfingers and placed close to each other so that their switching thresholds are closelymatched across all corners.

• This makes the proposed circuit less sensitive to intra die process variations as well.

2.7 Performance Evaluation

Simulation Setup

• Foundry specified four corner model files and mismatch model file for Montecarlo sim-ulations were used.

• All the signaling schemes offer the same input capacitance (equivalent to one minimumsized inverter).

• All signaling scheme drive FO4 load.

• Line RLC used were: Rline = 244Ω /mm, Lline = 1.5nH/mm, Cline = 201fF/mm.

• All schemes were designed for a throughput of 2.65Gbps.

• Current mode schemes are designed for Ipeak = 500µA

Effect of Intra-die Process Variations

Mismatch in Vm of an inverter can be up to 40 mV. 1. For a mismatch of 40 mV in the Vm

value of the inverters,

CMS system Percentage DegradationDelay Throughput

CMS-Fb 25 33CMS-Fpw 10 14CMS-Bias 4 9.5

1Mismatch Data sheet from the foundry

29



Effect of Inter-die Process Variations

Signaling System/ Percentage DegradationLogic Circuit SS SNFP FNSP

CMS-Fb 17.5 5.7 2.9CMS-Fpw 32 33.6 34.9CMS-Bias 18.75 8.2 7.14

Voltage Mode 27 < 1 2.8Ring Oscillator Freq 23 2.88 3

• Interconnects with CMS-Fpw scheme become the bottleneck in overall performance ofthe chip in skewed corners

• Degradation in the throughput of the proposed scheme in the skewed corners is around7% which is less than that in CMS-Fpw scheme

Overall Comparison

Performance Comparison of four signaling schemes (line=6 mm, Power measured at 1Gbps)


CMS-Fb(90 mV) 700 2.56 146 2.00CMS-Fpw 503 2.65 114 2.40

Proposed CMS 490 2.56 113 3.07Voltage Mode 1100 2.85 655 12.53

• The CMS-Fb scheme consumes higher power than other schemes due to static powerconsumption in the feedback inverter

• The proposed scheme shows 78% improvement in area over voltage mode scheme whereasother schemes, CMS-Fb and CMS-Fpw show 84% and 80% respectively

X 6.6

Line =1.5mm

Data Rate = 500 Mbps

(d)CMS Power < VM Power

Data Rate=50 Mbps

X 8125 Mbps

Line=6mm(a) (b)

(f)

(c)

(e) Line=6mmData Rate=500 Mbps

0

200

400

600

800

0 2 4 6 8 10 12 14Line Length (mm)

Pow

er (

uW)

0

50

100

150

200

0 2 4 6 8 10 12 14Line Length (mm)

Pow

er (

uW)

0.01

0.1

1

10

10 100 1000 10000 Data Rate (Mbps)

Ene

rgy

(pJ)

0

200

400

600

800

2 3 4 5 6 7 8 9 10Line Length (mm)

Dat

a R

ate(

Mbp

s)

0

0.5

1

1.5

2

2.5

0 2 4 6 8 10 12Line Length (mm)

Del

ay (

ns)

10

100

1000

10000

10 100 1000 10000 Data Rate(Mbps)

Pow

er (

uW)

DOD−Fb+Rx−Fb [1] DOD−Fpw+Rx−BMul [3] Voltage ModeProposedDOD−Fpw+Rx−Fb [2]

30



2.8 Bidirectional Links

Bidirectional Links

In many applications, on-chip buses need to carry signal in both directions.

For example, the bus between processor and memory, main processor and floating pointmultiplier etc.

Often bidirectional buffers with direction control are used for this.

Limitations of Conventional Bidirectional Buffer

Back-to-Back Connected Tri-state Buffers

En

En

En

En

En En

En

SegmentWire

SegmentWire

En

SegmentWire

En= Signal

Direction

• One of the two tristate buffers is enabled at a given time

• Two transistors in stack ⇒ increased sizes of PMOS and NMOS

• Delay of a bidirectional repeater is more than that of a unidirectional buffer

• Direction control signal is required by each repeater

• Buffers offer huge load to direction control signal

• Buffers carrying direction control signal consume additional power

We need a repeaterless Signaling Scheme

The Proposed Current Mode Bidirectional Link

• Employs only two bidirectional transceivers, one at each end of the line.

• Direction signal is required only at two ends of the line

• The direction control signal can be the same as one of the control signal or derived fromit based on communication protocol

• Assumption: Direction signal (Tx/Rx) is locally available at both ends before datatransmission starts

31



Proposed Current-Mode Transceiver

Tx/Rx

Tx/Rx

element Delay

Tx/Rx

Tx/Rx

Vbn

Vbp

Vbp

Long

Tx_ip_1

In

Tx_ip_0

Long

Driver Driver

Transmitter Part

AmplifierTerminator

Receiver Part

Wir

e

Data

out

PMOS

NMOSShort

NMOS

ShortPMOS

Weak Strong

Vbn

Inverter

Either the transmitter part or the receiver part is enabled at a time

2.8.1 Simulated Performance of Bidirectional Link

Speed-Power of Proposed Bidirectional CMS Scheme

Current-Mode Vs. Voltage-Mode

2 3 4 5 6 7

2.5 2

1.5 1

0.5 0

8 2 3 4 5 6 7 8

180

140

100 60 20

CM−Bid VM−Bid

2 3 4 5 6 7 8

PowerCM−Bid

PowerVM−Bid

Data Rate=500Mbps 10e3

1e3

1e2

(c)

5X100Mbps

Line=4mm

1000 100

1e3

10e3

1e2

Line Length (mm)

(a) (b)

(d)

Dat

a R

ate

(Mbp

s)C

ross

over

35%7x

Line Length (mm)

Pow

er (

uW)

Line Length (mm)

Data Rate(Mbps)

Del

ay (

ns)

Pow

er (

uW)

• 35% improvement in delay for nearly all line lengths

• 1.7× lower power for 2mm lines and 7× lower power for 8mm line

• Power crossover frequency 100Mbps for 4mm long lines

• 5 × reduction in power at 1Gbps

32



• For lines longer than 2mm communicating at data-rates more than 180Mbps, the proposedscheme consumes less power than voltage-mode

Designed in 180nm for Vdd=1.8V using nominal Vt devices

Line Characteristics: R=211Ω/mm and C=0.245pF/mm

33



CMOS Static LogicPseudo nMOS Design Style

Complementary Pass gate LogicCascade Voltage Switch Logic

Dynamic Logic

Logic Design Styles

Dinesh Sharma


June 1,2006

Dinesh Sharma Logic Design Styles





Dynamic Logic

A simple model

0.2

0.4

0.0 0.5 1.0 4.03.02.52.0 4.51.5

0.6

0.8

1.0

1.2

1.4

Dra

in C

urre

nt (

mA

)

Drain Voltage (V)

1.5

2.0

2.5

3.0

Vg = 3.5

1.03.5

for Vgs ≤ VT, Ids = 0

for Vgs > VT and Vds ≤ Vgs − VT,Ids = K

[

(Vgs − VT )Vds −12V 2

ds

]

for Vgs > VT and Vds > Vgs − VT,

Ids = K (Vgs−VT )2

2

This model assumes current to be independent of Vds in thesaturation region.(This is somewhat oversimplified.)






Dynamic Logic

A more realistic model

0.0 1.0 2.0 3.0 4.0 5.0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

Drain Voltage (V)

Dra

in C

urre

nt (

mA

)

Let ‘Early Voltage’ ≡ VE

define Vdss ≡ VE

√

1 +2(Vgs − VT )

VE− 1

≃ (Vgs − VT )

(

1 −Vgs − VT

2VE

)

and Idss ≡ K[

(Vgs − VT )Vdss −12

V 2dss

]

for Vgs > VT and Vds ≤ Vdss Ids = K[

(Vgs − VT )Vds −12

V 2ds

]

for Vgs > VT and Vds > Vdss Ids = IdssVd + VE

Vdss + VE






Dynamic Logic

CMOS InverterInverter Static CharacteristicsNoise marginsDynamic CharacteristicsConversion of CMOS Inverters to other logic

CMOS Static Logic

Each logic stage contains pull up and pull down networkscontrolled by input signals.

The pull up network contains p channel transistors.

The pull down network is made of n channel transistors.

If the pull up network is ‘on’, the pull down network is ‘off’and vice versa.

Since the pull up and pull down networks are never ‘on’simultaneously, there is no static power consumption.






Dynamic Logic


CMOS Inverter

The simplest of CMOS logic structure is the inverter.

Vi Vo

VddCMOS inverter is the basic gate.

More complex gates are designed bymapping them to an ‘equivalent’ inverter.

The pull up network of the logic gate ismade equivalent to the pMOS of theinverter.

The pull down network of the logic gate ismade equivalent to the nMOS of theinverter.

Thumb rules are used to map thegeometries of the pull up and pull downnetworks to single transistors.






Dynamic Logic


Static Characteristics

V

V

V V

OH

OL

iL iH

Inverter Transfer Curve

The range of input voltages can be divided intoseveral regions.

nMOS ‘off’, pMOS ‘on’

nMOS saturated, pMOS linear

nMOS saturated, pMOS saturated

nMOS linear, pMOS saturated

nMOS ‘on’, pMOS ‘off’






Dynamic Logic



V

V

V V

OH

OL

iL iH


For 0 < Vi < VTn

the n channel transistor is ‘off’,

the p channel transistor is ‘on’ and theoutput voltage = Vdd .

This is the normal digital operation rangewith input = ‘0’ and output = ‘1’.






Dynamic Logic



V

V

V V

OH

OL

iL iH


In this regime, both transistors are ‘on’.

The input voltage Vi is > VTn, but is smallenough so that the n channel transistor isin saturation, and the p channel transistoris in the linear regime.

In static condition, the output voltage willadjust itself such that the currents throughthe n and p channel transistors are equal.






Dynamic Logic



The absolute value of gate-source voltage on the pchannel transistor is Vdd - Vi , and therefore the “overvoltage” on its gate is Vdd - Vi - VTp.

The drain source voltage of the pMOS has an absolutevalue Vdd -Vo.

Therefore,

Id = Kp

[

(Vdd − Vi − VTp)(Vdd − Vo) −12(Vdd − Vo)

2]

=Kn

2(Vi − VTn)

2

Where symbols have their usual meanings.






Dynamic Logic


We define β ≡ Kn/Kp and Vdp ≡ Vdd − Vo

Then we can solve the quadratic equation:

Id = Kp

[

(Vdd − Vi − VTp)(Vdd − Vo) −12(Vdd − Vo)

2]

=Kn

2(Vi − VTn)

2

So Vo = Vi + VTp +

√

(Vdd − Vi − VTp)2 − β(Vi − VTn)

2

If Kn = Kp; (β = 1),

Vo = (Vi + VTp) +

√

(Vdd − VTn − VTp)(Vdd − 2Vi + VTn − VTp)

for Vi ≤Vdd + VTn − VTp

2






Dynamic Logic



when Vi =Vdd+

√

βVTn−VTp

1+√

β, both transistors are saturated.

0.0

3.0

2.5

2.0

1.5

1.0

0.5

V

V

oH

oL

0.0 0.5 1.0 1.5 2.0 2.5 3.0ViL ViH

Input Voltage

Out

put V

olta

ge

V +VTn Tp

Currents of both transistors are independent oftheir drain voltages.

we do not get a unique solution for Vo byequating drain currents.

The currents will be equal for all values of Vo inthe range

Vi − VTn ≤ Vo ≤ Vi + VTp

Thus the transfer curve of an inverter shows a drop of VTn+ VTp

at a voltage near Vdd /2.Dinesh Sharma Logic Design Styles





Dynamic Logic



As we increase Vi further, so that

Vdd +√

βVTn − VTp

1 +√

β< Vi < Vdd − VTp

both transistors are still ‘on’, but nMOS enters the linear regimewhile pMOS is saturated. Equating currents in this condition,

Id =Kp

2(Vdd − Vi − VTp)

2

= Kn

[

(Vi − VTn)Vo −12

V 2o

]

From this, we get the quadratic equation

12

V 2o − (Vi − VTn)Vo +

(Vdd − Vi − VTp)2

2β= 0






Dynamic Logic


12

V 2o − (Vi − VTn)Vo +


2β= 0

This has solutions

Vo = (Vi − VTn) −

√

(Vi − VTn)2 −


β

In the special case where β = 1, we have

Vo = (Vi − VTn) −√

(Vdd − VTn − VTp)(2Vi − Vdd − VTn + VTp)






Dynamic Logic



V

V

V V

OH

OL

iL iH


As we increase the input voltage beyondVdd - VTp, the p channel transistor turns‘off’, while the n channel conductsstrongly.

As a result, the output voltage falls to zero.

This is the normal digital operation rangewith input = ‘1’ and output = ‘0’.






Dynamic Logic


Noise Margins

For robust design, the output levels must be interpretedcorrectly at the input of next stage even in the presence ofnoise.

For the ‘high’ level, we require that the output of one stageshould still be interpreted as ‘high’ at the input of the nextgate even when pulled down a little due to noise.

Therefore VoH should be > ViH.

Similarly VoL should be < ViL

The difference, ViL − VoL is the ‘low’ noise margin. andVoH − ViH is the ‘high’ noise level.






Dynamic Logic


Logic Levels

A digital circuit should distinguish logic levels, but beinsensitive to the exact analog voltage at the input.

Therefore flat portions of the transfer curve (where ∂Vo∂Vi

issmall) are suitable for digital logic.

We select two points on the transfer curve where the slope(∂Vo

∂Vi) is -1.0.

The coordinates of these two points define the values of(ViL,VoH) and (ViH ,VoL).

The region to the left of ViLand to the right of ViHhas|∂Vo∂Vi

| < 1, and is suitable for digital operation.






Dynamic Logic


Calculation of Noise Margins

Vi Vo

Vdd To evaluate the values of noise margins,we shall use the expressions derived forβ = 1 to keep the algebra simple.

When the input is low and output high, then channel transistor is saturated and the pchannel transistor is in its linear regime.

When the input is high and the output islow, the n channel transistor is in its linearregime, while the p channel transistor issaturated.






Dynamic Logic


Calculation of ViL and VoH

for (ViL,VoH), n channel transistor is saturated, while the pchannel transistor is in its linear regime.

Vo = (Vi + VTp) +

√

(Vdd − VTn − VTp)(Vdd + VTn − VTp − 2Vi)

From this, we evaluate ∂Vo∂Vi

and set it = -1.

∂Vo

∂Vi= −1 = 1 −

√

Vdd − VTn − VTp

Vdd + VTn − VTp − 2Vi

This gives

ViL =3Vdd + 5VTn − 3VTp

8

VoH =7Vdd + VTn + VTp

8= Vdd −

Vdd − VTn − VTp

8Dinesh Sharma Logic Design Styles





Dynamic Logic


Calculation of ViH and VoL

When the input is ‘high’, we should use the equation for nMOSlinear and pMOS saturated.

Vo = (Vi − VTn) −√


Differentiating with respect to Vi gives

∂Vo

∂Vi= −1 = 1 −

√

Vdd − VTn − VTp

2Vi − Vdd − VTn + VTp

From where, we get

ViH =5Vdd + 3VTn − 5VTp

8

VoL =Vdd − VTn − VTp

8






Dynamic Logic



The ‘High’ noise margin is given by

VoH − ViH =Vdd − VTn + 3VTp

4

Similarly, the ‘Low’ noise margin is

ViL − VoL =Vdd + 3VTn − VTp

4

The two noise margins can be made equal by choosing equalvalues for VTn and VTp.






Dynamic Logic


Dynamic Characteristics

For the calculation of rise and fall times, we shall assumethat only one of the two transistors in the inverter is ‘on’.

This is more conservative than the static logic levelscalculated by slope considerations.

We shall use the simple model described at the beginningof this lecture.






Dynamic Logic


Rise time

ViL

Vo

Vdd

When the input is low, the n channel transistoris ‘off’, while the p channel transistor is ‘on’.From Kirchoff’s current law at the output node,

Idp = CdVo

dt

so,dtC

=dVo

Idp

Integrating both sides, we get

τrise

C=

∫ VoH

0

dVo

Idp






Dynamic Logic


τrise

C=

∫ VoH

0

dVo

Idp

Till the output rises to ViL+ VTp, the p channel transistor is insaturation.if VoH > ViL + VTp (which is normally the case), the integrationrange can be broken into saturation and linear regimes. Thus

τrise

C=

∫ ViL+VTp

0

dVoKp2 (Vdd − ViL − VTp)

2

+

∫ VoH

ViL+VTp

dVo

Kp[

(Vdd − ViL − VTp)(Vdd − Vo) −12(Vdd − Vo)2

]






Dynamic Logic


τrise =2C(ViL + VTp)

Kp(Vdd − ViL − VTp)2

+C

Kp(Vdd − ViL − VTp)ln

Vdd + VoH − 2ViL − 2VTp

Vdd − VoH

The first term is just the constant current charging of theload capacitor.The second term represents the charging by the pMOS inits linear range.This can be compared with resistive charging, which wouldhave taken a charge time of

τ = RC lnVdd − ViL − VTp

Vdd − VoH

to charge from ViL+ VTp to VoH .Dinesh Sharma Logic Design Styles





Dynamic Logic


Fall Time

Vo

Vi H

When the input is high, the p channel transistoris ‘off’, while the n channel transistor is ‘on’.From Kirchoff’s current law at the output node,

Idn = −CdVo

dtSeparating variables and integrating from the initial voltage(= Vdd ) to some terminal voltage VoL gives

τfall

C= −

∫ voL

Vdd

dVo

Idn






Dynamic Logic


Fall time

τfall

C= −

∫ voL

Vdd

dVo

Idn

The n channel transistor will be in saturation till the output fallsto Vi - VTn. Below this, the transistor will be in its linear regime.We can divide the integration range in two parts.

τfall

C= −

∫ Vi−VTn

Vdd

dVo

Idn−

∫ VoL

Vi−VTn

dVo

Idn

=

∫ Vdd

Vi−VTn

dVoKn2 (Vi − VTn)

2

+

∫ Vi−VTn

VoL

dVo

Kn[(Vi − VTn)Vo − 12V 2

o






Dynamic Logic


Fall time

τfall

C=

Vdd − Vi + VTnKn2 (Vi − VTn)

2+

1Kn(Vi − VTn)

ln2(Vi − VTn) − VoL

VoL

The first term represents the time taken to discharge atconstant current in the saturation regime, whereas the secondterm is the quasi-resistive discharge in the linear regime.






Dynamic Logic


Trade off between power, speed and robustness

Noise margins are given by


4


4As we scale technologies, we improve speed and powerconsumption. However, the noise margin becomes worse.We can improve noise margins by choosing relativelyhigher threshold voltages. However, this will reducespeeds.We could also increase Vdd - but that would increase powerdissipation.

Thus we have a trade off between power, speed and noisemargins.






Dynamic Logic


CMOS Inverter Design Flow

A common design requirement is symmetric charge anddischarge behaviour and equal noise margins for high andlow logic values.

This requires matched values of Kn and Kp and equalvalues of VTnand VTp.

Rise and fall times depend linearly on Kn and Kp.

Thus it is a straightforward calculation to determinetransistor geometries if speed requirements andtechnological parameters are given.

However, as transistor geometries are made larger, selfloading can become significant.






Dynamic Logic


CMOS Inverter Design Flow

For large self-loading, we have to model the loadcapacitance as

CLoad = Cext + αKn

where we have assumed that β = Kn/Kp is constant. α is atechnological constant.We use the expressions for K τ/C which depend only onvoltages. Once these values are calculated, the geometrycan be determined.In the extreme case, when self capacitance dominates theload capacitance, K/C becomes constant and τ becomesgeometry independent. There is no advantage in usingwider transistors in this regime to increase the speed. It isbetter to use multi-stage logic with tapered buffers in thisregime.






Dynamic Logic


From Inverters to Other Logic

Once the basic CMOS inverter is designed, other logic gatescan be derived from it. The logic has to be put in a canonicalform which is a sum of products with a bar (inversion) on top.

For every ‘.’ in the expression, we put the corresponding nchannel transistors in series and the corresponding pchannel transistors in parallel.

for every ‘+’, we put the n channel transistors in paralleland the p channel transistors in series.

We scale the transistor widths up by the number of devices(n or p) put in series.

The geometries are left untouched for devices put inparallel.






Dynamic Logic


CMOS implementation of A.B + C.(D + E)

A

C

B

D

E

Out

A

B

C

D E

Vdd

For n channel, A and B are in series, Thepair is in parallel with C which is in serieswith a parallel combination of D and E.

For p channel, A is in parallel with B, thepair is in series with C which is in parallelwith a series combination of D and E.

Implementation of A.B + C.(D + E) in CMOS logic design style.






Dynamic Logic

Static CharacteristicsNoise marginsDynamic characteristicsPseudo nMOS design Flow

CMOS summary

Vi Vo

Vdd

Logic consumes no static power in CMOSdesign style.

However, signals have to be routed to then pull down network as well as to the ppull up network.

So the load presented to every driver ishigh.

This is exacerbated by the fact that n andp channel transistors cannot be placedclose together as these are in differentwells which have to be kept well separatedin order to avoid latchup.






Dynamic Logic


Pseudo nMOS Design Style

Vdd

Gnd

Out

in

The CMOS pull up network is replaced bya single pMOS transistor with its gategrounded.

Since the pMOS is not driven by signals, itis always ‘on’.

The effective gate voltage seen by thepMOS transistor is Vdd . Thus theovervoltage on the p channel gate isalways Vdd - VTp.

When the nMOS is turned ‘on’, a directpath between supply and ground existsand static power will be drawn.

However, the dynamic power is reduceddue to lower capacitive loadingDinesh Sharma Logic Design Styles





Dynamic Logic


Static Characteristics

As we sweep the input voltage from ground to Vdd , weencounter the following regimes of operation:

nMOS ‘off’


nMOS linear, pMOS linear







Dynamic Logic


Low input

Vdd

Gnd

Out

in

When the input voltage is less than VTn.The output is ‘high’ and no current isdrawn from the supply.

As we raise the input just above VTn, theoutput starts falling.

In this region the nMOS is saturated, whilethe pMOS is linear






Dynamic Logic



The input voltage is assumed to be sufficiently low so that theoutput voltage exceeds the saturation voltage Vi − VTn.Normally, this voltage will be higher than VTp, so the p channeltransistor is in linear mode of operation.Equating currents through the n and p channel transistors, weget

Kp

[

(Vdd − VTp)(Vdd − Vo) −12(Vdd − Vo)

2]

=Kn

2(Vi − VTn)

2

defining V1 ≡ Vdd − Vo and V2 ≡ Vdd − VTp, we get

12

V 21 − V2V1 +

β

2(Vi − VTn)

2= 0






Dynamic Logic



12

V 21 − V2V1 +

β

2(Vi − VTn)

2= 0

The solutions are:

V1 = V2 ±√

V 22 − β(Vi − VTn)

2

substituting the values of V1 and V2 and choosing the signwhich puts Vo in the correct range, we get

Vo = VTp +

√

(Vdd − VTp)2 − β(Vi − VTn)

2






Dynamic Logic



Vo = VTp +

√


2

As the input voltage is increased, the output voltage willdecrease.

The output voltage will fall below Vi − VTn when

Vi > VTn +

VTp +

√

V 2Tp + (β + 1)Vdd(Vdd − 2VTp)

β + 1

The nMOS is now in its linear mode of operation. Thederived equation does not apply beyond this input voltage.






Dynamic Logic



As the input voltage is raised still further, the output voltage willfall below VTp. The pMOS transistor is now in saturationregime. Equating currents, we get

Kn

[

(Vi − VTn)Vo −12

V 2o

]

=Kp

2(Vdd − VTp)

2

which gives

12

V 2o − (Vo − VTn)Vo +

(Vdd − VTp)2

2β

This can be solved to get

Vo = (Vi − VTn) −√

(Vi − VTn)2 − (Vdd − VTp)2/β






Dynamic Logic


Noise Margins

We find points on the transfer curve where the slope is -1.When the input is low and output high, we should use

Vo = VTp +

√


2

Differentiating this equation with respect to Vi and setting theslope to -1, we get

ViL = VTn +Vdd − VTp√

β(β + 1)

and

VoH = VTp +

√

β

β + 1(Vdd − VTp)






Dynamic Logic


When the input is high and the output low, we use

Vo = (Vi − VTn) −√


Differentiating with respect to Vi and setting the slope to -1, weget

ViH = VTn +2

√3β

(Vdd − VTp)

and

VoL =(Vdd − VTp)

√3β






Dynamic Logic


Ratioed Logic

To make the output ‘low’ value lower than VTn, we get thecondition

β >13

(

Vdd − VTp

VTn

)2

This places a requirement on the ratios of widths of n andp channel transistors. The logic gates work properly onlywhen this equation is satisfied.Therefore this kind of logic is also called ‘ratioed logic’.In contrast, CMOS logic is called ratioless logic because itdoes not place any restriction on the ratios of widths of nand p channel transistors for static operation.The noise margin for pseudo nMOS can be determinedeasily from the expressions for ViL, VoL, ViH , VoH .






Dynamic Logic


Rise Time

ViL

Vo

Vdd

When the input is low, the nMOS is off and theoutput rises from ‘low’ to ‘high’.The situation is identical to the charge upcondition of a CMOS gate with the pMOSbeing biased with its gate at 0V.

This gives

τrise =C

Kp(Vdd − VTp)

[

2VTp

Vdd − VTp+ ln

Vdd + VoH − 2VTp

Vdd − VoH

]






Dynamic Logic


Fall Time

Vdd

Gnd

Out

in

Calculation of fall time is complicated by thefact that the pMOS load continues to dumpcurrent in the output node, even as the nMOStries to discharge the output capacitor.The nMOS needs to sink the discharge currentas well as the drain current of the pMOStransistor.Simplifying assumption:pMOS current remains constant at itssaturation value through the entire dischargeprocess.

(This will result in a slightly pessimistic value of discharge time).






Dynamic Logic


Fall Time

If we assume that the pMOS current remains constant at itssaturation value,

Ip =Kp

2(Vdd − VTp)

2

. We can write the KCL equation at the output node as:

In − Ip + CdVo

dt= 0

which givesτfall

C= −

∫ VoL

Vdd

dVo

In − Ip

We define V1 ≡ Vi − VTn and V2 ≡ Vdd − VTp.






Dynamic Logic


Fall Time

Vdd

Gnd

Out

in

The integration range can be divided into tworegimes.

nMOS is saturated when V1 ≤ Vo < Vdd .

It is in the linear regime whenVoL < Vo < V1.






Dynamic Logic


Fall Time

τfall

C= −

∫ V1

Vdd

dVo12KnV 2

1 − Ip−

∫ VoL

V1

dVo

Kn(V1Vo − 12V 2

o ) − Ip

so,τfall

C=

Vdd − V112KnV 2

1 − Ip+

∫ V1

VoL

dVo

Kn(V1Vo − 12V 2

o ) − Ip






Dynamic Logic


Pseudo nMOS Inverter design

We design the basic inverter and then scale device sizesbased on the logic function being designed.

The load device size is calculated from the rise time.

τrise =C

Kp(Vdd − VTp)

[

2VTp

Vdd − VTp+ ln

Vdd + VoH − 2VTp

Vdd − VoH

]

Given a value of τrise, operating voltages and technologicalconstants, Kp and hence, the geometry of the p channeltransistor can be determined.






Dynamic Logic


Pseudo nMOS Inverter design

Geometry of the n channel transistor can be determinedfrom static considerations.

VoL = (ViH − VTn) −√

(ViH − VTn)2 − (Vdd − VTp)

2/β

We take VoL= VTn, and calculate β.

But β ≡ Kn/Kp and Kp is already known.

This evaluates Kn and hence, the geometry of the nchannel transistor.






Dynamic Logic


Conversion to other logic

Once the basic pseudo nMOS inverter is designed, otherlogic gates can be derived from it.

The procedure is the same as that for CMOS, except that itis applied only to nMOS transistors.

The p channel transistor is kept at the same size as that foran inverter.






Dynamic Logic


Conversion to other logic

The logic is expressed as a sum of products with a bar(inversion) on top.

For every ‘.’ in the expression, we put the corresponding nchannel transistors in series.

For every ‘+’, we put the n channel transistors in parallel.We scale the transistor widths up by the number of devicesput in series.

The geometries are left untouched for devices put inparallel.






Dynamic Logic


A.B + C.(D + E) in pseudo-nMOS

Out

A

B

C

D E

Vdd

A and B are in series.

The pair is in parallel with C which is inseries with a parallel combination of D andE.

Implementation of A.B + C.(D + E) in pseudo-nMOS logicdesign style.






Dynamic Logic

Logic Design using CPLPull up for Leakage current Reduction

Complementary Pass gate Logic

This logic family is based on multiplexer logic.

Given a boolean function F (x1, x2, . . . , xn), we can expressit as:

F (x1, x2, . . . , xn) = xi · f 1 + xi · f 2

where f1 and f2 are reduced expressions for F with xi

forced to 1 and 0 respectively.

Thus, F can be implemented with a multiplexer controlledby xi which selects f1 or f2 depending on xi .

f1 and f2 can themselves be decomposed into simplerexpressions by the same technique.






Dynamic Logic


Complementary Pass gate Logic

To implement a multiplexer, we need both xi and xi .

Therefore, this logic family needs all inputs in true as wellas in complement form.

In order to drive other gates of the same type, it mustproduce the outputs also in true and complement forms.

Thus each signal is carried by two wires.

This logic style is called “Complementary Passgate Logic”or CPL for short.






Dynamic Logic


Basic Multiplexer Structure

x xi i

f1

f2

f2

f1

F

F

FF

Pure passgate logic contains no ‘amplifying’elements. Therefore, each logic stagedegrades the logic level.Hence, multiple logic stages cannot becascaded.We include conventional CMOS inverters torestore the logic level.Ideally, the multiplexer should be composed ofcomplementary pass gate transistors.However, we shall use just n channeltransistors as switches for simplicity.






Dynamic Logic


Logic Design using CPL

For any logic function, we pick one input as the controlvariable.

Multiplexer inputs are decided by re-evaluating thefunction, fourcing this variable to 1 and zero respectively.

Since both true and complement outputs are generated byCPL, we need fewer types of gates.

For example, we do not need separate gates for AND andNAND functions.

The same applies to OR-NOR, and XOR-XNOR functions.






Dynamic Logic


Implementation of XOR and XNOR

To take an example, let us consider the XOR-XNOR functions.

A A

A+B

A+B

A+B

A+B

B

B

B

BXOR−XNOR

Because of the inverter, for XOR output,We calculate the XNOR function given byA.B + A.B.

If we put A = 1, this reduces to B and for A= 0, it reduces to B.

For the XNOR output, we generate theXOR expression = A.B + A.B

The expression reduces to B for A = 1 andto B for A = 0.






Dynamic Logic


Implementation of AND-NAND and OR-NOR

A A

A

BA.B

A.B

A.BA.B

B

A

AND−NAND

A A

A

B

A

B

A+B

A+B

A+B

A+B

OR−NOR

For AND, the mux should output A.B to be inverted by thebuffer. This reduces to B when A = 1 and to 1 (= A) whenA = 0.

Implementation of NAND, OR and NOR functions followsalong the same lines.






Dynamic Logic


Buffer Leakage Current

Fy=F

f1

f2

xi xi

The high output of the multiplexer (y)cannot rise above Vdd - VTnbecause weuse nMOS multiplexers.

Consequently, the pMOS transistor in thebuffer inverter never quite turns off.

This results in static power consumption inthe inverter.

F

f1

f2

xi xi

y=F

This can be avoided by adding a pull up pMOSwith the inverter.






Dynamic Logic


Use of Pullup PMOS

F

f1

f2

xi xi

y=F

When the multiplexer output (y) is ‘low’,the inverter output (F) is high. The pMOSis off and has no effect.

When the multiplexer output (y) goes‘high’, the inverter output falls and turnsthe pMOS on.

Now, even though the multiplexer nMOS turns ‘off’ as yapproaches Vdd - VTn, the pMOS remains ‘on’ and takes theinverter input (y) all the way to Vdd .

This avoids leakage in the inverter.






Dynamic Logic


Need for ratioing

The use of pMOS pullup brings up another problem.

Consider the equivalent circuit when the inverter output is ‘low’and the pMOS is ‘on’.

‘0’

‘0’

‘0’ ‘1’

Vdd

0 ->1

If the final output is ‘low’, the pMOS pullup is‘on’. Now if the multiplexer output wants to go‘low’, it has to fight the pMOS pullup - which istrying to keep this node ‘high’.

In fact, the multiplexer n transistor and the pullup p transistor constitute a pseudo nMOSinverter.

Therefore, the multiplexer output cannot be pulled low unlessthe transistor geometries are appropriately ratioed.






Dynamic Logic

Improving Pseudo nMOS

A B

Out

Vdd

Out

Vdd

A

B

In the pseudo-nMOS NOR circuit on the left, static power isconsumed when the output is ‘LOW’We would like to turn the pMOS off when A OR B is TRUE.The OR logic can be constructed by using a Pseudo-nMOSNAND of A and B as in the circuit on the right.But then what about the pMOS drive of this circuit?






Dynamic Logic

Pseudo nMOS without Static Power

A B

Out

Vdd

Out

Vdd

A

B

The output of the circuit on the right is ‘LOW’ when bothA and B are ‘HIGH’ (A = B = 0).We would like to turn its pMOS off when NOR of A and B is‘TRUE’But this can be provided by the circuit on the left!So the two circuits can drive each other’s pMOS transistorsand avoid static power consumption.






Dynamic Logic

Cascade Voltage Switch Logic

A B

Out

Vdd

Out

A

B

This kind of logic is called Cascade VoltageSwitch Logic (CVSL).It can use any network f and itscomplementary network f in the twocross-coupled branches.

Like CMOS static logic, there is no static powerconsumption.Like CPL, this logic requires both True and Complementsignals. It also provides both True and complementoutputs. (Dual Rail Logic).Like pseudo nMOS, the inputs present a single transistorload to the driving stage.The circuit is self latching. This reduces ratioingrequirements.






Dynamic Logic

Four Phase Dynamic LogicDomino LogicZipper logic

Dynamic logic

In this style of logic, some nodes are required to hold theirlogic value as a charge stored on a capacitor.

These nodes are not connected to their ‘drivers’permanently.

The ‘driver’ places the logic value on them, and is thendisconnected from the node.

Due to leakage etc., the logic value cannot be heldindefinitely.

Dynamic circuits therefore require a minimum clockfrequency to operate correctly.

Use of dynamic circuits can reduce circuit complexity andpower consumption substantially.






Dynamic Logic


A CMOS dynamic logic circuit

A B

C CL

Out

Vdd

Ck

When the clock is low, pMOSis on and the bottom nMOS isoff.

The output is ‘pre-charged’ to1 unconditionally.

When the clock goes high, thepMOS turns off and thebottom nMOS comes on.

The circuit then conditionallydischarges the output node, if(A+B).C is TRUE.

This implements the function(A + B).C.






Dynamic Logic


Problem with Cascading

A B

C CL

Out

Vdd

Ck

X

Ck

Out

X(A+B).C = FALSE

Ck

Out

X

(A+B).C = TRUE

There is no problem when (A+B).C is false. X pre-charges to 1and remains at 1.When (A+B).C is TRUE, X takes some time to discharge.During this time, charge placed on the output leaks away as theinput to nMOS of the inverter is not 0.






Dynamic Logic


4 Phase Dynamic Logic

P

Ck12

Ck23

A B

C

Out

Ck1Ck2Ck3Ck4

The problem can be solved byusing a 4 phase clock.

In phase 1 node P ispre-charged.

In phase 2 P and output arepre-charged.

In phase 3 The gate evaluates.

In phases 4 and 1, the outputis isolated from the driver andremains valid.

This is called a type 3 gate. Itevaluates in phase 3 and isvalid in phases 4 and 1.

Similarly, we can have type 4,Dinesh Sharma Logic Design Styles





Dynamic Logic


Drive cycles

Type 1 Type 2

Type 3Type 4

Drive Sequences

A type 3 gate can drive a type4 or a type 1 gate.

similarly, type 4 will drive types1 and 2; type 1 will drive types2 and 3; and type 2 will drivetypes 3 and 4.

We can use a 2 phase clock ifwe stick to type 1 and type 3gates (or type 2 and type 4gates) as these can drive eachother.






Dynamic Logic


Domino Logic

P

A B

C

Ck

Another way to eliminate theproblem with cascading logicstages is to use a static inverterafter the CMOS dynamic gate.The output is ‘0’ when it is notvalid. Therefore, it does not affectthe evaluation of the next gate.

However, the logic is non-inverting. Therefore, it cannot beused to implement any arbitrary logic function.






Dynamic Logic


Zipper Logic

Instead of using an inverter, we can alternate n and pevaluation stages.

A B

C

Ck Ck

D

E

A, B, C must be from p stages.D and E must be from n stages.

Vdd

Gnd

The n stage is pre-chargedhigh, but it drives a p stage.

A high pre-charged stage willkeep the p evaluation stageoff, which will not cause anymalfunction.

The p stage will bepre-discharged to ‘low’, whichis safe for driving n stages.

This kind of logic is called zipper logic.




Logic Design

Dinesh SharmaMicroelectronics group

EE Department, IIT Bombay



Contents

1 Transistor Models 3

2 Static CMOS Logic Design 72.1 Static CMOS Design style . . . . . . . . . . . . . . . . . . . . . . . 72.2 CMOS Inverter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1 Static Characteristics . . . . . . . . . . . . . . . . . . . . . . 72.2.2 Noise margins . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2.3 Dynamic Considerations . . . . . . . . . . . . . . . . . . . . 132.2.4 Trade off between power, speed and robustness . . . . . . . 162.2.5 CMOS Inverter Design Flow . . . . . . . . . . . . . . . . . . 172.2.6 Conversion of CMOS Inverters to other logic . . . . . . . . . 17

3 Beyond Static CMOS 193.1 Pseudo nMOS Design Style . . . . . . . . . . . . . . . . . . . . . . 19

3.1.1 Static Characteristics . . . . . . . . . . . . . . . . . . . . . . 203.1.2 Noise margins . . . . . . . . . . . . . . . . . . . . . . . . . . 213.1.3 Dynamic characteristics . . . . . . . . . . . . . . . . . . . . 223.1.4 Pseudo nMOS design Flow . . . . . . . . . . . . . . . . . . . 233.1.5 Conversion of pseudo nMOS Inverter to other logic . . . . . 24

3.2 Complementary Pass gate Logic . . . . . . . . . . . . . . . . . . . . 243.2.1 Basic Multiplexer Structure . . . . . . . . . . . . . . . . . . 253.2.2 Logic Design using CPL . . . . . . . . . . . . . . . . . . . . 253.2.3 Buffer Leakage Current . . . . . . . . . . . . . . . . . . . . . 26

3.3 Cascade Voltage Switch Logic . . . . . . . . . . . . . . . . . . . . . 283.4 Dynamic Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.4.1 Problem with Cascading CMOS dynamic logic . . . . . . . . 313.4.2 Four Phase Dynamic Logic . . . . . . . . . . . . . . . . . . . 323.4.3 Domino Logic . . . . . . . . . . . . . . . . . . . . . . . . . . 333.4.4 Zipper logic . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

1



List of Figures

1.1 MOS characteristics according to the simple analytic model . . . . . 31.2 MOS characteristics with non zero conductance in saturation . . . . 4

2.1 The basic CMOS inverter . . . . . . . . . . . . . . . . . . . . . . . 82.2 Transfer Curve of a CMOS inverter . . . . . . . . . . . . . . . . . . 102.3 CMOS inverter with the nMOS ‘off’ . . . . . . . . . . . . . . . . . . 132.4 CMOS inverter with the pMOS ‘off’ . . . . . . . . . . . . . . . . . . 152.5 CMOS implementation of A.B + C.(D + E) . . . . . . . . . . . . . 18

3.1 ‘high’ to ‘low’ transition on the output . . . . . . . . . . . . . . . . 223.2 Pseudo NMOS implementation of A.B + C.(D + E) . . . . . . . . . 243.3 Basic Multiplexer with logic restoring inverters . . . . . . . . . . . . 253.4 Implementation of XOR and XNOR by CPL logic. . . . . . . . . . 263.5 Implementation of (a) AND-NAND and (b) OR-NOR functions us-

ing complementary passgate logic. . . . . . . . . . . . . . . . . . . . 263.6 High leakage current in inverter . . . . . . . . . . . . . . . . . . . . 273.7 Pull up pMOS to avoid leakage in the inverter . . . . . . . . . . . . 273.8 Problem with a low to high transition on the output . . . . . . . . . 283.9 Pseudo-nMOS NOR . . . . . . . . . . . . . . . . . . . . . . . . . . 283.10 Pseudo-nMOS OR from complemented inputs . . . . . . . . . . . . 293.11 OR-NOR implementation in Cascade Voltage Switch Logic . . . . . 293.12 CMOS dynamic gate to implement (A + B).C. . . . . . . . . . . . . 303.13 CMOS 4 phase dynamic logic . . . . . . . . . . . . . . . . . . . . . 323.14 CMOS 4 phase dynamic logic drive constraints . . . . . . . . . . . . 323.15 CMOS domino logic . . . . . . . . . . . . . . . . . . . . . . . . . . 333.16 Zipper logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2



Chapter 1

Transistor Models

In this booklet, we shall use simple analytical models for MOS transistors. Weuse a sign convention according to which, voltage and current symbols associatedwith the pMOS transistor (such as VTp) have positive values. Then, the n channelformulae can be used for both transistors and we shall assign signs to quantitiesexplicitly.

0.2

0.4

0.0 0.5 1.0 4.03.02.52.0 4.51.5

0.6

0.8

1.0

1.2

1.4

Dra

in C

urre

nt (

mA

)

Drain Voltage (V)

1.5

2.0

2.5

3.0

Vg = 3.5

1.03.5

Figure 1.1: MOS characteristics according to the simple analytic model

The model we use is described by the following equations:for Vgs ≤ VT,

Ids = 0 (1.1)

3



for Vgs > VT and Vds ≤ Vgs − VT,

Ids = K[


2V 2

ds

]

(1.2)

and for Vgs > VT and Vds > Vgs − VT,

Ids = K(Vgs − VT )2

2(1.3)

The saturation region equation is somewhat oversimplified because it assumes thatthe current is independent of Vds. In reality, the current has a weak dependenceon Vds in this region.

In order to model the saturation region more accurately, we adopt an “EarlyVoltage” like formalism.

0.0 1.0 2.0 3.0 4.0 5.0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

Drain Voltage (V)

Dra

in C

urre

nt (

mA

)

Figure 1.2: MOS characteristics with non zero conductance in saturation

It is assumed that the current increases linearly in the saturation region. All linear

4



characteristics in saturation can be produced backwards towards negative drainvoltages and will intersect the drain voltage axis at a single point at -VE. (Thisis, at best, an approximation). Because the conductance in saturation is nownon zero, the onset of saturation has to be redefined, so that the current and itsderivative are continuous at the boundary of linear and saturation regimes. Thecurrent equations are given by:For Vgs > VT and Vds ≤ Vdss,

Ids = K[


2V 2

ds

]

(1.4)

and for Vgs > VT and Vds > Vdss,

Ids = IdssVd + VE

Vdss + VE(1.5)

Where VE is the ‘Early Voltage’. Here Vdss and Idss are saturation drain voltageand drain current respectively. Since the current values must match at either sideof Vds = Vdss, we must have:

Idss ≡ K[


2V 2

dss

]

. (1.6)

For the curve to be smooth and continuous at Vd = Vdss, the value of the firstderivative should match on either side of Vdss. Therefore,

K(Vgs − VT − Vdss) =Idss

Vdss + VE

So,

K(Vgs − VT − Vdss)(Vdss + VE) = K[


2V 2

dss

]

(1.7)

This leads to a quadratic equation in Vdss

1

2V 2

dss + VEVdss − (Vgs − VT )VE = 0 (1.8)

Solving this quadratic, we get

Vdss = VE

√

1 +2(Vgs − VT )

VE− 1

(1.9)

For VE >> Vgs − VT this reduces to

Vdss ≃ (Vgs − VT )(

1 −Vgs − VT

2VE

)

(1.10)

5



Characteristics of a MOS transistor using this model are shown in fig.1.2. Whileaccurate modeling of the output conductance is essential for linear design, thesimpler model assuming constant Id in saturation is often adequate for preliminarydigital design. In any case, final designs will have to be validated with detailedsimulations. In this booklet, we shall use the simple model for MOS devices tokeep the algebra simple.

6



Chapter 2

Static CMOS Logic Design

Static logic circuits are those which can hold their output logic levels for indefiniteperiods as long as the inputs are unchanged. Circuits which depend on chargestorage on capacitors are called dynamic circuits and will be discussed in a laterchapter.

2.1 Static CMOS Design style

The most common design style in modern VLSI design is the Static CMOS logicstyle. In this, each logic stage contains pull up and pull down networks which arecontrolled by input signals. The pull up network contains p channel transistors,whereas the pull down network is made of n channel transistors. The networks areso designed that the pull up and pull down networks are never ‘on’ simultaneously.This ensures that there is no static power consumption.

2.2 CMOS Inverter

The simplest of such logic structures is the CMOS inverter. In fact, for any CMOSlogic design, the CMOS inverter is the basic gate which is first analyzed anddesigned in detail. Thumb rules are then used to convert this design to other morecomplex logic. The basic CMOS inverter is shown in fig. 2.1. We shall developthe characteristics of CMOS logic through the inverter structure, and later discussways of converting this basic structure more complex logic gates.

2.2.1 Static Characteristics

The range of input voltages can be divided into several regions.

7



Vi Vo

Vdd

Figure 2.1: The basic CMOS inverter


For 0 < Vi < VTn the n channel transistor is ‘off’, the p channel transistor is ‘on’and the output voltage = Vdd. This is the normal digital operation range withinput = ‘0’ and output = ‘1’.


In this regime, both transistors are ‘on’. The input voltage Vi is > VTn, but issmall enough so that the n channel transistor is in saturation, and the p channeltransistor is in the linear regime. In static condition, the output voltage will adjustitself such that the currents through the n and p channel transistors are equal. Theabsolute value of gate-source voltage on the p channel transistor is Vdd- Vi, andtherefore the “over voltage” on its gate is Vdd- Vi- VTp. The drain source voltageof the pMOS has an absolute value Vdd-Vo. Therefore,

Id = Kp

[

(Vdd − Vi − VTp)(Vdd − Vo) −1

2(Vdd − Vo)

2

]

=Kn

2(Vi − VTn)2 (2.1)

Where symbols have their usual meanings.

We define β ≡ Kn/Kp. We make the substitution Vdp ≡ Vdd − Vo, where Vdpisthe absolute value of the drain-source voltage for the p channel transistor. Then,

(Vdd − Vi − VTp)Vdp −1

2V 2

dp =β

2(Vi − VTn)2 (2.2)

Which gives the quadratic

1

2V 2

dp − Vdp(Vdd − Vi − VTp) +β

2(Vi − VTn)2 = 0 (2.3)

Solutions to the quadratic are:

Vdp = (Vdd − Vi − VTp) ±√

(Vdd − Vi − VTp)2 − β(Vi − VTn)2 (2.4)

8



These equations are valid only when the pMOS is in its linear regime. This requiresthat

Vdp ≡ Vdd − Vo ≤ Vdd − Vi − VTp

Therefore, we must choose the negative sign. Thus

Vdd − Vo = (Vdd − Vi − VTp) −√

Vdd − Vi − VTp)2 − β(Vi − VTn)2 (2.5)

Therefore,

Vo = Vi + VTp +√

(Vdd − Vi − VTp)2 − β(Vi − VTn)2 (2.6)

Since Vo must be ≥ Vi +VTp, the limit of applicability of the above result is givenby

(Vdd − Vi − VTp)2 = β(Vi − VTn)2

That is, the solution for Vo is valid for

Vi ≤Vdd +

√βVTn − VTp

1 +√

β(2.7)

In the case where we size the n and p channel transistors such that

Kn = Kp; so β = 1

we have

Vo = (Vi + VTp) +√

(Vdd − VTn − VTp)(Vdd − 2Vi + VTn − VTp) (2.8)

with

Vi ≤Vdd + VTn − VTp

2


At the limit of applicability of eq. 2.7, when the input voltage is exactly at

Vi =Vdd +

√βVTn − VTp

1 +√

β(2.9)

both transistors are saturated. Since the currents of both transistors are indepen-dent of their drain voltages in this condition, we do not get a unique solution forVo by equating drain currents. The currents will be equal for all values of Vo inthe range

Vi − VTn ≤ Vo ≤ Vi + VTp

Thus the transfer curve of an inverter shows a drop of VTn+ VTp at a voltage nearVdd/2. This is actually an artifact of the simple transistor model chosen for this

9



0.0

3.0

2.5

2.0

1.5

1.0

0.5

V

V

oH

oL

0.0 0.5 1.0 1.5 2.0 2.5 3.0ViL ViH

Input Voltage

Out

put V

olta

ge

V +VTn Tp

Figure 2.2: Transfer Curve of a CMOS inverter

analysis, which assumes perfect saturation of drain current. In a real case, thedrain current does depend on the drain voltage (albeit weakly) in the saturationregion. If the model incorporates an Early Voltage like effect, the drop near themiddle of the characteristic is more gradual.


At the gate voltage given by eq. 2.9, both transistors are saturated. As we increaseVi beyond this value, such that

Vdd +√

βVTn − VTp

1 +√

β< Vi < Vdd − VTp

both transistors are still ‘on’, but nMOS enters the linear regime while pMOS getssaturated. Equating currents in this condition,

Id =Kp

2(Vdd − Vi − VTp)

2 = Kn

[

(Vi − VTn)Vo −1

2V 2

o

]

(2.10)

From this, we get the quadratic equation

1

2V 2

o − (Vi − VTn)Vo +(Vdd − Vi − VTp)

2

2β= 0 (2.11)

10



This has solutions

Vo = (Vi − VTn) ±

√

(Vi − VTn)2 −(Vdd − Vi − VTp)2

β(2.12)

Since the equations are valid only when the n channel transistor is in the linearregime (Vo < Vi − VTn), we choose the negative sign. This gives,

Vo = (Vi − VTn) −

√

(Vi − VTn)2 −(Vdd − Vi − VTp)2

β(2.13)

Again, in the special case where β = 1, we have

Vo = (Vi − VTn) −√

(Vdd − VTn − VTp)(2Vi − Vdd − VTn + VTp) (2.14)


As we increase the input voltage beyond Vdd- VTp, the p channel transistor turns‘off’, while the n channel conducts strongly. As a result, the output voltage fallsto zero. This is the normal digital operation range with input = ‘1’ and output =‘0’.

The figure below shows the transfer curve of an inverter with Vdd= 3V, VTn=0.6V and VTp= 0.5V, and β = 1.

0

0.5

1

1.5

2

2.5

3

3.5

0 0.5 1 1.5 2 2.5 3

Ou

tpu

t V

olta

ge

Input Voltage

The plot produced by SPICE for this circuit with realistic models is quite similar.

2.2.2 Noise margins

The requirement from a digital circuit is that it should distinguish logic levels,but be insensitive to the exact analog voltage at the input. This implies that

11



the flat portions of the transfer curve (where ∂Vo

∂Vi

is small) are suitable for digital

logic. We select two points on the transfer curve where the slope (∂Vo

∂Vi

) is -1.0.The coordinates of these two points define the values of (ViL,VoH) and (ViH ,VoL).Robust digital design requires that the output high level be higher than what isacceptable as a high level at the input (VoH > ViH). The difference between thesetwo levels is the ‘high’ noise margin. This is the amount of noise that can rideon the worst case ‘high’ output and still be accepted as a ‘high’ at the input ofthe next gate. Similarly, we require VoL < ViL. The difference, ViL − VoL is the‘low’ noise margin. Obviously, it is of interest to evaluate the values of these noisemargins. For the discussion which follows, we shall use the expressions derivedearlier for β = 1 to keep the algebra simple.

Calculation of ViL and VoH

from eq. (2.8)

Vo = (Vi + VTp) +√

(Vdd − VTn − VTp)(Vdd + VTn − VTp − 2Vi)

From this, we can evaluate ∂Vo

∂Vi

and set it = -1.

∂Vo

∂Vi

= −1 = 1 −

√

Vdd − VTn − VTp

Vdd + VTn − VTp − 2Vi

(2.15)

This gives

ViL =3Vdd + 5VTn − 3VTp

8(2.16)

Substituting this in eq.(2.8), we get

VoH =7Vdd + VTn + VTp

8= Vdd −

Vdd − VTn − VTp

8(2.17)

Calculation of ViH and VoL

When the input is ‘high’, we should use eq.(2.14).

Vo = (Vi − VTn) −√


Differentiating with respect to Vi gives

∂Vo

∂Vi= −1 = 1 −

√

Vdd − VTn − VTp

2Vi − Vdd − VTn + VTp(2.18)

From where, we get

ViH =5Vdd + 3VTn − 5VTp

8(2.19)

12



and

VoL =Vdd − VTn − VTp

8(2.20)


The high noise margin is given by


4(2.21)

Similarly, the Low noise margin is


4(2.22)

The two noise margins can be made equal by choosing equal values for VTn andVTp.

2.2.3 Dynamic Considerations

In this section, we analyze the dynamic behaviour of the inverter. For the calcu-lation of rise and fall times, we shall assume that only one of the two transistorsin the inverter is ‘on’. (Notice that this is more conservative than the input highand low conditions determined by slope considerations in eq.2.19 and 2.16). Weshall continue to use the simple model described at the beginning of this booklet.

Rise time

When the input is low, the n channel transistor is ‘off’, while the p channel tran-sistor is ‘on’. The equivalent circuit in this condition is shown in fig. 2.3. From

ViL

Vo

Vdd

Figure 2.3: CMOS inverter with the nMOS ‘off’

13



Kirchoff’s current law at the output node,

Idp = CdVo

dt

so,dt

C=

dVo

Idp

This separates the variables, with the LHS independent of operating voltages andthe RHS independent of time. Integrating both sides, we get

τrise

C=∫ VoH

0

dVo

Idp

Till the output rises to ViL+ VTp, the p channel transistor is in saturation. Sincethe current is constant, the integration is trivial. If VoH > ViL + VTp (which isnormally the case), the integration range can be broken into saturation and linearregimes. Thus

τrise

C=

∫ ViL+VTp

0

dVo

Kp

2(Vdd − ViL − VTp)2

+∫ VoH

ViL+VTp

dVo

Kp

[

(Vdd − ViL − VTp)(Vdd − Vo) −1

2(Vdd − Vo)2

]

We define V1 ≡ Vdd − Vo and V2 ≡ Vdd − ViL − VTp, so dVo = −dV1.We get

Kpτrise

2C=

ViL + VTp

V 22

−∫ Vdd−VoH

V2

dV1

2V1V2 − V 21

The integral can be evaluated as

I ≡ −∫ Vdd−VoH

V2

dV1

2V1V2 − V 21

=1

2V2

∫ V2

Vdd−VoH

(

1

V1

+1

2V2 − V1

)

dV1

=1

2V2

[

lnV1

2V2 − V1

]V2

Vdd−VoH

=1

2V2

ln2V2 − Vdd + VoH

Vdd − VoH

Therefore,Kpτrise

2C=

ViL + VTp

V 22

+1

2V2

ln2V2 − Vdd + VoH

Vdd − VoH

14



or

Kpτrise

2C=

ViL + VTp

(Vdd − ViL − VTp)2+

1

2(Vdd − ViL − VTp)ln

2V2 − Vdd + VoH

Vdd − VoH

Thus,

τrise =C(ViL + VTp)

Kp

2(Vdd − ViL − VTp)2

+C

Kp(Vdd − ViL − VTp)ln

Vdd + VoH − 2ViL − 2VTp

Vdd − VoH(2.23)

The first term is just the constant current charging of the load capacitor. Thesecond term represents the charging by the pMOS in its linear range. This can becompared with resistive charging, which would have taken a charge time of

τ = RC lnVdd − ViL − VTp

Vdd − VoH

to charge from ViL+ VTp to VoH .

Fall time

When the input is high, the n channel transistor is ‘on’ and the p channel transistoris ‘off’. If the output was initially ‘high’, it will be discharged to ground through

Vo

Vi H

Figure 2.4: CMOS inverter with the pMOS ‘off’

the nMOS. To analysis the fall time, we apply Kirchoff’s current law to the outputnode. This gives

Idn = −CdVo

dt

Again, separating variables and integrating from the initial voltage (= Vdd) to someterminal voltage VoL gives

τfall

C= −

∫ voL

Vdd

dVo

Idn

15



The n channel transistor will be in saturation till the output voltage falls to Vi- VTn.Below this voltage, the transistor will be in its linear regime. Thus, we can dividethe integration range in two parts.

τfall

C= −

∫ Vi−VTn

Vdd

dVo

Idn−∫ VoL

Vi−VTn

dVo

Idn

=∫ Vdd

Vi−VTn

dVo

Kn

2(Vi − VTn)2

+∫ Vi−VTn

VoL

dVo

Kn[(Vi − VTn)Vo −1

2V 2

o

Therefore

Knτfall

2C=

Vdd − Vi + VTn

(Vi − VTn)2+∫ Vi−VTn

VoL

dVo

2Vo(Vi − VTn) − V 2o

=Vdd − Vi + VTn

(Vi − VTn)2+

1

2(Vi − VTn)

∫ Vi−VTn

VoL

dVo

(

1

Vo+

1

2(Vi − VTn) − Vo

)

Which gives

Knτfall

2C=

Vdd − Vi + VTn

(Vi − VTn)2+

1

2(Vi − VTn)

[

lnVo

2(Vi − VTn) − Vo

]Vi−VTn

VoL

=Vdd − Vi + VTn

(Vi − VTn)2+

1

2(Vi − VTn)ln

2(Vi − VTn) − VoL

VoL

and therefore

τfall =C(Vdd − Vi + VTn)

Kn

2(Vi − VTn)2

+C

Kn(Vi − VTn)ln

2(Vi − VTn) − VoL

VoL

(2.24)

Again, the first term represents the time taken to discharge at constant current inthe saturation regime, whereas the second term is the quasi-resistive discharge inthe linear regime.

2.2.4 Trade off between power, speed and robustness

As we scale technologies, we improve speed and power consumption. However,as we can see from the expression for noise margins, (eq 2.21 and eq 2.22) thenoise margin becomes worse. We can improve noise margins by choosing relativelyhigher threshold voltages. However, this will reduce speeds. We could also increaseVdd- but that would increase power dissipation. Thus we have a trade off betweenpower, speed and noise margins.

This choice is made much more complicated by process variations, because wehave to design for the worst case.

16



2.2.5 CMOS Inverter Design Flow

The CMOS inverter forms the basis of most static CMOS logic design. More com-plex logic can be designed from it by simple thumb rules. A common (though notuniversal) design requirement is symmetric charge and discharge behaviour andequal noise margins for high and low logic values. This requires matched valuesof Kn and Kp and equal values of VTnand VTp. For a constant load capacitance,rise and fall times depend linearly on Kn and Kp. Thus it is a straightforwardcalculation to determine transistor geometries if speed requirements and techno-logical parameters are given. However, as transistor geometries are made larger,self loading can become significant. We now have to model the load capacitanceas

CLoad = Cext + αKn

where we have assumed that β = Kn/Kp is kept constant. α is a technologicalconstant. We use the expressions for Kτ/C which depend only on voltages. Oncethese values are calculated, the geometry can be determined.

In the extreme case, when self capacitance dominates the load capacitance, K/Cbecomes constant and τ becomes geometry independent. There is no advantagein using wider transistors in this regime to increase the speed. It is better to usemulti-stage logic with tapered buffers in this regime. This will be discussed in themodule on Logical Effort.

2.2.6 Conversion of CMOS Inverters to other logic

Once the basic CMOS inverter is designed, other logic gates can be derived fromit. The logic has to be put in a canonical form which is a sum of products with abar (inversion) on top. For every ‘.’ in the expression, we put the correspondingn channel transistors in series and the corresponding p channel transistors in par-allel. for every ‘+’, we put the n channel transistors in parallel and the p channeltransistors in series. We scale the transistor widths up by the number of devices(n or p) put in series. The geometries are left untouched for devices put in paral-lel. Fig.2.5 shows the implementation of A.B + C.(D + E) in CMOS logic designstyle.

17



A

C

B

D

E

Out

A

B

C

D E

Vdd

Figure 2.5: CMOS implementation of A.B + C.(D + E)

18



Chapter 3

Beyond Static CMOS

3.1 Pseudo nMOS Design Style

CMOS design style ensures that the logic consumes no static power. This is be-cause the pull down and pull up networks are never ‘on’ simultaneously. However,this requires that signals have to be routed to the n pull down network as well as

to the p pull up network. This means that the load presented to every driver ishigh. This fact is exacerbated by the fact that n and p channel transistors cannotbe placed close together as these are in different wells which have to be kept wellseparated in order to avoid latchup.

Pseudo nMOS design style reduces dynamic power (by reducing capacitiveloading) at the cost of having non-zero static power by replacing the pull upnetwork by a single pMOS transistor with its gate terminal grounded. The pseudonMOS inverter is shown below.

Vdd

Gnd

Out

in

Notice that since the pMOS is not driven by signals, it is always ‘on’. The effectivegate voltage seen by the pMOS transistor is Vdd. Thus the overvoltage on the pchannel gate is always Vdd- VTp. When the nMOS is turned ‘on’, a direct pathbetween supply and ground exists and static power will be drawn.

19



3.1.1 Static Characteristics

As we sweep the input voltage from ground to Vdd, we encounter the followingregimes of operation:

nMOS ‘off’

This is the case when the input voltage is less than VTn. The output is ‘high’ andno current is drawn from the supply.


As the input voltage is raised above VTn, we enter this region. The input voltageis assumed to be sufficiently low that the output voltage exceeds the saturationvoltage Vi − VTn. Normally, this voltage will be higher than VTp, so the p channeltransistor is in linear mode of operation. Equating currents through the n and pchannel transistors, we get

Kp

[

(Vdd − VTp)(Vdd − Vo) −1

2(Vdd − Vo)

2

]

=Kn

2(Vi − VTn)2 (3.1)

defining V1 ≡ Vdd − Vo and V2 ≡ Vdd − VTp, we get

1

2V 2

1 − V2V1 +β

2(Vi − VTn)2 = 0 (3.2)

with solutionsV1 = V2 ±

√

V 22 − β(Vi − VTn)2

substituting the values of V1 and V2 and choosing the sign which puts Vo in thecorrect range, we get

Vo = VTp +√

(Vdd − VTp)2 − β(Vi − VTn)2 (3.3)


As the input voltage is increased, the output voltage will decrease in accordancewith equation(3.3). At some point, the output voltage will fall below Vi − VTn. Itcan be shown that this will happen when

Vi > VTn +VTp +

√

V 2Tp + (β + 1)Vdd(Vdd − 2VTp)

β + 1.

The nMOS is now in its linear mode of operation. We shall not derive the expres-sion for the output voltage in this mode of operation in the discussion here. Thesolution is straightforward, though algebraically tedious.

20




As the input voltage is raised still further, the output voltage will fall below VTp.The pMOS transistor is now in saturation regime. Equating currents, we get

Kn

[

(Vi − VTn)Vo −1

2V 2

o

]

=Kp

2(Vdd − VTp)

2

which gives1

2V 2

o − (Vo − VTn)Vo +(Vdd − VTp)

2

2β

This can be solved to get

Vo = (Vi − VTn) −√

(Vi − VTn)2 − (Vdd − VTp)2/β (3.4)

3.1.2 Noise margins

As in the case of CMOS inverter, we find points on the transfer curve where theslope is -1.

When the input is low and output high, we should use eq(3.3). Differentiatingthis equation with respect to Vi and setting the slope to -1, we get

ViL = VTn +Vdd − VTp√

β(β + 1)(3.5)

and

VoH = VTp +

√

β

β + 1(Vdd − VTp) (3.6)

When the input is high and the output low, we use eq(3.4). Again, differentiatingwith respect to Vi and setting the slope to -1, we get

ViH = VTn +2

√3β

(Vdd − VTp) (3.7)

and

VoL =(Vdd − VTp)√

3β(3.8)

To make the output ‘low’ value lower than VTn, we get the condition

β >1

3

(

Vdd − VTp

VTn

)2

21



This condition on values of β places a requirement on the ratios of widths of nand p channel transistors. The logic gates work properly only when this equationis satisfied. Therefore this kind of logic is also called ‘ratioed logic’. In contrast,CMOS logic is called ratioless logic because it does not place any restriction onthe ratios of widths of n and p channel transistors for static operation. The noisemargin for pseudo nMOS can be determined easily from the expressions for ViL,VoL, ViH , VoH .

3.1.3 Dynamic characteristics

In the sections above, we have derived the behaviour of a pseudo nMOS inverterin static conditions. In the sections below, we discuss the dynamic behaviour ofthis inverter.

Rise Time

When the input is low and the output rises from ‘low’ to ‘high’, the nMOS is off.The situation is identical to the charge up condition of a CMOS gate with thepMOS being biased with its gate at 0V. This gives

τrise =C

Kp(Vdd − VTp)

[

2VTp

Vdd − VTp

+ lnVdd + VoH − 2VTp

Vdd − VoH

]

(3.9)

Fall Time

Analytical calculation of fall time is complicated by the fact that the pMOS loadcontinues to dump current in the output node, even as the nMOS tries to dischargethe output capacitor.

Vdd

Gnd

Out

in

Figure 3.1: ‘high’ to ‘low’ transition on the output

Thus the nMOS should sink the discharge current as well as the drain current ofthe pMOS transistor. We make the simplifying assumption that the pMOS current

22



remains constant at its saturation value through the entire discharge process. (Thiswill result in a slightly pessimistic value of discharge time). Then,

Ip =Kp

2(Vdd − VTp)

2

. We can write the KCL equation at the output node as:

In − Ip + CdVo

dt= 0

which givesτfall

C= −

∫ VoL

Vdd

dVo

In − Ip

We define V1 ≡ Vi−VTn and V2 ≡ Vdd−VTp. The integration range can be dividedinto two regimes. nMOS is saturated when V1 ≤ Vo < Vdd and is in linear regimewhen VoL < Vo < V1. Therefore,

τfall

C= −

∫ V1

Vdd

dVo1

2KnV 2

1 − Ip

−∫ VoL

V1

dVo

Kn(V1Vo −1

2V 2

o ) − Ip

so,τfall

C=

Vdd − V1

1

2KnV 2

1 − Ip

+∫ V1

VoL

dVo

Kn(V1Vo −1

2V 2

o ) − Ip

3.1.4 Pseudo nMOS design Flow

We design the basic inverter first and then map the inverter design to other logiccircuits. The load device size is calculated from the rise time. From eq. 3.9 wehave

τrise =C

Kp(Vdd − VTp)

[

2VTp

Vdd − VTp+ ln

Vdd + VoH − 2VTp

Vdd − VoH

]

Given a value of τrise, operating voltages and technological constants, Kp andhence, the geometry of the p channel transistor can be determined.

Geometry of the n channel transistor in the reference inverter design can bedetermined from static considerations. Using eq. 3.4, the output ‘low’ level isgiven by:

Vo = (Vi − VTn) −√


If the desired value of the output ‘low’ level is given, we can calculate β. Butβ ≡ Kn/Kp and Kp is already known. This evaluates Kn and hence, the geometryof the n channel transistor.

23



Out

A

B

C

D E

Vdd

Figure 3.2: Pseudo NMOS implementation of A.B + C.(D + E)

3.1.5 Conversion of pseudo nMOS Inverter to other logic

Once the basic pseudo nMOS inverter is designed, other logic gates can be derivedfrom it. The procedure is the same as that for CMOS, except that it is appliedonly to nMOS transistors. The p channel transistor is kept at the same size asthat for an inverter.

The logic is expressed as a sum of products with a bar (inversion) on top.For every ‘.’ in the expression, we put the corresponding n channel transistors inseries and for every ‘+’, we put the n channel transistors in parallel. We scalethe transistor widths up by the number of devices put in series. The geometriesare left untouched for devices put in parallel. Fig.3.2 shows the implementation ofA.B + C.(D + E) in pseudo NMOS logic design style.

3.2 Complementary Pass gate Logic

This logic family is based on multiplexer logic.

Given a boolean function F(x1, x2, . . . , xn), we can express it as:

F (x1, x2, . . . , xn) = xi · f1 + xi · f2

where f1 and f2 are reduced expressions for F with xi forced to 1 and 0 respectively.Thus, F can be implemented with a multiplexer controlled by xi which selects f1or f2 depending on xi. f1 and f2 can themselves be decomposed into simplerexpressions by the same technique.

To implement a multiplexer, we need both xi and xi. Therefore, this logicfamily needs all inputs in true as well as in complement form. In order to drive

24



x xi i

f1

f2

f2

f1

F

F

FF

Figure 3.3: Basic Multiplexer with logic restoring inverters

other gates of the same type, it must produce the outputs also in true and com-plement forms. Thus each signal is carried by two wires. This logic style is called“Complementary Passgate Logic” or CPL for short.

3.2.1 Basic Multiplexer Structure

Pure passgate logic contains no ‘amplifying’ elements. Therefore, it has zero ornegative noise margin. (Each logic stage degrades the logic level). Therefore,multiple logic stages cannot be cascaded. We shall assume that each stage includesconventional CMOS inverters to restore the logic level. Ideally, the multiplexershould be composed of complementary pass gate transistors. However, we shalluse just n channel transistors as switches for simplicity.This gives us the multiplexer structure shown in fig.3.3.

3.2.2 Logic Design using CPL

Since both true and complement outputs are generated by CPL, we do not needseparate gates for AND and NAND functions. The same applies to OR-NOR, andXOR-XNOR functions.

To take an example, let us consider the XOR-XNOR functions. Because of theinverter, the multiplexer for the XOR output first calculates the XNOR functiongiven by A.B+A.B. If we put A = 1, this reduces to B and for A = 0, it reduces toB. Similarly, for the XNOR output, we generate the XOR expression = A.B+A.Bwhich will be inverted by the logic level restoring inverter. The expression reducesto B for A = 1 and to B for A = 0. This leads to an implementation of XOR-

25



A A

A+B

A+B

A+B

A+B

B

B

B

BXOR−XNOR

Figure 3.4: Implementation of XOR and XNOR by CPL logic.

XNOR as shown in fig.3.4

A A

A

BA.B

A.B

A.BA.B

B

A

AND−NAND

A A

A

B

A

B

A+B

A+B

A+B

A+B

OR−NOR

Figure 3.5: Implementation of (a) AND-NAND and (b) OR-NOR functions usingcomplementary passgate logic.

Implementation of AND and OR functions is similar. In case of AND, themultiplexer should output A.B to be inverted by the buffer. This reduces to Bwhen A = 1. When A = 0, it evaluates to 1 = A. For NAND output, themultiplexer should output A.B, which evaluates to B for A = 1 and to 0 (or A)when A = 0.

3.2.3 Buffer Leakage Current

The circuit configuration described above uses nMOS multiplexers. This limits

26



Fy=F

f1

f2

xi xi

Figure 3.6: High leakage current in inverter

the ‘high’ output of the multiplexer (node y - which is the input for the inverter)to Vdd - VTn. Consequently, the pMOS transistor in the buffer inverter never quiteturns off. This results in static power consumption in the inverter. This can be

F

f1

f2

xi xi

y=F

Figure 3.7: Pull up pMOS to avoid leakage in the inverter

avoided by adding a pull up pMOS as shown in fig. 3.7. When the multiplexeroutput (y) is ‘low’, the inverter output is high. The pMOS is therefore off and hasno effect. When the multiplexer output goes ‘high’, the inverter input charges up,the output starts falling and turns the pMOS on. Now, as the multiplexer output(y) approaches Vdd - VTn, the nMOS switch in the multiplexer turn off. However,the pMOS pull up remains ‘on’ and takes the inverter input all the way to Vdd.This avoids leakage in the inverter.

However, this solution brings up another problem. Consider the equivalent cir-cuit when the inverter output is ‘low’ and the pMOS is ‘on’. Now if the multiplexeroutput wants to go ‘low’, it has to fight the pMOS pullup - which is trying to keep

27



‘0’

‘0’

‘0’ ‘1’

Vdd

0 ->1

Figure 3.8: Problem with a low to high transition on the output

this node ‘high’.

In fact, the multiplexer n transistor and the pull up p transistor constitute apseudo nMOS inverter. Therefore, the multiplexer output cannot be pulled lowunless the transistor geometries are appropriately ratioed.

3.3 Cascade Voltage Switch Logic

We can understand this logic configuration as an attempt to improve pseudo-nMOSlogic circuits. Consider the NOR gate shown below: Static power is consumed by

A B

Out

Vdd

Figure 3.9: Pseudo-nMOS NOR

this NOR circuit whenever the output is ‘LOW’. This happens when A OR B isTRUE. We wish that the pMOS could be turned off for just this combination ofinputs.

To turn the pMOS transistor off, we need to apply a ‘HIGH’ voltage level to itsgate whenever A OR B is true. This obviously requires an OR gate. Non-inverting

28



gates cannot be made in a single stage. However, We can create the OR functionby using a NAND of A and B as shown in figure 3.10. But then what about the

Out

Vdd

A

B

Figure 3.10: Pseudo-nMOS OR from complemented inputs

pMOS drive of this circuit?

We want to turn the pMOS of this OR circuit off when both A and B are‘HIGH’; i.e. when A = B = 0. This means we would like to turn the pMOS ofthis circuit off when the NOR of A and B is ‘TRUE’.

But we already have this signal as the output of the first (NOR) circuit! Sothe two circuits can drive each other’s pMOS transistors and avoid static powerconsumption. This kind of logic is called Cascade Voltage Switch Logic (CVSL). It

A B

Out

Vdd

Out

A

B

Figure 3.11: OR-NOR implementation in Cascade Voltage Switch Logic

can use any network f and its complementary network f in the two cross-coupledbranches. The complementary network is constructed by changing all series con-nections in f to parallel and all parallel connections to series, and complementingall input signals.

CVSL shares many characteristics with static CMOS, CPL and pseudo-nMOS.

• Like CMOS static logic, there is no static power consumption.

29



• Like CPL, this logic requires both True and Complement signals. It alsoprovides both True and complement outputs. (Dual Rail Logic).

• Like pseudo nMOS, the inputs present a single transistor load to the drivingstage.

• The circuit is self latching. This reduces ratioing requirements.

3.4 Dynamic Logic

In this style of logic, some nodes are required to hold their logic value as a chargestored on a capacitor. These nodes are not connected to their ‘drivers’ perma-nently. The ‘driver’ places the logic value on them, and is then disconnected fromthe node. Due to leakage etc., the logic value cannot be held indefinitely. Dynamiccircuits therefore require a minimum clock frequency to operate correctly. Use ofdynamic circuits can reduce circuit complexity and power consumption substan-tially. When the clock is low, pMOS is on and the bottom nMOS is off. The output

A B

C CL

Out

Vdd

Ck

Figure 3.12: CMOS dynamic gate to implement (A + B).C.

is ‘pre-charged’ to 1 unconditionally. When the clock goes high, the pMOS turnsoff and the bottom nMOS comes on. The circuit then conditionally discharges theoutput node, if (A+B).C is TRUE. This implements the function (A + B).C.

30



3.4.1 Problem with Cascading CMOS dynamic logic

There is no problem when (A+B).C is false. X pre-charges to 1 and remains at 1.

A B

C CL

Out

Vdd

Ck

X

Ck

Out

X(A+B).C = FALSE

Ck

Out

X

(A+B).C = TRUE

When (A+B).C is TRUE, X takes some time to discharge. During this time,charge placed on the output leaks away as the input to nMOS of the inverter isnot 0.

31



3.4.2 Four Phase Dynamic Logic

P

Ck12

Ck23

A B

C

Out

Ck1Ck2Ck3Ck4

Figure 3.13: CMOS 4 phase dynamic logic

The problem can be solved by using a 4 phase clock. The idea is to sample theprevious stage only after its evaluation is complete.

In phase 1, node P is pre-charged. In phase 2, P as well as the output are pre-charged. In phase 3, The gate evaluates. In phases 4 and 1, the output is isolatedfrom the driver and remains valid. This is called a type 3 gate. It evaluates inphase 3 and is valid in phases 4 and 1. Similarly, we can have type 4, type 1 andtype 2 gates. A type 3 gate can drive a type 4 or a type 1 gate. Similarly, type

Type 1 Type 2

Type 3Type 4

Drive Sequences

Figure 3.14: CMOS 4 phase dynamic logic drive constraints

4 will drive types 1 and 2; type 1 will drive types 2 and 3; and type 2 will drive

32



types 3 and 4. We can use a 2 phase clock if we stick to type 1 and type 3 gates(or type 2 and type 4 gates) as these can drive each other.

3.4.3 Domino Logic

P

A B

C

Ck

Figure 3.15: CMOS domino logic

Another way to eliminate the problem with cascading logic stages is to use astatic inverter after the CMOS dynamic gate. Recall that the cascaded dynamicCMOS stage causes problems because the output is pre-charged to Vdd. If the finalvalue is meant to be zero, the next stage nMOS to which the output is connectederroneously sees a one till the pre-charged output is brought down to zero. Duringthis time, it ends up discharging its own pre-charged output, which it was notsupposed to do. If an inverter is added, the output is held ‘low’ before logic eval-uation. If the final output is zero, there is no problem anyway. If the final outputis supposed be one, the next stage is erroneously held at zero for some time. How-ever, this does not result in a false evaluation by the next stage. The only effectit can have is that the next stage starts its evaluation a little later. However, theaddition of an inverter means that the logic is non-inverting. Therefore, it cannotbe used to implement any arbitrary logic function.

3.4.4 Zipper logic

Instead of using an inverter, we can alternate n and p evaluation stages. The nstage is pre-charged high, but it drives a p stage. A high pre-charged stage willkeep the p evaluation stage off, which will not cause any malfunction. The p stagewill be pre-discharged to ‘low’, which is safe for driving n stages. This kind of logicis called zipper logic.

33



A B

C

Ck Ck

D

E

A, B, C must be from p stages.D and E must be from n stages.

Vdd

Gnd

Figure 3.16: Zipper logic

34



CMOS Mixed Signal Design

CMOS Mixed Signal DesignPart I: OpAmp Design

Dinesh Sharma


September 19, 2010




Introduction

Linear Mode

Linear Mode of Operation

V

V

V V

OH

OL

iL iH


Analog circuits require theoutput voltage to be sensitiveto the input voltage.

Digital logic requires theoutput to be insensitive to theexact input voltage.

Circuits need to be biased for operation in the linear regime.




Single Transistor Amplifier

A Single Transistor Amplifier

v

vo

Vd

Id

i Vg

dId =∂Id∂Vg

dVg +∂Id∂Vd

dVd

∂Id∂Vg

= gm (Transconductance)

∂Id∂Vd

= go (O/P conductance)

The current source load keeps the drain current constant. So

dId = 0 = gmvi + govo

Hence, the voltage gain (Ao) is

Ao =vo

vi= −

gm

go= −gmro





Transistor Characteristics


gm and go depend on the transistor characteristics.In saturation,

Id ≃K2

(Vgs − VT )2

where, K is the conductivity factor given by:

K = K ′

(

WL

)

≡ µCox

(

WL

)

VT is the threshold voltageW and L are transistor width and length respectively.µ is the mobilityand Coxis the gate oxide capacitance per unit area.






Transconductance

Let VGT ≡ (Vgs − VT )

Then Id =KV 2

GT

2and VGT =

√

2IdK

gm =∂Id∂Vg

= KVGT = K ′

(

WL

)

VGT

Also gm = KVGT = K

√

2IdK

=

√

2KId =

√

2K ′

(

WL

)

Id

Similarly, K =2Id

VGT2 ; Therefore gm =

2IdVGT

2 VGT =2IdVGT






Which formula?

gm = K ′

(

WL

)

VGT

gm =

√

2K ′

(

WL

)

Id

gm =2IdVGT

To increase gm

should we increase VGT ?or decrease it?Is gm linearly dependent ontransistor size?dependent on its square root?or is it independent of transistorsize?

In fact, which formula should be applied depends on how thetransistor is biased and sized. If size and VGT are known, thefirst formula applies. If the drain current and size are known, thesecond one does. If gate voltage and drain current are givenand the transistor is accordingly sized, the third formula shouldbe used.






Output conductance

Assuming a simple Early effect like model, we can write for go:

go ≃ λ′Id/L

where L is the channel length and λ is a technology dependentparameter. In terms of geometry and VGT , we can write:

go =λ′K ′

2WL2 V 2

GT

The Early Voltage VA is L/λ′. So,

go ≃ Id/VA =K ′W2λ′

(

VGT

VA

)2





DC Voltage Gain

Voltage Gain

The voltage gain in terms of geometry and VGT :

Ao =2L

λ′VGT

In terms of drain current and geometry:

Ao =1λ′

√

2K ′WLId

Thus, if the transistor is biased at constant current, the DC gainis determined by the square root of the gate area.





AC Behaviour

AC Behaviour

G

S

D

S

vi

vo

Cg

Cgd

gm vi ro Co

sCgd (vi − vo) − gmvi −vo

ro− sCovo = 0

vi(

sCgd − gm)

− vo

(

sCgd +1ro

+ sCo

)

= 0

So the AC gain A1 =vo

vi= −gmro

1 − sCgd/gm

1 + sro(cgd + co)





AC Behaviour

Bandwidth

A1 = −gmro1 − sCgd/gm

1 + sro(cgd + co)

Let Ctot ≡ Cgd + Co

Then, A1 = Ao1 − sCgd/gm

1 + sroCtot

Normally, ωCgd/gm << 1

Therefore, A1 ≃Ao

1 + sroCtot

This describes the frequency response of a system with onedominant pole. The bandwidth is given by 1/roCtot .





AC Behaviour

Gain Bandwidth Product

BW GBW

Gai

n (d

b)

oA

oA - 3db

0 db

Frequency

GBW = gmro ·1

roCtot=

gm

Ctot

The gain bandwidth product (or the cutoff frequency) isindependent of ro.





AC Behaviour

Maximum GBW

GBW is max. when there is no load connected and the load isentirely due to the device capacitance itself. Then the loadcapacitance is proportional to the device width.

Ctot = χW where χ is a technological parameter.

GBWmax =gm

χW

GBWmax =K ′VGT

χL

=1χ

√

2K ′IdWL

=2Id

χWVGT





AC Behaviour

Summary

Free Design Variables:Parameters W , L, VGT W , L, Id L, VGT , Id

gm K ′WL VGT

√

2K ′ WL Id

2IdVGT

goλ′K ′WV 2

GT2L2

λ′IdL

λ′IdL

Ao2L

λ′VGT

1λ′

√

2K ′WLId

2Lλ′VGT

GBW K ′WVGTLCtot

√

2K ′WIdL

1Ctot

2IdVGT Ctot

GBW maxK ′VGT

χL1χ

√

2K ′IdWL

K ′VGTχL





AC Behaviour

Technological Constraint

Ao · GBWmax =2L

λ′VGT·

K ′VGT

χL=

1λ′

√

2K ′WLId

·1χ

√

2K ′IdWL

So Ao · GBWmax =2K ′

λ′χ

Therefore, this quantity is a technological constant and thedesigner has no control over it.What if an application requires a Gain-GBW product higherthan this value?




Cascode Amplifier

Cascode Amplifier

Id

V

Vd1

d2

M1

M2

V

V

v

v

g1

g2

in

outV

ref

dId = gmeqdVg1 + goeqdVd2

So gmeq =∂Id

∂Vg1with dVd2 = 0

and goeq =∂Id

∂Vd2with dVg1 = 0

To calculate gmeq, we put a voltage source atthe output node and calculate ∂Id

∂Vg1.

goeq is calculated by putting a voltage source atvg1 and calculating ∂Id

∂Vd2.




Cascode Amplifier

Cascode eq. gm

Equivalent gm of Cascode

Id

V

Vd1

d2

M1

M2

V

V

v

v

g1

g2

in

outV

ref

gmeq =∂Id

∂Vg1with dVd2 = 0

dVds2 = −dVd1 , dVgs2 = −dVd1

id = gm1vg1 + go1vd1

id = −gm2vd1 − go2vd1

So vd1 = −id

gm2 + go2

id = gm1vg1 − idgo1

gm2 + go2

gmeq =id

vg1= gm1

gm2 + go2

go1 + go2 + gm2≃ gm1




Cascode Amplifier

Cascode eq. go

Equivalent go of Cascode

goeq =∂Id

∂Vd2with dVg1 = 0

dVgs1 = 0, dVgs2 = −dVd1, dVds2 = dVd2 − dVd1

Id

V

Vd1

d2

M1

M2

V

V

v

v

g1

g2

in

outV

ref

id = 0 + go1vd1, sovd1 =id

go1

id = −gm2vd1 + go2(vd2 − vd1)

id = −idgm2 + go2

go1+ go2vd2

goeq =id

vd2=

go1go2

go1 + go2 + gm2

goeq ≃ go1go2

gm2




Cascode Amplifier

DC gain of Cascode

DC gain of Cascode

Ao = −gmeq

goeq= −

gm1(gm2 + go2)

g01 + g02 + gm2·

g01 + g02 + gm2

g01g02

So Ao = −gm1(gm2 + go2)

g01g02= −

gm1

g01·

(

1 +gm2

g02

)

Let A01 ≡ −gm1

g01common source gain

And A02 ≡ 1 +gm2

g02common gate gain

Then, Ao = −A01 · A02

DC gain = the product of the DC gain of the two transistors.




Cascode Amplifier

AC Behaviour of Cascode


Id

V

Vd1

d2

M1

M2

V

V

v

v

g1

g2

in

outV

refG

S

vi

vo

CoCg1

Cdg1

gm1 viro1

vx

ro2

gm2 vx

We shall see presently that vx is quite small.

Initially, we shall ignore the effect of the drain capacitance ofthe lower transistor and the gate capacitance of the upper one.If necessary, we can always replace ro1 by ro1‖Cds1‖Cg2.




Cascode Amplifier


G

S

vi

vo

CoCg1

Cdg1

gm1 viro1

vx

ro2

gm2 vx

gm2vx +vx − vo

ro2= sCovo

vx =1 + sro2Co

1 + gm2ro2vo =

1 + sro2Co

A2vo

Since A2 is quite large, vx is very small compared to vo.




Cascode Amplifier


G

S

vi

vo

CoCg1

Cdg1

gm1 viro1

vx

ro2

gm2 vx

sCdg1(vi − vx ) = gm1vx +vx

ro1+ sCovo

vo

vi= −

(A1 − sro1Cdg)A2

(1 + sro2Co)(1 + sro1Cdg) + A2sCoro1

If sro1Cdg is small,

Voltage gain =vo

vi= −

A1A2

1 + sro1Co(A2 + ro2/ro1)

This shows that the DC gain is multiplied by A2 and thebandwidth is reduced by roughly the same factor.




Cascode Amplifier


Example Cascode Design

We want to design a cascode amplifier with the followingspecifications:

DC gain = 2500

Gain-Bandwidth product = 100MHz.

Load capacitance = 1 pF

Id

V

Vd1

d2

M1

M2

V

V

v

v

g1

g2

in

outV

ref

The two transistors in cascodeconfiguration have identical geometriesand the load is an ideal current source.Assume the following technologicalparameters:K ′

n = 150µA/V 2, VTn = 0.5V , VE = 20VAssume the supply voltage to be 3.3V.




Cascode Amplifier


Calculation of gm

The gain bandwidth product is given by gmC . So,

2π × 108=

gm

C=

gm1

10−12

So gm1 = 628.3µS

Since the same current flows through the two transistors andthey have the same geometry, gm1 = gm2, go1 = go2.

Let A =gm1

go1=

gm2

go2

Therefore,

2500 =gm1

go1·

(

1 +gm2

go2

)

= A(A + 1)

This gives A ≃ 49.5.




Cascode Amplifier


Calculation of bias current and geometry

49.5 =628.3 × 10−6

go1so go1 = 12.7µS

Therefore go1 = 12.7 × 10−6=

IdVE

=Id20

From where, the drain current is 254µA.

Since gm1 =

√

2K ′WL

Id ,WL

=628.32 × 10−12

300 × 10−6 × 254 × 10−6 ≃ 5.2

Therefore gm = 628.3µS,WL

= 5.2, Id = 254µA




Cascode Amplifier


Bias Voltages

Id =12

K ′WL

V 2GT So VGT =

√

2 × 254150 × 5.2

= .81V

Id

V

Vd1

d2

M1

M2

V

V

v

v

g1

g2

in

outV

ref

Vg1 ≥ VTn + VGT = 0.5 + 0.81 = 1.31VM1 will be in saturation whenVd1 = VS2 ≥ 0.81V ,So Vg2 ≥ 0.81 + 0.5 + 0.81 = 2.12V .For M2to be in saturation,Vd2 ≥ 2.12 − 0.5 = 1.62V .

Thus the maximum output swing is from 1.62V to Vdd .




Cascode Amplifier


DC level incompatibility

The output DC level of a cascode amplifier is higher than theinput DC level. This causes problems with direct connection tothe next stage, or with DC feed back to itself.

These problems can be reduced if we usea complementary arrangement of n and pchannel transistors for cascoding.

The upper transistor of the cascodearrangement can be thought of as asource follower to its bias voltage, whichkeeps the drain voltage of the loweramplifier transistor (nearly) constant.

Can we use a p channel transistor as asource follower?

Vout

Vbiasn

Vin

LoadVdd

Gnd




Cascode Amplifier


Alternative Cascode

The p source follower will keep the drainvoltage of the amplifier at ≃ Vbiasp + |VTp|,allowing the cascode action as before.

Unfortunately, the circuit won’t work asthere is no path between Vdd and ground!

We can rectify this problem by providing acurrent source p load to the amplifiertransistor M1.

Vbiasp

LoadGnd

Vdd

Vout

M1M2




Cascode Amplifier

Folded Cascode

Folded Cascode

M1

M2

M3

Vin

Vbiasp1Vdd

GndLoad

Vbiasp2

Vout

This arrangement is called a folded cascode.M3 provides the bias current.M2 and M3 keep the drain voltage of M1 nearlyfixedId3 - Id1 flows through the p channelcascoding transistor M2, which providesamplification in a common gate configuration.

rout = (1 + gm2ro2)(ro1||ro3) + ro2

This is lower than the output resistance of the telescopiccascode stage, because of the paralleling of ro1 and ro3.

However, it is much higher than the single transistor outputresistance.




Current Mirrors

Current Source Loads

Up to now we have assumed current source loads. How do weimplement these?

A transistor in saturation has a (nearly) constant draincurrent.

Therefore single transistors (preferably with long channels)can be used as current sources/sinks.

These act as current sources/sinks only over some voltagerange — not for all voltages.

There is a weak dependence on voltage due to nonzerooutput conductance.

This dependence can be reduced by using a cascodestage.




Current Mirrors

A simple Current Mirror

M1 M2

Iref Io

Vref

For M1, Vds = Vgs > Vgs − VT

Therefore M1 is saturated.

Iref =K2

(Vref − VT )2

Therefore Vref = VT +

√

2Iref

K

If M2 is also saturated, Io = Iref

Thus M2 can act as a current source load

if Vo > Vref − VT i.e. Vo >

√

2Iref

K




Current Mirrors

Load for a Cascode stage

Vbiasp1

Vbiasp2

Vbiasn

Vin

Vout

Vdd

Gnd

The output resistance of the load appears inparallel with that of the amplifying stage.If we use a single transistor current load for acascode, the output resistance of the load willbe ≈ ro while that of the cascode stage will be≈ A × ro.The effective output resistance will thus bedominated by the much lower resistance of theload and we shall lose the advantages of thecascode stage.It is important, therefore, that the load alsoshould be a current source made from acascode pair.




Current Mirrors

A cascode current mirrorIref Io

VrefM1 M2

M3Vb

Vx Vy

A single transistor current mirror will havesome dependence on the drain voltagedue to its output resistance.This dependence can be reducedsubstantially by using a cascode stage.However, this reduces the availablevoltage range over which the transistorsare saturated.

For saturation of M2 Vy ≥ Vref − VT =

√

2Iref

K

Therefore Vb ≥ 2

√

2Iref

K+ VT

For saturation of M3 Vo ≥ 2

√

2Iref

K




Current Mirrors

Self biased Cascode current mirror

Iref

VrefM1 M2

M3

Vb

Vx Vy

Io

M0

This circuit does not need an externalvoltage bias.

The reference side of the mirror generatesthe bias voltages for both the transistors ofthe cascode output side.

However, this reduces the voltage rangeover which the the output may swing.

Vb = 2

√

2Iref

K+ 2VT

For saturation of M3 Vo ≥ 2

√

2Iref

K+ VT

The output voltage needs to be a VT higher than the minimum.




Current Mirrors

Folded Cascode with load

M1

M2

M3

Vin

Gnd

Vbiasp1

Vbiasp2

Vout

Vbiasn2

Vbiasn1

Vdd

The load for the folded cascode should also bea cascode pair.Here two n channel transistors in cascodeconfiguration are used as the load.

One major advantage of the folded cascode is that the outputcan be directly coupled to the input for negative feedback.




Current Mirrors

Folded Cascode with Load

M1

M2

M3

Vin

Gnd

Vbiasp1

Vbiasp2

Vout

Vbiasn2

Vbiasn1

Vdd

The single transistor amplifier can be replaced by anytransconductance, of course. In operational amplifiers, the

single transistor stage will be replaced by a differential amplifier.




Operational Amplifiers

Differential Amplifiers


Circuits which amplify the difference of two input voltages (eachof which has equal and opposite signal excursions) have manyadvantages over single ended amplifiers.

Noise picked up by both inputs gets canceled in the output.

Input and feedback paths can be isolated.

If both inputs have the same DC bias, the output isinsensitive to changes in the bias.






Some definitions

It is more convenient to represent the two input voltages andthe two output voltages by their mean and difference values.

vid ≡ vi1 − vi2

vicm ≡vi1 + vi2

2vod ≡ vo1 − vo2

vocm ≡vo1 + vo2

2

The common mode and differential gains are:

Adiff ≡vod

vid

Acm ≡vocm

vicm






Common Mode Rejection Ratio

For a good diff amp, the differential gain should be high andindependent of input common mode voltage, whereas thecommon mode gain should be as low as possible. Thecommon mode rejection ratio is:

CMRR ≡ 20 logAdiff

AcmdB






Will this do?

vi 1 vi 2

vo 1 vo 2

Vdd

One (not very good) way of implementing a diff amp is to usetwo single ended amplifiers as shown above.

Output = Vo1 − Vo2

Here the transistor currents, and hence the differential gain, willdepend on the common mode voltage. This is not desirable aswe would like the circuit to ignore the common mode voltageand to amplify just the difference signal.






The long tail pair

A better diff amp can be implemented by adding a currentsource to keep the total current constant.

vi 1 vi 2

vo 1 vo 2

Vdd

Is

Vs

If the common mode voltage appearing at thetwo inputs changes, it will only change thevoltage at the node where the two sources join(Vs). However, the current remains unchangeddue to the current source - and therefore, thedifferential gain is unaffected by the commonmode voltage.






Diff amp with single ended output

vi 1 vi 2

Vdd

Is

VsMn1 Mn2

Mp1 Mp2i out

iout = I(Mp2)− I(Mn2)

I(Mp2) = I(Mp1) (current mirror)I(Mp1) = I(Mn1) (series connection)

iout = I(Mn1) − I(Mn2) = gm(vi1 − vi2)

iout ≡ Gm(vi1 − vi2) = Gmvid

Thus we have a single output which is proportional to thedifference of inputs.The effective Gm is just the gmof either of the diff-pairtransistors.






Gain of the OTA

vi 1 vi 2

Vdd

Is

VsMn1 Mn2

Mp1 Mp2i out

This circuit is also called an operationaltransconductance amplifier (OTA) because theoutput is a current.

Rout = ro(Mn2)‖ro(Mp2)

So DC voltage gain = gm(ro(Mn2)‖ro(Mp2))

and GBW =gm

CL

CL includes Cdg and Cd for Mn2 and Mp2, as well as the loadcapacitance.





The two stage op-amp

Two stage op-amp

vi 1 vi 2

Vdd

VsMn1 Mn2

Mp1 Mp2i out

VbiasMn3

Mp3

Mn4

vout

A simple two stage op-amp can be constructedby following the diff amp by a common sourcestage with a constant current load.The current source for the diff amp isimplemented by an n channel MOS transistorin saturation.

The two stage design permits us to optimize the output stagefor driving the load and the input stage for providing gooddifferential gain and CMRR.A diff amp with n transistors and an output stage with p driver isshown. However, a p type diff amp with n type common sourcestage is better for low noise operation.






op-amp eq. circuit

gm11 v1 gm22 v2

R1 R2C1 C2

v2 v0

Differential Stage Output Stage

Each stage of the opamp can be considered a gain stage with asingle pole frequency response.Notice that the phase of the output of each stage will undergo aphase change of 90o around its pole frequency.






op-amp Compensation

Most opamps are used with negative feedback.If the opamp stages themselves contribute a phase differenceof 180o, the negative feedback will appear as positive feedback.If the gain at this frequency is > 1, the circuit will becomeunstable.Both stages of the opamp have a single pole frequencyresponse.The poles for both the stages can be quite close together.As a result, they can contribute a total of 180o phase shift overa relatively narrow frequency range.






Pole Splitting

To avoid instability, we would like to arrange things suchthat the gain drops to below one by the time the phase shiftthrough the opamp becomes 180o.- Even if it means that we have to reduce the bandwidth ofthe op amp.

This is often achieved by a technique called pole splitting.

The lower frequency pole is brought to a low enoughfrequency, so that the gain diminishes to below one by thetime the second pole is reached.

One way of doing this is to use a Miller capacitor.






Eq. Circuit of compensated Opamp

gm11 v1 gm22 v2

R1 R2C1 C2

v2 v0

Differential Stage Output StageCc






Miller Compensation

C

A1 A2

The diff amp stage sees a load capacitance A 2C.This brings its pole to 1

ro1A2C .The total DC gain is A1A2.The bandwidth is set by the diff amp stage.

Therefore the gain-bandwidth product is:

A1A2

ro1A2C=

A1

ro1C






Slew rate

Miller compensation also sets the slew rate of the op amp.

vi 1 vi 2

Vdd

VsMn1 Mn2

Mp1 Mp2i out

VbiasMn3

Mp3

Mn4

vout

For large signal input, the output current of theOTA = tail current.The effective load capacitance for this stage isA2 × C.

A2 × CdVdt

= I(Mn4)

Output of the OTA slews at a rate I(Mn4)A2×C .

So the op amp slews at a rate which is A2 times this value.

Hence the slew rate of the op amp is I(Mn4)C .






Design Equations-I

vi 1 vi 2

Vdd

VsMn1 Mn2

Mp1 Mp2i out

VbiasMn3

Mp3

Mn4

vout

All transistors must be saturated

I(Mn1) = I(Mn2) =I(Mn4)

2I(Mn1) = I(Mp1) (Series connection)

I(Mp1) = I(MP2) (Mirror)

Mp1 is always saturated.Mp1, Mp2 have the same Vs, Vg, IdSince W/L(Mp2) = W/L(Mp1), MP2 will have the same Vd asMp1, and so, will be saturated.






Design Equations-II

vi 1 vi 2

Vdd

VsMn1 Mn2

Mp1 Mp2i out

VbiasMn3

Mp3

Mn4

vout

Mp3 has the same Vs, Vg as Mp1.

IfI(Mp3)

I(Mp1)=

W/L(Mp3)

W/L(Mp1)

Mp3 will have the same Vd as Mp1and will be saturated.

The slew rate determines I(Mn4).

I(Mn4) = C × Slew Rate

I(Mn1) = I(Mn2) =I(Mn4)

2






Design Equations-III

vi 1 vi 2

Vdd

VsMn1 Mn2

Mp1 Mp2i out

VbiasMn3

Mp3

Mn4

vout

GBW determines gm of Mn1, Mn2.

GBW =gm(Mn2)

C

Since the current as well as gm of Mn1 and Mn2 are now known

gm(Mn2) =

√

2K ′W/L(Mn2)I(Mn2)

W/L(Mn1) = W/L(Mn2)

This will determine the geometries of Mn1 and Mn2.






Design Equations-IV

Currents through Mn2,Mp2, Mp3 and Mn3 are known

(go = Id/VA) where VA is the Early voltage = L/λ′

The overall DC gain is given by

A =gm(Mn2)gm(Mp3)

(go(Mn2)||go(Mp2))(go(Mp3)||go(Mn3))

As gmfor Mn2 and all govalues are known, this determines thegmfor MP3.Once we know the gmas well as the current for Mp3, we cancalculate its geometry.






Example Design: Specifications

K ′(n) = 120µA/V2, K ′

(p) = 60µA/V2

VT (n) = 0.4V, VT (p) = −0.4V

Early Voltage VA = 20V for both p and n channel transistors

Op amp DC gain = 80dB (Voltage gain of 10000)

Gain Bandwidth product = 50MHz, slew rate = 20V/µs






Example Design-1

vi 1 vi 2

Vdd

VsMn1 Mn2

Mp1 Mp2i out

VbiasMn3

Mp3

Mn4

vout

We choose a compensation capacitor value of 2 pF.

We shall bias the second stage at 5 times the tail current ofthe differential stage.

From the slew rate, I(Mn4) = 2 × 10−12 × 2010−6 = 40µA

Therefore I(Mn1) = I(Mn2) = I(Mp1) = I(Mp2) = 20µAand I(Mp3) = I(Mn3) = = 200µA






Example Design-2

vi 1 vi 2

Vdd

VsMn1 Mn2

Mp1 Mp2i out

VbiasMn3

Mp3

Mn4

vout

From the GBW requirement,

2π × 50 × 106=

gm(Mn2)

2 × 10−12

This gives gm(Mn2) ≃ 628µ.To get a gmof 628 µ with a current of 20µA,

628 × 10−6=

√

2 × 120 × 10−6 × (W/L) × 20 × 10−6

this gives W/L(Mn2) ≈ 82 = W/L(Mn1)






Example Design-3

vi 1 vi 2

Vdd

VsMn1 Mn2

Mp1 Mp2i out

VbiasMn3

Mp3

Mn4

vout

goof Mn2 and Mp2 = 20µA/20V = 1µ.Therefore go(Mn2)‖go(Mp2) = 2µ.goof Mn3 and Mp3 is = 200µA/20V = 10µ.Therefore go(Mp3)‖go(Mn3) = 20µ.

DC gain = 10000 =628µ

2µ×

gm(Mp3)

20µ

So, gm(Mp3) ≃ 637µ






Example Design-4

vi 1 vi 2

Vdd

VsMn1 Mn2

Mp1 Mp2i out

VbiasMn3

Mp3

Mn4

vout

To get a gmof 637µ with a drain current of 200µA, weshould have

637 × 10−6=

√

2 × 60 × 10−6 × (W/L) × 200 × 10−6

which gives the W/L of Mp3 ≈ 17.Since the geometry of Mp1 and Mp2 has to be in thecurrent ratio with Mp3, W/L of Mp1 and Mp2 should be≈ 1.7.






Example Design-5

vi 1 vi 2

Vdd

VsMn1 Mn2

Mp1 Mp2i out

VbiasMn3

Mp3

Mn4

vout

Finally, we assume that an n type reference bias transistorof W/L = 4 is available with a current of 10 µA. This willgive the W/L of Mn4 and Mn3 as 16 and 80 respectively.

This completes the design for the simple two stage op amp.





Cascode Opamps

Telescopic Cascode Opamp

Vdd

Gnd

Vbiasp1Vbiasp2

Vbiasn2

Vbiasn1

Vout

Vin + Vin -

+

-

The telescopic cascode is a differential versionof the cascode amplifier discussed earlier.

Its gain is comparable to the two stage op-amp.

The output impedance is (very) high!

The output impedance in conjunction with theload capacitance constitutes the dominant poleof the system.





Cascode Opamps

Telescopic Cascode Opamp

Gain is comparable to the two stage opamp (product of twosingle stage amplifiers).

It needs a higher supply voltage compared to a two stageopamp.

The output stage is high impedance, so the dominant poleis at the output.

Compensation is provided by the load capacitance. So aminimum value of load capacitance is required for stability.

The output common mode voltage is different from theinput common mode voltage range.

This presents difficulties in direct coupling to the next stageand DC feedback to its own input.





Cascode Opamps

Folded Cascode

The common mode voltage incompatibility of a telescopiccascode can be solved by using a folded cascode.

-+

Vdd

Gnd

Vbiasp1

Vbiasp2

Vbiasn2

Vbiasn1

VoutVin + Vin -




Push Pull Output Stage

Push-Pull Op Amp

Differential to single ended conversion can be done in theoutput stage, by using a push-pull driver. The output loads inthe differential stage (Mp1 and Mp2) are diode connected.

Vdd

Gnd

Mp1 Mp2

Mn1 Mn2

Mn3 Mn4

Mp3

Vbias

vi- vi+

Mp4

Mn5

Vs

Out

Current through Mp2 is mirrored inthe output p transistor Mp4.

Current through Mp1 is mirrored intoa pMOS (Mp3) and passed througha diode connected nMOS (Mn3).

This current is mirrored in the outputstage nMOS (Mn4).

Mirroring ratio of Mp4 to Mp2 andMn4 to Mn3 should be identical (andcan be large).



Pipeline Optimization

Dinesh Sharma


2006

Dinesh Sharma Pipeline Optimization



Von Neumann Architecture

ProcessingData Instruction

Processing

State

Instructions

Memory

Bus

Data Instructions

Bottleneck!

A common bus is used fordata as well as instructions.

The system can become ‘busbound’.




Harvard Architecture


Processing

State

Instructions

Data InstructionMemory Memory

DataInstructions

Separate data and instructionpaths

Good performance

Needs 2 buses → expensive!

Traffic on the buses is notbalanced.

Instruction bus may remainidle.




Modified Harvard Architecture


Processing

State

Instructions

DataMemory Memory

MUX

Read Only

Constants

Constants can be stored withInstructions in ROM.

Better Bus balancing ispossible.

Typically, 1 instruction read, 1constant read, 1 data read and1 result write per instruction.

2 mem ops per bus.




Modified Harvard with Cache


Processing

State

Instructions

DataMemory Memory

MUX

Read Only

Constants

Cache

Cache allows optimumutilization of bus bandwidths.

Each operation need not bebalanced individually.




Instruction and Data State Machines

AddressReq. Instr. Recv. Instr.

Decode,Operand Addr to DP

Recv StateFrom DP Send to DP

ReceiveOperands

ExecuteInstruction

From PC

RequestOperands

ReceiveOper. Addr

ReceiveInstruction

StoreResults

ReturnState

Operation of the system maybe modeled as two interactingstate machines.

Instruction processor fetchesinstr, decodes and givesoperation type and operandlocations to data processor.

Data processor fetchesoperands, performs operationand writes back the result.




A pipelined processor

ROM RAM

ROM AddressROM data

RAM AddressRAM data

Instruction Fetch

Instruction

ROM address

.

Consider a Harvard architectureprocessor, which performs thefollowing tasks repetitively:

Fetch Op Code (ROM)





ROM RAM

ROM AddressROM data

RAM AddressRAM data

Data

Constant

Data and Constant Fetch


Fetch Op Code (ROM)

Fetch variable (RAM)

Fetch constant (ROM)





ROM RAM

ROM AddressROM data

RAM AddressRAM data

Execution Phase

.


Fetch Op Code (ROM)



Calculate result





ROM RAM

ROM AddressROM data

RAM AddressRAM data

Write Back

Result


Fetch Op Code (ROM)



Calculate result

Store result (RAM)




Resource Reservation

We can keep track of which resource is doing what at any giventime by a table as shown below:

Resource Reservation Table

0 1 2 3 4ROM Instr Fetch Const. fetchRAM Var. Fetch Write BackALU Compute

This is called a reservation table.Given this reservation table, It appears that we can launch anew instruction every 4 cycles.




Overlapping Operations

However, we need not wait for the previous operation to be overbefore launching a new one.

0 1 2 3 4 5 6 7 8 9 10ROM 0 0RAM 0 0ALU 0

When can we launch the next calculation?




Pipelining

We can fetch the next instruction from ROMwhile we write back the result of the current one to the RAM.

0 1 2 3 4 5 6 7 8 9 10ROM 0 0 1 1 2 2RAM 0 0 1 1 2 2ALU 0 1 2

This will enable us to launch a new calculation every third cycle.




Overlapping Operations

Is this the best we can do?

0 1 2 3 4 5 6 7 8 9 10ROM 0 0 1 1 2 2RAM 0 0 1 1 2 2ALU 0 1 2

None of the resources are utilized 100% in this scheme.The ROM and the RAM are busy for 2 out of 3 cycles, whereasthe ALU is used for 1 cycle out of 3.

A new sample is handled every 3rd cycle now.Can we get even better throughput?




Improved Scheduling

If we store the result in a local register for 1 cycle,

and write it to the RAM only in the 4th cycle, we get

Modified Resource Reservation Table

0 1 2 3 4 5 6ROM 0 0RAM 0 0ALU 0BUF 0

By delaying the write back,

we can launch the next instruction earlier!




Improved Scheduling




0 1 2 3 4 5 6 7 8 9 10ROM 0 0 1 1 2 2 3 3 4 4 5RAM 0 1 0 2 1 3 2 4 3ALU 0 1 2 3 4BUF 0 1 2 3

We can now launch a new operation every 2nd cycle.

Can this be further improved?




Improved Scheduling




0 1 2 3 4 5 6 7 8 9 10ROM 0 0 1 1 2 2 3 3 4 4 5RAM 0 1 0 2 1 3 2 4 3ALU 0 1 2 3 4BUF 0 1 2 3

The RAM and the ROM are now occupied

100% of the time, So the design is optimal

and the throughput cannot be improved any further.




How can we always find the optimum solution?

Given a Resource Reservation Table, we would like to setup a systematic method which optimizes the throughputof the process using this table.

For maximum throughput, we would like to launch newoperations as frequently as possible.

Thus, we want to minimize the time gap between launchingtwo operations.

This is called the Sample Period (SP).

What is the minimum possible value of SP?




The minimum Sampling Period

Consider an operation in which the busiest resource isused for n cycles.

If we launch a new operation every n cycles, this resourcewill be used 100% of the time.

If we launch operations any more frequently than this, theresource will not have enough time to do its work.

Therefore, the minimum possible Sample Period is equal tothe maximum number of cycles for which the busiest of theresource(s) is in operation.




Sampling Period

We want to minimize the sampling period.

But the sampling period need not be a constant!

SP can cycle through a finite set of values.

We should therefore define an Average Sampling periodASP.

The minimum value of this average Sampling Period(MASP) is given by the number of cycles for which thebusiest resource is used in an operation.




Cyclic Sampling Period

Consider the following reservation table:

0 1 2 3 4 5 6 7 8RSC1 0 0RSC2 0 0RSC3 0

Now the next operation can be launched in cycle 1 itself.However, the following one can only be launched after a gap of3 cycles in cycle 4.

0 1 2 3 4 5 6 7 8 9 10ROM 0 1 0 1 2 3 2 3 4 5 4RAM 0 1 0 1 2 3 2 3 4 5ALU 0 1 2 3 4

Again, the next operation can be launched in the next cycle (incycle 5) and after that, with a gap of 3 cycles in cycle 8.




Average Sampling Period

0 1 2 3 4 5 6 7 8 9 10ROM 0 1 0 1 2 3 2 3 4 5 4RAM 0 1 0 1 2 3 2 3 4 5ALU 0 1 2 3 4

New operations can be launched in clock periods0,1,4,5,8,9 . . . .

Thus, the sample period cycles through the values 1,3.

The average of the cycle is called the Average SamplingPeriod (ASP).

The Average Sampling period (ASP) is 2 here.

The whole pattern repeats every 4 cycles. This is calledthe period (p).




Minimum Average Sampling Period

The minimum value of the Average Sampling Period(MASP) is given by the maximum number of cycles forwhich a resource is busy during an operation.

Therefore, given a reservation table, MASP is known.

If the actual average Sampling Period is equal to MASP,the system is already optimum and nothing needs to bedone.

If the actual average Sampling Period is greater thanMASP, we can attempt to modify the reservation table,such that MASP is achieved.




Pipeline Optimization

1 For a given reservation table, find the current averagesample period (ASP).

2 Find the largest no. of cycles for which a resource is busy.

3 This is equal to the Minimum possible Average SamplingTime (MASP).

4 If ASP = MASP, there is nothing to be done.

5 Else, we should try to re-schedule events such that MASPis achieved.




Method to achieve MASP

We first consider various cycles whose average is thedesired MASP.

For example, if MASP is 2, we can have cycles of 2, 1,3or 1,1,4 etc.

The periods are 2, 4 and 6 in these three cases.




The Generator Set

For each cycle, we construct a generator set G, whichcontains elements of the cycle, their sums taken two at atime, three at a time etc., modulo periodicity p.

In our example, cycles are 2, 1,3 and 1,1,4For a cycle of 2, p = 2, so G = 0For a cycle of 1,3, p = 4, so G = 0,1,3For a cycle of 1,1,4, p = 6, so G = 0,1,2,4,5




The Source Set

For each selected cycle, We now construct the Source setS. This contains integers 0 through p-1, from which allmembers of G except 0 have been removed.

In our example, cycles are 2, 1,3 and 1,1,4Cycle p G S2, 2 0 0,11,3, 4 0,1,3 0,21,1,4, 6 0,1,2,4,5 0,3




Design Sets

For each selected cycle, We construct Design sets Di

which have the property that:if a ∈ D and b ∈ Dthen |a − b| also ∈ D.

In our example,Cycle p S D sets2, 2 0,1 0, 1 and 0,11,3, 4 0,2 0, 2, 0,21,1,4, 6 0,3 0, 3, 0,3




Notice that Design sets do not depend on the reservationtable.

The sets G, S and Di are constructed from the repetitioncycles whose average value is the MASP.

Therefore we can make a library of these in advance fordifferent combinations of MASP values and cycles - anduse them when needed.




Row Vectors

We construct a row vector for each resource in thereservation table.

The row vector is a set which contains the clock period inwhich a specific resource is busy.


0 1 2 3ROM 0 0RAM 0 0ALU 0

In this example, the row vector for ROM is 0,1, for RAM is1,3 and for ALU is 2.




Matching Rows with Design Sets

Choose a particular cycle with the desired MASP.(Say MASP = 2, cycle = 2).

Pick the corresponding design sets.(In this example, D = 0, 1, 0,1).

For each resource,take its row vector and take a design set with the samecardinality.

Align these according to defined rules.




Rules for Alignment of the First elements

Compare R(1) and D(1).If these are equal, nothing needs to be done.Else,

If R(1) < D(1), add D(1)-R(1) to all members of RIf R(1) > D(1), add R(1)-D(1) to all members of D

This is equivalent to a rigid shift of R or D till their firstmembers are aligned.

For Example, if R = 1,3,4,6 and D=0,2,5,6

X X X X

X X X X

X X X X

X X X X

D0,2,5,6

0 1 2 3 4 5 6 7

R

1 2

1,3,4,6

1,3,4,6

0 3 4 5 6 7

R

D 1,3,6,7




Alignment of other elements

If R(i) = D(i)] Nothing needs to be done.

If R(i) < D(i)Add D(i) - R(i) delays to allmembers of R at position iand beyond.

X X X X

X X X X

X X

X X X X

X X

1 2 4 5 6 7

1,3,4,6R

D 1,3,6,7

Break Here and move

0

0

1 2

3

4 5 6 7 83

1,3,6,8

1,3,6,7

The i’th elements are now aligned.




Alignment of other elements

If D(i) < R(i)(for Example, p = 2R = 1,3,4,6, D = 1,2,5,6.Now D2 < R2)

1 Add sufficient multiples of p toD(i) such that it is ≥ R(i).

2 Add the same number tomembers of D beyond i.

3 Now if R(i) < D(i), add D(i) -R(i) delays to all members ofR at position i and beyond.

D 1,2,5,6

Peridicity p = 2

Break here andmove forward by p (=2) steps

Now align R

1 2 4 5 6 7

R

0 3

D

98

1,4,7,8

1 2 4 5 6 7

R

0 3

D

98

1,4,7,8

1,4,5,7

X X X X

X X X X

X X X X

X X X X

X X X X

X X X X

1,3,4,6

1 2 4 5 6 7

1,3,4,6R

0 3




Alignment Example

Let R = 1,3,4,6 and D = 0,1,4,5; with periodicity p = 2

0 1 2 3 4 5 6 7 8R X X X XD X X X X

X X X X

To align the first element,move all elements of Dforward by 1 step.Now D = 1,2,5,6.

0 1 2 3 4 5 6 7 8R X X X XD X X X X

X X X XR X X X X

For the second element, D isbehind. Move D2 onwardsfwd by p = 2, so D = 1,4,7,8.Move R2 onwards fwd by 1So R = 1,4,5,7




Alignment Example

R = 1,4,5,7 and D = 1,4,7,8. R3 < D3

0 1 2 3 4 5 6 7 8 9 10D X X X XR X X X X

X X X X

Move R3 and beyondforward by 2So R = 1,4,7,9and D = 1,4,7,8.

0 1 2 3 4 5 6 7 8 9 10R X X X XD X X X XD X X X XR X X X X

D4 < R4Move D4 forward by 2to 10.Now R4 < D4.Move R4 forward by 1to 10

Vectors are now aligned at 1,4,7,10.




Example System

we shall illustrate the method using our original example,whose reservation table is:


0 1 2 3 4 5 6ROM 0 0RAM 0 0ALU 0

Since the ROM and the RAM are used for 2 cycles each inevery operation, MASP = 2.However, as we had seen before, ASP = 3 in this case.Therefore, the schedule needs improvement.




Example Application

Aligning the ROM

0 1 2 3ROM 0 0RAM 0 0ALU 0

MASP = 2, Choose the cycle:2Then D = 0, 1, 0,1

For ROM: R = 0,1, D=0,1

So no alignment is required.




Adjusting the RAM Schedule

For RAM: R = 1,3, D=0,1

Aligning the First Element:R(1) > D(1)Add (1-0)=1 to D elements ⇒ D = 1,2

Aligning other elements:R(2) > D(2)Add p (=2) to D(2) ⇒ D = 1, 4Now R(2) < D(2)Add (3-2)=1 to R(2) ⇒ R = 1, 4R and D are now aligned.




ALU Schedule

For ALU: R = 2, D = 0

Aligning first element: Add (2-0) = 2 to D ⇒ D = 2R and D are now aligned.

ROM = 0,1, RAM = 1,4, ALU = 2Modified Reservation Table

0 1 2 3 4ROM 0 0RAM 0 0ALU 0

As we have seen earlier, this is indeed the optimal schedulewith ASP = 2.




Optimized Reservation Table


0 1 2 3 4 5 6 7 8 9 10ROM 0 0 1 1 2 2 3 3 4 4 5RAM 0 1 0 2 1 3 2 4 3ALU 0 1 2 3 4

The ALU is idle 50% of the time.

Rather than buffering its result to delay the write back, wecan use a slower ALU which takes 2 cycles to compute.




Using a Slower ALU

The reservation table with a slower ALU is:

0 1 2 3 4 5 6 7 8 9 10ROM 0 0 1 1 2 2 3 3 4 4 5RAM 0 1 0 2 1 3 2 4 3ALU 0 0 1 1 2 2 3 3 4

One can trade off power for speed when designing theALU.

By using optimization techniques, we are able to reach ahigher throughput, even with a slower ALU!




Alternative Choice of Cycle

0 1 2 3ROM 0 0RAM 0 0ALU 0

MASP = 2, Choose the cycle:1,3Then D = 0, 2, 0,2

For ROM: R = 0,1, D=0,2R(1) = D(1) = 0, R(2) < D(2)

Add D(2) - R(2) to all members of R at position 2 (and beyond)⇒ R(2) = 2.

R and D are now aligned at 0,2




Alternative Cycle:RAM Schedule

For RAM: R = 1,3, D=0,2

R(1) > D(1)Add (1-0)=1 to D elements: ⇒ D = 1,3R and D are now aligned at 1,3.

For ALU: R = 2, D = 0

Aligning first element: Add (2-0) = 2 to D ⇒ D = 2R and D are now aligned at 2.

0 1 2 3ROM 0 0RAM 0 0ALU 0




Time Ordering

0 1 2 3 4 5 6 7 8 9 10ROM 0 1 0 1 2 3 2 3 4 5 4RAM 0 1 0 1 2 3 2 3 4 5ALU 0 1 2 3 4

As expected, the schedule is optimum.

The sampling rate alternates between 1 and 3.

However this schedule does not preserve time order.

It asks for computation and constant fetch in the samecycle.

If we pre-fetch the constant for the next to next calculationin this cycle and store it for 4 cycles, it may still work.




Conclusions

Pipeline can improve throughput of systems.

A systematic procedure for optimizing pipeline throughputexists. It can create modified reservation tables which areoptimal by delaying some operations.

However, it does not guarantee that the time order ofdifferent operations will be preserved.

Different cycles with the same Average Sampling Periodmay have to be tried before an acceptable time order isfound.

The procedure also allows us to identify non-criticalcomponents which can then be redesigned to be slowerbut at lower power consumption.




AN Introduction to VHDLOverview

Dinesh Sharma


August 2008

Dinesh Sharma VHDL



Design Units in VHDLObject and Data Types

Part I

VHDL Design Units

1 Design Units in VHDLentityArchitectureComponentConfigurationPackages and Libraries

2 Object and Data TypesScalar data typesComposite Data Types

Dinesh Sharma VHDL




entityArchitectureComponentConfigurationPackages and Libraries

An introduction to VHDL

VHDL is a hardware description language which uses thesyntax of ADA. Like any hardware description language, it isused for many purposes.

For describing hardware.

As a modeling language.

For simulation of hardware.

For early performance estimation of system architecture.

For synthesis of hardware.

For fault simulation, test and verification of designs.

etc.

Dinesh Sharma VHDL





Design Elements in VHDL: ENTITY

The basic design element in VHDL is called an ‘ENTITY’.

An ENTITY represents a template for a hardware block.

It describes just the outside view of a hardware module –namely its interface with other modules in terms of inputand output signals.

The hardware block can be the entire design, a part of it orindeed an entire “test bench”.

A test bench includes the circuit being designed, blockswhich apply test signals to it and those which monitor itsoutput.

The inner operation of the entity is described by anARCHITECTURE associated with it.

Dinesh Sharma VHDL





ENTITY DECLARATION

The declaration of an ENTITY describes the signals whichconnect this hardware to the outside. These are called portsignals. It also provides optional values of manifest constants.These are called generics.

VHDL 93

entity name isgeneric (list);port (list);

end entity name;

VHDL 87

entity name isgeneric (list);port (list);

end name ;

Dinesh Sharma VHDL





ENTITY EXAMPLE

VHDL 93

entity flipflop isgeneric (Tprop:delay length);port (clk, d: in bit; q: out bit);

end entity flipflop;

VHDL 87

entity flipflopgeneric (Tprop: delay length);port (clk, d: in bit; q: out bit);

end flipflop;

The entity declares port signals, their directions and data types.

These signals are used by an architecture associated with thisentity.

Dinesh Sharma VHDL





Design Elements in VHDL: ARCHITECTURE

An ARCHITECTURE describes how an ENTITY operates. AnARCHITECTURE is always associated with an ENTITY.

There can be multiple ARCHITECTURES associated with anENTITY.

An ARCHITECTURE can describe an entity in a structuralstyle, behavioural style or mixed style.

The language provides constructs for describing components,their interconnects and composition (structural descriptions).

The language also includes signal assignments, sequential andconcurrent statements for describing data and control flow, andfor behavioural descriptions.

Dinesh Sharma VHDL





ARCHITECTURE Syntax

VHDL 93

architecture name of entity-nameis

(declarations)begin (concurrent statements)end architecture name;

VHDL 87

architecture name of entity-nameis

(declarations)begin (concurrent statements)end architecture name;

The architecture inherits the port signals from its entity. It mustdeclare its internal signals. Concurrent statements constitutingthe architecture can be placed in any order.

Dinesh Sharma VHDL





ARCHITECTURE Example

VHDL 93

architecture simple of dff issignal ...;begin...end architecture simple;

VHDL 87

architecture simple of dff issignal ...;begin...end simple;

Dinesh Sharma VHDL





Design Elements in VHDL: COMPONENTS

An ENTITY↔ ARCHITECTURE pair actually describes acomponent type .In a design, we might use several instances of the samecomponent type .Each instance of a component type may be distinguishedby using a unique name.Thus, a component instance with a unique instance nameis associated with a component type , which in turn isassociated with an ENTITY↔ ARCHITECTURE pair.This is like saying U1 (component instance) is a D Flip Flop(component type) which is associated with an entity DFF(which describes its pin diagram) using architectureLS7474 (which describes its inner operation).

Dinesh Sharma VHDL





Component Example

VHDL 93

component name isgeneric (list);port (list);

end component name;EXAMPLE:component flipflop is

generic (Tprop:delay length);port (clk, d: in bit; q: out bit);

end component flipflop;

VHDL 87

component namegeneric (list);port (list);

end component ;EXAMPLE:component flipflop

generic (Tprop: delay length);port (clk, d: in bit; q: out bit);

end component;

Dinesh Sharma VHDL





Design Elements in VHDL: Configuration

Structural Descriptions describe components and theirinterconnections.

A component is an instance of a component type.Each component type is associated withan ENTITY↔ ARCHITECTURE pair.

The architecture used can itself contain other components -whose type will then be associated with otherENTITY↔ARCHITECTURE pairs.

A “configuration” describes linkages between componenttypes and ENTITY↔ ARCHITECTURE pairs. It specifiesbindings for all components used in an architecture associatedwith an entity.

Dinesh Sharma VHDL





Design Elements in VHDL: Packages

Related declarations and design elements like subprogramsand procedures can be placed in a ”package” for re-use.

A package has a declarative part and an implementation part.

This is somewhat like entity and architecture for designs.

Objects in a package can be referred to by apackagename.objectname syntax.

A description can include a ‘use’ clause to incorporate thepackage in the design. Objects in the package then becomevisible to the description without having to use the dot referenceas above.

Dinesh Sharma VHDL





Design Elements in VHDL: Libraries

Many design elements such as packages, definitions and entireentity architecture pairs can be placed in a library.

The description invokes the library by first declaring it:For example, Library IEEE;

Objects in the Library can then be incorporated in the design bya ‘use’ clause.For example, Use IEEE.std logic 1164.all

In this example, IEEE is a library and std logic 1164 is apackage in the library.

Dinesh Sharma VHDL




Scalar data typesComposite Data Types

Object and Data Types in VHDL

VHDL defines several types of objects . These includeconstants, variables, signals and files .

The types of values which can be assigned to these objects arecalled data types.

Same data types may be assigned to different object types.For example, a constant , a variable and a signal can all havevalues which are of data type BIT.

Declarations of objects include their object type as well as thedata type of values that they can acquire.For example signal Enable: BIT;

Dinesh Sharma VHDL





Data Types

bit_vector string

bit character

Composite

constrainedarray

unconstrainedarray

Access

boolean

PhysicalFloating Pt.Discrete

timerealInteger

enumeration

Severity Level file_open_kind file_open_status

Scalar File

Dinesh Sharma VHDL





Enumeration Type

VHDL enumeration types allow us to define a set of values thata variable of this type can acquire. For example, we can definea data type by the following declaration:

type instr is (add, sub, adc, sbb, rotl, rotr);

Now a variable or a signal defined to be of type instr can onlybe assigned values enumerated above – that is: add, sub, adc,sbb, rotl and rotr.In actual implementation, these values may may be mapped toa 3 bit value. However, an attempt to assign, say, ‘010’ to avariable of type instr will result in an error. Only the enumeratedvalues can be assigned to a variable of this type.

Dinesh Sharma VHDL





Pre-defined Enumeration Types

A few enumeration types are pre-defined in the language.These are:type bit is (’0’, ’1’);type boolean is (false, true);type severity level is (note, warning, error, failure);type file open kind is (read mode, write mode, append mode);type file open status is

(open ok, status error, name error, mode error);

In addition to these, the character type enumerates all theASCII characters.

Dinesh Sharma VHDL





Types and SubTypes

A signal type defined in the IEEE Library is std logic. This is asignal which can take one of 9 possible values. It is defined by:

type std logic is (‘U’, ‘X’, ‘0’, ‘1’, ‘Z’, ‘W’, ‘L’, ‘H’, ‘-’);

A subtype of this kind of signal can be defined, which can takethe four values ‘X’, ‘0’, ‘1’, and ‘Z’ only.This can be defined to be a subtype of std logic

subtype fourval logic is std logic range ‘X’ to ‘Z’;

Similarly, we may want to constrain some integers to a limitedrange of values. This can be done by defining a new type:subtype bitnum is integer range 31 downto 0;

Dinesh Sharma VHDL





Physical Types

Objects which are declared to be of Physical type, carry a valueas well as a unit. These are used to represent physicalquantities such as time, resistance and capacitance.

The Physical type defines a basic unit for the quantity and maydefine other units which are multiples of this unit.

Time is the only Physical type, which is pre-defined in thelanguage. The user may define other Physical types.

Dinesh Sharma VHDL





Pre-defined Physical Type: Time

type time is range 0 to . . .units

fs;ps = 1000 fs;ns = 1000 ps;us = 1000 ns;ms = 1000 us;sec = 1000 ms;min = 60 sec;hr = 60 min;

end units time;

The user may define other physical types as required.Dinesh Sharma VHDL





User Defined Physical Types

As an example of user defined Physical types, we can definethe resistance type.

type resistance is range 0 to 1E9units

ohm;kohm = 1000 ohm;Mohm = 1000 kohm;

end units resistance;

Dinesh Sharma VHDL





Composite Data Types

Composite data types are collections of scalar types.

VHDL recognizes records and arrays as composite data types.

Records are like structures in C.

Arrays are indexed collections of scalar types. The index mustbe a discrete scalar type.

Arrays may be one-dimensional or multi dimensional.

Dinesh Sharma VHDL





Arrays

Arrays can be constrained or unconstrained.

In constrained arrays, the type definition itself placesbounds on index values. For example:

type byte is array (7 downto 0) of bit;type rotmatrix is array (1 to 3, 1 to 3) of real;

In unconstrained arrays, no bounds are placed on indexvalues. Bounds are established at the time of declaration.

type bus is array (natural range <>) of bit;

The declaration could be:signal addr bus: bus(15 downto 0);signal data bus: bus(7 downto 0);

Dinesh Sharma VHDL





Built in Array types

VHDL defines two built in types of arrays. These are:bit vectors and strings. Both are unconstrained.

type bit vector is array (natural range <>) of bit;type string vector is array (positive range <>) of character;

As a result we can directly declare:variable message: string(1 to 20)signal Areg: bit vector(7 downto 0)

Dinesh Sharma VHDL





Records

While an array is a collection of the same type of objects,a record can hold components of different types and sizes.

This is like a struct in C.

The syntax of a record declaration containsa semicolon separated list of fields, each field having the formatname, . . ., name : subtypeFor example:

type resource is record(P reg, Q reg : bit vector(7 downto 0); Enable: bit)end record resource;

Dinesh Sharma VHDL



Structural Description

Part II

Structural Description in VHDL

3 Structural DescriptionComponent DeclarationsComponent InstantiationConfigurationRepetition Grammar

Dinesh Sharma VHDL




Component DeclarationsComponent InstantiationConfigurationRepetition Grammar

Structural Style

Structural style describes a design in terms of components andtheir interconnections.

Each component declares its ports and the type and directionof signals that it expects through them

How can we describe interconnections between components?

U1

U2

U3In

Outs1

s2

s3

s4

s5

s6

s7

p1

p2p3

p1

p2

p2

p3

p3

p1

p4

p4

p4

p5

p6

p5

p5

p6s3

s4

p6

Dinesh Sharma VHDL





Describing Interconnect

U1

U2

U3In

Outs1

s2

s3

s4

s5

s6

s7

p1

p2p3

p1

p2

p2

p3

p3

p1

p4

p4

p4

p5

p6

p5

p5

p6s3

s4

p6

For each internal interconnect, wedefine an internal signal.

When instantiating a component,we map its ports to specific internalsignals.

For example, in the circuit above, At the time ofinstantiating U1, we map its pin p2 to signal s2.

Similarly, when instantiating U2, we map its pin p3 to s2.

This connects p2 of U1 to s2 and through s2 to pin p3 ofU2.

Dinesh Sharma VHDL





Structural Architecture

A purely structural architecture for an entity will consist of

1 Component declarations: to associate component typeswith their port lists.

2 Signal Declarations: to declare the signals used.

3 Component Instantiations: to place component instancesand to portmap their ports to signals. Signals can beinternal or port signals declared by the ENTITY.

4 Configurations: to bind component types to ENTITY→ARCHITECTURE pairs.

5 Repetition grammar: for describing multiple instances ofthe same component type – for example, memory cells orbus buffers.

Dinesh Sharma VHDL





Component Declarations

VHDL 93

component name isgeneric (list);port (list);

end component name;EXAMPLE:component flipflop is

generic (Tprop:delay length);port (clk, d: in bit; q: out bit);

end component flipflop;

VHDL 87

component namegeneric (list);port (list);

end component ;EXAMPLE:component flipflop

generic (Tprop: delay length);port (clk, d: in bit; q: out bit);

end component;

Dinesh Sharma VHDL





Component Instantiation

VHDL-93: Direct Instantiation

VHDL-93 allows direct instantiation ofENTITY↔ ARCHITECTURE pairs without having to go througha component type declaration first.

Instance-name: entity entity-name (architecture-name)generic map(list)port map(list);

This form is convenient, but does not have the flexibility ofassociating alternative ENTITY↔ ARCHITECTURE pairs witha component.

VHDL-87 does not allow direct instantiation.Dinesh Sharma VHDL






VHDL-93: Normal Instantiation

Instance-name: component component-type-namegeneric map(list)port map(list);

The association here is with a previously declared componenttype. The type will be bound to an ENTITY↔ ARCHITECTUREpair using an inline configuration statement or a configurationconstruct.

Dinesh Sharma VHDL






VHDL-87

The keyword component is not used in VHDL-87. This isbecause direct instantiations are not allowed and therefore thebinding is always to a component.

Instance-name: component-type-namegeneric map(list)port map(list);

The association is with a previously declared component type.The type will be bound to an ENTITY↔ ARCHITECTURE pairusing an inline configuration statement or construct.

Dinesh Sharma VHDL





Inline Configuration

The association between component types andENTITY↔ARCHITECTURE pairs can be made inline with ause clause.

for all: component-nameuse entity entity-name(architecture-name);

Instead of saying for all , we can specify a list of selectedinstances of this component type to which this binding willapply.

instance-name-list: component-nameuse entity entity-name(architecture-name);

Dinesh Sharma VHDL





The key word OTHERS

If we use the keyword others instead of a list of instancenames, it refers to all component instances of thiscomponent-name which have not yet figured in a name-list.

In VHDL, the key word others is used in different contextsinvolving lists.

If some members of the list have been specified, then othersrefers to the remaining members. (If none was specified, it isequivalent to all .

Dinesh Sharma VHDL





Hierarchical Configuration

When we associate a component type with a previously definedENTITY↔ ARCHITECTURE pair,the chosen architecture could itself contain other components- and these components in turn would be associated with otherENTITY↔ ARCHITECTURE pairs.

This hierarchical association can be described by a standalonedesign unit called a configuration .

Dinesh Sharma VHDL





Hierarchical Configuration

VHDL contains fairly complex configuration statements. Asimplified construct is introduced here:

configuration config-name of entity-name isfor architecture-name

for component-instance-namelist: component-type-nameuse entity entity-name(architecture-name);

end forend for

end configuration config-name;

Dinesh Sharma VHDL





Structural description: Example

A

B

A+B

A+BA

B A+B

A + B

Let us choose the xor gateshown on the left as anexample for structuraldescription.

It uses four instances of asingle type of component: twoinput NAND.

We shall describe the NANDgate first.

Dinesh Sharma VHDL





The work library

In VHDL, as we describe entities and architectures, theseare compiled into a special library called WORK.

This library is always included and does not have to bedeclared.

In some sense, the WORK library represent the currentstate of development of the project for designingsomething.

Dinesh Sharma VHDL





Definition of NAND

Entity nand2 isport (in1, in2: in bit; p: out bit);

end entity nand2;

We do not use any generic for thissimple example.

Architecture trivial of nand2 isp <= not (in1 and in2);end Architecture trivial;

‘not’ and ‘and’ are inbuilt logicalfunctions.(Actually so is nand – but we aretrying to be cute!)

Now that we have this entity-architecture pair, we can use it tobuild our xor gate.

Dinesh Sharma VHDL





XOR Gate example

A

B

A+B

A+BA

BA+B

A + Bs1

s1

s1

s2

s3N1

N2

N3

N4 axb

USE WORK.ALLEntity xor isport(a,b: in bit; axb: out bit);End Entity xor;

Architecture simple of xor iscomponent NAND2in IS port(a,b:in bit; axb: out bit);For all NAND2in: use EntityNAND2(Trivial);signal s1,s2,s3: bit;

Dinesh Sharma VHDL





XOR Architecture body

A

B

A+B

A+BA

BA+B

A + Bs1

s1

s1

s2

s3N1

N2

N3

N4 axb

beginN1: component NAND2inportmap(a, b, s1);N2: component NAND2inportmap(a, s1, s2);N3: component NAND2inportmap(b, s1, s3);N4: component NAND2inportmap(s2, s3, axb);end Architecture simple;

Dinesh Sharma VHDL





Repetition Grammar

We frequently use a large number of identical components ofthe same type. (For example memory cells or bus drivers).It is tedious to instantiate and configure each one of themindividually.

VHDL provides a way to place a collection of instances of acomponent type at one go using the generate statement.

Dinesh Sharma VHDL





GENERATE Statement

The generate statement contains a for loop which takes effectduring the circuit elaboration step. This can be used to repeatinstantiation constructs. We illustrate this statement with anexample:

groupname: for index in 0 to width-1 generatebegin

some-name: component outbufportmap (...);

end generate groupname;

The defined index in the “for” construct has local scope and canbe used to pick specific signals from an array in portmapstatements.

Dinesh Sharma VHDL





Example: Full adder

a

b

C_in

sum

C_outFull

AdderEntity FullAdder isPort(a,b, C in: in bit; sum, C out: out bit);End Entity FullAdder;

C out and sum represent the more significant and lesssignificant bits of a+b+C in.

Dinesh Sharma VHDL





Example: Full adder

a

b

C_in

sum

C_outFull



Suppose this is too difficult for the likes of us to figure out

Dinesh Sharma VHDL





Example: Full adder

a

b

C_in

sum

C_outFull



Suppose this is too difficult for the likes of us to figure out

We would like to decompose the circuit into blocks whichhandle two bits at a time.

Dinesh Sharma VHDL





Decomposition of Full Adder

HA1

HA2

a

b

C_in

sum

C_out

cy1

cy2

combn

s1

s2

s

cyi1

i2

i1i2

s

cy

The combiner just combines thecarries from the two half adders.(Just an OR Gate will do it.)

i1

i2

s

cy

Half Adder

Each half adder represents thesum and carry of just two bits.

Carry occurs only if both bits are 1.Sum is zero if both bits are zero orboth are one.so sum = a xor b, cy = a and b.

Dinesh Sharma VHDL





Description of full Adder

Entity HalfAdder isport(in1, in2: in bit; s, cy: out bit);End Entity HalfAdder;

Architecture trivial of HalfAdder isbegin

s <= a xor b;cy <= a and b;

end Architecture trivial;

Architecture simple of FullAdder isComponent HalfAdder is

port(a, b: in bit; s, cy: out bit);End Component HalfAdder;signal s1, cy1, cy2: bit;beginHA1: Component HalfAdder

portmap(a,b,s1,cy1)HA2: Component HalfAdder

portmap(s1,cy1,sum,cy2)Cmbn: Component OR2in

portmap(cy1, cy2, C out)end Architecture simple;

Dinesh Sharma VHDL





The half adder

Carry from the half adder is an AND gate, and the combiner isan OR.

But Gates without inversion are slow. So we bring out carryrather than carry, using a NAND gate.

i1

i2

s

Half Adder

cybar

Entity HalfAdder isport(in1, in2: in bit; s, cybar: out bit);End Entity HalfAdder;Architecture better of HalfAdder isbegin

s <= a xor b;cybar <= a nand b;

end Architecture better;The combiner should now be an OR of negative true signals.This is just a NAND.

Dinesh Sharma VHDL





Efficient Full Adder

HA1

HA2

a

b

C_in

sum

C_out

combn

cybar

cybar

i1

i2

s

s

i1

i2

s1

c1b

c2b

s2

Architecture better of FullAdder isComponent HalfAdder isport(a, b: in bit; s, cybar: out bit);End Component HalfAdder;signal s1, c1b, c2b: bit;beginHA1: Component HalfAdder

portmap(a,b,s1,c1b);HA2: Component HalfAdder

portmap(s1,c1b,sum,c2b);Cmbn: Component NAND2in

portmap(c1b, c2b, C out);end Architecture better;

Dinesh Sharma VHDL



Behavioural DescriptionSubprograms

Attributes

Part III

Behavioural Description Using VHDL

4 Behavioural DescriptionConcurrent StatementsVHDL OperatorsProcessesSequential Statements

5 Subprograms

6 AttributesArray attributesType AttributesSignal attributes

Dinesh Sharma VHDL




Attributes

Concurrent StatementsVHDL OperatorsProcessesSequential Statements

Behavioural Style

Behavioural style describes a design in terms of its behaviour,and not in terms of a netlist of components.

We describe behaviour through “if-then-else” type of constructs,loops, sequential and concurrent assignment statements.

Statements like “if-then-else” are inherently sequential. Thesemust therefore occur only inside sequential bodies likeprocesses.

A concurrent assignment statement may be considered as ashorthand for a very simple process.

Dinesh Sharma VHDL




Attributes


Specifying a waveform

A waveform is described by a comma separated list of valuesand optionally, delays. For example, we may assign a waveformby a statement like

indata <= ’0’, ‘1’ AFTER 20 NS, ’0’ AFTER 50 NS;

The values at different times are treated as transport delaysand are all inserted in the time ordered queue without wipingout earlier values.

(This is the only context where delays are transport by default).Single value assignments use inertial delay by default.

Dinesh Sharma VHDL




Attributes


Concurrent Assignment

A concurrent assignment can be made conditionally by using‘when’ clauses.

name < = [delay-mechanism]waveform when Boolean-expression elsewaveform when Boolean-expression;

The assignment is made from the first waveform where theBoolean expression evaluates to TRUE.

Dinesh Sharma VHDL




Attributes


Concurrent Assignment

The assignment can also be made on a selective basis, basedon the value of some expression:

with expression selectname < = [delay-mechanism]

waveform when choices,waveform when choices;

If the expression evaluates to one of the specified choices, thecorresponding assignment is made.

Dinesh Sharma VHDL




Attributes


Assignment to an aggregate

Assignments can be made to a collection of signalssimultaneously. For example let vec be defined as bit vector(2downto 0)

vec <= (“000”) - - 000 : stringvec <= (’0’,’0’,’1’) - - 001 : positionalvec <= (1=>’1’, others => ’0’) - - 010 : named, partialvec <= (’1’, others => ’0’) - - 100 : positional, partialvec <= (2|0 =>′ 1′, others => ’0’) - - 101 : partialvec <= (others => ’1’) - - 111

Dinesh Sharma VHDL




Attributes


VHDL Operators

Logical operators: AND, OR, NAND, NOR, OR, XNOR andNOTFor example x <= a xor b;

Relational operators: =, /, <, <=, >, >=

= and = operate on any type. Others operate on arithmetictypes: (integers, reals etc.). All of these return a booleanvalue.

Shift operators: SLL (logical left), SLA (arithmetic left) SRL(logical right), SRA (Arithmetic right), ROL rotate left andROR (rotate right).

Dinesh Sharma VHDL




Attributes


Processes

Sequential constructs need to be placed inside a process. Aprocess uses the syntax:

[ process-label: ] process [(sensitivity-list)] [is ][declarations]

begin[sequential statements]

end process [process-label];

Sequential statements include “if” constructs, case statements,looping constructs, assertions, wait statements etc.

Dinesh Sharma VHDL




Attributes


Process with Sensitivity list

Every process is like an endless loop. Therefore, it requires anexplicit or implicit suspend statement.

If a sensitivity list is given with the process statement, theprocess automatically suspends when it reaches its end.

It restarts from the beginning when any of the signals in itssensitivity list has an event.

This process has a static sensitivity and an implicit suspendstatement.

Dinesh Sharma VHDL




Attributes


Wait statements

A process without a sensitivity list requires explicit suspendstatements. These are provided by wait statements. These canbe of the form:

wait for waiting-time;wait on signal-list;wait until waiting-condition;wait for 0 some-time-unit;wait ;

wait for 0 ns causes the process to suspend till the next delta.The last form (bare wait statement) suspends the process forever.

Dinesh Sharma VHDL




Attributes


Dynamic sensitivity

Processes without a sensitivity list and multiple wait statementshave a dynamic sensitivity. This is because these processesare sensitive to different events at different times.

One cannot mix static and and dynamic sensitivityThus, a process with a sensitivity list cannot use waitstatements.

This is because once the process is suspended, it is possible tohave an event on a signal in the sensitivity list simultaneouslywith the condition for resumption after wait being fulfilled.

This would leave the process undecided on where to resumefrom.

Dinesh Sharma VHDL




Attributes


IF statements

if statements are similar to their counterparts in programminglanguages. The syntax is:

[ if-label: ] if Boolean-expression thensequential statements

[ elsif Boolean-expression thensequential statements ]

[ elsif ... ]

[ else sequential statements ]

end if [ if-label ];

Dinesh Sharma VHDL




Attributes


CASE statements

A case statement acts like a multiplexer.The syntax is:

[ case-label:] case expression iswhen choices = >

sequential-statements[ when ... ]

end case [ case-label ];

Dinesh Sharma VHDL




Attributes


CASE Choices

Choices can be specified in CASE statements as vertical barseparated lists of expressions, discrete ranges or the keywordothers . For example:

case opcode isload | store | add | subtract = >

...

Dinesh Sharma VHDL




Attributes


Loop Statements

There are several different forms of the loop statement. Thesimplest is the endless loop:

[ loop-label: ] loop[ loop-label: ] loop

sequential statementsend loop [ loop-label ];

This constitutes an endless loop.It is assumed that it will have an exit statement or a waitstatement inside to suspend operation.

Dinesh Sharma VHDL




Attributes


Exiting a Loop

The exit statement has the syntax:[ label: ] exit [ loop-label ] [ when Boolean expression ]

The loop label allows one to exit several levels of nested loops.

We can also skip to the end of a loop by using the nextstatement. This works like “continue” in C.

Dinesh Sharma VHDL




Attributes


NEXT Statement

[ label: ] next [ loop-label ] [ when Boolean expression ]

The next statement skips the statements of the loopand immediately starts the next iteration of the specified loop.

The loop label allows one to skip through several levels ofnested loops.

Dinesh Sharma VHDL




Attributes


WHILE Loops

VHDL also has a while loop.

[ loop-label: ]

while Boolean-expression loopsequential statements

end loop [ loop-label ];

The loop continues as long as the Boolean expression is TRUE.

Dinesh Sharma VHDL




Attributes


For Loops

VHDL also provides a fo r loop.

[ loop-label: ]

for identifier in discrete-range loopsequential statements

end loop [ loop-label ];

The discrete range can be of the formexpression to | downto expression

The identifier is initialized to the left limit of the range and takeson successive values in the discrete range till it exceeds theright limit.

Dinesh Sharma VHDL




Attributes


Assertions and Reports

The assert statement takes the form

[ label: ] assert Boolean expression[ report expression ] [ severity expression ];

If the Boolean expression is TRUE, no action is taken.If it is FALSE, an assertion violation is said to have occurred.The simulators then outputs the report expression.

Subsequent operation depends on the severity clause.

Dinesh Sharma VHDL




Attributes


Severity Clause in Assertions

Assert statements are used for debugging and documentation.The severity clause decides what happens when an assertionfailure occurs.

Severity is an enumerated type which is predefined to take anyof the values:

note, warning, error, failure

Depending on the severity value, simulation continues or isaborted.

Dinesh Sharma VHDL




Attributes


Severity values

Note is simply to generate an output when an assertionviolation occurs.

Warning is useful when the validity of the simulation may bein doubt, but we would like to issue a warning andcontinue anyway.

Error is used when an unexpected value is encountered.

Failure is the most severe violation and is used whensome inconsistency is detected.

Dinesh Sharma VHDL




Attributes


Assertions defaults

[ label: ] assert Boolean expression[ report expression ] [ severity expression ];

If the optional report clause is missing in the assert statement,the default report message is “Assertion Violation”.

If the severity clause is omitted, the default value is ‘error’.

Most simulators allow the user to set a severity threshold,beyond which the simulation is aborted on an assertionviolation. It is common to continue on note and warning and toabort on error and failure.

In VHDL-93, the report clause can be used by itself as astatement to output useful messages.

Dinesh Sharma VHDL




Attributes

Subprograms in VHDL

VHDL has two types of subprograms: Functions andProcedures.

FUNCTIONS are used to return a single value from a given listof input parameters. These occur in expression onthe right hand side of VHDL statements. Functionsexecute in zero simulation time.

PROCEDURES can return multiple values and need notexecute in zero simulation time. The parametershave their type as well as direction defined in theparameter list. These are invoked like a VHDLstatement.

Dinesh Sharma VHDL




Attributes

FUNCTIONS

Functions can be PURE or IMPURE.

A PURE function returns the same value every time it is calledwith the same value of input parameters. Most functions arePURE.

An IMPURE function can return different values for calls withthe same parameter values.For example, the function NOW, which returns the currentsimulation time.RANDOM is also an IMPURE function.

Dinesh Sharma VHDL




Attributes

Functions

Function name(parameter list) Return type IS. . . Local declarations . . .

BEGINSequential Statements;. . . ;

END [FUNCTION] name;

Dinesh Sharma VHDL




Attributes

Function Example

TYPE Byte IS ARRAY(7 DOWNTO 0) OF BIT;

FUNCTION ByteVal(InByte: Byte) RETURN Integer ISVariable RetVal: Integer := 0;

BEGINFOR I IN 7 DOWNTO 0 LOOP

RetVal = 2 * RetVal;IF (InByte = ’1’) THEN RetVal := RetVAl + 1;END IF;

END LOOP;RETURN RetVal;

END FUNCTION ByteVal;

Dinesh Sharma VHDL




Attributes

Procedures

Declaration:

PROCEDURE name (parameter list) IS. . . Local declarations . . .

BEGINSequential Statements;. . . ;

END [PROCEDURE] name;

A procedure ends when it reaches the END statement. It canbe terminated earlier by using the RETURN statement.

Dinesh Sharma VHDL




Attributes

Parameter Lists for Procedures

Similar to List of signals in a PORT declaration.

Elements of the list have a TYPE as well as a direction.

The direction can be in, out or inout.

Elements of the list can also have their Object Class(Constant/ Variable/ Signal) also in the parameter list.

For example: (SIGNAL a, b, c: IN BIT; Variable result: OUTINTEGER);

Dinesh Sharma VHDL




Attributes

Array attributesType AttributesSignal attributes

Attributes

VHDL provides built in functions which return usefult attributesof the objects that they operate on.Attribute functions may provide attributes of

Arrays

Types

Signals

Entities

Attributes are invoked as name’attrib name.The single quote is read as “tick”

Dinesh Sharma VHDL




Attributes


Array Attributes

Array attributes interrogate the property of arrays. Consider thedeclaration:TYPE regfile IS ARRAY(0 To 3, 7 Downto 0) OF BIT;Then we can use the following attributes:

’LEFT :regfile’LEFT(2) = 7’RIGHT:regfile’RIGHT(1) = 3’HIGH:regfile’HIGH(2) = 7’LOW:regfile’LOW(1) = 0

’RANGE:regfile’RANGE(1)= 0 TO 3’REVERSE RANGE:regfile’REVERSE RANGE(1) = 3DOWNTO 0’LENGTH: regfile’LENGTH(1) = 4’ASCENDING:regfile’ASCENDING(1) = TRUE

Dinesh Sharma VHDL




Attributes


Type Attributes

Type attributes apply only to scalar types. Consider thedeclarations:TYPE nineval IS(’U’, ’X’, ’0’, ’1’, ’Z’, ’L’, ’H’, ’W’, ’-’)SUBTYPE fourval IS nineval RANGE ’X’ to ’Z’Then, fourval’BASE = nineval

Attributes LEFT, RIGHT, HIGH and LOW are defined for TYPESalso. When applied to a TYPE, these return the correspondingvalues as defined for the type. For example,

nineval’LEFT = ’U’, fourval’LEFT = ’X’POSITIVE’LOW = 1

Dinesh Sharma VHDL




Attributes


Signal Attributes

Name Example Return type Value type’DELAYED s’DELAYED Signal same as s’STABLE s’STABLE(5ns) Signal Boolean’EVENT s’EVENT Value Boolean’QUIET s’QUIET(3ns) Signal Boolean

’TRANSACTION s’TRANSACTION Signal BIT’DRIVING s’DRIVING Value Boolean

’DRIVING VALUE s’DRIVING VALUE Value same as s

Dinesh Sharma VHDL




Attributes


Case of RS Latch

R

S

Q

Q

Entity RS Latch isPort(R,S: IN BIT; Q, Qbar: OUT BIT);End Entity RS Latch;Architecture trouble of RS Latch isBeginQ <= R NOR Qbar;Qbar <= S NOR Q;End Architecture trouble;

This will run into trouble as Q and Qbar are declared to beoutputs and cannot be used on the RHS expression of anassignment.

Dinesh Sharma VHDL




Attributes


RS Latch

R

S

Q

Q

We have several choices:

Declare Q and Qbar to be inout.This is not safe as this will allow outside circuitry to drive Q andQbar nodes.

Use structural description and connect nor outputs to internalsignals s1 and s2. Later assign s1 and s2 to Q, Qbar.Introduces artificial delay in driving of Q and Qbar.

Better choice is to use the driving value attribute.Dinesh Sharma VHDL



Signal types in Package Std Logic 1164Functions Defined in std logic package 1164

Part IV

The IEEE Package Std Logic 1164

7 Signal types in Package Std Logic 1164The resolution FunctionLogic Functions with std logic

8 Functions Defined in std logic package 1164

Dinesh Sharma VHDL




The resolution FunctionLogic Functions with std logic

9 Valued Logic

The stdlogic package uses 9 valued logic.The basic unresolved signal type is declared as:

TYPE std ulogic IS (’U’,’X’,’0’,’1’,’Z’,’W’,’L’,’H’,’-’);

Here U is uninitialized,X is forcing unknown, W is weak unknown,L and H are weak 0 and 1,Z is high impedance and - is “don’t care”.

This type combines signal values and drive strengths,permitting modeling of open drain and wired or circuits. Othertypes are derived from this basic signal type.

Dinesh Sharma VHDL





Derived types

We derive the following types from the basic u logic signal

TYPE std ulogic vector ISARRAY (NATURAL RANGE<>) OF std ulogic);

FUNCTION resolved(s:std ulogic vector) RETURN std ulogic;

SUBTYPE std logic IS resolved std ulogic;

TYPE std logic vector ISARRAY (NATURAL RANGE<>) OF std logic);

Dinesh Sharma VHDL





Other Types

The IEEE package 1164 also defines the following subtypes ofstd ulogic.

1 X01 allows the values X, 0 and 1.

2 X01Z allowed the values X, 0, 1 and Z. This type iscompatible with the default verilog signal type.

3 UX01 allows the values U, X, 0 and 1.

4 UX01Z allows the values U, X, 0 1 and Z.

The package includes functions for conversion between varioustypes.

Dinesh Sharma VHDL





The Resolution Function

This function uses the following table:

U X 0 1 Z W L H -U U U U U U U U U UX U X X X X X X X X0 U X 0 X 0 0 0 0 X1 U X X 1 1 1 1 1 XZ U X 0 1 Z W L H XW U X 0 1 W W W W XL U X 0 1 L W L W XH U X 0 1 H W W H X- U X X X X X X X X

The resolution function receives a vector of driving values oftype std ulogic. The return is type std ulogic!

Dinesh Sharma VHDL





The Resolution Function

FUNCTION resolved(s: std ulogic vector)RETURN std ulogic IS

VARIABLE result:std ulogic:=’Z’BEGINIF (s’LENGTH = 1) THEN RETURN s(s’LOW);ELSE

FOR i IN s’RANGE LOOPresult:= resolution table(result,s(i));

END LOOP;END IF;RETURN result;END resolved;

Dinesh Sharma VHDL





Logic Functions with std logic

Since signals can now acquire a multiplicity of values, we needto redefine logic functions.

This is done by overloading logic functions with new definitionswhen their arguments are of type std ulogic or std logic.

What happens when we put an inverter on a std ulogic signal?

This is defined by the ‘NOT’ logic function:

NOTinput U X 0 1 Z W L H -output U X 1 0 X X 1 0 X

Dinesh Sharma VHDL





Logic Truth TABLES

Truth tables of 2 input logic functions will now be 9x9 matrices!AND

U X 0 1 Z W L H -U U U 0 U U U 0 U UX U X 0 X X X 0 X X0 0 0 0 0 0 0 0 0 01 U X 0 1 X X 0 1 XZ U X 0 X X X 0 X XW U X 0 X X X 0 X XL 0 0 0 0 0 0 0 0 0H U X 0 1 X X 0 1 X- U X 0 X X X 0 X X

Dinesh Sharma VHDL




Conversion Functions

The following type conversion functions are included inpackage 1164:

These include To bit (from std ulogic) and To std ulogic(from bit)

To bit vector (from std ulogic vector and std ulogic vector)

To std ulogic vector (from bit vector) andTo std logic vector (from bit vector)

To std logic vector (from std ulogic vector) andTo std ulogic vector (from std logic vector)

There are similar functions for inter-conversions betweenX01, X01Z etc. and std logic and std ulogic.

Dinesh Sharma VHDL




Edge Detection Functions

The IEEE library package 1164 includes edge detectionfunctions for std ulogic types. These are defined as:

FUNCTION rising edge (SIGNAL s: std ulogic)RETURN Boolean

The rising edge is detected when there is a transitionfrom 0 or L to 1 or H.

FUNCTION falling edge (SIGNAL s: std ulogic)RETURN Boolean

The falling edge is detected when there is a transitionfrom 1 or H to 0 or L.

Dinesh Sharma VHDL



A magnitude comparator

Part V

An Example Design

9 A magnitude comparatorFirst Level DescriptionConstructing the Byte ComparatorStructural Description of Bit Comparator

Dinesh Sharma VHDL



A magnitude comparatorFirst Level DescriptionConstructing the Byte ComparatorStructural Description of Bit Comparator

A Magnitude Comparator

The example used in this section has been described inthe book: “VHDL: Analysis and Modeling of DigitalSystems” by Zainalabedin Navabi (McGraw Hill).

However the treatment in this tutorial is different.

We illustrate top down design using this example.

Dinesh Sharma VHDL




A magnitude comparator

We want to design a circuit to compare the magnitude oftwo binary numbers.We shall illustrate the design by a comparator for byte widenumbers.However, the design should be stackable, so that widernumbers can be compared.The input to the system are the two numbers and stackinginputs, gt in, eq in and lt in.The outputs are the result of comparison: gt out, eq outand lt out.The stacking inputs and outputs use “one hot” coding:exactly one of the conditions gt, eq or lt is TRUE at a giventime.

Dinesh Sharma VHDL




First level description

Library IEEE;USE IEEE.std logic 1164.ALL;TYPE Byte IS Array (7 DownTo 0) OF std ulogic;Entity Byte Compar is

Port(a, b: IN BYTE;gt in, eq in, lt in: IN std ulogic;gt out, eq out, lt out: OUT std ulogic);

End Entity Byte Compar;

Dinesh Sharma VHDL




Architecture of Byte Comparator

Architecture first Of Byte Compar isVariable val1, val2: Integer:= 0;

BEGINP1: PROCESS(a, b, gt in, eq in, lt in)

BEGINval1 := ByteVal(a);val2 := ByteVal(b);IF (val1 > val2) THEN

gt out <= ’1’; eq out <= ’0’; lt out <= ’0’;ELSIF (val1 < val2) THEN

gt out <= ’0’; eq out <= ’0’; lt out <= ’1’;ELSE gt out <= gt in; eq out <= eq in; lt out <= lt in;END IF;

END PROCESS P1;END Architecture first;

Dinesh Sharma VHDL




Decomposition of Byte Comparator

The byte comparator is difficult to design directly.We can brek up the design into bit comparators

with cascading inputs gt in, eq in and lt in;and cascading outputs gt out, eq out and lt out.

>

<=

>

<=

BitPart BitPart BitPart BitPart BitPart BitPart BitPart BitPart

A7A0 B7B0 B6A6A1 B1 B2A2 A3 B3 A4 B4 A5 B5

Notice that the most significant bit is compared closest to theoutput.

Dinesh Sharma VHDL




Composing the Byte comparator

Architecture compose of Byte Compar ISCOMPONENT BitPart IS

Port(a, b: IN std ulogic;gt in, eq in, lt in: IN std ulogic;gt out, eq out, lt out: OUT std ulogic);

END COMPONENT BitPart;FOR ALL: BitPart

USE ENTITY Bit Compar(behave);TYPE Connect IS ARRAY (1 TO 3, 0 TO 6) OF std ulogic);Signal Cascade: Connect;

Dinesh Sharma VHDL





BEGINFOR I in 0 T0 7 GENERATEFirst: IF I = 0 GENERATE

COMPONENT BitPartPORTMAP(gt in, eq in, lt in,a(I), b(I),Connect(1, I), Connect(2,I), Connect(3,I));

END GENERATE;

Dinesh Sharma VHDL





Last: IF I = 7 GENERATECOMPONENT BitPart

PORTMAP(Connect(1, I-1), Connect(2,I-1), Connect(3,I-1));a(I), b(I),gt out, eq out, lt out)

END GENERATE;

Dinesh Sharma VHDL





Mid: IF (I >0) AND (I< 7) GENERATECOMPONENT BitPart

PORTMAP(Connect(1, I-1), Connect(2,I-1), Connect(3,I-1));a(I), b(I),Connect(1, I), Connect(2,I), Connect(3,I));

END GENERATE;END GENERATE;

END Architecture Compose;

Dinesh Sharma VHDL




The bit comparator

Once we have decomposed the byte comparator as above, weneed to design the bit comparator.

The bit comparators recieve a pair of bits to compare.

If A > B, i.e. A=1 and B=0; it makes the output gt outTRUE and makes the other outputs FALSE.

If A < B, i.e. A=0 and B=1; it makes the output lt out TRUEand makes the other outputs FALSE.

IF A and B are equal, it copies its cascading inputs (gt in,eq in, lt in) to its outputs (gt out, eq out, lt out);

Dinesh Sharma VHDL




The bit comparator

Library IEEE;USE IEEE.std logic 1164.ALL;

Entity Bit Compar isPort(a, b: IN std ulogic;

gt in, eq in, lt in: IN std ulogic;gt out, eq out, lt out: OUT std ulogic);

End Entity Bit Compar;

Dinesh Sharma VHDL




Behavioural Architecture of Bit Comparator

Architecture behave Of Bit Compar isBEGINP1: PROCESS(a, b, gt in, eq in, lt in)

BEGINIF (a = ’1’ AND b = ’0’) THEN

gt out <= ’1’; eq out <= ’0’; lt out <= ’0’;ELSIF (a = ’0’ AND b = ’1’) THEN

gt out <= ’0’; eq out <= ’0’; lt out <= ’1’;ELSE gt out <= gt in; eq out <= eq in; lt out <= lt in;END IF;

END PROCESS P1;END Architecture behave;

Dinesh Sharma VHDL




Structural Description of Bit Comparator

We can write Karnaugh Maps for the three outputs easily:

gt outab → 00 01 11 10gt in ↓

0√

1√ √ √

lt outab → 00 01 11 10lt in ↓

0√

1√ √ √

eq outab → 00 01 11 10

eq in ↓

01

√ √

This gives:

gt out = a · b + gt in · (a + b)

lt out = a · b + lt in · (a + b)

eq out = eq in · (a · b + a · b)

Dinesh Sharma VHDL




Final Design of bit comparator

a

b

gt_in

a + b

a + b

lt_in

eq_out

eq_in

lt_out

gt_out

a

bThis design can be describedstructurally in terms of basicgates.

The design uses only invertinggates. It can be implementeddirectly on a chip.

Dinesh Sharma VHDL




Structural Description of Bit Comparator

Architecture struct Of Bit Compar isComponent Inv IS

PORT(In1: IN std ulogic; op1: OUT std ulogic);END COMPONENT Inv;FOR ALL: Inv USE ENTITY Inverter(behav);Component Nand2 IS

PORT(In1, In2: IN std ulogic; op1: OUT std ulogic);END COMPONENT Nand2;FOR ALL: Nand2 USE ENTITY Nand2(behav);Component Nand3 IS

PORT(In1, In2, In3: IN std ulogic; op1: OUT std ulogic);END COMPONENT Nand3;FOR ALL: Nand3 USE ENTITY Nand3(behav);

Dinesh Sharma VHDL




Structural Architecture of Bit Comparator

SIGNAL Abar, Bbar, AplusBbar, BplusAbar: std ulogic;SIGNAL s1, s2, Eqbar: std ulogic;BEGINInv1: Inv PORTMAP(A, Abar);Inv2: Inv PORTMAP(B, Bbar);N1: Nand2 PORTMAP(A, Bbar, BplusAbar);N2: Nand2 PORTMAP(B, Abar, AplusBbar);N3: Nand2 PORTMAP(lt in, BplusAbar, s1);N4: Nand2 PORTMAP(gt in, AplusBbar, s2);N5: Nand2 PORTMAP(s1, AplusBbar, lt out);N6: Nand2 PORTMAP(s2, BplusAbar, gt out);N7: Nand3 PORTMAP(AplusBbar, BplusAbar, Eq in, Eqbar);Inv3: Inv PORTMAP(Eqbar, Eq out);END ARCHITECTURE struct;Dinesh Sharma VHDL




Inline configuration

The configuration of a component can be declared “inline” in anarchitecture.

Architecture compose of Byte Compar ISCOMPONENT BitPart IS

Port(a, b: IN std ulogic;gt in, eq in, lt in: IN std ulogic;gt out, eq out, lt out: OUT std ulogic);

END COMPONENT BitPart;FOR ALL: BitPart

USE ENTITY Bit Compar(behave);TYPE Connect IS ARRAY (1 TO 3, 0 TO 6) OF std ulogic);Signal Cascade: Connect;

All components of type BitPart have been configured to use theentity Bit Compar with architecture behave.Dinesh Sharma VHDL




Standalone configuration

In the example given, all components of type BitPart wereconfigured to use the entity Bit Compar with architecturebehave.

This was specified ”inline” in the architecture declarativepart.

We can write a separate configuration description outsidethe architecture using the configuration.

Dinesh Sharma VHDL




Stand alone configuration

The syntax of a standalone configuration is:

CONFIGURATION configname OF entityname ISFOR architecture name

FOR instance name | OTHERS | ALL : component nameUSE ENTITY sub entity name(sub architecture name);. . .

END FOR;END FOR;

END [CONFIGURATION] [configname];

Dinesh Sharma VHDL




Hierarchical configuration

The architecture being configured may containscomponents which are bound to architectures containingother components.

This requires hierarchical configuration.

Instead of binding component instances toentity-architecture pairs directly, we bind these to otherconfigurations.

These other configurations associate the component withan entity-architecture pair and cofigure the lower levelcomponents.

Dinesh Sharma VHDL




Hierarchical configuration

The syntax used for hierarchical configuration is:

CONFIGURATION configname OF entityname ISFOR architecture name

FOR instance name | OTHERS | ALL : component nameUSE CONFIGURATION subconfig name;. . .

END FOR;END FOR;

END [CONFIGURATION] [configname];

Subconfig name will associate the component with anentity-architecture pair and will configure lower levelcomponents in the hierarchy.

Dinesh Sharma VHDL




Hierrarchy in a single configuration

The hierarchy can be described through nested FORs in asingle configuration description.

CONFIGURATION single OF Byte compar ISFOR compose – architecture name

FOR ALL: BitPartUSE ENTITY WORK.Bit Compar(struct);FOR struct – architecture of Bit Compar

FOR ALL: Nand2 USE ENTITY . . .

Dinesh Sharma VHDL



Files in VHDLThe Textio Package

Part VI

File I-O in VHDL

10 Files in VHDLFile DeclarationsOpening and Closing FilesReading and writingExample of File usage

11 The Textio Package

Dinesh Sharma VHDL




File DeclarationsOpening and Closing FilesReading and writingExample of File usage

Files in VHDL

To VHDL, a file is a collection of information of a type that isknown to it.

File I-O presents a special problem, because conventionsfor naming files and directories are different for differentOperating Systems.

We would like to insulate hardware descriptions from thisvariation.

We do it by making a distinction between file names usedby VHDL and the operating system dependent filenamewhich is associated with it.

Dinesh Sharma VHDL





FILE Types

In VHDL, in order to use files, we use a two step procedure.

1 We declare a FILE TYPE first. This associates a File TYPEwith the kind of objects that files of this type will contain.

2 We can then decare files of this FILE TYPE.The file declaration associates a VHDL filename with aFILE TYPE and optionally, with a Physical file name andfile mode (read, write or append).

Dinesh Sharma VHDL





Examples

TYPE datafile IS FILE OF CHARACTER;This specifies that any file which has the type datafile willcontain characters and each read will return a character whileeach write will accept a character to be written to the file.

Once a file type has been declared, we may declare one ormore files of this type. For example,

FILE vfile1: datafile;FILE vfile2: datafile IS “indata.dat”FILE vfile3: datafile OPEN WRITE MODE is “output.dat”;

Dinesh Sharma VHDL





FILE vfile1: datafile;This form merely associates the VHDL name vfile1 with the fileTYPE datafile, which specifies that it contains characters.

FILE vfile2: datafile IS “indata.dat”This form also associates the VHDL filename vfile2 with thePhysical filename indata.dat.

FILE vfile3: datafile OPEN WRITE MODE is “output.dat”; Thisform associates the vhdl filename vfile3 with the physicalfilename output.dat and also opens it in write mode.

Dinesh Sharma VHDL





Opening and Closing Files

If a file has not been opened during its declaration, it can beopened later by specific statements.

Once a file type has been declared as:TYPE FileType IS FILE OF DataType;it implicitly defines various procedures and functions.

PROCEDURE FILE OPEN(FILE f: FileType;Phys name: IN string;open kind: IN FILE OPEN KIND:= READ MODE);

PROCEDURE FILE CLOSE(FILE f: FileType);

Dinesh Sharma VHDL





Reading from and Writing to Files

Once file types and files have been declared, varioussubprograms become available.

PROCEDURE READ(FILE f: FileType; value: OUT Data type);PROCEDURE WRITE(FILE f: FileType; value: IN Data type);FUNCTION ENDFILE(FILE f: FileType) RETURN Boolean;

Dinesh Sharma VHDL





Unconstrained Data Types

It is possible to declare a File Type to contained unconstrainedarrays as data types. For example:

TYPE VectorFile IS FILE OF std ulogiv vector;

Now how do we know the amount of data which will be returnedupon each read request? For this, there is an additional syntaxfor the read procedure:PROCEDURE READ(FILE f: FileType; value: OUT Data type

Length: OUT natural);When we use this form, we supply an array large enough toaccommodate the array in the worst case and a variable, whichwill receive the length of the vector actually read.

Dinesh Sharma VHDL





Example of File usage

Library IEEE;USE IEEE.std logic 1164.ALL;ENTITY ROM Block IS

GENERIC(size: NATURAL, content file: STRING)PORT(Chip sel: IN std logic;

rdbar: IN std logic;Addr: IN std logic vector;Data: IN std logic vector);

END ENTITY ROM Block;

Dinesh Sharma VHDL





ROM Initialization

ARCHITECTURE From File OF ROM Block ISSUBTYPE Word IS

std logic vector(Data’Length-1 DOWNTO 0);TYPE Mem Array IS

ARRAY(NATURAL RANGE 0 TO 2**size -1) of Word;VARIABLE Mem Contents: Mem Array;VARIABLE Index: Natural;. . .TYPE RomData File IS FILE of WORD;FILE Rom Contents : RomData FILE

OPEN Read Mode IS content file;. . .

Dinesh Sharma VHDL





ROM Initialization

BEGINFilling: Process ISBEGIN

Index := 0;WHILE NOT EndFile(ROM Contents) LOOP

READ(ROM Contents, Mem Contents(Index);Index:= Index+1;

END LOOP;WAIT;END PROCESS Filling;. . . - - process to handle rdbar

END ARCHITECTURE From File;

Dinesh Sharma VHDL




The Textio Package

This package defines various TYPEs and provides manyprocedures for handling text.

TYPE TEXT IS FILE OF STRING;TYPE LINE IS ACCESS STRING;FILE INPUT: TEXT OPEN READ MODE IS “std input”FILE OUTPUT: TEXT OPEN WRITE MODE IS “std output”PROCEDURE READLINE(FILE f: TEXT; L: INOUT LINE)

Dinesh Sharma VHDL




Reading and Writing Text

Text reading and writing is a two step procedure. For writing,you first compose a line and then write it to a file. For reading,you read a line and then extract values from it.

Several overloaded functions all carrying the names READ orWRITE are provided for this. For example:

PROCEDURE READ (L: InOut LINE; value: OUT BIT);PROCEDURE READ (L: InOut LINE; value: OUTBIT VECTOR);PROCEDURE READ (L: InOut LINE; value: OUT Integer);PROCEDURE READ (L: InOut LINE; value: OUT BIT);etc.Similarly, there are many WRITE functions.

Dinesh Sharma VHDL



Lecture Notes on Mixed Signal Circuit Design by Prof Dinesh.K.sharma

Documents

Transcript of Lecture Notes on Mixed Signal Circuit Design by Prof Dinesh.K.sharma