Contactless Test of IC Pads, Pins, and TSVs via Standard Boundary Scan
Transcript of Contactless Test of IC Pads, Pins, and TSVs via Standard Boundary Scan
![Page 1: Contactless Test of IC Pads, Pins, and TSVs via Standard Boundary Scan](https://reader037.fdocuments.net/reader037/viewer/2022092823/5750a8091a28abcf0cc59374/html5/thumbnails/1.jpg)
Contactless Test ofIC Pads, Pins, and TSVsvia Standard Boundary ScanStephen Sunter and Aubin Roy
Mentor Graphics
h MANY MIXED-SIGNAL AND flash memory circuits
have inherently long time constants which impose
fundamental limits on how much the test time for
integrated circuits (ICs) can be reduced when test-
ing one IC at a time. Industry generally uses multisite
testing to reduce the effective test time for these ICs.
To facilitate multisite testing without greatly increas-
ing the number of automatic test equipment (ATE)
channels needed, many of each IC’s pins are not
contacted by the tester. By making the uncontacted
pins bidirectional, these pins are tested via bound-
ary scan for basic shorts and opens.
In our ITC’10 and ITC’11 papers [1], [2], a built-in
self-test for input/output pins (I/O BIST) was de-
scribed that tests various delays of I/O pins via
standard 1149.1 boundary scan [3]. No changes are
needed in the boundary
scan cells if they are similar
to the BC_7 cells specified
in the standard. And no
changes are needed in the
I/O cells. As we will de-
scribe later, the delay mea-
su r emen t s a r e made
without using delay circuit-
ry (which would be pro-
cess sensitive), by using a
phase-locked loop (PLL) to
generate a slightly asyn-
chronous sampling fre-
quency and applying it to the capture latches of
the boundary scan cells.
Testing delays of low-speed I/Os (G 50 MHz) is
generally considered unnecessary, but higher speed
I/Os, like those used for double data rate (DDR)
memory access, have important timing re-
quirements and more circuitry susceptible to pro-
cess variations. Adding delay measurement circuitry
within these I/O cells would risk altering their
silicon-characterized ac, dc, or electrostatic dis-
charge (ESD) characteristics. Fortunately for our
BIST approach, many DDR interfaces already have
boundary scan.
The sampling instant for DDR inputs is usually
controlled by a delay-locked loop (DLL) that gener-
ates a clock edge midway between received data
edges. Jitter on DDR pins is sometimes tested by a
BIST that drives the I/O pads with pseudorandom
data while capturing the pad data with increasingly
uncentered DLL clock edges and detecting any
resulting differences in the captured data. But the
Editor’s notes:The performance of an IC’s inputs and outputs (I/Os) is always specified inIC datasheets and is the performance most likely to be affected by assemblysteps. As the speed and number of I/Os increase beyond low-cost ATEcapabilities, and I/O pads become smaller (less than 10 microns wide for3D assemblies), built-in self-test (BIST) of this performance becomes moreattractive. This article describes a BIST that exploits relatively low-speedIEEE 1149.1 boundary scan to access the I/Os and test performance with aslow as 5 ps calibrated resolution, equivalent to a bandwidth approaching100 GHz.
VShawn Blanton, Carnegie Mellon University
0740-7475/12/$31.00 B 2012 IEEESeptember/October 2012 Copublished by the IEEE CEDA, IEEE CASS, IEEE SSCS, and TTTC 55
Digital Object Identifier 10.1109/MDT.2012.2206363
Date of publication: 28 June 2012; date of current version:
23 October 2012.
![Page 2: Contactless Test of IC Pads, Pins, and TSVs via Standard Boundary Scan](https://reader037.fdocuments.net/reader037/viewer/2022092823/5750a8091a28abcf0cc59374/html5/thumbnails/2.jpg)
timing of the DLL is untested, uncalibrated, and re-
latively coarse.
For many applications, matching between I/O
delays is more important than whether each pin’s
performance meets absolute specifications. In DDR
interfaces, each I/O within a group of pins is
designed to have performance identical to the other
I/Os. Nevertheless, process variations create interpin
differences in duty cycle, clock-to-Q delay, and slew
rate, which cause timing skew. All published BIST
techniques that we are aware of, including [4]–[7],
do not measure pin-to-pin performance matching,
though [5] provides pass/fail testing.
Many time-to-digital converters (TDCs) are de-
scribed in the literature, most of them based on
Vernier delay-line or oscillator techniques, or
combining them [8], aimed at measuring one-shot
events, such as phase delay of each clock cycle of a
phase-lock loop (PLL). None of these techniques is
suitable for measuring I/O delays without connect-
ing additional circuitry to each I/O.
The development of three-dimensional (3D) de-
vices containing stacked ICs, or an intermediate
form designated 2.5D and containing ICs on a
silicon substrate ‘‘interposer’’, has created an in-
teresting test challenge: they are connected by
through-silicon vias (TSVs) instead of printed wires.
The ICs may contain thousands of TSVs which,
compared to conventional bond pad I/Os, are more
numerous, much smaller, and driven by weaker I/O
circuitry, so probing these I/Os risks damaging or
overloading them and requires many more ATE
channels. Furthermore, the quality of connections
between the ICs is presently a significant yield
limiter for the assemblies, so an on-chip solution is
needed for both prebond and postbond testing of
TSV quality.
Measuring I/O delays viaboundary scan
Our technique is based on the ‘‘I/O wrap delay’’
measurement technique first described in [4] which
used a precision-delayed strobe pulse generated by
ATE or a proposed on-chip delay line; the strobe
pulse was delivered to a capture latch with increas-
ing delays until the captured result changed.
In our case, the clock for the Capture latches in
the boundary scan register (BSR), as shown in
Figure 1, is derived from an asynchronous but co-
herent frequency, relative to the frequency used for
the Update latches and core logic of the IC. The ratio
between the frequencies of the ‘‘Async’’ clock and
the ‘‘Ref’’ clock is made equal to (N-1)/N, where N is
an integer typically between 10 and 4000. The test
access port (TAP) clock, TCK, is not used for this
sampling because typically its frequency is not
constant and may have considerable jitter.
The technique’s measurement resolution TRES is
equal to the Ref clock’s period divided by N � 1.
Figure 2 shows waveforms for an example 3-bit BSR
(LBSR is its length, in bits), when N ¼ 6.
The BIST and BSR are initialized and loaded via
the 1149.1 TAP at TCK clock rates, and a delay path
in each I/O cell, as shown in Figure 1, is selected via
the mode_2 control. The BIST’s Ref and Async clock
inputs are then enabled synchronously. At least one
of these clocks must be generated by a PLL, which
may be on-chip or off-chip.
After initialization, which occurs when the Ref
and Async clock edges align, the BSR is clocked
continuously, and the first
Async clock edge chosen for a
Capture occurs in advance of
the Update edge by up to one
half clock periodVthis ensures
that up to one half clock period
of skew in clockDR relative to
updateDR can be tolerated. The
captured data is then shifted to
the BIST circuitry (while data for
the next update is shifted in).
The BIST module has multi-
ple ‘‘measurement slices’’ con-
taining counters and a register,
each slice’s register loaded withFigure 1. Bidirectional I/O, indicating pad, and reference delay paths.
IEEE Design & Test of Computers56
ITC Special Section
![Page 3: Contactless Test of IC Pads, Pins, and TSVs via Standard Boundary Scan](https://reader037.fdocuments.net/reader037/viewer/2022092823/5750a8091a28abcf0cc59374/html5/thumbnails/3.jpg)
a different I/O address (bit position in the BSR). As
boundary scan data is shifted out, a master counter
is incremented and when its count equals that of
one of the slice address registers, the measurement
counter in that slice is incremented if an edge has
been detected earlier in sampled data at that ad-
dress. Simultaneously, the data shifted into the BSR
for the associated I/O is its previous output logic
value inverted. Boundary scan data for all unin-
volved I/Os is simply recirculated so that they are
stable.
Uniquely, this I/O BIST can be placed into a de-
sign after function design, most layout, and even
timing closure, is complete. As shown in Figure 3, a
multiplexer in the BSR design enables the I/O BIST to
substitute new BSR clocks, modes, and data, whose
timing needs to be verified only within the BIST
module.
Figure 3. BIST connection to 1149.1 TAP, BSR, and system PLL.
Figure 2. Waveforms for N ¼ 6 and LBSR ¼ 3, when sampling midrange.
September/October 2012 57
![Page 4: Contactless Test of IC Pads, Pins, and TSVs via Standard Boundary Scan](https://reader037.fdocuments.net/reader037/viewer/2022092823/5750a8091a28abcf0cc59374/html5/thumbnails/4.jpg)
The BIST measures time intervals or delay dif-
ferences, from one edge to another for the same
signal, or from a reference edge to a resulting signal
edge for one path versus another (e.g., Reference
path versus Pad path in Figure 1), or from a ref-
erence edge to a resulting signal edge for one con-
dition versus another (e.g., high versus low drive).
Measurements are performed by capturing all
output values N times, uniformly spanning the
measurement range (one clock period) for one
path or condition, and again for a second path or
condition. The measurement slices count from
when their pin’s captured data rises/falls for one
path/condition to the time it rises/falls for a second
path/condition.
For most measurements, the time to test each
group of I/Os is calculated as
TTEST � 4� RoundupLBSR
N
� �� N 2TREF � R (1)
where TREF is the Ref clock period, and R is the
measurement range in clock periods. For a 40-ns
period, 100-ps resolution ðN ¼ 400Þ, LBSRG400, andR ¼ 1, test time is 26 ms per group of I/Os; for
LBSR ¼ 400 � 800, it is 52 ms.
The total test time depends on the total number
of pins whose delays are to be tested. For the case
above where TTEST ¼ 52 ms, if the number of mea-
surement slices was 16 and the total number of pins
to be measured was ten times that number, then
total test time would be 10� 52 ¼ 520 ms.
The number of nand2-equivalent gates in the
BIST circuitry is approximately 700 per measure-
ment slice, plus 2000 for common circuitry. The
number of slices can be optimized for test time and
silicon area, and hence, test cost.
Improving range, resolution, andtest time
One way to reduce test time is to increase the
reference frequency, since that reduces both the
range and the time to shift out the results, hence
the N 2 term in (1). However, if the I/O path delay
(measured from the TAP controller updateDR edge
to the transition-capturing clockDR edge) is greater
than the usable range, its delay cannot be measur-
edVit will be ‘‘out of range.’’ For zero clock skew
(relative to nominal Update to Capture delay, which
is always an odd number of half clock-periods), the
usable measurement range is a half clock-period.
To increase the BSR shift frequency without re-
ducing the delay measurement range, we permit the
number of captures per path/condition to span mul-
tiple clock periods, i.e., R ¼ 2 in (1). This permits
the clock frequency to be doubled, which reduces
test time yet increases the usable range. For exam-
ple, at 25 MHz (40 ns period) and R ¼ 1, the usable
range for less than �5 ns clock skew is 15� 25 ns.
Doubling the frequency to 50 MHz (while maintain-
ing �5 ns clock skew) and doubling the span to two
clock periods ðR ¼ 2Þ means the new range follow-
ing the update edge becomes 25� 35 ns, yet test
time is reduced 50%.
The BIST’s delaymeasurementsmentioned thus far
include rise delay and fall delay for one path or
condition versus another, but other differences can be
measured. Outside the measurement slices, we
provided two additional counters so that while each
slice’s rise and fall counters are being incremented,
the increment signals of all slices are ORed together
to increment an ‘‘average rise’’ counter and an
‘‘average fall’’ counter. Comparing pin delays to these
averages can be quite useful, as we shall see next.
Comparing measurements to test limitsIn practice, the BIST’s operation always includes
two major steps: a measurement followed by shifting
out all measurement counter contents (for charac-
terization), or a measurement followed by compar-
ison to shifted-in test limits and shifting out pass/fail
bits (for production test).
If the delay counts are to be compared to test
limits, then after measuring a group of pin delays
simultaneously, the rise and fall delay counters of all
the measurement slices are concatenated into a
single circular shift register (the two average count-
ers are concatenated into a separate circular shift
register). Then, as lower or upper per-pin test limits
are shifted in via the TDI pin of the TAP, and the
concatenated counters are shifted, each limit is
compared to each count to produce a pass/fail bit.
Optionally, as all the counts are shifted serially
past a single subtract/compare unit, every adjacent
pair of counts is subtracted from each other and
each result compared to a respective test limit. For
example, each I/O’s fall count can be subtracted
from its rise count to produce a rise-fall mismatch.
Each rise (and fall) count can be subtracted from
IEEE Design & Test of Computers58
ITC Special Section
![Page 5: Contactless Test of IC Pads, Pins, and TSVs via Standard Boundary Scan](https://reader037.fdocuments.net/reader037/viewer/2022092823/5750a8091a28abcf0cc59374/html5/thumbnails/5.jpg)
the neighboring slice’s rise (and fall) count to pro-
duce a pin-to-pin rise (and fall) mismatch count.
Note that each slice can be programmed with any
I/O’s address so delays of any two pins can be
compared. Also, each rise (and fall) count can be
subtracted from the rise (and fall) count of a
selected reference I/O. Lastly, each rise (and fall)
count can be subtracted from the average rise (and
fall) count.
This last capability can provide adaptive BIST at
the subdie level, as forecast in the ITRS 2009 road-
map [9], wherein test limits are adjusted based on
average measurements for structures within the
same die. Since our I/O BIST’s test limits can be re-
lative to the average delay, the absolute limits are in
effect adjusted dynamically as each IC is testedV
similar to dynamic part average testing (DPAT), ex-
cept that this BIST allows simpler tests since no
computations are needed in the ATE. The applied
patterns contain binary-coded test limits for the
difference between the individual measurements
and each I/O group average.
The advantage of adaptive testing at the subdie
level is that every measurement is treated equally,
including the first I/O group of the first die on the
first wafer in the first lot. Defects that affect a single
path are more detectable with a subdie approach
because each path can be compared to paths hav-
ing more-similar process, voltage, and temperature
than other dies. Note, however, that our approach
compares each measurement to a mean, assuming a
constant standard deviation (sigma), whereas DPAT
may use a dynamic sigma.
Instead of adaptive testing, testing pin-to-pin va-
riations can be used to simply detect delay faults
that are within normal die-to-die process variations
but excessive considering that the I/O cells and
loading were designed identically. Differences in
output loading at the board level, or in 3D devices,
can cause detectable delay differences. This is
clearly important for high-speed parallel data bus-
ses, such as DDR interfaces, but it is also important
for diagnosing TSV quality. A resistive void in a TSV
can decrease the capacitive loading seen by the
TSV’s driver, and hence its I/O wrap delay. A leakage
path in the TSV can be detected via boundary scan
as described in [6].
Another small delay difference of interest occurs
when a pin’s transition affects that of neighboring
pinsVcrosstalk. Our BIST measures this by chang-
ing the BSR contents between the first and second
phases of the I/O delay measurement, so that neigh-
boring pins toggle in the same direction during
phase 1 of the measurement, and opposite to each
other in phase 2 of the measurement. Test limits are
applied to the change in each pin’s measured delay,
for both the rising edge and the falling edge. Test
limits can also be applied to the mismatch between
I/Os for this delay change.
Measuring high-speed signal timingThe delay measurements described thus far use
the Update latch in boundary scan cells to provide
the stimulus. Each time the BSR is reloaded, the data
for the Update latch is inverted relative to its pre-
vious value. As described in [1], this allows many
timing properties of the I/O cells to be measured,
and with time resolution limited only by the fre-
quency resolution of the PLL that provides the Async
clock.
For I/Os that deliver signals at frequencies higher
than the boundary scan rate, for example DDR data
at 1.6 Gb/s, it will usually be more accurate to mea-
sure timing parameters while data is delivered to the
I/O from the core function logic instead of from the
boundary scan logic.
To perform this type of measurement, the core
logic must generate alternating data (1010. . .) at a
rate synchronous to the BIST’s Ref clock. This may
be accomplished by inserting a toggle flip-flop in
test mode, or combinational logic that applies a
constant 0 to even bits and 1 to odd bits of the pa-
rallel data that is converted to serial DDR data.
For example, if the BSR has a 20 ns period
(50 MHz) Ref clock and a 20.025 ns period Async
clock, then the DDR logic can deliver data at
1.6 Gb/s, and the BIST’s measurement resolution
will be 25 ps. The boundary scan logic samples the
high-speed data using selected edges of the Aync
clock and shifts samples to the BIST at 50 MHz. The
BSR mode_1 signal (see Figure 1) selects the high-
speed signal from the core, and the mode_2 signal
selects whether it is sampled before or after the I/O
driver.
The measurement range can be reduced to
span two periods of the high-speed signal, which
ensures that it spans at least one rising edge fol-
lowed by a falling edge. This reduces R and test
time proportionally.
September/October 2012 59
![Page 6: Contactless Test of IC Pads, Pins, and TSVs via Standard Boundary Scan](https://reader037.fdocuments.net/reader037/viewer/2022092823/5750a8091a28abcf0cc59374/html5/thumbnails/6.jpg)
Duty cycle distortion (DCD)Total DCD for an I/O signal is equal to mismatch
in output pad rise and fall times plus DCD in the data
signal from the core logic. To measure DCD in the
core data signal, the BIST samples the signal with the
BSR Capture latches (mode 1 ¼ 0 and mode 2 ¼ 1)
and counts the interval from each first detected
rising edge to the next detected falling edge, in units
of the measurement resolution (e.g., there should be
25 consecutive ‘‘1’’ samples for 1.6 Gb/s data, when
sampled with 25 ps resolution). This count is com-
pared directly to test limits.
The detected rising edge will be affected by jitter
in the clocks and power rail noise, but the BIST
computes median edge positions [1] so that it can
tolerate peak-to-peak jitter as large as six times the
measurement resolution.
Slew rateOutput slew rate can be tested by measuring the
captured edge time for a transition through the I/O
pad at two different receiver threshold voltages. The
input threshold for DDR inputs is controlled by a
termination voltage ðVTÞ and it could be offset by,
say, 50 mV between the first and second half of the
delay measurement; the voltage change divided by
the difference between the two delays is equal to the
slew rate. During a production test, this delay differ-
ence would be compared to test limits, e.g., a 50 mV
offset would produce 25 ps delay difference for a
2 V/ns slew rate.
Clock-to-output delayMeasuring the delay from a clock input pin to a
data output pin requires sampling at the two pins
with zero sampling skew, which is not generally
practical within an IC. But if signals for a small num-
ber of I/Os can be captured by Capture latches
clocked by a single clock buffer, then the delay from
an edge of one of the signals to an edge of another of
the signals may be measured with sufficient accuracy.
To measure delay between edges of different sig-
nals occurring nearly simultaneously on I/Os spaced
far apart, delay is measured relative to an arbitrary
phase referenceVin our case it is when edges of
the Ref and Async clocks align. The delay mea-
sured for one I/O is subtracted from the delay mea-
sured for a reference I/O. As long as the reference
I/O is sampled similarly to the other I/Os (i.e., has a
similar boundary scan cell), regardless of whether
it is an input or output clock, the difference can be
measured.
SkewTiming skew is the maximum difference in clock-
to-output delay measured between any two output
pins within a defined group of pins. In [5], timing
skew is measured by continually incrementing the
delay of the sampling strobe pulse and counting the
number of increments between the strobe time
when the first output fails and the strobe time when
the last output fails.
To measure skew with our BIST, each measure-
ment slice starts counting samples when a first edge
is detected for its respective pin, and does not stop
counting until the test is complete. The two counters
that measure average count are modified so that
each starts counting when the first delay count in-
crement pulse (for rise, and fall) is detected in any
measurement slice, and does not stop counting until
the test is completeVthis produces a count for the
first edge on any pin (for rise, and fall) instead of an
average count. Then the difference between each
I/O’s delay count and the ‘‘first edge on any pin’’
count is compared to test limits as described earlier
for the average count.
Hardware resultsWe used an Altera Stratix II FPGA to implement
the I/O BIST, and synthesized from RTL with no lay-
out guidance other than typical clock timing skew
constraints. The boundary scan cells were syn-
thesized in the logic fabric of the FPGA (the FPGA’s
hard-wired boundary scan could not be used be-
cause its clocks, data, and control signals are not
accessible). We measured delays for four I/O pins
that were adjacent to one another on the die and in
the package but not connected to the board. We
chose these pins to emulate contactless testing yet
maximize crosstalk. We configured the FPGA with
the four outputs programmed for 8 mA drive and
then for 4 mA drive. Detailed results are given in [2]
so only a summary is provided here.
The TAP interface was implementedwith general-
purpose FPGA I/O pins, and connected to an off-
the-shelf USB-to-JTAG module connected to a
desktop PC. The TCK clock rate was limited by the
speed of the PC software, and was approximately
2 MHz. Two differential reference clocks were gener-
ated by two telecom-quality PLLs: one provided
IEEE Design & Test of Computers60
ITC Special Section
![Page 7: Contactless Test of IC Pads, Pins, and TSVs via Standard Boundary Scan](https://reader037.fdocuments.net/reader037/viewer/2022092823/5750a8091a28abcf0cc59374/html5/thumbnails/7.jpg)
200 MHz to the FPGA and 25 MHz to the other PLL,
which in turn provided 24.9875 MHz to the FPGA.
These PLLs deliver clocks with less than 0.5 ps rms
jitter, which is less than a typical on-chip PLL but
they allowed us to explore the limits of the BIST
technique’s timing resolution. Within the FPGA, the
200 MHz was supplied as a high-speed core ‘‘data’’
signal to the boundary scan cells, and it was divided
by 8 to provide a 25 MHz Ref clock to the BIST.
Hence, during measurements, the Ref clock period
was 40.00 ns, the Async clock period was 40.02 ns,
and the sampling resolution was the difference,
20 ps.
I/O wrap delayOur hardware measurements showed that I/O
wrap propagation delays through 8 mA output driv-
ers, pad, and input receiver, decreased by an aver-
age of 719 ps, or about 10%, when the drive was
decreased to 4 mA (the I/Os are not connected to
board wiring).
The 8 mA drive delays also decreased by an
average of 24 ps when adjacent pins had the same
transitions, and by 12 ps when adjacent pins had
opposite transitions, whereas the 4 mA drive delays
were unaffected (1 � 6 ps average difference). The
delay dependence on adjacent pin transitions will
appear as crosstalk-induced jitter in function mode.
In experiments with other 8 mA I/O pins connected
to approximately 10 cm of adjacent board wiring, we
measured almost 100 ps delay decrease for same
transitions on adjacent pins, and 100 ps delay in-
crease for opposite transitions.
Duty cycleMode_1 of the boundary scan cells was set to
logic 0 to select the core signal (200 MHz clock sup-
plied differentially to the FPGA from off-chip PLL).
The positive and negative pulse widths were mea-
sured simultaneously, starting from the first detected
target edge-type to the next opposite edge-type. The
sum of the two pulse width measurements for each
pin provided a checksum that should equal the
period of the 200 MHz clock (and did, within 36 ps
or 0.7%), since the measurements are independent.
For the 200 MHz clock as data signal, we mea-
sured 68 and 182 ps positive-negative pulse width
mismatch at the two different pins, corresponding to
0.7% or 1.8% DCD, respectively. Clock buffers and
logic gates within the FPGA likely altered the duty
cycle, so this amount of DCD seems reasonable.
We used only 200 MHz due to practical limita-
tions of conveying higher frequencies into the FPGA,
but nothing in the technique prevents measuring
DCD for data rates up to about 4 Gb/s with better
than 1% resolution.
DiscussionThe I/O wrap measurement results for a path
that included only a multiplexer, pad driver, pad
receiver, and another multiplexer, show that the
delay varied less than 25 ps when there was cross-
talk from adjacent pins but by 200 ps when con-
nected to 10 cm of adjacent wiring. We concluded
that a delay test of an uncontacted pin would not
reliably distinguish 4 from 8 mA driveVa load capa-
citance is needed.
However, for ICs intended for 3D applications,
output drive is much less than for stand-alone ICs, to
save power and area, so the I/O wrap delay will be
more sensitive to changes in drive or loading. For
example, for 0.5 mA output drive and 1 volt signal
swing, an open defect in a TSV that causes the out-
put capacitance to change from 100 fF to zero will
cause a 100 ps reduction in delay, which could be
reliably detected by our I/O BIST.
The BIST’s measurement technique allows arbi-
trary and almost unlimited timing resolution be-
cause the resolution is set by the ratio between the
Ref and Async frequencies supplied to the BIST.
On the FPGA, we verified resolutions ranging from
6.4 ns to 5 ps.
When a DLL is used to measure delays, the delay
resolution is equal to the clock period (to which the
delay loop is locked) divided by the number of
inverters in the loop. If two DLLs are used, to imple-
ment Vernier delay lines, then finer resolution can
be achieved, but only for a range of one clock period.
When a PLL is used, as in our BIST, the frequency
resolution is independent of the number of inverters in
its voltage-controlled oscillator. Instead, resolution
depends on the feedback binary divider, which is
typically 6 � 10 bits, providing up to 1,024 steps of
resolution. PLLs that use an LC-tank oscillator can
achieve resolution up to 28 bits, which provides more
than a million steps of resolution per clock period.
This BIST was recently implemented in an IC
containing DDR3 interfaces, for use in a 3D device
containing many of the ICs. The delay of most interest
September/October 2012 61
![Page 8: Contactless Test of IC Pads, Pins, and TSVs via Standard Boundary Scan](https://reader037.fdocuments.net/reader037/viewer/2022092823/5750a8091a28abcf0cc59374/html5/thumbnails/8.jpg)
was between ICs; our I/O BIST provides two ways to
measure these delays. A system clock was provided
to all ICs with near-zero skew, and within each IC the
BSR capture clock was manually routed to provide
near-zero skew sampling of the DDR data signals and
the system clock. This arrangement allows the delay
of each DDR I/O signal to be measured relative to the
system reference clock. In simulations, measured
output delay for the transmitting IC plus the
measured input delay for the receiving IC, for any
signal, equaled the signal’s delay between the ICs.
A MORE PRACTICAL way to detect excessive delays
in signal paths between ICs is characterization of the
incoming signal skew, as detected by a capture clock
having constant but unknown skew. Our BIST allows
every I/O signal’s apparent arrival time in an IC
within a 3D device to be measured and character-
ized automatically so that individual upper and
lower test limits can be set for every signal. This
approach is quite general and could be used to test
signal paths between ICs in any system.
h References[1] S. Sunter and M. Tilmann, ‘‘BIST of I/O circuit
parameters via standard boundary scan,’’ in Proc.
ITC, Nov. 2010, pp. 529–551.
[2] S. Sunter and A. Roy, ‘‘Adaptive parametric BIST
of high-speed parallel I/Os via standard boundary
scan,’’ in Proc. ITC, Sep. 2011.
[3] IEEE Standard Test Access Port and Boundary-Scan
Architecture, IEEE Std. 1149.1-2001, The IEEE, Inc.,
New York.
[4] P. Gillis, F. Woytowich, K. McCauley, and U. Baur,
‘‘Delay test of chip I/Os using LSSD boundary
scan,’’ in Proc. ITC, Oct. 1998, pp. 83–90.
[5] M. Tripp, T. M. Mak, and A. Meixner, ‘‘Elimination
of traditional functional testing of interface timings
at Intel,’’ in Proc. of ITC, Oct. 2003.
[6] S. Sunter, C. McDonald, and G. Danialy, ‘‘Contactless
digital testing of IC pin leakage currents,’’ in Proc. ITC,
Oct. 2001, pp. 204–10.
[7] N. Vijayaraghavan, B. Singh, S. Singh, and
V. Srivastava, ‘‘Novel architecture for on-chip
AC characterization of I/Os,’’ in Proc. ITC
Oct. 2006.
[8] J. Yu, F. Dai, and R. Jaeger, ‘‘A 12-bit Vernier
ring time-to-digital converter in 0.13 �m CMOS
technology,’’ J. Solid-State Circuits, vol. 45, no. 4,
Apr. 2010.
[9] International Technology Roadmap for Semiconductors,
2009 Edition, Test and Test Equipment, ITRS 2009.
[Online]. Available: http://www.itrs.net/Links/2009ITRS/
2009Chapters_2009Tables/2009_Test.pdf.
Stephen Sunter is Mentor Graphics’ engineeringdirector of mixed-signal DFT, in Ottawa, Canada,where he has focused on the development of DFTandBIST for analog/mixed-signal ICs for over 15 years.He received the BASc degree in electrical engineeringfrom the University of Waterloo. He is a senior memberof the IEEE.
Aubin Roy is a staff engineer at Mentor Graphicsand has been responsible for the development ofmixed-signal DFT products for more than 15 years.He received the BASc degree in electrical engineer-ing from University of Sherbrooke (Canada). He is amember of the IEEE and ACM.
h Direct questions and comments about this articleto Stephen Sunter, Mentor Graphics.
IEEE Design & Test of Computers62
ITC Special Section