Tmax for ATPG n Diagnosis

TetraMAX for ATPG and test diagnosis; an industrial case study

Sverre Wichlund

Nordic Semiconductor ASA

[email protected]

ABSTRACT

New process technologies become progressively more complex. As this is combined with steadily increasing design sizes, the result is a significant increase in the number of scan test vectors that must be applied during manufacturing test. Delay fault testing fills a large gap in defect coverage as it addresses the dynamic behavior of the CUT (Circuit-Under-Test). Unfortunately, scan-based delay fault testing gives rise to a significant increase in the number of scan test vectors; the number of TF (Transition Fault) patterns has been reported to amount to as much as 3-5 times the number of single stuck-at patterns [1,2,3]. In turn test data volumes and test times may spin out of control. We present practical experience with TetraMAX ATPG as applied on a 3Mgate industrial design which has been manufactured in 0.13u technology. The focus has been on minimizing the total pattern count to cut test time and stay within the available ATE memory budget without compromising test quality. Furthermore, we also applied the TetraMAX diagnosis flow during test program debug, which turned out to be quite useful.

SNUG Europe 2007 TetraMAX for ATPG and test diagnosis; an industrial case study

2

Table of Contents

1.0 Introduction......................................................................................................................... 3 2.0 The case .............................................................................................................................. 3 3.0 Delay fault testing ............................................................................................................... 5 4.0 Diagnosis and debug ......................................................................................................... 14 5.0 Conclusions....................................................................................................................... 17 6.0 References......................................................................................................................... 18 7.0 Appendix A....................................................................................................................... 19 8.0 Appendix B ....................................................................................................................... 21

Table of figures

Figure 1 DFT inserted to facilitate AC-scan testing of paths between primary input/outputs and FFs (flip-flops) in the design. ................................................................................................... 4

Figure 2 Our chip toplevel (simplified view).................................................................................. 5 Figure 3 An illustration of a transition fault test............................................................................. 7 Figure 4 Waveform using a single clock period of 100ns for shift and capture [8]. ...................... 7 Figure 5 Waveform using early-late-early................................................................................... 8 Figure 6. Using two different periods, one for shift and one for capture........................................ 8 Figure 7 Our TetraMAX atpg flow. ................................................................................................ 9 Figure 8 The relationship between the ATE test clock period (T_test), system clock period

(T_app) and the available slack on a path under test (T_det)................................................. 13 Figure 9 Outline of the TetraMAX diagnosis flow....................................................................... 15 Figure 10 ATE cycle based failure log output. ............................................................................. 16 Figure 11 Example TetraMAX diagnosis output. ......................................................................... 16 Figure 12 Example TetraMAX output after truncating the failure log. ........................................ 17


3

1.0 Introduction Todays increased chip complexities combined with the smaller feature sizes requires that we now address defect mechanisms that safely could be more or less ignored in earlier technologies. Research has shown that the traditional stuck-at fault test is no longer sufficient to maintain low DPM levels at deep sub-micron geometries as defects which require at-speed tests become more prevalent. Delay fault testing fills a large gap in defect coverage as it addresses the dynamic behavior of the CUT. Unfortunately, scan-based delay fault testing gives rise to a significant increase in the number of scan test vectors; the number of TF patterns has been reported to amount to as much as 3-5 times the number of single stuck-at patterns [1,2,3]. In turn test data volumes and test times may spin out of control.

There are mainly three ways to reduce pattern volume and thereby test time: 1) DFT techniques like scan compression. Widely used nowadays. Observation point

insertion and some divide-and-conquer techniques are also used. 2) Non- DFT techniques like tuning the ATPG tool and truncating patterns. 3) ATE: some vendors provide ways of compressing the patterns stored in the ATE memory.

This helps in freeing up tester memory, but not on test time.

In this paper we discuss experience with TetraMAX atpg as applied on a 3Mgate industrial design which has been manufactured in 0.13u technology. The focus has been on minimizing the total pattern count to cut test time and stay within the available ATE memory budget without compromising test quality. Furthermore, we will briefly discuss the tradeoffs made between stuck-at patterns and transition fault patterns which were included. The transition fault coverage is a number to be interpreted with care. It is really an upper bound on the coverage achieved given that a delay defect always increase the delay by an amount greater than the available slack on the path in question. This depends of course on the path used to sensitize a given fault. TetraMAX has a large number of useful switches to help in reducing the pattern count. We look closer at some of these. We also applied the TetraMAX diagnosis flow during test program debug, which turned out to be quite useful. More specifically, this flow may not only be used to debug manufacturing defects, but may very well be used to solve issues with pattern simulation mismatches as well as improper use of ATPG models during ATPG. The focus of this paper is primarily on the tool flow in an industrial environment. The flow which is based on TetraMAX and VTRAN is discussed in detail.

2.0 The case Our design is a digital ASIC of approximately 3.3Mgates manufactured in 013um technology. It contains 4 clock domains (100MHz, 200MHz, 400MHz, 12MHz) in addition to one clock controlling the digital part of a usb2phy. The larger part of the logic is within the first two clock domains. A significant portion of the die is occupied by memory. Each memory instance was wrapped with a memory BIST block. All IOs were bounded in scan, to be able to test paths between primary inputs/outputs and flip-flops during AC-scan, see Figure 1.


4

Figure 1 DFT inserted to facilitate AC-scan testing of paths between primary input/outputs and FFs (flip-flops) in the design.

A couple of non-scan blocks were wrapped during scan. Apart from this, full scan was used. To reduce scan test time and ATE memory requirements, on-chip scan compression was utilized [4]. All the five clocks were individually controllable from chip level pads during scan test. This was to ensure that a relatively fast clock domain would not be slowed down during AC-scan by sharing clock input port with a slower clock domain. Boundary scan (IEEE1149) was also employed. We used BSD Compiler to insert this logic.

A top-level view of the ASIC is shown in Figure 2.

The ATPG related tools used in the project are listed in Table 1 below.

Tool Version TetraMAX Y-2006.06-SP4 / Z-2007.03 VTRAN (sourceIII) 7.7 linux

OS / Platform: Linux Redhat Enterprise3 -64bit running on amd64, 3.4GHz / 6GB RAM Table 1 Tools & OS used in the project.

comb

FF

FF

PI

AC-scan

Part of scan chain

comb

PO

FF


5

Figure 2 Our chip toplevel (simplified view).

3.0 Delay fault testing In the smaller geometries, a significant portion of manufacturing defects are speed related. Resistive opens (typically bad vias) and shorts are examples of such defects [6,7]. Furthermore, the impact of technology scaling can be quickly summarized as:

Increasing design complexity as per Moores law. Sub-wavelength litho and printability issues.

Difficult to check all corner cases. All noise and power drop sources may not be covered. Low-power design and logic optimization; timing paths are clustered around the cycle

time, which makes the design more vulnerable to variations and timing defects.


6

Timing related defects can only be detected by a test propagating a transition down a path. A speed related defect increasing the delay by an amount bigger than the available slack will be detected. It should be noted that the test environment and applied clock speed will affect the available slack on a given path. It should also be mentioned that at-speed test is used for speed-binning, where a product not meeting the fastest speed may be binned to the next lower speed level.

3.1 AC-scan theory There are two important classes of defects;

Systematic defects. Caused by process variations as well as lithography induced defects. May have a dramatic impact on yield.

Random defects. Contamination during manufacturing.

The first class of defects is more isolated in nature and may be targeted by carefully designed functional patterns or ATPG generated PDF (Path-Delay-Fault) patterns. To detect the random ones, a more global view must be taken, i.e. the whole die must be adequately covered. Earlier, handcrafted functional patterns were used for these purposes (and still are). However, the effort in generating such patterns may be very high and the resulting coverage insufficient. It is here that structural test in the form of AC-scan and TF ATPG comes into play and TF patterns has become a de facto industry standard for detecting random timing related defects [1,2,6-8]. The TF model is easy to use and the number of faults is similar to the number of SSA (single-stuck-at) faults. In this model slow-to-rise/fall faults are placed on gate inputs and outputs. The computational complexity of ATPG in this case is quite similar to SSA ATPG. As opposed to the PDF model, the TF model covers the entire logic portion of the die. The size of a delay defect detected is however affected by the path length selected by the ATPG tool for propagating a fault effect, see next section. An example of a transition fault test may be depicted as in Figure 3. Here a 0->1 transition is launched at the A-pin. After a predefined time determined by the actual test cycle time, the output f is strobed. Depending on the size of a delay defect along the path A->D->f we would strobe a 0 at f and not a 1. When we use scan for this, two approaches may be used for launching the transition:

1) Launch-by-shift, where the actual transition is launched by the last shift cycle, and 2) Launch-by-capture, where the actual transition is launched by the first capture clock (out

of two).

Comparing the two techniques, launch-by-shift may utilize combinational ATPG only. On the down side, an at-speed scan enable signal would be required (must be treated like a clock throughout physical implementation). Also the shift clock speed and thereby the test clock period may be limited by the driving capabilities of the IOs. Furthermore, yield-loss may result since the launch state (directly scanned in) may be an illegal state during functional operation. Using launch-by-capture, there is no balancing requirements on the scan enable signal, since the shift operation is decoupled from the capture clock pulses (launch+capture) and we may insert a wait state between the last shift cycle and the launch clock allowing scan enable to stabilize. Setting up the clock waveforms is also easier. Based on this, we selected launch-by-capture as the launch


7

methodology in our ATPG setup. Although dual cycle sequential ATPG is essentially needed here, the optimized two-clock transition ATPG engine in TetraMAX posed no run-time problems in our project.

Figure 3 An illustration of a transition fault test.

So how should the waveforms be selected? Consider the example in Figure 4. Here a single clock period of 100ns is utilized both for shift and capture.

Figure 4 Waveform using a single clock period of 100ns for shift and capture [8]. To accommodate a system clock period of 100MHz (T=10ns), an early clock pulse follows a late launch pulse. The problem with this approach is the relatively small shift clock pulse width. If LOCKUP latches are utilized, the pulse width of the clock must be larger than the clock skew between adjacent clocks. We may alter this approach slightly by modifying the shift clock waveform as in Figure 5. In this early-late-early approach, the shift clock pulse width is increased. However, we must ensure

A

B

C f

01

11

00

D 01

01

strobe

A

B

C

D

f

100 100 100

95 95 5

Late launch

Early capture Scan in Scan out


8

that the delay between a scan input and the first flip-flop in a scan chain in less than 5ns + clock insertion delay.

Figure 5 Waveform using early-late-early.

Using two different clock periods, one for shift and one for capture is a better choice. Then we are free to select the most appropriate waveforms to accommodate the desired shift speed as well as the system clock speed at which we want to test, see Figure 6. Here we have selected a shift clock speed of 20MHz and a test clock speed of 100MHz. Note the dead periods without any clocks between the last shift clock cycle and first launch cycle and between the capture cycle and the first shift cycle. This allows enough time for the scan enable signal to settle. This scheme is often denoted as slow-dead-fast-fast.

Figure 6. Using two different periods, one for shift and one for capture.

It should be noted however, that not all ATE environments will easily accommodate two or more different waveform periods within the same pattern file.

3.2 TetraMAX ATPG in our case To minimize pattern count, we first ran TF ATPG, then we fault simulated the resulting patterns with respect to SSA faults. Then we ran a final topup SSA ATPG to ensure the highest possible SSA coverage. The flow is depicted in Figure 7 below.

5010

20 5 5

Scan in Scan out

10

40

100 100 100

95 95 5

Late launch

Early capture Scan in Scan out


9

Figure 7 Our TetraMAX atpg flow.

The patterns format in the figure is wgl. The final pattern formatting flow is a bit more complex and is shown in appendix A. We will not elaborate on this here, see [4] for details. However, it should be noted that due to problems with the translation software of our ATE vendor, we found

Netlist(s) ATPG constraints STIL Protocol

set faults -model transition

run atpg auto

write fault write patterns

TF patterns

set faults -model stuck

set patterns external

run fault_sim

Run atpg auto

write fault write patterns

SSA patterns


10

that serialized wgl was the most convenient pattern format. Serializing the wgl was a separate step in their translation software anyway and it took very long time to execute. Furthermore, their translator was not stable. Due to the sheer size of the wgl files (~2GB) and the limitations mentioned, serializing the wgl ourselves using VTRAN was much more effective. In Table 2 we have summarized the ATPG results following the flow in Figure 7.

DFT: Methodology Muxed FF / full scan / latch based clock gating (dft in front of

latch). Memory shadow muxes. #chains 44 (140 internal, where 128 are daisy-chained to 32, in addition to

9 + 3) Length longest chain 5180 (in normal scan mode) / 1295 # test clocks 4 (+2 USB2)

AC-scan specific: IO-bounding Yes IO handling (un-bounded) No PI changes / no PO measures CDC-handling Single-clock launch/capture Exception handling Add slow path Launch method Launch-by-capture

ATPG:

Stuck-at fault (reference): ATPG efficiency 99.9% Test coverage 99.14% #patterns 3278 Algorithm: Combinational + fast-seq atpg Other: Applied add slow bidis. High effort pattern merge. 5 pass

reverse+random pattern reordering. CPU time (s) 35000 (incl. 1250s spent during run pattern_compression)

Stuck-at fault topup (note: after transition fault atpg): ATPG efficiency 99.9% Test coverage 99.15% #patterns 2749 Algorithm: Combinational + fast-seq atpg Other: Applied add slow bidis. CPU time (s) 29500

Transition fault: Test coverage 84.80% #patterns 1986 (pattern constraint on 3k pattern applied) Algorithm Two-clock transition optimization, clock grouping disabled (-

nodisturb -noallow_multiple_common_clocks) Other: Applied add slow bidis. High effort pattern merge. 5 pass

reverse+random pattern reordering. CPU time (s) 14500 (incl. 8500s spent during run pattern_compression)

Table 2 ATPG results summary.


11

Please note that the SSA topup ATPG in Table 2 is run after the TF ATPG, following the flow depicted in Figure 7. The SSA ATPG denoted reference is a full SSA ATPG run (no topup). It was run in daisy-chain scan mode where the on-chip compressor was bypassed to be able to compare the fallout from this test against the fallout from the two other where the on-chip compressor was enabled. This was necessary to ensure that no unwanted aliasing occurred in the on-chip compressor. After this check was carried out, this test was skipped.

To avoid unintentionally loopback tests, that is testing paths going from digital core, through bidirectional ports and then back to the digital core, we applied add slow bidis all. Furthermore, to avoid propagating transitions from primary inputs and strobing primary outputs during AC-scan, we applied:

set delay -nopi_changes add po masks -all

The reader may ask why the first two constraints were necessary, taking the IO-bounding in Figure 1 into account. The answer is that not completely all IOs were IO-bounded due to some critical timing constraints on associated paths. During AC-scan we wanted to avoid the testing of paths going from one clock domain to another. Thus, we applied the constraint:

set delay -common_launch_capture_clock -nodisturb

The reason for adding the switch -nodisturb was that we wanted to avoid grouping of clocks with a (limited) number of cells having sequential dependencies. This to reduce the amount of masking in the generated patterns (our scan compression scheme was vulnerable to the amount of X-contents in the pattern).

The switch -nodisturb may increase the number of patterns since logic belonging to different clock domains must be tested in sequence. This depends on the amount of interaction between the clock domains.

Clock grouping disturb nodisturb

3278 3304 Table 3 The effect of disturbed clocking on the resultant number of patterns as applied to SSA ATPG.

We investigated the effect of the switch -nodisturb. From Table 3 we observe that the effect on the pattern increase was very marginal in our case.

3.2.1 ATE memory For us, the ATE memory was a concern, not only test time. Initial calculations showed that we would not stay within the available ATE memory. However, by the combination of DFT and the ability of TetraMAX to provide highly compact test sets we stayed within budget. It should also be mentioned that the ATE that we utilized had the ability to store patterns in a compressed form. This was accomplished by essentially packing two cycles in the wgl file into


12

one cycle of twice the period by utilizing a larger number of timing edges within each cycle. In other words, two consecutive cycles within the wgl file were merged into one using a new waveform table with twice the number of edges in it. It should be emphasized that this process was carried out by the ATE vendor and was transparent to us. This methodology however only applied to our SSA patterns since our TF patterns used too many timing edges (separate tables for shift, launch and capture respectively) compared to what the ATE could provide. Therefore it was important to keep the number of TF patterns low. By using this methodology (called X-mode) we were able to recover 50% of the ATE memory used by the SSA patterns.

One TetraMAX switch that may have a profound effect on the pattern count is the merge effort switch of the set atpg command:

set atpg Merge

The set atpg command has numerous switches controlling the behavior of the TetraMAX atpg algorithm(s), please refer to the TetraMAX ATPG User Guide for details. We conducted some experiments with the merge effort during TF ATPG. To limit the resultant number of patterns it was also necessary to apply the maximum pattern constraint of the set atpg command.

merge effort / maximum pattern constraint set atpg -Merge low

low patterns 7000 set atpg -Merge high high patterns 3000

set atpg -Merge 2500 25000 patterns 2000

TF coverage (pattern count)

84.08% (7000) 85.35% (2987) 84.80% (1986) CPU time (s) ATPG/ run pattern_compression

39000 / 0 5000 / 12500 6000 / 8500

Table 4 The effect of the merge effort and maximum pattern constraint on the resultant TF coverage and cpu time.

The results are shown in Table 4. We tried to balance the merge effort with the maximum patterns constraint to keep the resultant coverage as constant as possible. It should be noted that the results in the leftmost column in Table 4 was obtained without running a post atpg run pattern_compression pass. Note also the relatively long run time of 39000s in this case where it should be noted that almost 34000s was spent in testability analysis. Starting out with 7k patterns, we were able to reduce the pattern count by a factor 3.5 without reducing the TF coverage. We also observe that the cpu time spent during ATPG depends heavily on the merge effort applied; for a max pattern constraint of 2000 patterns, ATPG required 6000s compared to 5000s for the 3000 patterns case. However, all the run times listed in Table 4 were within the run overnight limit.

It is generally difficult to achieve very high TF coverage due to pattern count and complexity of TF ATPG. Therefore the usual approach is to aim at the highest SSA coverage, and add TF patterns on top while staying within the budgeted test time and ATE memory. This was also the approach taken by us, please refer to Table 2. Our ATE had a physical memory limit of 28Mbit for each channel. The memory usage breakdown is shown in Table 5.


13

Pattern type (count) Memory usage X-mode TF (1986) 2.6Mbit No SSA topup (2749) 1.8Mbit Yes SSA reference (3278) 8.5Mbit Yes Total 12.9Mbit -

Table 5 ATE memory usage for the different types of scan patterns.

This gave us a good margin to accommodate all the other (non-scan) tests as well while keeping the total test time within the budget (of approx. 2 s).

In this discussion it should be noted that the TF coverage must be treated with care; a given delay defect on a path will not be detected unless the magnitude of the defect is larger than the available slack on the path in question. Thus the TF coverage is nothing more than an upper bound on the achievable TF coverage. This becomes clear by looking at Figure 8 below:

Figure 8 The relationship between the ATE test clock period (T_test), system clock period (T_app) and the available slack on a path under test (T_det).

Figure 8 shows an example where a CUT has an application clock period of T_app=7ns. Say that the longest path here is 6ns long, while a given sensitized path is 4ns long. This means that the magnitude of the smallest delay defect that can be detected in this example is T_det=T_test - 4ns=6ns. Delay defects may be significantly smaller than this [9]. Among complicating factors are:

Most ATPG tools will propagate a given fault effect along a relatively short path to a primary output (a flip-flop).

T (ns) 2 4 0 6 8 10 12

Longest path = 6ns

Sensitized path = 4ns

T_app T_test

T_det

T_margin


14

Furthermore, the tester clock period must be selected in a way that we are sufficiently robust against process variations and other environmental conditions.

It is not always easy to take into account during test all timing exceptions that apply in functional mode.

The maturity of the process technology used.

Based on the above, a relatively high TF coverage rather means a higher potential for screening out delay defects. This means that one should be careful when striving for the highest TF coverage without taking the magnitude of the smallest delay defect that can be detected into account. This means that choosing between to test sets A and B of TF patterns where A has higher coverage than B, test set B could be of higher value if its ability to screen the smallest delay defects is relatively good.

Another thing to keep in mind is that the screening power of a TF pattern test is not a step function of the tester clock period. In an empirical study the authors in [7] observed that the cumulative fallout as a function of tester clock frequency F was proportional to (F/Ft) where Ft is the target clock frequency of the CUT. This means that a TF pattern may still be useful even though the tester clock frequency is e.g. only half the target clock frequency.

So how should one allocate the precious tester memory to SSA and TF patterns such that the total number of defective circuits detected by these patterns is maximized? What is the optimal mix between SSA and TF patterns? If each test pattern was equal with respect to its ability to screen out defective devices, then we could select between the two types of patterns at random. This is of course not the case, the fault models are orthogonal in a sense, aiming at detecting different types of defects. The relevance of the two types of models will vary with technology as well as the design itself (e.g. does the design push the technology limits?) In [10] the author studied the tradeoff of tester memory between SSA and TF patterns. It was observed that a 50/50 mix was the worst choice, actually hitting a local minimum in the number of defective devices detected. It was also observed that increasing the weight of SSA patterns clearly maximized the number of defective devices detected. The latter observation will probably depend on the actual coverage of the TF patterns as discussed earlier in this section, as well as the maturity of the technology and on the design itself. Nevertheless it indicates that one should be very careful with trading TF patterns at the expense of SSA patterns.

4.0 Diagnosis and debug The TetraMAX diagnosis feature may be very useful! In addition to analyzing manufacturing failures, it can be useful during test debug as well, something that we discovered as we shall see. During initial scan debug we ran into a problem where only a few cycles (


15

Figure 9 Outline of the TetraMAX diagnosis flow.

It should be noted that TetraMAX accepts two formats of failure log data. The pattern-based format is of the form:

[shift cycle #] [expected data]

where pattern is the scan pattern where the error occurred starting with 0. Output port is the port with an error. In case we have error during scan shift, the shift cycle number, starting with 0 closest to the scan out must be given. The expected value is optional. However, it is useful to provide this information because together with the check switch to the set diagnosis command it may be used to check that everything is lined up correctly before diagnosis.

There is also a cycle based format. However, this format didnt comply with the binary pattern format. We used the pattern based format in our flow. Since the test house only provided us with a cycle based format, we had to translate it to a pattern based format though. An example output from the ATE looked like this:

Netlist(s) Libraries STIL protocol

Patterns failure.log

Reports

read netlist run build_model run drc set patterns external cut_debug.bin set diagnosis -auto -check run diagnosis failure.log -verbose write diag.flt.gz -replace -compress gzip


16

Figure 10 ATE cycle based failure log output.

We observed that the error on the rightmost port named rd_n only occurred in one cycle only. Next, we translated this output into the pattern based TetraMAX format described above, using the script in Appendix B. When executing the flow depicted in Figure 9, we got the output shown in Figure 11 below. One defect was reported here, and resolved to one of two fault locations. The match was not particularly good (12.5%).

Figure 11 Example TetraMAX diagnosis output.

Vector Label: nic3400_flat r ppppd aaaa_ 6543n #####

V0000140248 C0000140248 F L^HHL V0000140249 C0000140249 F L^HHH V0000140270 C0000140270 F H^HLL V0000140271 C0000140271 F H^LHL V0000140272 C0000140272 F L^LHL V0000142886 C0000142886 F ^HHHH V0000142903 C0000142903 F ^LLHH V0000180304 C0000180304 F LLHL^

run diagnosis /work/sw/failure.log -verbose -write /work/sw/diag.flt.gz -replace -compress gzip Warning: Diagnosis will use 74 po_masks and 16 capture_masks. (M633) Check expected data completed: 8 out of 8 failures were checked Write fault list completed : 2 faults were written into file "/work/sw/diag.flt.gz". Diagnosis summary for failure file /work/sw/failure.log #failing_pat=3, #failures=8, #defects=1, #faults=2, CPU_time=41.25 Simulated : #failing_pat=3, #passing_pat=96 #failures=8 ------------------------------------------------------------------------------

Fault candidates for defect 1: stuck fault model, #faults=2, #failures=8 Observable points: 1800149 1800099 1799949 1800217 1800218 1800652 1761031 ------------------------------------------------------------------------------

match=12.50%, (TFSF=1/TFSP=7/TPSF=0), #perfect/partial match: sa1 DS DUT/u146_USB2WRAP/UTM_MODEL/u_sl210/ANALOG/analogOtgTrans/bSessEnd (analogOtgTrans) ------------------------------------------------------------------------------

match=12.50%, (TFSF=1/TFSP=7/TPSF=0), #perfect/partial match: sa1 DS DUT/u_NVO_NT/u_NVO_CORE/U_TESTMUX/U3708/Y (CLKBUFX4) sa1 ** DUT/u_NVO_NT/u_NVO_CORE/U_TESTMUX/U3708/A (CLKBUFX4) sa1 ** DUT/u_NVO_NT/u_NVO_CORE/U_vo_core/i/global/cfg_wrap/registers/U80/A0 (AOI22X1) ------------------------------------------------------------------------------


17

Next, we truncated the failure log such that the only line remaining was the single error on the rd_n port. We wanted to check if this single erroneous output could give us a better hint of the root cause of the ATE errors. After running diagnosis we now got the output shown in Figure 12.

Figure 12 Example TetraMAX output after truncating the failure log.

This time the match is much better. We soon did suspect the first of the faults in the figure (the pin bSessEnd coming from an usb2phy block. Looking through the library files we had used for TetraMAX ATPG, we realized that we had used a wrong model for the usb2phy during ATPG. We then changed to the correct ATPG model and generated a new debug pattern. After running this pattern on the ATE, all the previous errors disappeared. Setting up the diagnosis flow was really proved to be a useful exercise for us.

5.0 Conclusions Through an industrial design manufactured in 0.13 technology we have demonstrated the usefulness of TetraMAX for ATPG and diagnosis. We focused on controlling the pattern count to stay within the available ATE memory as well as cutting test time. Furthermore, we have looked closer at AC-scan and the trade-off between SA and TF patterns to achieve the highest coverage while keeping the pattern count within acceptable limits. Finally we showed how TetraMAX diagnosis may be used as a valuable tool during the test-program debug phase.

run diagnosis onefail.log -verbose -write /work/sw/diag.flt.gz -replace -compress gzip Warning: Diagnosis will use 74 po_masks and 16 capture_masks. (M633) Check expected data completed: 1 out of 1 failures were checked Write fault list completed : 2 faults were written into file "/work/sw/diag.flt.gz". Diagnosis summary for failure file onefail.log #failing_pat=1, #failures=1, #defects=1, #faults=2, CPU_time=37.44 Simulated : #failing_pat=1, #passing_pat=96 #failures=1 ------------------------------------------------------------------------------

Fault candidates for defect 1: stuck fault model, #faults=2, #failures=1 Observable points: 1761031 ------------------------------------------------------------------------------

match=100.00%, (TFSF=1/TFSP=0/TPSF=0), #perfect/partial match: sa1 DS DUT/u146_USB2WRAP/UTM_MODEL/u_sl210/ANALOG/analogOtgTrans/bSessEnd (analogOtgTrans) ------------------------------------------------------------------------------

match=100.00%, (TFSF=1/TFSP=0/TPSF=0), #perfect/partial match: sa1 DS DUT/u_NVO_NT/u_NVO_CORE/U_TESTMUX/U3708/Y (CLKBUFX4) sa1 ** DUT/u_NVO_NT/u_NVO_CORE/U_TESTMUX/U3708/A (CLKBUFX4) sa1 ** DUT/u_NVO_NT/u_NVO_CORE/U_vo_core/i/global/cfg_wrap/registers/U80/A0 (AOI22X1) ------------------------------------------------------------------------------


18

6.0 References [1] B. Keller et al., An Economic Analysis and ROI Model for Nanometer Test, in Proc. International Test Conference, 2004, pp. 518-524. [2] B. Vermeulen et al., Trends in Testing Integrated Circuits, in Proc. International Test Conference, 2004, pp. 688-697. [3] I. Yarom, Test Pattern Generation for Sub-Micron Designs, SNUG Israel 2005. [4] S. Wichlund, Fighting Scan Test Time and Data Volumes; Squeezing the last drop out of DFT Compiler and TetraMAX?, SNUG Boston 2006. [5] S. Wichlund et al., Reducing Scan Test Data Volume and Time: A diagnosis friendly finite memory compactor, In Proc. IEEE Asian Test Symposium, 2006, pp. 421-428. [6] C. Hawkins et al., Parametric Timing Failures and Defect-based Testing in Nanotechnology CMOS Digital ICs, in Proc. 11th Annual NASA Symposium on VLSI Design, 2003. [7] R. Madge and B. R. Benware, Obtaining High Defect Coverage for Frequency-Dependent Defects in Complex ASICs, in IEEE Design & Test of Computers, Sept./Oct. 2003, pp. 46-53. [8] J. Saxena et. al., Scan-Based Transition Fault Testing Implementation and Low Cost Test Challenges, in Proc. International Test Conference, 2002, pp.1120-1129. [9] P. Nigh and A. Gattiker, "Test Method Evaluation Experiments & Data," in Proc. International Test Conference, 2000, pp. 454-463. [10] Butler, K., A Study of Test Quality/Tester Scan Memory Trade-offs Using the SEMATECH Test Methods Data," in Proc. International Test Conference, 1999, pp. 839-847.


19

7.0 Appendix A Figure depicting the pattern formatting flow used to interface to our test vendor.

ATPG

wgl

Extract scan outputs

Scanout data (dsco*)

Simulate compactor

Compactor response

Merge

Merged wgl

Vtran: wgl2ver

testbench

Simulate some patterns

ATE

Netlist with dsco outputs Constraints: mask compactor outputs

calc

Vtran: wgl2wgl

flat wgl

This step necessary because TetraMAX wgl_flat is not compatible with vtran

stil

Xtract: X check

Vtran: wgl2wgl_dsco

Final wgl


20

The reader should keep in mind that most of the steps in the depicted pattern formatting flow are related to the on-chip scan compression scheme used. To be able to compute the compressor output response during scan shift and successively merge this response into the pattern, the compactor had to be simulated standalone with the ATPG patterns as stimuli. A separate utility written in C denoted calc in the figure was used for this purpose. For more details, see [4,5]. Without the employed compression scheme, we could probably have skipped all the steps in the figure but the first one (using VTRAN to flatten/serialize the wgl).


21

8.0 Appendix B

Script example translating a cycle based failure log to a pattern-based failure log file for TetraMAX diagnosis:

#!/usr/local/bin/perl

# # 20070417 sw # # Small util to convert a ATE failure log to tmax diag format #

$h = 0; $PLENGTH = -1; $SLENGTH = -1;

while ($_ = $ARGV[0],/^-/) { shift; last if /^--$/; m/^-h/i && ($h = 1); m/^-p(.*)/i && ($PLENGTH = $1); m/^-l(.*)/i && ($SLENGTH = $1); }

if ($h || ($PLENGTH == -1) || ($SLENGTH == -1) ) { print("\n"); print("convert a 93K failure log to a Tmax pattern based failure log\n\n"); print("Usage: log2tmax.pl [-h] \n\n"); print("Options:\n"); print(" -h: Help (display this help message).\n"); print(" -p: number of cycles up to and including the last shift cycle in the first scan pattern.\n"); print(" -l: length of longest chain + number of cycles during one capture.\n\n"); # print("Description:\n"); exit 0; }

#################### # ####################

$input_file = ($#ARGV == -1)? "-" : shift;

$output_file = "failure.log";

open(INFILE,$input_file) || die "Couldn't open $input_file"; open(OUTFILE,">$output_file") || die "Couldn't open $output_file";

$pin_dec_start = 0; $pin_count = 0; $fail_index = 0;


22

while() { # skip the preamble if(m/[-]+ Pin Results.*/) { $pin_dec_start = 1; # print "found\n"; } # first, process the pin results section elsif( (m/\W+([a-zA-Z0-9_]+) FAILED/) && ($pin_dec_start == 1) ) { # a pin declaration @pin_name[$pin_count] = $1; $pin_count = $pin_count + 1; } # then process the vector results section elsif( (m/V[0-9]+ C([0-9]+) F ([A-Z^]+)/) && ($pin_dec_start == 1) ) { # a failing cycle # printf "cycle $1\n"; @failure_cycle[$fail_index] = $1; @failure_code[$fail_index] = $2; $fail_index = $fail_index + 1; } } # end while()

$no_of_patterns = 0; $max_pattern_no = -1;

# process the fail codes to calculate pin and exp value for($i = 0; $i < $fail_index; $i++) { # print "cycle @failure_cycle[$i] code=@failure_code[$i]\n"; # search for ^v

# calculate pattern no $_pattern_no = (@failure_cycle[$i] - $PLENGTH) / $SLENGTH; # calculate floor($_pattern_no) @n=split(/\./,$_pattern_no); $pattern_no=@n[0]; if($pattern_no > $max_pattern_no) { $max_pattern_no = $pattern_no; $no_of_patterns = $no_of_patterns + 1; }

# calculate shift cycle within the pattern, starting with 0 at chain output $shift_no=@failure_cycle[$i]-$PLENGTH-$pattern_no*$SLENGTH; # print "pattern=$pattern_no, shift=$shift_no\n";

# first ^ (which means 0 was expected) $where = 0; $prev = 0; while ($where != -1) { $where = index(@failure_code[$i],"^",$prev); if($where != -1) { # found


23

#print "found ^ at position $where\n"; print OUTFILE "$pattern_no @pin_name[$where] $shift_no exp=0\n"; } $prev = $where + 1; }

# then v (which means 1 was expected) $where = 0; $prev = 0; while ($where != -1) { $where = index(@failure_code[$i],"v",$prev); if($where != -1) { # found #print "found v at position $where\n"; print OUTFILE "$pattern_no @pin_name[$where] $shift_no exp=1\n"; } $prev = $where + 1; }

}

# print some statistics print "Number of failing pins: $pin_count\n";

for($i = 0; $i < $pin_count; $i++) { print "Fails: @pin_name[$i]\n";

}

print "Number of failing cycles: $fail_index\n"; print "Number of failing scan patterns: $no_of_patterns\n";

print "Done translation...output in file $output_file\n"; print "\n";

Tmax for ATPG n Diagnosis

Documents

Transcript of Tmax for ATPG n Diagnosis