VLSI Testing Chapter 5 Design For Testability & Scan Test ...syhuang/testing/ch5.DFT.pdf4 Outline...

1

EE-6250

國立清華大學電機系

超大型積體電路測試VLSI Testing

Chapter 5Design For Testability

& Scan Test

Outline

• Introduction– Why DFT?– What is DFT?

• Ad-Hoc Approaches• Full Scan• Partial Scan

ch5-2

• Partial Scan

2

Why DFT ?

• Direct Testing is Way Too Difficult !– Large number of FFs– Embedded memory blocks– Embedded analog blocks

• Design For Testability is inevitable• Like death and tax

ch5-3

• Like death and tax

Design For Testability

• Definition– Design For Testability (DFT) refers to those design Design For Testability (DFT) refers to those design

techniques that make test generation and testing cost-effective

• DFT Methods– Ad-hoc methods– Scan, full and partial– Built-In Self-Test (BIST)

ch5-4

– Boundary scan

• Cost of DFT– Pin count, area, performance, design-time, test-time

3

Why DFT Isn’t Universally Used Previously?

– Short-sighted view of management

– Time-to-market pressurep ss

– Life-cycle cost ignored by development management/contractors/buyers

– Area/functionality/performance myths

– Lack of knowledge by design engineers

Testing is someone else’s problem

ch5-5

– Testing is someone else s problem

– Lack of tools to support DFT until recently

We don’t’ have to worry about this management barrier any more Most design teams now have DfT people

Important Factors

• Controllability– Measure the ease of controlling a lineg

• Observability– Measure the ease of observing a line at PO

• Predictability– Measure the ease of predicting output values

• DFT deals with ways of improving

ch5-6

• DFT deals with ways of improving– Controllability– Observability– Predictability

4

Outline

• Introduction• Ad-Hoc Approaches

– Test Points– Design Rules

• Full Scan• Partial Scan

ch5-7

• Partial Scan

Ad-Hoc Design For Testability

• Design Guidelines– Avoid redundancy– Avoid asynchronous logic– Avoid clock gating (e.g., ripple counter)– Avoid large fan-in– Consider tester requirements (tri-stating, etc.)

• Disadvantages

ch5-8

– High fault coverage not guaranteed– Manual test generation– Design iterations required

5

Some Ad-Hoc DFT Techniques

• Test Points• Initialization• Monostable multivibrators

– One-shot circuit

• Oscillators and clocks• Counters / Shift-Registers

– Add control points to long counters

Delayelement

One-shot

input output

input

ch5-9

• Partition large circuits• Logical redundancy• Break global feedback paths

p

output

On-Line Self-Test & Fault Tolerance By Redundancy

• Information Redundancy– Outputs = (information-bits) + (check-bits)– Information bits are the original normal outputs– Check bits always maintains a specific pre-defined

logical or mathematical relationship with the corresponding information bits

– Any time, if the information-bits and check-bits violatethe pre-defined relationship, then it indicates an error

Hard are Red ndanc

ch5-10

• Hardware Redundancy– Use extra hardware (e.g., duplicate or triplicate the

system) so that the fault within one module will be masked (I.e., the faulty effect never observed at the final output)

6

Module Level Redundancy

• Triple Module Redundancy (TMR)– majority voting on three identical modules’

outputs help mask out faults that occur in a outputs help mask out faults that occur in a single module

Module

Majority verdict 0

1

0

ch5-11

Module

Module

voter0

0

Test Point Insertion

• Employ test points to enhance– Controllability– Observability

• CP: Control Points– Primary inputs used to enhance controllability

• OP: Observability Points– Primary outputs used to enhance observability

0OP

(extra PI) (extra PO)

ch5-12

1

Add 0-CP

Add 1-CP

Add OP

OP

(extra PI)

7

0/1 Injection Circuitry

• Normal operationWhen CP_enable = 0

• Inject 0j– Set CP_enable = 1 and CP = 0

• Inject 1– Set CP_enable = 1 and CP = 1

C1 MUX0 w

ch5-13

C1 C2MUX1

CP

CP_enableInserted circuit for controlling line w

Single I/O Port for Multiple Test Points

• Constraints of using test points– A large demand on I/O pins

Thi t i t b h t li d b – This constraint can be somewhat relieved by using MUX & DEMUX at the cost of increasing the test time

MUX

OP1OP2OP3 output pin

For OPInput pinFor CP

CP1CP2CP3DEMUX

dispatcher

ch5-14

C1 C2 C3 Cn

OPN

N = 2n

(multiplexing observation points)

CPN

C1 C2 C3 Cn

N = 2n

(demultiplexing control points)

8

Sharing Between Test Points & Normal I/O

• Advantage: Even fewer I/O pins for Test Points• Overhead: Extra MUX delay for normal I/O• Overhead: Extra MUX delay for normal I/O

Output pins

n 2-to-1MUX’s n

n

Normal functional

outputs

n 1-to-2DEMUX’s

nInputpins n

NormalFunctional

inputs

ch5-15

pn

n

SELECTPIN

Observationpoints

n

SELECTPIN

n CP’s

Control Point Selection

• Impact– The controllability of the fanout-cone of the added y

point is improved

• Common selections– Control, address, and data buses– Enable / Hold inputs– Enable and read/write inputs to memory

ch5-16

– Clock and set/clear signals of flip-flops– Data select inputs to multiplexers and

demultiplexers

9

Example: Use CP to Fix DFT Rule Violation

• DFT rule violations– The set/clear signal of a flip-flop is generated by other

logic, instead of directly controlled by an input pinlogic, instead of directly controlled by an input pin– Gated clock signals

• Violation Fix– Add a control point to the set/clear signal or clock signals

QD QD

ch5-17

logic

clearCK

logic

clearCKViolation

fix

CLEAR

Example: Fixing Gated Clock

• Gated Clocks– Advantage: power dissipation of a logic design can thus

reducedreduced– Drawback: the design’s testability is also reduced

• Testability Fix

QDCK Violation

QD

CK

ch5-18

GatedCKCK_enable

Violationfix

CK_enable

CKCP_enable

MUX

10

Example: Fixing Tri-State Bus Contention

• Bus Contention– A stuck-at-fault at the tri-state enable line may cause

b t ti lti l ti d i t dbus contention – multiple active drivers are connected to the bus simultaneously

• Fix– Add CPs to turn off tri-state devices during testing

Enable line stuck-at-1 x

(A Bus Contention Scenario in the presence of a fault)

ch5-19

Enable line active

Enable line stuck at 1 x

0 0

1 1

Unpredictable voltage on bus maycause a fault to go unnoticed

Example: Partitioning Counters

• Consider a 16-bit ripple-counter– Could take up to 216 = 65536 cycles to test– After being partitioned into two 8-bit counters below, it

can be tested with just 2x28 = 512 cycles

Trigger clockFor 2nd 8-bit

counterQ0Q1Q2Q3 8 bit co nters

Q8Q9Q10Q11

startstart

ch5-20

Q7

8-bit countersQ3Q4Q5Q6

MUX

CKCP_enable

8-bit counters Q12Q13Q14Q15

CK_for_Q8

11

Observation Point Selection

• Impact– The observability of the fanin-cone (or transitive y (

fanins) of the added OP is improved

• Common choice– Stem lines having a large number of fanouts– Global feedback paths– Redundant signal lines

O t t f l i d i h i i t

ch5-21

– Output of logic devices having many inputs• MUX, XOR trees

– Output from state devices– Address, control and data buses

(常為電路區塊間之介面訊號)

Problems of CP & OP

• Large number of I/O pins– Add MUXes to reduce the number of I/O pins– Serially shift CP values by shift-registers

• Larger test time

X Z

X’ Z’Shift-register R1 Shift-register R2

ch5-22

X Zg

control Observe

12

Outline

• Introduction• Ad-Hoc Approaches• Full Scan

– The Concept– Scan Cell Design– Random Access Scan

ch5-23

• Partial Scan

What Is Scan ?

• Objective– To provide controllability and observability at internal

state variables for testingg

• Method– Add test mode control signal(s) to circuit– Connect flip-flops to form shift registers in test mode– Make inputs/outputs of the flip-flops in the shift register

controllable and observable

ch5-24

• Types– Internal scan

• Full scan, Partial scan, Random access

– Boundary scan

13

The Scan Concept

CombinationalLogicLogic

FF

FF

Mode Switch(normal or test)

Scan In

ch5-25

FF

FFScan Out

A Logic Design Before Scan Insertion

input output

Combinational Logic

D Q

inputpins

clock

outputpins

D Q D Q

ch5-26

Sequential ATPG is extremely difficult: due to the lack of controllability and observability at flip-flops.

clock

14

Example: A 3-stage Counter

input output

Combinational LogicCombinational Logic

q1q

1

D Q

pins

clock

ppins

1

D Q1

D Q

q1 q2q3

g stuck-at-0 q2q3

ch5-27

It takes 8 clock cycles to set the flip-flops to be (1, 1, 1),for detecting the g stuck-at-0 fault

(220 clock cycles for a 20-stage counter !)

A Logic Design After Scan Insertion

input output

Combinational Logicq1

1D Q

putpins

outputpins

1D Q

1D Qscan-input

(SI)

scan-output(SO)

MU

X

MU

X

MU

X

g stuck-at-0 q2q3

q1q2

q3

ch5-28

Scan Chain provides an easy access to flip-flopsPattern Generation is much easier !!

clockscan-enable

Note: Scan Enable (SE), not shown here, controls every MUX.

15

Procedure of Applying Test Patterns

• Notation– Test vectors T = < tiI, tiF > i= 1, 2, …

Output Response R = < r O r F > i= 1 2

PI’s PO’sComb.– Output Response R = < riO, riF > i= 1, 2, …

• Test Application– (1) i = 1;

– (2) Scan-in t1F /* scan-in the first state vector for PPI’s */

– (3) Apply tiI /* apply current input vector at PI’s */

– (4) Observe riO /* observe current output response at PO’s */

PPI’s PPO’sportion

ch5-29

(4) Observe ri / observe current output response at PO s /

– (5) Capture PPOs to FFs as riF /* capture the response at PPO’s to FFs */• (Set to ‘Normal Mode’ by raising SE to ‘1’ for one clock cycle)

– (6) Scan-out riF while scanning-in ti+1F /* overlap scan-in and scan-out */

– (7) i = i+1; Goto step (3)

Testing Scan Chain ?

• Common practice– Scan chain is often first tested before testing the core logic

by a so-called flush test - which pumps random vectors in and out of the scan chain

• Procedure (flush test of scan chain)– (1) i = 0;– (2) Scan-in 1st random vector to flip-flops

ch5-30

– (3) Scan-out (i)th random vector while scanning-in (i+1)thvector for flip-flops.

• The (i)th scan-out vector should be identical to (i)th vector scanned in earlier, otherwise scan-chain is mal-functioning

– (4) If necessary i = i+1, goto step (3)

16

MUX-Scan Flip-Flop

– Only D-type master-slave flip-flops are used– All flip-flop clocks controlled from primary inputs

• No gated clock allowedg– Clocks must not feed data inputs of flip-flops– Most popularly supported in standard cell libraries

DSC (normal / test)

ch5-31

NormalMaster-Slave

Flip-flopSI (scan input)

CLK

Two-Port Dual-Clock Scan FF

• Separate normal clock from the clock used for scanning– D: normal input data– D: normal input data– CK1: normal clock– SI: scan input– CK2: scan clock

D Q D Q

Q1

Q2

DCK1

masterlatch

ch5-32

Q

CK

D Q

CKSICK2

slavelatch

17

Race-Free Scan FF

• Use two-phase clocking– CK1 and CK2 are two-phase non-overlapping

clocks which insure race-free operation

Q1

D

CK1CK2

ch5-33

D Q

CK

D Q

CK

Q2

SISC

CK1CK2

LSSD flip-flop (1977 IBM)

• LSSD: Level Sensitive Scan Design– Less performance degradation than MUX-scan FF

• Clocking– Normal operation: non-overlapping CK1=1 CK3=1– Scan operation: non-overlapping CK2=1 CK3=1

D

CK1

Q2Q1

ch5-34

C

SD

CK2CK3

想辦法將 MUX融入FF設計中，

以降低 Scan 對速度的負面影響

18

Symbol of LSSD FF

1D

Latch 1

QD Q1 (normal level-sensitive 1D2D

CK1CK2

QDSICA

D

Latch 2

Q

latch output)

SO

ch5-35

CKB

Scan Rule Violation Example

Q2Q1

DFlipFl

DFlipFl

D1 D2

Flop Flop

Clock Rule violation:Flip-flops cannot form a shift-register

D1 Q2

A workaround

ch5-36

DFlipFlop

DFlipFlop

Clock

D1

D2

Q2

All FFs are triggered by the same clock edgeSet and reset signals are not controlled by any internal signals

19

Some Problems With Full Scan

• ProblemsMajor Commercial Test Tool Companies

SynopsysMentor-Graphics

SynTest (華騰科技)– Area overhead– Possible performance degradation– High test application time– Power dissipation

• Features of Commercial Tools

Cadence

ch5-37

– Scan-rule violation check (e.g., DFT rule check)– Scan insertion (convert a FF to its scan version)– ATPG (both combinational and sequential)– Scan chain reordering after layout

Performance Overhead

• The increase of delay along the normal data paths include:p– Extra gate delay due to the multiplexer– Extra delay due to the capacitive loading of the

scan-wiring at each flip-flop’s output

• Timing-driven partial scan– Try to avoid scan flip-flops that belong to the

ch5-38

timing critical paths– The flip-flop selection algorithm for partial scan

can take this into consideration to reduce the timing impact of scan to the design

20

Scan-Chain Reordering

– Scan-chain order is often decided at gate-level without knowing the physical locations of the cells

– Scan-chain consumes a lot of routing resources, and could be minimized by re-ordering the flip-flops in the chain after layout is known

Scan-In

S O t S O t

Scan-In1

3

51

2

3

ch5-39

Scan-Out Scan-Out

Layout of a scan design A better scan-chain order

Scan cell2

4

5

4

Overhead of Scan Design

– Number of CMOS gates = 2000– Fraction of flip-flops = 0.478

Scan implementation

Predicted overhead

Actual area overhead

Normalized operating frequency

None 0 0 1.0

ch5-40

Hierarchical 14.05% 16.93% 0.87

Optimized 14.05% 11.9% 0.91

21

Random Access Scan

• Comparison with Scan-Chain– More flexible – any FF can be accessed in constant time

Test time could be reduced– Test time could be reduced– More hardware and routing overhead

ecod

er

Y address

QDMUX

Testdata

Normaldata

1

0

FF

FF

ch5-41

X decoder

Y d

X address X-enable

Y-enable

Outline

• Introduction• Ad-Hoc Approaches• Full Scan• Partial Scan

ch5-42

22

Partial Scan

• Basic idea– Select a subset of flip-flops for scan– Lower overhead (area and speed)– Relaxed design rules

• Cycle-breaking technique– Cheng & Agrawal, IEEE Trans. On Computers, April 1990– Select scan flip-flops to simplify sequential ATPG– Overhead is about 25% off than full scan

ch5-43

• Timing-driven partial scan– Jou & Cheng, ICCAD, Nov. 1991– Allow optimization of area, timing, and testability

simultaneously

Full Scan vs. Partial Scan

scan design

full scan partial scan

every flip-flop is a scan-FF NOT every flip-flop is a scan-FF

test time longer shorter

ch5-44

hardware overhead

fault coverage

ease-of-use

more

~100%

easier

less

unpredictable

harder

23

A Partial-Scan DfT Flow

Circuit file

Flip-flop selection

Flip-flop list Test model generator

Test model

Circuit modifier

Circuit with

ch5-45

Test generation (stg3)

Test vectors

Circuit withPartial scan

Directed Graph Of A Synchronous Sequential Circuit

3 A circuit with six flip-flopsprimaryinputs

1 2 4 5 6

3Graph of the circuit

primaryinputs

inputs

primaryinputs

primaryoutputs

ch5-46

1 2 4 5 6L=3

L=1L=2

Graph of the circuit

Depth D=4

24

Partial Scan For Cycle-Free Structure

• Select minimal set of flip-flops– To eliminate some or all cycles– To eliminate some or all cycles

• Self-loops of unit length– Are not broken to reduce scan overhead

– The number of self-loops in real design can be quite large

• Limit the length of

ch5-47

– Consecutive self-loop paths

– Long consecutive self-loop paths in large circuits may pose problems to sequential ATPG

Test Generation for Partial Scan Circuits

• Separate scan clock is used• Scan flip-flops are removed

– And their input and output signals are added to the PO/PI lists

• A sequential circuit test generator– is used for test generation

• The vector sequencesAre then converted into scan sequences

ch5-48

– Are then converted into scan sequences– Each vector is preceded by a scan-in sequence to set

the states of scanned flip-flops– A scan-out sequence is added at the end of applying

each vector

25

Partial Scan Design

PI PO

Scan In

PPIPI

PPO

PO

1 2

3

4 5 6

Scan In Scan Out

ch5-49

Scan OutScan Flip-Flops: {2, 5}Non-Scan FFs: {1, 3, 4, 6}

Trade-Off of Area Overhead v.s. Test Generation Effort

CPUTime

Area Overhead

Area overhead

TestGenerationComplexity

ch5-50

Non-Scan Only Self-LoopsRemain

FeedbackFree Circuit

Full-Scan

26

Summary of Partial-Scan

• Partial Scan– Allows the trade-off between test generation effort

and hardware overhead to be automatically explored

• Breaking Cycles– Dramatically simplifies the sequential ATPG

• Limiting the Length of Self-Loop Paths– Is crucial in reducing test generation effort for large

circuits

ch5-51

circuits

• Performance Degradation– Can be minimized by using timing analysis data for

flip-flop selection

VLSI Testing Chapter 5 Design For Testability & Scan Test ...syhuang/testing/ch5.DFT.pdf4 Outline...

Documents

Transcript of VLSI Testing Chapter 5 Design For Testability & Scan Test ...syhuang/testing/ch5.DFT.pdf4 Outline...