Boundary-Scan driven Vectorless Testing on Active Components
VLSI Testing Chapter 5 Design For Testability & Scan Test ...syhuang/testing/ch5.DFT.pdf4 Outline...
Transcript of VLSI Testing Chapter 5 Design For Testability & Scan Test ...syhuang/testing/ch5.DFT.pdf4 Outline...
-
1
EE-6250
國立清華大學電機系
超大型積體電路測試VLSI Testing
Chapter 5Design For Testability
& Scan Test
Outline
• Introduction– Why DFT?– What is DFT?
• Ad-Hoc Approaches• Full Scan• Partial Scan
ch5-2
• Partial Scan
-
2
Why DFT ?
• Direct Testing is Way Too Difficult !– Large number of FFs– Embedded memory blocks– Embedded analog blocks
• Design For Testability is inevitable• Like death and tax
ch5-3
• Like death and tax
Design For Testability
• Definition– Design For Testability (DFT) refers to those design Design For Testability (DFT) refers to those design
techniques that make test generation and testing cost-effective
• DFT Methods– Ad-hoc methods– Scan, full and partial– Built-In Self-Test (BIST)
ch5-4
– Boundary scan
• Cost of DFT– Pin count, area, performance, design-time, test-time
-
3
Why DFT Isn’t Universally Used Previously?
– Short-sighted view of management
– Time-to-market pressurep ss
– Life-cycle cost ignored by development management/contractors/buyers
– Area/functionality/performance myths
– Lack of knowledge by design engineers
Testing is someone else’s problem
ch5-5
– Testing is someone else s problem
– Lack of tools to support DFT until recently
We don’t’ have to worry about this management barrier any more Most design teams now have DfT people
Important Factors
• Controllability– Measure the ease of controlling a lineg
• Observability– Measure the ease of observing a line at PO
• Predictability– Measure the ease of predicting output values
• DFT deals with ways of improving
ch5-6
• DFT deals with ways of improving– Controllability– Observability– Predictability
-
4
Outline
• Introduction• Ad-Hoc Approaches
– Test Points– Design Rules
• Full Scan• Partial Scan
ch5-7
• Partial Scan
Ad-Hoc Design For Testability
• Design Guidelines– Avoid redundancy– Avoid asynchronous logic– Avoid clock gating (e.g., ripple counter)– Avoid large fan-in– Consider tester requirements (tri-stating, etc.)
• Disadvantages
ch5-8
– High fault coverage not guaranteed– Manual test generation– Design iterations required
-
5
Some Ad-Hoc DFT Techniques
• Test Points• Initialization• Monostable multivibrators
– One-shot circuit
• Oscillators and clocks• Counters / Shift-Registers
– Add control points to long counters
Delayelement
One-shot
input output
input
ch5-9
• Partition large circuits• Logical redundancy• Break global feedback paths
p
output
On-Line Self-Test & Fault Tolerance By Redundancy
• Information Redundancy– Outputs = (information-bits) + (check-bits)– Information bits are the original normal outputs– Check bits always maintains a specific pre-defined
logical or mathematical relationship with the corresponding information bits
– Any time, if the information-bits and check-bits violatethe pre-defined relationship, then it indicates an error
Hard are Red ndanc
ch5-10
• Hardware Redundancy– Use extra hardware (e.g., duplicate or triplicate the
system) so that the fault within one module will be masked (I.e., the faulty effect never observed at the final output)
-
6
Module Level Redundancy
• Triple Module Redundancy (TMR)– majority voting on three identical modules’
outputs help mask out faults that occur in a outputs help mask out faults that occur in a single module
Module
Majority verdict 0
1
0
ch5-11
Module
Module
voter0
0
Test Point Insertion
• Employ test points to enhance– Controllability– Observability
• CP: Control Points– Primary inputs used to enhance controllability
• OP: Observability Points– Primary outputs used to enhance observability
0OP
(extra PI) (extra PO)
ch5-12
1
Add 0-CP
Add 1-CP
Add OP
OP
(extra PI)
-
7
0/1 Injection Circuitry
• Normal operationWhen CP_enable = 0
• Inject 0j– Set CP_enable = 1 and CP = 0
• Inject 1– Set CP_enable = 1 and CP = 1
C1 MUX0 w
ch5-13
C1 C2MUX1
CP
CP_enableInserted circuit for controlling line w
Single I/O Port for Multiple Test Points
• Constraints of using test points– A large demand on I/O pins
Thi t i t b h t li d b – This constraint can be somewhat relieved by using MUX & DEMUX at the cost of increasing the test time
MUX
OP1OP2OP3 output pin
For OPInput pinFor CP
CP1CP2CP3DEMUX
dispatcher
ch5-14
C1 C2 C3 Cn
OPN
N = 2n
(multiplexing observation points)
CPN
C1 C2 C3 Cn
N = 2n
(demultiplexing control points)
-
8
Sharing Between Test Points & Normal I/O
• Advantage: Even fewer I/O pins for Test Points• Overhead: Extra MUX delay for normal I/O• Overhead: Extra MUX delay for normal I/O
Output pins
n 2-to-1MUX’s n
n
Normal functional
outputs
n 1-to-2DEMUX’s
nInputpins n
NormalFunctional
inputs
ch5-15
pn
n
SELECTPIN
Observationpoints
n
SELECTPIN
n CP’s
Control Point Selection
• Impact– The controllability of the fanout-cone of the added y
point is improved
• Common selections– Control, address, and data buses– Enable / Hold inputs– Enable and read/write inputs to memory
ch5-16
– Clock and set/clear signals of flip-flops– Data select inputs to multiplexers and
demultiplexers
-
9
Example: Use CP to Fix DFT Rule Violation
• DFT rule violations– The set/clear signal of a flip-flop is generated by other
logic, instead of directly controlled by an input pinlogic, instead of directly controlled by an input pin– Gated clock signals
• Violation Fix– Add a control point to the set/clear signal or clock signals
QD QD
ch5-17
logic
clearCK
logic
clearCKViolation
fix
CLEAR
Example: Fixing Gated Clock
• Gated Clocks– Advantage: power dissipation of a logic design can thus
reducedreduced– Drawback: the design’s testability is also reduced
• Testability Fix
QDCK Violation
QD
CK
ch5-18
GatedCKCK_enable
Violationfix
CK_enable
CKCP_enable
MUX
-
10
Example: Fixing Tri-State Bus Contention
• Bus Contention– A stuck-at-fault at the tri-state enable line may cause
b t ti lti l ti d i t dbus contention – multiple active drivers are connected to the bus simultaneously
• Fix– Add CPs to turn off tri-state devices during testing
Enable line stuck-at-1 x
(A Bus Contention Scenario in the presence of a fault)
ch5-19
Enable line active
Enable line stuck at 1 x
0 0
1 1
Unpredictable voltage on bus maycause a fault to go unnoticed
Example: Partitioning Counters
• Consider a 16-bit ripple-counter– Could take up to 216 = 65536 cycles to test– After being partitioned into two 8-bit counters below, it
can be tested with just 2x28 = 512 cycles
Trigger clockFor 2nd 8-bit
counterQ0Q1Q2Q3 8 bit co nters
Q8Q9Q10Q11
startstart
ch5-20
Q7
8-bit countersQ3Q4Q5Q6
MUX
CKCP_enable
8-bit counters Q12Q13Q14Q15
CK_for_Q8
-
11
Observation Point Selection
• Impact– The observability of the fanin-cone (or transitive y (
fanins) of the added OP is improved
• Common choice– Stem lines having a large number of fanouts– Global feedback paths– Redundant signal lines
O t t f l i d i h i i t
ch5-21
– Output of logic devices having many inputs• MUX, XOR trees
– Output from state devices– Address, control and data buses
(常為電路區塊間之介面訊號)
Problems of CP & OP
• Large number of I/O pins– Add MUXes to reduce the number of I/O pins– Serially shift CP values by shift-registers
• Larger test time
X Z
X’ Z’Shift-register R1 Shift-register R2
ch5-22
X Zg
control Observe
-
12
Outline
• Introduction• Ad-Hoc Approaches• Full Scan
– The Concept– Scan Cell Design– Random Access Scan
ch5-23
• Partial Scan
What Is Scan ?
• Objective– To provide controllability and observability at internal
state variables for testingg
• Method– Add test mode control signal(s) to circuit– Connect flip-flops to form shift registers in test mode– Make inputs/outputs of the flip-flops in the shift register
controllable and observable
ch5-24
• Types– Internal scan
• Full scan, Partial scan, Random access
– Boundary scan
-
13
The Scan Concept
CombinationalLogicLogic
FF
FF
Mode Switch(normal or test)
Scan In
ch5-25
FF
FFScan Out
A Logic Design Before Scan Insertion
input output
Combinational Logic
D Q
inputpins
clock
outputpins
D Q D Q
ch5-26
Sequential ATPG is extremely difficult: due to the lack of controllability and observability at flip-flops.
clock
-
14
Example: A 3-stage Counter
input output
Combinational LogicCombinational Logic
q1q
1
D Q
pins
clock
ppins
1
D Q1
D Q
q1 q2q3
g stuck-at-0 q2q3
ch5-27
It takes 8 clock cycles to set the flip-flops to be (1, 1, 1),for detecting the g stuck-at-0 fault
(220 clock cycles for a 20-stage counter !)
A Logic Design After Scan Insertion
input output
Combinational Logicq1
1D Q
putpins
outputpins
1D Q
1D Qscan-input
(SI)
scan-output(SO)
MU
X
MU
X
MU
X
g stuck-at-0 q2q3
q1q2
q3
ch5-28
Scan Chain provides an easy access to flip-flopsPattern Generation is much easier !!
clockscan-enable
Note: Scan Enable (SE), not shown here, controls every MUX.
-
15
Procedure of Applying Test Patterns
• Notation– Test vectors T = < tiI, tiF > i= 1, 2, …
Output Response R = < r O r F > i= 1 2
PI’s PO’sComb.– Output Response R = < riO, riF > i= 1, 2, …
• Test Application– (1) i = 1;
– (2) Scan-in t1F /* scan-in the first state vector for PPI’s */
– (3) Apply tiI /* apply current input vector at PI’s */
– (4) Observe riO /* observe current output response at PO’s */
PPI’s PPO’sportion
ch5-29
(4) Observe ri / observe current output response at PO s /
– (5) Capture PPOs to FFs as riF /* capture the response at PPO’s to FFs */• (Set to ‘Normal Mode’ by raising SE to ‘1’ for one clock cycle)
– (6) Scan-out riF while scanning-in ti+1F /* overlap scan-in and scan-out */
– (7) i = i+1; Goto step (3)
Testing Scan Chain ?
• Common practice– Scan chain is often first tested before testing the core logic
by a so-called flush test - which pumps random vectors in and out of the scan chain
• Procedure (flush test of scan chain)– (1) i = 0;– (2) Scan-in 1st random vector to flip-flops
ch5-30
– (3) Scan-out (i)th random vector while scanning-in (i+1)thvector for flip-flops.
• The (i)th scan-out vector should be identical to (i)th vector scanned in earlier, otherwise scan-chain is mal-functioning
– (4) If necessary i = i+1, goto step (3)
-
16
MUX-Scan Flip-Flop
– Only D-type master-slave flip-flops are used– All flip-flop clocks controlled from primary inputs
• No gated clock allowedg– Clocks must not feed data inputs of flip-flops– Most popularly supported in standard cell libraries
DSC (normal / test)
ch5-31
NormalMaster-Slave
Flip-flopSI (scan input)
CLK
Two-Port Dual-Clock Scan FF
• Separate normal clock from the clock used for scanning– D: normal input data– D: normal input data– CK1: normal clock– SI: scan input– CK2: scan clock
D Q D Q
Q1
Q2
DCK1
masterlatch
ch5-32
Q
CK
D Q
CKSICK2
slavelatch
-
17
Race-Free Scan FF
• Use two-phase clocking– CK1 and CK2 are two-phase non-overlapping
clocks which insure race-free operation
Q1
D
CK1CK2
ch5-33
D Q
CK
D Q
CK
Q2
SISC
CK1CK2
LSSD flip-flop (1977 IBM)
• LSSD: Level Sensitive Scan Design– Less performance degradation than MUX-scan FF
• Clocking– Normal operation: non-overlapping CK1=1 CK3=1– Scan operation: non-overlapping CK2=1 CK3=1
D
CK1
Q2Q1
ch5-34
C
SD
CK2CK3
想辦法將 MUX融入FF設計中,
以降低 Scan 對速度的負面影響
-
18
Symbol of LSSD FF
1D
Latch 1
QD Q1 (normal level-sensitive 1D2D
CK1CK2
QDSICA
D
Latch 2
Q
latch output)
SO
ch5-35
CKB
Scan Rule Violation Example
Q2Q1
DFlipFl
DFlipFl
D1 D2
Flop Flop
Clock Rule violation:Flip-flops cannot form a shift-register
D1 Q2
A workaround
ch5-36
DFlipFlop
DFlipFlop
Clock
D1
D2
Q2
All FFs are triggered by the same clock edgeSet and reset signals are not controlled by any internal signals
-
19
Some Problems With Full Scan
• ProblemsMajor Commercial Test Tool Companies
SynopsysMentor-Graphics
SynTest (華騰科技)– Area overhead– Possible performance degradation– High test application time– Power dissipation
• Features of Commercial Tools
Cadence
ch5-37
– Scan-rule violation check (e.g., DFT rule check)– Scan insertion (convert a FF to its scan version)– ATPG (both combinational and sequential)– Scan chain reordering after layout
Performance Overhead
• The increase of delay along the normal data paths include:p– Extra gate delay due to the multiplexer– Extra delay due to the capacitive loading of the
scan-wiring at each flip-flop’s output
• Timing-driven partial scan– Try to avoid scan flip-flops that belong to the
ch5-38
timing critical paths– The flip-flop selection algorithm for partial scan
can take this into consideration to reduce the timing impact of scan to the design
-
20
Scan-Chain Reordering
– Scan-chain order is often decided at gate-level without knowing the physical locations of the cells
– Scan-chain consumes a lot of routing resources, and could be minimized by re-ordering the flip-flops in the chain after layout is known
Scan-In
S O t S O t
Scan-In1
3
51
2
3
ch5-39
Scan-Out Scan-Out
Layout of a scan design A better scan-chain order
Scan cell2
4
5
4
Overhead of Scan Design
– Number of CMOS gates = 2000– Fraction of flip-flops = 0.478
Scan implementation
Predicted overhead
Actual area overhead
Normalized operating frequency
None 0 0 1.0
ch5-40
Hierarchical 14.05% 16.93% 0.87
Optimized 14.05% 11.9% 0.91
-
21
Random Access Scan
• Comparison with Scan-Chain– More flexible – any FF can be accessed in constant time
Test time could be reduced– Test time could be reduced– More hardware and routing overhead
ecod
er
Y address
QDMUX
Testdata
Normaldata
1
0
FF
FF
ch5-41
X decoder
Y d
X address X-enable
Y-enable
Outline
• Introduction• Ad-Hoc Approaches• Full Scan• Partial Scan
ch5-42
-
22
Partial Scan
• Basic idea– Select a subset of flip-flops for scan– Lower overhead (area and speed)– Relaxed design rules
• Cycle-breaking technique– Cheng & Agrawal, IEEE Trans. On Computers, April 1990– Select scan flip-flops to simplify sequential ATPG– Overhead is about 25% off than full scan
ch5-43
• Timing-driven partial scan– Jou & Cheng, ICCAD, Nov. 1991– Allow optimization of area, timing, and testability
simultaneously
Full Scan vs. Partial Scan
scan design
full scan partial scan
every flip-flop is a scan-FF NOT every flip-flop is a scan-FF
test time longer shorter
ch5-44
hardware overhead
fault coverage
ease-of-use
more
~100%
easier
less
unpredictable
harder
-
23
A Partial-Scan DfT Flow
Circuit file
Flip-flop selection
Flip-flop list Test model generator
Test model
Circuit modifier
Circuit with
ch5-45
Test generation (stg3)
Test vectors
Circuit withPartial scan
Directed Graph Of A Synchronous Sequential Circuit
3 A circuit with six flip-flopsprimaryinputs
1 2 4 5 6
3Graph of the circuit
primaryinputs
inputs
primaryinputs
primaryoutputs
ch5-46
1 2 4 5 6L=3
L=1L=2
Graph of the circuit
Depth D=4
-
24
Partial Scan For Cycle-Free Structure
• Select minimal set of flip-flops– To eliminate some or all cycles– To eliminate some or all cycles
• Self-loops of unit length– Are not broken to reduce scan overhead
– The number of self-loops in real design can be quite large
• Limit the length of
ch5-47
– Consecutive self-loop paths
– Long consecutive self-loop paths in large circuits may pose problems to sequential ATPG
Test Generation for Partial Scan Circuits
• Separate scan clock is used• Scan flip-flops are removed
– And their input and output signals are added to the PO/PI lists
• A sequential circuit test generator– is used for test generation
• The vector sequencesAre then converted into scan sequences
ch5-48
– Are then converted into scan sequences– Each vector is preceded by a scan-in sequence to set
the states of scanned flip-flops– A scan-out sequence is added at the end of applying
each vector
-
25
Partial Scan Design
PI PO
Scan In
PPIPI
PPO
PO
1 2
3
4 5 6
Scan In Scan Out
ch5-49
Scan OutScan Flip-Flops: {2, 5}Non-Scan FFs: {1, 3, 4, 6}
Trade-Off of Area Overhead v.s. Test Generation Effort
CPUTime
Area Overhead
Area overhead
TestGenerationComplexity
ch5-50
Non-Scan Only Self-LoopsRemain
FeedbackFree Circuit
Full-Scan
-
26
Summary of Partial-Scan
• Partial Scan– Allows the trade-off between test generation effort
and hardware overhead to be automatically explored
• Breaking Cycles– Dramatically simplifies the sequential ATPG
• Limiting the Length of Self-Loop Paths– Is crucial in reducing test generation effort for large
circuits
ch5-51
circuits
• Performance Degradation– Can be minimized by using timing analysis data for
flip-flop selection