CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

45
CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I

Transcript of CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Page 1: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

CMP238: Projeto e Teste de Sistemas VLSI

Marcelo Lubaszewski

Aula 3 - Teste

PPGC - UFRGS

2005/I

Page 2: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Lecture 3 - Fault Simulation andFault Injection

• Fault simulation

• Fault simulation algorithms• Serial• Parallel• Deductive• Concurrent

• Random Fault Sampling

• Fault Injection• ASIC• FPGAs

Page 3: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Problem and Motivation

• Fault simulation Problem: Given A circuit A sequence of test vectors A fault model

– Determine Fault coverage - fraction (or percentage) of modeled

faults detected by test vectors Set of undetected faults

• Motivation Determine test quality and in turn product quality Find undetected fault targets to improve tests

Page 4: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Fault simulator in a VLSI Design Process

Verified designnetlist

Verificationinput stimuli

Fault simulator Test vectors

Modeledfault list

Testgenerator

Testcompactor

Faultcoverage

?

Remove tested faults

Deletevectors

Add vectors

Low

Adequate

Stop

Page 5: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Fault Simulation Scenario• Circuit model: mixed-level

• Mostly logic with some switch-level for high-impedance (Z) and bidirectional signals

• High-level models (memory, etc.) with pin faults

• Signal states: logic• Two (0, 1) or three (0, 1, X) states for purely Boolean

logic circuits

• Four states (0, 1, X, Z) for sequential MOS circuits

• Timing:• Zero-delay for combinational and synchronous circuits

• Mostly unit-delay for circuits with feedback

Page 6: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Fault Simulation Scenario (continued)

• Faults:• Mostly single stuck-at faults• Sometimes stuck-open, transition, and path-delay

faults; analog circuit fault simulators are not yet in common use

• Equivalence fault collapsing of single stuck-at faults• Fault-dropping -- a fault once detected is dropped

from consideration as more vectors are simulated; fault-dropping may be suppressed for diagnosis

• Fault sampling -- a random sample of faults is simulated when the circuit is large

Page 7: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Fault Simulation Algorithms

• Serial

• Parallel

• Deductive

• Concurrent

Page 8: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Serial Algorithm• Algorithm: Simulate fault-free circuit and save

responses. Repeat following steps for each fault in the fault list:

• Modify netlist by injecting one fault• Simulate modified netlist, vector by vector, comparing

responses with saved responses• If response differs, report fault detection and suspend

simulation of remaining vectors

• Advantages:• Easy to implement; needs only a simulator, less memory• Most faults, including analog faults, can be simulated

Page 9: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Serial Algorithm (Cont.)• Disadvantage: Much repeated computation; CPU time

prohibitive for VLSI circuits

• Alternative: Simulate many faults together

Test vectors Fault-free circuit

Circuit with fault f1

Circuit with fault f2

Circuit with fault fn

Comparator f1 detected?

Comparator f2 detected?

Comparator fn detected?

Page 10: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Parallel Fault Simulation

• Exploits inherent bit-parallelism of logic operations on computer words

• Storage: one word per line for two-state simulation

• Multi-pass simulation: Each pass simulates w-1 new faults, where w is the machine word length

• Speed up over serial method ~ w-1

• Not suitable for circuits with timing-critical and non-Boolean logic

Page 11: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Parallel Fault Sim. Example

a

b c

d

e

f

g

1 1 1

1 1 1 1 0 1

1 0 1

0 0 0

1 0 1

s-a-1

s-a-0

0 0 1

c s-a-0 detected

Bit 0: fault-free circuit

Bit 1: circuit with c s-a-0

Bit 2: circuit with f s-a-1

Page 12: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Deductive Fault Simulation• One-pass simulation

• Each line k contains a list Lk of faults detectable on k

• Following simulation of each vector, fault lists of all gate output lines are updated using set-theoretic rules, signal values, and gate input fault lists

• PO fault lists provide detection data

• Limitations:• Set-theoretic rules difficult to derive for non-Boolean

gates• Gate delays are difficult to use

Page 13: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Deductive Fault SimulationExample

a

b c

d

e

f

g

1

1 1

0

1

{a0}

{b0 , c0}

{b0}

{b0 , d0}

Le = La U Lc U {e0}

= {a0 , b0 , c0 , e0}

Lg = (Le Lf ) U {g0}

= {a0 , c0 , e0 , g0}

U

{b0 , d0 , f1}

Notation: Lk is fault list for line k

kn is s-a-n fault on line k

Faults detected bythe input vector

Page 14: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Concurrent Fault Simulation• Simulation of fault-free circuit and only those parts of the

faulty circuit that differ in signal states from the fault-free circuit.

• A list per gate containing copies of the gate from all faulty circuits in which this gate differs. List element contains fault ID, gate input and output values and internal states, if any.

• All events of fault-free and all faulty circuits are implicitly simulated.

• Faults can be simulated in any modeling style or detail supported in simulation (offers most flexibility.)

• Faster than other methods, but uses most memory.

Page 15: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Conc. Fault Sim. Example

a

b c

d

e

f

g

1

11

0

1

1

11

1

01

1 0

0

10

1

00

1

00

1

10

1

00

1

11

1

11

0

00

0

11

0

00

0

00

0 1 0 1 1 1

a0 b0 c0 e0

a0 b0

b0

c0 e0

d0d0 g0 f1

f1

Page 16: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Conc. Fault Sim. Example

a

b c

d

e

f

g

1 -> 0

11 -> 0

0

1->0

1->0

11->0

1

01

1 0

0

10

1

00

1

00

1

10

1

00

1

11

1

11

0

00

0

11

0

00

0

00

0 1 0 1 1 1

a0 b0 c0 e0

a0 b0

b0

c0 e0

d0d0 g0 f1

f1

Page 17: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Conc. Fault Sim. Example

a

b c

d

e

f

g

1

11

0

1

1

11

1

01

1 0

1

11

1

00

1

00

0

11

0

01

1

11

1

11

1

01

0

11 1

01

0 1 0 1 1 1

a1b0 c0 e1

a1

b0

b0

e1

d0

d0 g1

f1

f1

Page 18: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Fault Sampling

• A randomly selected subset (sample) of faults is simulated.

• Measured coverage in the sample is used to estimate fault coverage in the entire circuit.

• Advantage: Saving in computing resources (CPU time and memory.)

• Disadvantage: Limited data on undetected faults.

Page 19: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Motivation for Sampling

• Complexity of fault simulation depends on:• Number of gates• Number of faults• Number of vectors

• Complexity of fault simulation with fault sampling depends on:

• Number of gates• Number of vectors

Page 20: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Random Sampling Model

All faults witha fixed butunknowncoverage

Detectedfault

Undetectedfault

Random

picking

Np = total number of faults

(population size)

C = fault coverage (unknown)

Ns = sample size

Ns << Npc = sample coverage (a random variable)

Page 21: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Example

A circuit with 39,096 faults has an actualfault coverage of 87.1%.

The measured coverage in a random sample of 1,000 faults is 88.7%. The above

formula gives an estimate of 88.7% 3%. CPU time for sample simulation was about 10% of

that for all faults.

Page 22: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

• ASIC (general)– performed by simulation or – emulation to speed up the process

• FPGAs– directly into the bitstream

Fault Injection

Page 23: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Fault Injection

General: ASIC performed by simulation or emulation to speed up the process

Page 24: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

• When should a fault be inserted?

Pseudo-random number

generator

timelocationbit

Fault Injection EnableSignals

Enable_registers

Enable_memory

Fault maskreset

Reset_core

clock

Memory location

Fault Injection Block

[0][4][5][7]

8-bit pseudo-random register

Fault Injection Control Block

applicationcorecore

......• Where should a fault be inserted?

Page 25: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Device Under Test Core Registers

CLK RST

mask ...

FI_EN

ENFI_EN CLK RST

mask

FI_EN

ENFI_EN

Hamming Code

Hamming Decode

• Special paths are required in the high-level description to ensure the full controllability of a fault insertion.

• No major performance penalties are observed

Page 26: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Device Under Test Core Memory

FI_memory location

mask

WEAENARSTACLKAADDRADIA

DOA

DOB

WEBENBRSTBCLKBADDRBDIB

FI_enable_memory

WEA

FI_memory location

mask

WEAENARSTACLKAADDRADIA

DOA

DOB

WEBENBRSTBCLKBADDRBDIB

Hamm

ing Code

Hamm

ing Decode

FI_enable_memory

WEA

• Dual-port memories allows high flexibility with low extra logic.

• Good fault model approximation: the faults can be injected in the memory at any time (all micro-controller states) without perturbing the normal operation.

Page 27: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Monitor Block

• Observability of an error in response of a fault injected in the DUT.

• What is the best method to analyze the results?

• Assumption: application results will be stored in the internal memory, implemented using BlockRam in the Virtex FPGA.

• Two main tasks are performed by the monitor block:• Clear all the internal memory in order to avoid accumulation

of faults;

• Read all the internal memory to analyze if there is an error or not.

Page 28: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Fault Injection Platform

Fault injection Soft-core

“Golden” Soft-core

Fault injection block

reset

Reset_core

clock

Monitor Block

Monitor Block

data

data

status

status

Lost of sequence

error

Chipscope Analyzer

Virtex FPGA

jtagcomputer

• “Golden” chip approach was used to observe the faults in the system.

• Comparing:– Data memory error– Application_end signal in the end lost of sequence.

Page 29: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Reset

Reset_MEM

Read_MEM

Clear_MEM

Clear Memory Application

Fault Injection Signals

Reset_DUT

Clear Memory ApplicationRead Memory

Time range for Fault Injection

Application_end

Lost of sequence

Error

Page 30: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Case Sudy: 8051 Fault Injection Experiment

000000010000001000000011000001000000010100000110000000010000001000000011000001000000010100000110000000010000001000000011000001000000010100000110000000010000001000000011000001000000010100000110000000010000001000000011000001000000010100000110000000010000001000000011000001000000010100000110000000010000000100000001000000010000000100000001000000100000001000000010000000100000001000000010000000110000001100000011000000110000001100000011000001000000010000000100000000100000010000000100000001010000010100000101000000101000010100000101000001100000011000000110000000110000011000000110000101010010101000111111010101000110100101111110000101010010101000111111010101000110100101111110 000101010010101000111111010101000110100101111110000101010010101000111111010101000110100101111110000101010010101000111111010101000110100101111110000101010010101000111111010101000110100101111110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000….000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

variables

Matrix 2

Matrix Result

0h

0Ah0Bh

1Fh2Fh

52h53h

76h

100h

Matrix 1

• Registers: Datapath, Control block, state machine (16 registers)

• Memory: 256 bytes application memory (RAM)

• 8051 Application: 6x6 Matrix Multiplication – Multiplication is performed by subsequent

addition and left shift register.

• According to the TIME:– fault in the memory location < 52h can

generate an error or a lost of sequence.

memory

Page 31: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Case Study: 8051 Fault Injection ExperimentVirtex Platform

• The fault injection system and the (DUT) were implemented in Virtex FPGA platform.

• Thousands of faults were injected in some seconds

• The emulation in FPGA showed a ~120,000 times acceleration

• Component: Virtex XCV300 - 240p

• 8051-like micro-controller runs at 10 MHz

• Fault injection Standard 8051 – 28 % LUTs

• Fault injection Hardened 8051– 30 % LUTs

Page 32: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Case Study: 8051 Fault Injection Experiment

Chipscope Analyzer

• Results of thousands of injected faults were analyzed in the computer by the Chipscope Analyzer software (ISE).

Page 33: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Fault Injection

FPGAs

(Directly into the bitstream)

Page 34: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Introduction Digital designs synthesized in SRAM-based FPGAs are

sensitive to upsets in the FPGA customization cells.

DesignDesign

clk

E1E2

E1E3

E2E3

SynthesisCustomization bits

…10101011110001101010…

M M M M

Upset = bit-flip

…10101011100001101010…

clk

E1E2

E1E3

E2E3

fault

Page 35: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Motivation

• A way to test the reliability of a (fault-tolerant) design is to perform fault injection, simulating the effects of SEUs, and see how the design responds

• The estimated time for injecting one bit fault in a FPGA bitstream is around 1 to 3 seconds

• The main steps of the fault injection are:– read the bit file – flip one bit– write the new bit file with the correct CRC– download it to the board by the SelectMAP configuration mode

interface

– check the output signal

Page 36: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Fault Injection Cost AnalysisXCV300 Analysis

CLB Rows x CLB Cols 32 x 48

Total of 1.536 CLBs

1.327.104 CLB Bits

Total of 470 CLBs Used

406.080 CLB Bits

31 days!

9.5 days!

Page 37: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

CLB View in the FPGA Editor

Which bit in the bitstream correspond to this connection?

Which bit in the bitstream correspond to the logic?

Page 38: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Virtex Bitstream

...

Frame

0

...

171615

0 1 2 3 …. 47

Bit

CLB

BRAM

IOB

586 bits

What does this bit do?

Page 39: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

CLB View in the FPGA Editor

Which bit in the bitstream correspond to this connection?Bit:?, frame: ?

Which bit in the bitstream correspond to the logic?Bit: ?, frame:?

Page 40: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

CLBclass ToolCLB Map

Page 41: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

SEUs Effects on FPGAs I

Bit Flip in a Single Linebit: 03 frame: 29

Fault effect:- Open Circuit

Page 42: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

SEUs Effects on FPGAs II

Bit Flip in a Hex Line Muxbit: 01 , frame:32

Fault effect:- Open Circuit- Short Cut Circuit

Page 43: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Bitstream

Error flagList of bits that cause error:Major Address, Frame, Byte Frame, Bit(bit location in the Bitstream)

XCV300 240p(DUT)

Read: … 011001010101000001100001… step 1: load … 011000010101000001100001…step 2: load … 0110011010101000001100001… step 3: reset flip-flops … 011001100101000001100001…

3-modes

Bitstream:1,663,200 bits

Fault InjectionTool used at Xilinx

Page 44: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Fault Injection Tool (under developed at UFRGS)

Page 45: CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Fault Injection Time Reduction

Bits to be tested of the 1,536 CLBs

Number of Bits Estimated time Time Save (%)

All bits 1,327,104 31 days

LUTs 98,304 1.1 days 96,45

GRM 638,976 7.4 days 76,13

Single matrix 466,944 5.4 days 82,58

Hex matrix 172,032 2 days 93,55

Responsible for the majority of the errors due to SEU- Open circuits combined to Short cut circuits

One can choose where to inject the faults!! This can reduce the fault injection time.