Using Memory to Cope with Simultaneous Transient Faults

20
Using Memory to Cope with Simultaneous Transient Faults Authors: Universidade Federal do Rio Grande do Sul Programa de Pós-Graduação em Engenharia Elétrica Eduardo L. Rhod ([email protected]) Carlos A. L. Lisbôa ([email protected]) Luigi Carro ([email protected])

description

Universidade Federal do Rio Grande do Sul Programa de Pós-Graduação em Engenharia Elétrica. Using Memory to Cope with Simultaneous Transient Faults. Authors :. The Problem. - PowerPoint PPT Presentation

Transcript of Using Memory to Cope with Simultaneous Transient Faults

Page 1: Using Memory to Cope with Simultaneous Transient Faults

Using Memory to Cope with Simultaneous Transient Faults

Authors:

Universidade Federal do Rio Grande do Sul

Programa de Pós-Graduação em Engenharia

Elétrica

Eduardo L. Rhod ([email protected])

Carlos A. L. Lisbôa ([email protected])

Luigi Carro ([email protected])

Page 2: Using Memory to Cope with Simultaneous Transient Faults

2

The Problem

• Due to the technology scaling, future (an actual) technologies

will be heavily influenced by electromagnetic noise causing SEU

and SET inducted errors;

• The ocurence of multiple SEU and SET, which was not a

problem in the past, must have to be considered;

• We must guarantee robustness at lowest cost;

• Some usual protection techniques like TMR and N-MR might not

work properly;

Page 3: Using Memory to Cope with Simultaneous Transient Faults

3

Motivations

• Memory comes with intrinsic protection against manufacturing errors (spare columns and spare rows);

• There are protection techniques with low area and latency overhead like Reed Solomon that can be applied;

Page 4: Using Memory to Cope with Simultaneous Transient Faults

4

Our Proposal

• Use Reed-Solomon protected memory to replace combinational circuit;

• Reducing the area sensible to faults;

• Reducing the SER (soft error rate) of the circuit;

Page 5: Using Memory to Cope with Simultaneous Transient Faults

5

Outline

• Case Studies;

• Results;

• Conclusions;

• Future Work.

Page 6: Using Memory to Cope with Simultaneous Transient Faults

6

Replacing Combinational Circuit by Memory (ROM memory)• Example:

4x4 bit multiplier - Fully combinational:

Total area = 304 transistors

Fully memory:

Memory

Input A

Input B

4

4

result

8

Total area = 2,048 transistorsconsidering 1 transistor per bit

8 inputs and 8 outputs

28 x 8 = 2,048 bits

EXPENSIVE

X

Page 7: Using Memory to Cope with Simultaneous Transient Faults

7

Replacing Combinational Circuit by Memory (ROM memory)

• Example:

4x4 bit multiplier - Fully combinational:

Total area = 304 transistors

Let’s Replace just some part of the circuit !!!

1 column

Area cost = 512 transistorsLatency = 7 cycles

Memory512 bits

4

27 x 4 = 512 bits

7 inputs and 4 outputs

Page 8: Using Memory to Cope with Simultaneous Transient Faults

8

Case Studies

4x4 bit multiplier Two memory based solutions were proposed:

• Column multiplier;

• Line multiplier;

These two solutions were compared with the TMR and N-MR techniques.

Page 9: Using Memory to Cope with Simultaneous Transient Faults

9

Case Studies

4 taps 8 bit FIR Filter

Memory based solutioncompared with the combinational one

Page 10: Using Memory to Cope with Simultaneous Transient Faults

10

Case Studies

4x4 bit multiplier - Column Solution Protected by RS code

Sensitive to Faults

Page 11: Using Memory to Cope with Simultaneous Transient Faults

11

Case Studies

4x4 bit multiplier

- Line Solution Sensitive to Faults

Protected by RS code

Page 12: Using Memory to Cope with Simultaneous Transient Faults

12

MemoryWith coef.

Input 1

Input 2

Input 3

Input 4

Result10

8

8

8

8

Case Studies

8-bits FIR Filter with 4 taps• Just using memory:

Memory size

24*8 x 18 = 77 Gb

• Memory + comb sol.:

Memory size

24 x 10 = 160 bits

Latency = 8 cycles Sensitiveto faults

Protected by RS code

Page 13: Using Memory to Cope with Simultaneous Transient Faults

13

Fault Injection Process

Fault injection Steps:• Run the circuit fault free with the 1st input;• Run the circuit with “single event level 0” at the 1st

gate;• Compare the fault free and the “single event level 0”

results to detect if the fault have propagated;• Run the circuit with “single event level 1” at the 1st

gate;• Compare the fault free and the “single event level 1”

results to detect if the fault have propagated;• Repeat the process for all gates;• Repeat the process for all inputs;• Repeat the process for double faults;

Page 14: Using Memory to Cope with Simultaneous Transient Faults

14

Results

Circuit Total Area

# of gates that fail

Latency

(ns)

Fault rate (%)

Proportional fault rate (%)

5-MR 2128 532 18.5 8.80 8.80

TMR 1072 262 18.2 5.49 2.77

Combinational 304 76 17.5 49.02 7.00

Column 2004 33 120 46.82 2.90

Line 4252 9 66 70.23 1.19

4x4 Bit Multiplier Fault Rate Results for SINGLE Fault Injection

3 x

7 x

2 x more area

The voter Is too big

Page 15: Using Memory to Cope with Simultaneous Transient Faults

15

Results

Circuit Total Area

# of gates that fail

Latency

(ns)

Proportional fault rate (%)

5-MR 2128 532 18.5 20.50

TMR 1072 262 18.2 8.19

Combinational 304 76 17.5 8.95

Column 2004 33 120 4.19

Line 4252 9 66 1.53

4x4 Bit Multiplier Fault Rate Results for DOUBLE Fault Injection

5 x

13 x

2 x more area

2 x

5 x

The voter Is too big

4 x more area

Page 16: Using Memory to Cope with Simultaneous Transient Faults

16

Results

Circuit Total Area # of gates that fail

Latency

(ns)

Proportional fault rate (%)

Combinational 6524 1631 69 48.21

Memory 1832 50 56.8 2.58

FIR Filter Fault Rate Results for SINGLE Fault Injection

3.5 x less area

18 x

Circuit Total Area # of gates that fail

Latency

(ns)

Proportional fault rate (%)

Combinational 6524 1631 69 67.35

Memory 1832 50 56.8 2.96

FIR Filter Fault Rate Results for DOUBLE Fault Injection

3.5 x less area

22.5 x

Page 17: Using Memory to Cope with Simultaneous Transient Faults

17

Conclusions

• This work showed that replacing combinational circuit by

memory based circuit can be used to improve circuit

reliability against single and double faults, with some

penalties in area and computational time;

• The presented technique, permits different memory

based solutions with different costs and gains;

• Results showed that 5-MR technique may not work as

expected.

Page 18: Using Memory to Cope with Simultaneous Transient Faults

18

Future Work

• Implement this technique using magnetic memory (no area overhead);

• Test the presented approach with different case studies;

• Develop a tool that chooses between different memory based solutions, which best fit for each application;

• Implement this technique to develop a memory based processor.

Page 19: Using Memory to Cope with Simultaneous Transient Faults

19

Thank You !!!

Questions ???

e-mails: Eduardo L. Rhod ([email protected])

Carlos A. L. Lisbôa ([email protected])

Luigi Carro ([email protected])

Page 20: Using Memory to Cope with Simultaneous Transient Faults

20

Fault Injection Process

Tools:4x4 bit multiplier• Caco-ps – Cycle Accurate Configurable Power Simulator

- combinational;- column;- line;

• Synthesized solutions* (for more than 100 gates failing):- TMR;- 5-MR;

FIR Filter- combinational;- memory based;

*using Altera FPGA EP20K200EFC484-2X.