SBCCI08
-
Upload
jose-manuel-martins-ferreira -
Category
Travel
-
view
1.165 -
download
0
description
Transcript of SBCCI08
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (1)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructuresJ. M. Martins Ferreira [ [email protected] ]FEUP / DEECRua Dr. Roberto Frias4200-465 Porto - PORTUGAL
André Fidalgo, Gustavo R. Alves Manuel Gericota [ anf/gca/mgg @isep.ipp.pt ]ISEP / DEE Rua Ant. Bernardino Almeida, 4314200-072 Porto - PORTUGAL
SBCCI’08: Gramado, Brazil, 1-4 September 2008These slides are available at http://www.slideshare.net/josemmf
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (2)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Outline of the presentation
• Introduction and motivation
• Setup, workbench, workflow
• Experimental results– Basic, extended and OCD-FI – OCD-FI extensions (EDAC, RTREG)
• Comparison and discussion
• Conclusion
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (3)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Scope, focus, setup
• Scope: usage of OCD resources for validating fault tolerance / fault injection
• Focus: comparative analysis of experimental results for various OCD configurations and debugging scenarios
• Setup: a) 32-bit Freescale MPC-565, iSystem IC3000 (iTracePro), Winidea 2005 b) OCD enhancements in VHDL
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (4)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Motivation
• OCD offers controllability and observability features that may be used to inject faults and observe their effect (R/W access to registers and memory)
• Usefulness for fault tolerance validation may be limited in bandwidth, coverage and repeatability / representativeness of results
• Mitigation is possible by enhancing OCD
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (5)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Our approach
• Configurations: basic (2:8), extended (8:8), OCD-FI (with a fault injection module)
• Fault injection scenarios: off-line or real-time, predefined or on-the-fly
• OCD-FI is able to cope with error detection / correction and real-time requirements
• Comparison of results uses a common set of workload applications and FI campaigns
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (6)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
NEXUS FI for the MPC565
CPUHost Machine(Pentium PC)
Debugger(Fault Injector)
Trace Data
Data Link
Campaign Data
IC3000
OCD
MPC565
NEXUS
Trace data: Program trace data output by the OCD
Campaign data: scripts that describe the FI experiments
NEXUS Debug Features
Class Usability for FI
Run-Control 1 External Triggering
Breakpoints 1 Internal Triggering
Watchpoints 1 Real Time Triggering
Static Register and Memory Access
1 Static Fault Insertion
Program Trace 2 Fault Effects Classification
Dynamic Register and Memory Access
3 Real Time Fault Insertion
Data Trace 3Improved Fault Effects Classification
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (7)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
OCD infrastructure developed to support this work• NEXUS class 2
compliant with real--time memory access
• Adjustable data bus
• OCD configurations– Basic (2,8)– Extended (8,8)– OCD-FI: comprises a fault injection module
BUSES
OCD
RCT
RWAMQMAUX
PORT
Bus Snooper
CPUcore
ROM
RAM
I/O
Bus Master
FI
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (8)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Fault injection: Workload applications• Workload applications:
– Matrix adder (Madder)– Vector sorter (Vsorter)– LUT control algorithm (Xcontrol)
• Each application was implemented in two versions: normal and fault tolerant
• Fault tolerance by duplicating data in memory and repeating each operation
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (9)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Fault injection campaigns
• Scripts that define 10 FI experiments during system operation
• 100 campaigns were executed for each scenario using the three workload applications (Madder, Vsorter, Xcontrol)
• FI campaigns mostly target memory positions and cause a bit-flip to emulate SEU effects
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (10)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Predetermination to improve performance of FI campaigns• Predetermination of the contents of the
target memory cell at the FI instant may be done through a “gold run” or by ensuring:– Complete knowledge of the program flow– Full observability of external inputs– Precise control of the FI instant and location
• Otherwise the target memory cell must be read “immediately” before the FI instant
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (11)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Experimental scenarios
Configur. &
ScenarioBandwidth
Predetermination of the faulty value
Fault injection method
Delays (Clk cycles)
Set-Up Insertion
BOF MDI=2 MDO=8 YES Offline 22 35
BOF+ MDI=2 MDO=8 NO Offline 22 44
EOF MDI=8 MDO=8 YES Offline 6 9
EOF+ MDI=8 MDO=8 NO Offline 6 18
BRT MDI=2 MDO=8 YES Real Time 22 35
BRT+ MDI=2 MDO=8 NO Real Time 22 44
ERT MDI=8 MDO=8 YES Real Time 6 9
ERT+ MDI=8 MDO=8 NO Real Time 6 18
OCD-FI MDI=2 MDO=8 YES Real Time 57 2
OCD-FI+ MDI=2 MDO=8 NO Real Time 57 4
B: Basic; E: Extended; OCD-FI : OCD for Fault InjectionOF: Off-line; RT: Real-time; +: predetermination not required
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (12)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Experimental results (%): B, E, OCD-FI (results)
UERR: Undetected errors (incorrect final result that goes undetected)DERR: Detected errors (error detection signal activated)NERR: No errors (application ended correctly)
MAdder VSorter XControl
Configur. &
Scenario
non-FT SW-FT non-FT SW-FT non-FT SW-FT
UERR NERR DERR UERR NERR UERR NERR DERR UERR NERR UERR NERR DERR UERR NERR
OFF 19 81 28 13,9 58,1 98 2 97 2 1
Not PossibleBRT 19,4 80,6 28,3 13,8 57,9 98,1 1,9 96,8 2 1,2
ERT 19,2 80,8 28,1 13,9 58 98 2 96,9 2 1,1
OCD-FI 19 81 28 13,9 58,1 98 2 97 2 1
BRT+ 19,5 80,5 28,4 13,8 57,8 98,2 1,8 96,7 1,9 1,4 29,3 70,7 29,1 1,5 69,4
ERT+ 19,3 80,7 28,2 13,8 58 98,1 1,9 96,8 1,9 1,3 29,6 70,4 28,9 1,2 69,9
OCD-FI+ 19,1 80,9 28,1 13,9 58 98 2 96,9 1,9 1,2 29,8 70,2 28,8 1,1 70,1
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (13)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Experimental results (%):Erroneous fault insertions• Further experiments in RT scenarios were
carried out to identify erroneous FI which were classified as Inconclusive (INC)
Configur. &
Scenario
non-FT SW-FT
MAdder VSorter XControl MAdder VSorter XControl
OFF 0 0
BRT 3,1 0,9Not
Possible
4 2,2Not
PossibleERT 1,4 0,6 2,3 1,1
OCD-FI 0,2 0,1 0,2 0,2
BRT+ 3 1,2 2,1 4,8 2,8 3,2
ERT+ 2 0,8 1,5 3,7 2,1 2,4
OCD-FI+ 0,4 0,2 0,3 1,7 1,2 1,3
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (14)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Experimental results: Pros and cons of FI methods• Off-line configurations always produce the
most reliable results
• The CPU may overwrite the target memory cell before the FI is complete (INC)
• INC results increase with the delay between fault triggering and fault insertion, and are mitigated by OCD-FI and predetermination
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (15)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Experimental results (%): OCD-FI extensions for EDAC• FT versions of the workload applications
were not used due to EDAC
DERR: Percentage of errors detected that were corrected by EDAC
No Predetermination Predetermination
Derr Uerr Nerr INC Derr Uerr Nerr INC
MAdder 39,6 0 58,8 1,6 39,7 0 59,5 0,8
VSorter 98,3 0 0,8 0,9 99 0 0,7 0,3
XControl 29,9 0 69,1 1 30 0 69,5 0,5
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (16)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Experimental results: Pros and cons of OCD-FI EDAC extensions• EDAC mechanisms effectively eliminate
the effects of single bit-flip errors on the target system
• The OCD-FI EDAC extension enables FI into protected memory blocks
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (17)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Experimental results (%): OCD-FI for RTREG• RT register access requires a collision
manager that degrades dynamic performance…
non-FT SW-FT
Uerr Nerr Derr Uerr Nerr
MAdder 89 11 62 22 16
VSorter 60 40 46 14 40
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (18)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Experimental results: Pros and cons of OCD-FI RTREG extensions• Due to their higher occurrence rate, INC
results were explicitly avoided
• Not all code lines qualify to trigger a FI experiment (45% of the code lines could be used for triggering accumulator FI)
• FI results and software fault tolerance efficiency differ significantly between registers and memory
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (19)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Performance (FI rate)
• Maximum faults / second rates (single bit-flips on the same memory cell, 30 MHz clock frequency):
Conf. & Scenario Real Time Halted Access
BOF+
Not possible
400k
EOF+ 1150k
BRT+ 454k 400k
ERT+ 1250k 1150k
OCD_FI+ 491k 483k
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (20)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Performance (overhead, dynamic)
• Silicon overhead and maximum operating frequency on a Virtex-2 FPGA:
CPU Core
OCD OCD-FI EDAC RTREG
Area Overhead Max f
[Eq Gates] [%] [MHz]
x 53926 75,4% 37
x x 55018 76,9% 32
x BRT 71527 100,0% 36
x BRT x 72619 101,5% 32
x ERT 76127 106,4% 36
x x 71842 100,4% 36
x +EDAC x 73184 102,3% 32
x +RTREG x 76392 106,8% 27
x +BOTH x x 77484 108,3% 25
SBCCI’08 - 1-4 September - Gramado, Brazil :: These slides are available at http://www.slideshare.net/josemmf (21)
A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures
Conclusions
• Wide spectrum (FPGA, ASIC, etc.)
• FI rate does not justify real-time
• Low overhead
• Better C&O than radiation techniques
• Less intrusive than software techniques
• Should be used with the final HW and SW
• Limitations in coverage, lack of standards