[email protected] Exploiting HW+SW Partitioning for Reliable Embedded Systems Part 2.

[email protected]

Exploiting HWExploiting HW++SW PartitioningSW Partitioning

for Reliable Embedded Systemsfor Reliable Embedded Systems

Part 2Part 2

[email protected]

SummarySummary

1. Introduction: targeting the problem

2. The Possible Solution

2.1. SW-Based Fault Detection Mechanisms

2.2. Migrating SW-Based Fault Detection Mechanisms into

HW

3. Experimental Evaluation

4. Final Considerations

[email protected]


The increasing # of computer-based

critical applications rises questions about

the techniques for guaranteeing sufficient

degrees of reliability and to keep reasonable

costs for design and manufacturing.

?

[email protected]

? Techniques commonly used (on-chip and

system level): stand-alone solutions

Fault-Tolerance Techniques

(HW, SW, Time or Info domains)

Duplication/Voter, TMRLayout-Driven Fault Avoidance

Watch-Dogs

Consistency ChecksCapability Checks

Re-computation

EDAC


[email protected]


Watch-Dog Timer

?? Techniques commonly used (on-chip and





Re-computation

EDAC Impacts design:

performance, weight, size/volume,

power consumption, reliability.

Impacts design:




[email protected]


Watch-Dog Timer

? Techniques commonly used (on-chip and





Re-computation

EDAC Impacts design:



Impacts design:




[email protected]

HW Techniques:

Disadvantages:

High area overheadHigh development/fab cost

SW Techniques:Disadvantages:

Significant performance degradationMemory overhead


[email protected]

Development of a hybrid

methodology (HW+SW redundancies)

able to perform runtime detection of

errors in μprocessor-based SoCs may

have very good cost X benefit

returns.

2. The Possible Solution2. The Possible Solution

[email protected]

Returns: Minimization of area overhead and fab/development costs

(benefits of SW-based redundancy techniques)

Improvement of performance and minimization of memory

overhead (benefits of HW-based redundancy techniques)

In summary: Minimize fab cost and performance degradation, while

improving reliability

Target faults:Control flow errors

Data handling errors


[email protected]

Hybrid methodology (HW+SW

redundancies) explores:

• I-IP Core Architecture

• Software-Based Techniques


[email protected]

HW+SW SoC FT Architecture:

P IP

MemoryIP

CustomIP

I/O port

WDTI-IP

bus

SoCSoCMismatchMismatch

signalsignal

Computes run-time and stores control flow

signatures and data read from memory

Stores a hardened program

Information flow traveling

on the bus

Information flow traveling

on the bus


[email protected]

Faults Affecting Data:CerberusCerberus (Matteo et al.)

Faults Affecting Control:ECCAECCA (Matteo et al.)CFCSSCFCSS (McCluskey et al.)ECIECI (Miremadi et al.)


SW-BasedSW-Based Fault Detection Mechanisms Fault Detection Mechanisms

[email protected]

Original CodeOriginal Code:: Modified CodeModified Code::

a = b; a0 = b0;a1 = b1;if(b0 != b1)

error(); a = b + c; a0 = b0 + c0;

a1 = b1 + c1;if (b0 != b1) || (c0 != c1)

error(); Code modification for errors affecting data.

Faults Affecting Data:Cerberus (Matteo et al.)



[email protected]



res = search(a); search(a0, a1, &res0, &res1);… …int search(int p) void search(int p0, int p1, int *r0, int *r1){ int q; { int q0, q1;… …q = p + 1; q0 = p0 + 1;… q1 = p1 + 1;return(1); if(p0 != p1)}error();

…*r0 = 1;*r1 = 1;return;

} Code transformation for errors affecting procedure parameters.

Faults Affecting Data:Cerberus (Matteo et al.)


[email protected]



/* Basic Block beginning */ /* Basic Block beginning #371 */… ecf = 371;/* Basic Block end */ …

if (ecf != 371)error ();

/* Basic Block end */

Example of detection of errors affecting not allowed branches

Faults Affecting Control:ECCA - (Error Control-Flown Checking using Assertions) (Matteo et al.)


[email protected]



If (condition) If (condition){ /* Block A */ { /* Block A */… if (!condition)} error();else …{ /* Block B */ }… else} { /* Block B */

if (condition)error();

…}

Code transformation for a test statement



[email protected]


In summaryIn summary

To harden a given program this approach defines the

following assertions introduced into each basic block v j:

• Test Assertion: it controls the signature of basic block vj

checking if vi belongs to pred(vj).

• Set Assertion: updates the signature setting it to the value Bj

associated to vj.

Bj = (Bi M1) M2



[email protected]


01: while(k1<DIM)

02: {

03: if( != M1 && != M2 )04: //Error detected

05: A1 = matrixA1[i1][k1];

06: B1 = matrixB1[k1][j1];

07: C1 += A1*B1;

08: matrixC1[i1][j1] = C1;

09: k1++;

10: j =(i ^M1)^M2;

11: }



[email protected]

Principle: Modification of a Basic Block

Faults Affecting Control:CFCSS (McCluskey et al.)



[email protected]



Basically, the approach consists of six steps:Basically, the approach consists of six steps:

1) DivideDivide the program into basicbasic blocksblocks. A basic block is a minimal set of ordered instructions in which its execution begins from the first instruction and terminates at the last instruction. There is no branching instruction in a basic block except possibly for the last one. A basic block terminates at either an instruction branching to another basic block or an instruction receiving transfer of control flow (CF) from two or more places in the program. Notations: (a) V = {vi: i = 1, 2,…, n}: set of vertices denoting basic blocks. (b) E: set of edges

denoting possible CF between basic blocks.

2) ConstructConstruct a graphgraph for the program according to the instructions flow (each node represents a basic block). Note that a program can be represented by a program-graph, P, where bri,j are not necessarily explicit branch instructions; they also

represent fall-through execution paths, jumps, subroutine calls, and returns. Fig. 2.5 is an example. Notation: P: Program Graph {V, E}.

3) ArbitrarilyArbitrarily assignassign a signaturesignature for eacheach nodenode (compilation time).

4) ComputeCompute the signaturesignature differencedifference between the source and the destiny blocks.

5) ComputeCompute the newnew signaturesignature for each nodeeach node (execution time).

6) CompareCompare both signaturessignatures.


[email protected]



Sequence of instructionsand its graph. Detection of illegal branch.

General Form f = f(G, di) = G XOR di

G2 = f(G1, d2) = G1 XOR d2 = s1 XOR (s1 XOR s2) = s2

G4 = f(G1, d4) = G1 XOR d4 = G1 XOR (s3 XOR s4) = s1 XOR s3 XOR s4 ≠ s4


[email protected]



Detection of an illegal branch: a numerical example


[email protected]



Node v1 and node v3 have the same signatures: Branch Fan-in Nodes


[email protected]



Node v1 and node v3 have different signatures: Adjusting Signature D


[email protected]



Node v1 and node v3 have different signatures: Adjusting Signature D


G5 = f(G1, d5, D1) = G1 XOR d5 XOR D1 = s1 XOR (s1 XOR s5) EXOR “000” = s5

G5 = f(G3, d5, D3) = G3 XOR d5 XOR D3 = s3 XOR (s1 XOR s5) EXOR “s1 EXOR s3” = s5

[email protected]


Faults Affecting Control:ECI (Miremadi et al.)

Insertion of trap instructions in the program area, in the data area, and in the unused area of the memory.

The ECIs are inserted in the main memory locations that are not used by the CPU during normal execution. Thus, the execution of an ECI is a indication that a control flow error has occurred.

The task of an ECI is to initiate a recovery process.


[email protected]

WDT / I-IPWDT / I-IP works in symbiosis with the

processor which is not modified.

WDT / I-IPWDT / I-IP continuously spies the information

execution flow on the bus (which is computed

to test and update signatures).

If a mismatch is detected, WDT / I-IPWDT / I-IP outputs a

mismatch signal.


Migrating Migrating SW-BasedSW-Based Fault Detection Mechanism into Fault Detection Mechanism into HWHW

[email protected]

01: while(k1<DIM)02: {03: IIPtest( BB1 );04: IIPtest( BB2 );05: A1 = matrixA1[i1][k1];06: B1 = matrixB1[k1][j1];07: C1 += A1*B1;08: matrixC1[i1][j1] = C1;09: k1++;10: IIPset( BB2);11: }


Peace of code for control-flow faults detection (ECCA Partitioning):


03: if( != M1 && != M2 )04: //Error detected

10: j =(i ^M1)^M2;

[email protected]

WDT / I-IP Architecture:WDT / I-IP Architecture:

• Three modules: - bus interface logic

- consistency check logic

- CAM memory

Bus InterfaceLogic

Consistency CheckLogic

bus

MismatchMismatchSignalSignalWDT / I-IP

adx, data Compares flow signatures

Detects signatures

passing on the bus



CAM Memory

Stores flow signatures

[email protected]

ClkReset

Instruction_inRam_data_in

Ram_address_in

WDT / I-IP

Modulo 1Bus Interface Logic

ClkResetInstrucion_inRam_data_inRam_address_in

Data_memory_in

Data_memory_outAdr_memory_out

Ctrl_rw_out

En_compare_outData_1_outData_2_out

Modulo 2CAM Memory

ClkReset

Data_memory_out

Data_memory_inAdr_memory_inCtrl_rw_in

Modulo 3Consistency Check

LogicClkresetEn_compare_outData_1_outData_2_out

Mismatch Signal



WDT / I-IP Architecture:WDT / I-IP Architecture:

[email protected]

Consider now that the µprocessor-based SoC runs under an Operating System …


The application code is only a fragment of the total time allocated during system operation!


?

[email protected]


• Critical applications need operating systems (OS) which guarantee a correct and safe behavior despite the occurrence of errors.

• Faults can affect OS calls as well as the OS kernel: How does the system react in front of invalid or corrupted values operated by the kernel?


[email protected]

µProcessorµProcessor WDT / I-IPWDT / I-IP

ApplicationApplication

Address + Data BusAddress + Data Bus

Status RegisterStatus Register

SoC

Memory (Operating System)Memory (Operating System)DriverDriver

HW-SW Partitioning for Fault-Detection in Complex Systems


Memory (Application Code + Data)


Error Indication


[email protected]

µProcessorµProcessor WDT / I-IPWDT / I-IP

ApplicationApplication

Address + Data BusAddress + Data Bus

Status RegisterStatus Register

SoC

Memory (Operating System)Memory (Operating System)DriverDriver

HW-SW Partitioning for Fault-Detection in Complex Systems

DragonBall, ARM, DragonBall, ARM, Pentium, 8086, 68KPentium, 8086, 68K

ProgrammableProgrammableLogicLogic

SW Part

HW Part

SW Part




Error Indication

µCLinux, µµCLinux, µCOS-IICOS-II

SW Part

Com ChannelCom Channel


[email protected]



MC68VZ328 Block Diagram

CGM&Power Control

Real-TimeClock

In-CircuitEmulation

InterruptController

MemoryController

BootstrapMode

8/16-Bit 68000 Bus Interface

FLX6800StaticCPU

16-BitTimers(2)

8-BitPWM1

16-BitPWM2

SPI 1

UART 2IrDA1.0

UART 1IrDA1.0

SPI 2

LCDController

GP

IO P

ort

s

GP

IO P

ort

s

6800

0 In

tern

al B

us

Special FunctionPins (CPU Space)

Status InformationStatus Information

[email protected]




[email protected]



Special Function Pins (CPU Space): FC2, FC1, FC0

Function Code Output Processor Cycle Type

FC2 FC1 FC0

0 0 0 Undefined, reserved

0 0 1 User Data

0 1 0 User Program



1 0 1 Supervisor Data

1 1 0 Supervisor Program

1 1 1 CPU space (interrupt acknowledge)


68000 Die68000 Die

[email protected]



68010 – 68030 Dies68010 – 68030 Dies

A16 - A19 Pins


FC2 = FC1 = FC0 = 1 indicate CPU operations other FC2 = FC1 = FC0 = 1 indicate CPU operations other than interrupt acknowledge cycles (e.g. than interrupt acknowledge cycles (e.g. co-processor communications). co-processor communications).

Then, different CPU spaces are indicated Then, different CPU spaces are indicated in in A16 - A19A16 - A19 pins, if properly decoded. pins, if properly decoded.

[email protected]



Interrupt Control Pins: IPL2, IPL1, IPL0

Interrupt Processor Level Processor Cycle Type

IPL2 IPL1 IPL0

0 0 0 Lowest priority

0 0 1 |

|

|

|

|

|

|

|

|

0 1 0

0 1 1

1 0 0

1 0 1

1 1 0

1 1 1 Highest priority


68000 Die68000 Die

[email protected]



Event-Ticking Pins – ETPs: PM0, PM1


Event-Ticking Pins – ETP associated with Model Specific Registers – MSR to monitor:

# cache memory misses, # committed instructions, # interruptions executed, # taken branches, ...

Model Specific Registers – MSRs: Counters CRT0 and CRT1 programmed through the Control and Events Selector Register - CESR

Pentium DiePentium Die

[email protected]




Instructions used to program counters CRT0 and CRT1 through the Control and Events Selector Register – CESR:

WRMSRRDMSR

The RDMSR instruction may be executed in all CPLs (Current Privileged Level), but the WRMSR instruction may only be executed in CPL0.

[email protected]



Event-Ticking Pins – ETPs: d_i, s_u


DragonBall CoreDragonBall Core

If “0”: data;If “1”: instruction;If “z”: undefined.

If “0”: supervisor mode; If “1”: user mode; If “z”: undefined.

These pins were added to the processor core to serve as interface with the I-IP (watch-dog).

[email protected]



Event-Ticking Pins – ETPs: d_i, s_u


[email protected]


• OS error detection coverage has been measured and observations about OS critical data structures to be improved have been commented, in order to improve the final robustness of the µµCOS-IICOS-II operating system.

Juan Pardo, 2004Fault Tolerant Systems Group

Polytechnic University of Valencia Spain


[email protected]


µC/OS-II Operating SystemµC/OS-II Operating System

• Selection came motivated from the perspective that it is a system widely used in particular for embedded applications since several years ago.

First Version µC/OS 1992

• Industrial robots, motor control, medical instruments, etc.

• It is 99% compliant with the Motor Industry Software Reliability Association (MISRA) C Coding Standards.

• All Modified Condition Decision Coverage (MCDC) code in µC/OS-II has been removed, improving code quality for RTCA / EUROCAE DO-178B Level A-certified environments for avionics applications.


[email protected]


µC/OS-II: µC/OS-II: CharacteristicsCharacteristics

• Portable: uC/OS-II is written in highly portable ANSI C, with target microprocessor-specific code written in assembly language.

• ROMable: was designed for embedded applications. This means that if you have the proper tool chain (i.e., C compiler, assembler, and linker/locator), you can embed uC/OS-II as part of a product.

• Scalable: it’s possible to use only the services needed in the application. This allows to reduce the amount of memory (both RAM and ROM) needed. Scalability is accomplished with the use of conditional compilation (full version: 8KB).

• Preemptive: uC/OS-II is a fully preemptive real-time kernel. This means that uC/OS-II always runs the highest priority task that is ready.

• Multitasking: uC/OS-II can manage up to 64 tasks (Current version of the software reserves 8 of these tasks for system use. This leaves for application up to 56 tasks. Each task has a unique priority assigned to it, which means that uC/OS-II cannot do round-robin scheduling.)


[email protected]

µC/OS-II: µC/OS-II: CharacteristicsCharacteristics

• Deterministic: Execution time of all uC/OS-II functions and services are deterministic. You can always know how much time uC/OS-II will take to execute a function or a service. Further more execution time of all uC/OS-II services do not depend on the number of tasks running in your application.

• Task Stacks: Each task requires its own stack (uC/OS-II allows each task to have a different stack size. This allows to reduce the amount of RAM needed for application).

• Services: system services such as mailboxes, queues, semaphores, fixed-sized memory partitions, time-related functions, etc.

• Interrupt Management: Interrupts can suspend the execution of a task. If a higher priority task is awakened as a result of the interrupt, the highest priority task will run as soon as all nested interrupts complete. Interrupts can be nested up to 255 levels deep.

• Robust and Reliable: uC/OS-II is based on uC/OS, which has been used in hundreds of commercial applications since 1992.



[email protected]

Workload DesignWorkload Design

CharacteristicsCharacteristics::

Worst case application: maximum maximum system calls consumesystem calls consume.

System calls: SynchronizationSynchronization, SemaphoresSemaphores, MemoryMemory, QueuesQueues, MessagesMessages, TasksTasks HandlingHandling, TimingTiming ManagementManagement, etc.



[email protected]

The system workload is The system workload is

continuously runningcontinuously running and consists and consists

of a series of tasks executing the of a series of tasks executing the

application. application.

Consistency checksConsistency checks are added are added

to the to the application codeapplication code and and kernelkernel

to detect faults and invalid values to detect faults and invalid values

at the at the kernel callskernel calls in order to in order to

improve system robustness.improve system robustness.

The WDT / I-IP is the monitormonitor.




Addition of Consistency Checks

[email protected]

void RandomNumberTask(void *pdata) void RandomNumberTask(void *pdata)

{ { // Declare as auto to ensure reentrancy. // Declare as auto to ensure reentrancy. auto OS_TCB data; auto OS_TCB data; auto INT8U err; auto INT8U err; auto INT16U RNum;auto INT16U RNum;OSTaskQuery(OS_PRIO_SELF, &data); OSTaskQuery(OS_PRIO_SELF, &data); while(1) while(1) { { // Rand is not reentrant, so access must be controlled // Rand is not reentrant, so access must be controlled // via a semaphore. // via a semaphore. OSSemPend(RandomSem, 0, &err);OSSemPend(RandomSem, 0, &err); RNum = (int)(rand() * 100); RNum = (int)(rand() * 100); OSSemPost(RandomSem);OSSemPost(RandomSem);printf("Task%02d's random #: %d\n",data.OSTCBPrio,RNum);printf("Task%02d's random #: %d\n",data.OSTCBPrio,RNum);// Wait 3 seconds in order to view output from each task. // Wait 3 seconds in order to view output from each task. OSTimeDlySec(3); OSTimeDlySec(3); } } }}


Migrating Migrating SW-BasedSW-Based Fault Detection Mechanism into Fault Detection Mechanism into HWHW// 1. Define necessary configuration constants for uC/OS-II // 1. Define necessary configuration constants for uC/OS-II #define OS_MAX_EVENTS 2 #define OS_MAX_EVENTS 2 #define OS_MAX_TASKS 20 #define OS_MAX_TASKS 20 #define OS_MAX_QS 0 #define OS_MAX_QS 0 #define OS_Q_EN 0 #define OS_Q_EN 0 #define OS_MBOX_EN 0 #define OS_MBOX_EN 0 #define OS_TICKS_PER_SEC 32#define OS_TICKS_PER_SEC 32

// 2. Define necessary stack configuration constants // 2. Define necessary stack configuration constants #define STACK_CNT_512 1 // initial program stack #define STACK_CNT_512 1 // initial program stack #define STACK_CNT_1K OS_MAX_TASKS // task stacks#define STACK_CNT_1K OS_MAX_TASKS // task stacks// 3. This ensures that the above definitions are used // 3. This ensures that the above definitions are used #use "ucos2.lib“#use "ucos2.lib“

void RandomNumberTask(void *pdata);void RandomNumberTask(void *pdata);// Declare semaphore global so all tasks have access // Declare semaphore global so all tasks have access

OS_EVENT* RandomSem;OS_EVENT* RandomSem;void main(){ void main(){ int i;int i;// Initialize OS internals // Initialize OS internals OSInit();OSInit();for(i = 0; i < OS_MAX_TASKS; i++){for(i = 0; i < OS_MAX_TASKS; i++){// Create each of the system tasks // Create each of the system tasks OSTaskCreate(RandomNumberTask, NULL, 1024, i);OSTaskCreate(RandomNumberTask, NULL, 1024, i);} } // semaphore to control access to random number generator // semaphore to control access to random number generator RandomSem = OSSemCreate(1);RandomSem = OSSemCreate(1);// 4. Set number of system ticks per second // 4. Set number of system ticks per second OSSetTicksPerSec(OS_TICKS_PER_SEC);OSSetTicksPerSec(OS_TICKS_PER_SEC);// Begin multi-tasking // Begin multi-tasking OSStart(); OSStart(); }}

OS Call(task waits for signal)

OS Call

(task sends a signal)

Initializing Tasks

Starting Tasks


[email protected]


Migrating Migrating SW-BasedSW-Based Fault Detection Mechanism into Fault Detection Mechanism into HWHWWorkload DesignWorkload Design

OS_ENTER_CRITICAL

/*Code implemented for GNU-GAS*/ asm (" move.l #0x0100, -(%a0) | Write in “a0” the hexadecimal “0x0100” move.b #11, %a0 | Move the byte “11” to the address “a0” ");

…

asm (" move.l #0x0100, -(%a0) | Write in “a0” the hexadecimal “0x0100” move.b #00, %a0 | Move the byte “00” to the address “a0” ");

OS_EXIT_CRITICAL

Set an indication for the instant when the processor gets into the supervisor mode “OS_ENTER_CRITICAL”and when when it leaves this mode: “OS_EXIT_CRITICAL”. The signaling is done by writing to a specific memory address.

[email protected]


Migrating Migrating SW-BasedSW-Based Fault Detection Mechanism into Fault Detection Mechanism into HWHWWorkload DesignWorkload Design

/************************************************************** PEND ON SEMAPHORE*************************************************************/UBYTE OSSemPend(OS_SEM *psem, UWORD timeout){ UBYTE x, y, bitx, bity;

OS_ENTER_CRITICAL();

/*Code implemented for GNU-GAS*//*Code implemented for GNU-GAS*/ asm ("asm (" move.l #0x0100, -(%a0) | Write in “a0” the hexadecimal “0x0100”move.l #0x0100, -(%a0) | Write in “a0” the hexadecimal “0x0100” move.b #4, %a0 | Move the byte “4” to the address “a0”move.b #4, %a0 | Move the byte “4” to the address “a0” ");");/*End*//*End*/ if (psem->OSSemCnt-- > 0) {

OS_EXIT_CRITICAL(); return (OS_NO_ERR);} else { OSTCBCur->OSTCBStat |= OS_STAT_SEM; OSTCBCur->OSTCBDly = timeout; y = OSTCBCur->OSTCBPrio >> 3; x = OSTCBCur->OSTCBPrio & 0x07; bity = OSMapTbl[y]; bitx = OSMapTbl[x];

Systems Calls performed by Pend and Post through Semaphore, Mailbox and QUEUE

if ((OSRdyTbl[y] &= ~bitx) == 0) OSRdyGrp &= ~bity; psem->OSSemTbl[y] |= bitx; psem->OSSemGrp |= bity;

OS_EXIT_CRITICAL(); OSSched();

OS_ENTER_CRITICAL();

if (OSTCBCur->OSTCBStat & OS_STAT_SEM) { if ((psem->OSSemTbl[y] &= ~bitx) == 0) { psem->OSSemGrp &= ~bity; } OSTCBCur->OSTCBStat = OS_STAT_RDY;

OS_EXIT_CRITICAL(); return (OS_TIMEOUT); } else {

OS_EXIT_CRITICAL(); return (OS_NO_ERR); } }}

Consistency Check

Consistency Check

Co

ns

iste

nc

y C

he

ck

[email protected]

Matteo Sonza Reorda, 2002-05Fault Tolerant Systems Group

Politecnico di Torino

3. Experimental Evaluation3. Experimental Evaluation

• An Intel 8051-based SoC was inspected.

• PANDORA I-IP: VHDL (~1500 lines).

[email protected]


• Fault detection capabilities evaluated via HW-based

fault injection experiments (FPGA environment).

• Four benchmarks considered:

– Matrix multiplication, Elliptical Filter,

FIR Filter and Viterbi Algorithm.

[email protected]


Detection capabilities:• Transient faults (30,000 bit-flips)

• Number of wrong answers evaluated (escape detection).

Matrix 9.78 0.18 0.99 4.88

Ellipf 20.83 0 2.38 14.29

FIR 5.64 0 2.12 4.49

Viterbi 21.06 4.89 6.33 17.48

CFCSS [%]

Program Plain [%]Pandora

[%]ECCA [%]

Orig. SW IP (HW+SW) SW Sol. SW Sol.

[email protected]


Memory overhead:

• Additional code lines required to implement the

hybrid technique.

Matrix 223 385 902 456

Ellipf 303 361 640 347

FIR 194 364 701 320

Viterbi 436 707 1,115 725

ECCA [byte]

CFCSS [byte]

Prog.Plain [byte]

Pandora [byte]


[email protected]


Execution time overhead:

Matrix 31,211 41,462 102,356 43,791

Ellipf 16,268 17,815 25,635 17,611

FIR 43,434 71,994 153,458 57,357

Viterbi 286,364 328,150 349,111 314,244

Prog.Plain

[cycle]Pandora [cycle]

ECCA [cycle]

CFCSS [cycle]


[email protected]


Area overhead:

PANDORA size 992 gates

8051 size 30480 gates

PANDORA introduces about

3.2% of area overhead

Area overhead is expected to decrease when processor size increases.

[email protected]

4. Final Considerations4. Final Considerations

Development of a hybrid

methodology (HW+SW redundancies)

able to perform runtime detection of

errors in μprocessor-based SoCs may

have very good cost X benefit

returns.

[email protected]

Returns: Minimization of area overhead and fab/development costs

(benefits of SW-based redundancy techniques)

Improvement of performance and minimization of memory

overhead (benefits of HW-based redundancy techniques)

In summary: Minimize fab cost and performance degradation, while

improving reliability

Target faults:Control flow errors

Data handling errors


[email protected]

A hybrid methodology (HW+SW

redundancies) explores:

• I-IP Core Architecture

• Software-Based Techniques


[email protected]


System architecture co-implemented in HW+SW to detect faults in

control-flow and application data. The main characteristics of this

architecture:

SW-embedded structures at the application code level.

Partial migration of the SW-embedded structures into HW:

specific I-IIP monitors application processor such as a “watch-dog”.

Communication channel between the HW+SW entities: driver

embedded in the OS Kernel and specific signals used to

communicate the I-IP with the application processor.

[email protected] Exploiting HW+SW Partitioning for Reliable Embedded Systems Part 2.

Documents

Transcript of [email protected] Exploiting HW+SW Partitioning for Reliable Embedded Systems Part 2.