Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations...

24
Software Fault Tolerance Chapter 4 : N Version Programming - NVP Sima Emadi

Transcript of Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations...

Page 1: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Software Fault

Tolerance

Chapter 4:N Version Programming - NVP

Sima Emadi

Page 2: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 2

NVP

The NVP and RcB techniques are the original design diverse software fault tolerance techniques.

NVP was suggested by Elmendorf in 1972

NVP is a static technique.

In static software fault tolerance techniques, a task is executed by several processes or programs and a result is accepted only if it is adjudicated as an acceptable result, usually via a majority vote.

The technique is called static, because the various programs executing the task will execute in the same manner, regardless of which result(s) was determined acceptable by the DM.

Page 3: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 3

NVP

The NVP technique uses a decision mechanism (DM)

and forward recovery to accomplish fault tolerance.

The technique uses at least two independently

designed, functionally equivalent versions (variants)

of a program developed from the same specification.

The variants are run in parallel and a DM examines

the results and selects the .best. result, if one exists.

There are many alternative decision mechanisms

available for use with NVP.

Page 4: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 4

Basic NVP Operation

run Version 1, Version 2, …, Version n

if (Decision Mechanism (Result 1, Result 2, …, Result n))

return Result

else

failure exception

Page 5: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 5

Basic NVP Operation

The NVP states that the technique executes the n versions concurrently.

The results of these executions are provided to the DM, which operates upon them to determine if a correct result can be adjudicated.

If one can (i.e., the Decision Mechanism statement above evaluates to TRUE), then it is returned. If a correct result cannot be determined, then an error occurs.

Page 6: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 6

Basic NVP Operation

Vi Version i ;

n The number of versions;

NVP N-version programming;

DM Decision mechanism;

Ri Result of Vi.

Page 7: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 7

Basic NVP Operation

Page 8: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Failure-Free Operation

Upon entry to NVP, the executive performs the following:

formats calls to the n versions and through those calls

distributes the input(s) to the versions.

Each version, Vi, executes. No failures occur during their

execution.

The results of the version executions (Ri, i = 1,…, n) are

gathered by the executive and submitted to the DM.

The Ri are equal to one another, so the DM selects R2 as the

correct result.

Control returns to the executive.

The executive passes the correct result outside the NVP, and

the NVP module is exited.Sima Emadi Slides for Software Fault Tolerance 8

Page 9: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Failure Scenario.Incorrect Results

Upon entry to NVP, the executive performs the following:

formats calls to the n versions and through those calls

distributes the input(s) to the versions.

Each version, Vi, executes.

The results of the version executions (Ri, i = 1,…, n) are

gathered by the executive and submitted to the DM.

The Ri differ significantly from one another. The DM cannot

determine a correct result, and it sets a flag indicating this fact.

Control returns to the executive.

The executive raises an exception and the NVP module is

exited.Sima Emadi Slides for Software Fault Tolerance 9

Page 10: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Failure Scenario.Version Does Not

Execute Upon entry to NVP, the executive performs the following: formats calls to the n versions and through those

calls distributes the input(s) to the versions.

The versions, Vi, begin execution. One or more versions do

not complete execution for some reason.

The executive cannot retrieve all version results in a timely

manner. The executive submits the results it does have to the

DM.

The DM expects n results, but receives n-1. The basic majority

voter cannot handle fewer than n results and sets a flag

indicating its failure to select a correct result. Control returns to the executive.

The executive raises an exception and the NVP module is

exited.

Sima Emadi Slides for Software Fault Tolerance 10

Page 11: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Augmentations to N-Version

Programming Operation

in these scenarios, NVP operation continues until

the DM adjudicates a correct result, the DM

cannot select a correct result, or the DM fails.

Several augmentations to the basic NVP have

been suggested.

Many of these simply involve using a different

decision mechanism than the basic majority voter.

Sima Emadi Slides for Software Fault Tolerance 11

Page 12: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Augmentations to N-Version

Programming Operation

Another augmentation to the basic NVP involves

voting upon the results as each version completes

execution.

Once two results are available, the DM can

compare them and if they agree, complete that

NVP cycle.

Sima Emadi Slides for Software Fault Tolerance 12

Page 13: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Augmentations to N-Version

Programming Operation

If the first two results do not match, the DM

performs a majority vote on three results, and

continues voting through the nth version

execution, until it finds an acceptable result.

When an acceptable result is found, it is passed

outside the NVP, and the NVP module is exited.

This scheme provides results more quickly than

the basic NVP, assuming the versions have

different expected execution times.

Sima Emadi Slides for Software Fault Tolerance 13

Page 14: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 14

NVP Example

Page 15: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 15

Design DM

If we designate the result in this example as rij where i = 1, 2, 3 (up to n = 3 versions) and j = 1, 2, …, 6 (up to k = 6 items in the result set), then our DM performs the following tests:

r1j = r2j = r3j where j = 1, …, k

If the rij are equal for a specific j, then the result for that entry in the list is r1j

If they are not all equal for a specific j, do any two entries for a specific j match? That is, does:

r1j = r2j OR r1j = r3j OR r2j = r3j where j = 1, …, k

If a match is found, the matching value is selected as the result for that position in the list.

If there is no match, that is, r1j <> r2j <> r3j, then there is no correct result for that entry.

Page 16: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 16

Design DM

Upon entry to NVP, the executive performs the following: it

formats calls to the n = 3 versions and through those calls

distributes the inputs to the versions. The input set is (8, 7, 13,

−4, 17, 44).

Each version, Vi (i = 1, 2, 3), executes.

The results of the version executions (rij, i = 1, ., n; j = 1, ., k)

are gathered by the executive and submitted to the DM.

Page 17: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 17

Design DM The DM examines the results as follows:

The adjudicated result is (−4, 7, 8, 13, 17, 44).

Control returns to the executive.

The executive passes the correct result, (−4, 7, 8, 13, 17, 44),

outside the NVP, and the NVP module is exited.

Page 18: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 18

Disadvantage

The overhead incurred (beyond that of running a single version, as in non-fault-tolerant software) includes additional memory for the second through the nth variants, executive, and DM; additional execution timefor the executive and the DM, and synchronization overhead.

The time overhead for the NVP technique is always dependent upon the slowest variant

Even though NVP utilizes the design diversity principle, it cannot be guaranteed that the variants have no common residual design faults. If this occurs, the purpose of NVP is defeated.

The DM may also contain residual design faults.

Page 19: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 19

Advantage

Using NVP to improve testing (e.g., in back-to-

back testing) will likely result in bugs being found

that might otherwise not be found in single

version software

NVP runs in a multiprocessor environment,

although it could be executed sequentially in a

uniprocessor environment.

Page 20: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 20

Advantage

There are three elements to the NVP approach to

software fault tolerance:

the process of initial specification and NVP, the

product of that process.

the N-version software (NVS).

the environment that supports execution of NVS and

provides decision algorithms.the N-version executive

(NVX).

Page 21: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 21

Advantage

The objectives of the design paradigm are:

(a) reduce the possibility of oversights, mistakes,

and inconsistencies in software development and

testing;

(b) eliminate the most perceivable causes of

remaining design faults;

(c) minimize the probability that two or more

variants produce similar erroneous results during

the same decision action.

Page 22: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 22

Performance

One way to improve the performance of NVP is to use a

DM that is appropriate for the problem solution domain.

CV is one such alternative to majority voting.

Consensus voting has the advantage of being more stable

than majority voting.

The reliability of CV is at least equivalent to majority

voting. It performs better than majority voting when

average N-tuple reliability is low, or the average decision

space in which voters work is not binary.

Page 23: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 23

Performance

A disadvantage of consensus voting is the added

complexity of the decision algorithm.

However, this may be overcome, at least in part, by

pre-approved DM components .

Page 24: Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations to N-Version Programming Operation in these scenarios, NVP operation continues

Sima Emadi Slides for Software Fault Tolerance 24

Question ?