Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations...
Transcript of Software Fault Toleranceemadilms.ir/emadi/wp-content/uploads/2014/12/SFT04_NVP.pdf · Augmentations...
Software Fault
Tolerance
Chapter 4:N Version Programming - NVP
Sima Emadi
Sima Emadi Slides for Software Fault Tolerance 2
NVP
The NVP and RcB techniques are the original design diverse software fault tolerance techniques.
NVP was suggested by Elmendorf in 1972
NVP is a static technique.
In static software fault tolerance techniques, a task is executed by several processes or programs and a result is accepted only if it is adjudicated as an acceptable result, usually via a majority vote.
The technique is called static, because the various programs executing the task will execute in the same manner, regardless of which result(s) was determined acceptable by the DM.
Sima Emadi Slides for Software Fault Tolerance 3
NVP
The NVP technique uses a decision mechanism (DM)
and forward recovery to accomplish fault tolerance.
The technique uses at least two independently
designed, functionally equivalent versions (variants)
of a program developed from the same specification.
The variants are run in parallel and a DM examines
the results and selects the .best. result, if one exists.
There are many alternative decision mechanisms
available for use with NVP.
Sima Emadi Slides for Software Fault Tolerance 4
Basic NVP Operation
run Version 1, Version 2, …, Version n
if (Decision Mechanism (Result 1, Result 2, …, Result n))
return Result
else
failure exception
Sima Emadi Slides for Software Fault Tolerance 5
Basic NVP Operation
The NVP states that the technique executes the n versions concurrently.
The results of these executions are provided to the DM, which operates upon them to determine if a correct result can be adjudicated.
If one can (i.e., the Decision Mechanism statement above evaluates to TRUE), then it is returned. If a correct result cannot be determined, then an error occurs.
Sima Emadi Slides for Software Fault Tolerance 6
Basic NVP Operation
Vi Version i ;
n The number of versions;
NVP N-version programming;
DM Decision mechanism;
Ri Result of Vi.
Sima Emadi Slides for Software Fault Tolerance 7
Basic NVP Operation
Failure-Free Operation
Upon entry to NVP, the executive performs the following:
formats calls to the n versions and through those calls
distributes the input(s) to the versions.
Each version, Vi, executes. No failures occur during their
execution.
The results of the version executions (Ri, i = 1,…, n) are
gathered by the executive and submitted to the DM.
The Ri are equal to one another, so the DM selects R2 as the
correct result.
Control returns to the executive.
The executive passes the correct result outside the NVP, and
the NVP module is exited.Sima Emadi Slides for Software Fault Tolerance 8
Failure Scenario.Incorrect Results
Upon entry to NVP, the executive performs the following:
formats calls to the n versions and through those calls
distributes the input(s) to the versions.
Each version, Vi, executes.
The results of the version executions (Ri, i = 1,…, n) are
gathered by the executive and submitted to the DM.
The Ri differ significantly from one another. The DM cannot
determine a correct result, and it sets a flag indicating this fact.
Control returns to the executive.
The executive raises an exception and the NVP module is
exited.Sima Emadi Slides for Software Fault Tolerance 9
Failure Scenario.Version Does Not
Execute Upon entry to NVP, the executive performs the following: formats calls to the n versions and through those
calls distributes the input(s) to the versions.
The versions, Vi, begin execution. One or more versions do
not complete execution for some reason.
The executive cannot retrieve all version results in a timely
manner. The executive submits the results it does have to the
DM.
The DM expects n results, but receives n-1. The basic majority
voter cannot handle fewer than n results and sets a flag
indicating its failure to select a correct result. Control returns to the executive.
The executive raises an exception and the NVP module is
exited.
Sima Emadi Slides for Software Fault Tolerance 10
Augmentations to N-Version
Programming Operation
in these scenarios, NVP operation continues until
the DM adjudicates a correct result, the DM
cannot select a correct result, or the DM fails.
Several augmentations to the basic NVP have
been suggested.
Many of these simply involve using a different
decision mechanism than the basic majority voter.
Sima Emadi Slides for Software Fault Tolerance 11
Augmentations to N-Version
Programming Operation
Another augmentation to the basic NVP involves
voting upon the results as each version completes
execution.
Once two results are available, the DM can
compare them and if they agree, complete that
NVP cycle.
Sima Emadi Slides for Software Fault Tolerance 12
Augmentations to N-Version
Programming Operation
If the first two results do not match, the DM
performs a majority vote on three results, and
continues voting through the nth version
execution, until it finds an acceptable result.
When an acceptable result is found, it is passed
outside the NVP, and the NVP module is exited.
This scheme provides results more quickly than
the basic NVP, assuming the versions have
different expected execution times.
Sima Emadi Slides for Software Fault Tolerance 13
Sima Emadi Slides for Software Fault Tolerance 14
NVP Example
Sima Emadi Slides for Software Fault Tolerance 15
Design DM
If we designate the result in this example as rij where i = 1, 2, 3 (up to n = 3 versions) and j = 1, 2, …, 6 (up to k = 6 items in the result set), then our DM performs the following tests:
r1j = r2j = r3j where j = 1, …, k
If the rij are equal for a specific j, then the result for that entry in the list is r1j
If they are not all equal for a specific j, do any two entries for a specific j match? That is, does:
r1j = r2j OR r1j = r3j OR r2j = r3j where j = 1, …, k
If a match is found, the matching value is selected as the result for that position in the list.
If there is no match, that is, r1j <> r2j <> r3j, then there is no correct result for that entry.
Sima Emadi Slides for Software Fault Tolerance 16
Design DM
Upon entry to NVP, the executive performs the following: it
formats calls to the n = 3 versions and through those calls
distributes the inputs to the versions. The input set is (8, 7, 13,
−4, 17, 44).
Each version, Vi (i = 1, 2, 3), executes.
The results of the version executions (rij, i = 1, ., n; j = 1, ., k)
are gathered by the executive and submitted to the DM.
Sima Emadi Slides for Software Fault Tolerance 17
Design DM The DM examines the results as follows:
The adjudicated result is (−4, 7, 8, 13, 17, 44).
Control returns to the executive.
The executive passes the correct result, (−4, 7, 8, 13, 17, 44),
outside the NVP, and the NVP module is exited.
Sima Emadi Slides for Software Fault Tolerance 18
Disadvantage
The overhead incurred (beyond that of running a single version, as in non-fault-tolerant software) includes additional memory for the second through the nth variants, executive, and DM; additional execution timefor the executive and the DM, and synchronization overhead.
The time overhead for the NVP technique is always dependent upon the slowest variant
Even though NVP utilizes the design diversity principle, it cannot be guaranteed that the variants have no common residual design faults. If this occurs, the purpose of NVP is defeated.
The DM may also contain residual design faults.
Sima Emadi Slides for Software Fault Tolerance 19
Advantage
Using NVP to improve testing (e.g., in back-to-
back testing) will likely result in bugs being found
that might otherwise not be found in single
version software
NVP runs in a multiprocessor environment,
although it could be executed sequentially in a
uniprocessor environment.
Sima Emadi Slides for Software Fault Tolerance 20
Advantage
There are three elements to the NVP approach to
software fault tolerance:
the process of initial specification and NVP, the
product of that process.
the N-version software (NVS).
the environment that supports execution of NVS and
provides decision algorithms.the N-version executive
(NVX).
Sima Emadi Slides for Software Fault Tolerance 21
Advantage
The objectives of the design paradigm are:
(a) reduce the possibility of oversights, mistakes,
and inconsistencies in software development and
testing;
(b) eliminate the most perceivable causes of
remaining design faults;
(c) minimize the probability that two or more
variants produce similar erroneous results during
the same decision action.
Sima Emadi Slides for Software Fault Tolerance 22
Performance
One way to improve the performance of NVP is to use a
DM that is appropriate for the problem solution domain.
CV is one such alternative to majority voting.
Consensus voting has the advantage of being more stable
than majority voting.
The reliability of CV is at least equivalent to majority
voting. It performs better than majority voting when
average N-tuple reliability is low, or the average decision
space in which voters work is not binary.
Sima Emadi Slides for Software Fault Tolerance 23
Performance
A disadvantage of consensus voting is the added
complexity of the decision algorithm.
However, this may be overcome, at least in part, by
pre-approved DM components .
Sima Emadi Slides for Software Fault Tolerance 24
Question ?