Software criticality analysis of COTS/SOUP

11
Software criticality analysis of COTS/SOUP Peter Bishop a,b , Robin Bloomfield a,b , Tim Clement a , Sofia Guerra a, * a Adelard, Drysdale Building, Northampton Square, London EC1V 0HB, UK b CSR, City University, Drysdale Building, Northampton Square, London EC1V 0HB, UK Abstract This paper describes the Software Criticality Analysis (SCA) approach that was developed to support the justification of using commercial off-the-shelf software (COTS) in a safety-related system. The primary objective of SCA is to assess the importance to safety of the software components within the COTS and to show there is segregation between software components with different safety importance. The approach taken was a combination of Hazops based on design documents and on a detailed analysis of the actual code (100 kloc). Considerable effort was spent on validation and ensuring the conservative nature of the results. The results from reverse engineering from the code showed that results based only on architecture and design documents would have been misleading. q 2003 Elsevier Ltd. All rights reserved. Keywords: Software criticality analysis; Commercial off-the-shelf software; Hazops 1. Introduction This paper describes the Software Criticality Analysis (SCA) approach that was developed to support the justifica- tion of using a commercial off-the-shelf software (COTS) in a safety-related C & I system. The primary objective of the criticality analysis is to assess the importance to safety of the software components within the COTS and to show there is segregation between software components with different safety importance. There are both economic and safety assurance motivations for using SCA on pre-developed software products or COTS in safety-related applications. Typically, the implementation of the software products does not a priori comply with the relevant industry safety requirements, and may therefore need additional documen- tation, testing, field experience studies and static analysis of the code to justify its use in a safety context. There can therefore be considerable benefits in identifying or limiting the safety criticality of software components, as this can reduce the costs of justifying the overall system. This work can be seen as an extension of the IEC 61508 concept of Safety Integrity Level (SIL) that applies to safety functions to provide a safety indication of software component criticality. It is also an extension of the various work reported on software failure modes and effects analysis and Hazops. The distinctive features of our analysis include the application of such techniques to COTS and the validation of the assumptions underlying the safety analysis. This validation had a large impact on the results. The overall approach provides detail support to the framework we advocated in a study for the UK HSE [1] for justifying ‘software of uncertain pedigree’ (SOUP). 2. Identifying the software concerned A pre-requisite for the SCA is the identification of the main software components and their interconnections. A number of different representations could be used for this purpose: design documents augmented with expert opinion or the source code itself. In this particular project, both methods of identification were used, as described below. 2.1. Analysis of design documents and expert opinion In the first stage of subsystem identification, the design documentation was analysed to establish the major software components such as: operating system kernel, device drivers, support services (communications, remote program loading), 0951-8320/03/$ - see front matter q 2003 Elsevier Ltd. All rights reserved. doi:10.1016/S0951-8320(03)00093-0 Reliability Engineering and System Safety 81 (2003) 291–301 www.elsevier.com/locate/ress * Corresponding author. E-mail addresses: [email protected] (S. Guerra), [email protected] (P. Bishop), [email protected] (R. Bloomfield), [email protected] (T. Clement).

Transcript of Software criticality analysis of COTS/SOUP

Page 1: Software criticality analysis of COTS/SOUP

Software criticality analysis of COTS/SOUP

Peter Bishopa,b, Robin Bloomfielda,b, Tim Clementa, Sofia Guerraa,*

aAdelard, Drysdale Building, Northampton Square, London EC1V 0HB, UKbCSR, City University, Drysdale Building, Northampton Square, London EC1V 0HB, UK

Abstract

This paper describes the Software Criticality Analysis (SCA) approach that was developed to support the justification of using commercial

off-the-shelf software (COTS) in a safety-related system. The primary objective of SCA is to assess the importance to safety of the software

components within the COTS and to show there is segregation between software components with different safety importance. The approach

taken was a combination of Hazops based on design documents and on a detailed analysis of the actual code (100 kloc). Considerable effort

was spent on validation and ensuring the conservative nature of the results. The results from reverse engineering from the code showed that

results based only on architecture and design documents would have been misleading.

q 2003 Elsevier Ltd. All rights reserved.

Keywords: Software criticality analysis; Commercial off-the-shelf software; Hazops

1. Introduction

This paper describes the Software Criticality Analysis

(SCA) approach that was developed to support the justifica-

tion of using a commercial off-the-shelf software (COTS) in

a safety-related C & I system. The primary objective of the

criticality analysis is to assess the importance to safety of the

software components within the COTS and to show there is

segregation between software components with different

safety importance. There are both economic and safety

assurance motivations for using SCA on pre-developed

software products or COTS in safety-related applications.

Typically, the implementation of the software products does

not a priori comply with the relevant industry safety

requirements, and may therefore need additional documen-

tation, testing, field experience studies and static analysis of

the code to justify its use in a safety context. There can

therefore be considerable benefits in identifying or limiting

the safety criticality of software components, as this can

reduce the costs of justifying the overall system.

This work can be seen as an extension of the IEC 61508

concept of Safety Integrity Level (SIL) that applies to safety

functions to provide a safety indication of software

component criticality. It is also an extension of the various

work reported on software failure modes and effects

analysis and Hazops. The distinctive features of our analysis

include the application of such techniques to COTS and the

validation of the assumptions underlying the safety analysis.

This validation had a large impact on the results. The overall

approach provides detail support to the framework we

advocated in a study for the UK HSE [1] for justifying

‘software of uncertain pedigree’ (SOUP).

2. Identifying the software concerned

A pre-requisite for the SCA is the identification of the

main software components and their interconnections. A

number of different representations could be used for this

purpose: design documents augmented with expert opinion

or the source code itself. In this particular project, both

methods of identification were used, as described below.

2.1. Analysis of design documents and expert opinion

In the first stage of subsystem identification, the design

documentation was analysed to establish the major software

components such as:

† operating system kernel,

† device drivers,

† support services (communications, remote program

loading),

0951-8320/03/$ - see front matter q 2003 Elsevier Ltd. All rights reserved.

doi:10.1016/S0951-8320(03)00093-0

Reliability Engineering and System Safety 81 (2003) 291–301

www.elsevier.com/locate/ress

* Corresponding author.

E-mail addresses: [email protected] (S. Guerra), [email protected]

(P. Bishop), [email protected] (R. Bloomfield), [email protected] (T.

Clement).

Page 2: Software criticality analysis of COTS/SOUP

† application level scheduler for running control appli-

cations,

† interpreter and ‘function blocks’ for executing control

application function.

The overall software architecture was re-engineered with

the aid of experts in maintaining and developing the system

and from an examination of the system documentation and

code. This described the major software partitions and

provided a mapping from the software files to the

architecture component. (Note the structure of the source

files did not readily map into the architecture partitions). At

a more detailed level, the design experts provided names of

the top-level software components within the system and

also identified which elements were to be used in this

particular application. The unused software would not

represent a threat provided it was really not used or

activated.

2.2. Analysis of the software structure

There is a risk that design and maintenance documents

and expert opinion may not represent the actual software

implementation. To provide a rigorous traceable link

between the criticality analysis and the software, the

structure was reverse engineered from the code. This

established the actual architecture and identified the ‘top’

procedures in the software call trees and all the software

that is influenced by these as the basis for analysis.

There were difficulties in relying on the maintenance

documentation or the design documentation that was not

written from an assurance viewpoint. As the project

continued and the code-based analysis was shown to be

feasible, greater reliance was placed on using the code

analysis to provide the structure of the software, with the

experts and the function call names providing the

semantics of the code.

The 100 k lines of C code were analysed to identify the

top-level call-trees and the supporting subroutines. This

analysis was implemented using the CodeSurfer [2] tool,

which parses the C source code and generates a description

of all the dependencies between the C elements (data and

subroutines).

The system was then represented as a collection of

top-level functions, covering the entry points into the

system and the interfaces to major sub-components of the

system. These were identified by a combination of

analysis of the call graph extracted from the C

components of the code using CodeSurfer [2], and

domain knowledge. The call tree analysis was applied

to the C code of the software-some 100 000 lines of

code with around 2500 functions. The results were

translated into an Access database developed to support

the criticality analysis. The derived tree structures were

presented as indented lists, as shown in the example

below:

Sup_TaskSup_StructRDwordBUS_InstSys_Error

Handle_HeartbeatWDwordGet_SN

HandleSDAbortAll_SDRWordWWordGet_SN

Inst_Task_MonitorsTask_Dropout

Sys_Error

We would expect a significant proportion of unused

procedures in a generic system: perhaps as much as 25%. Of

the top-level procedures, some could be immediately

assigned the highest criticality by inspection, as they were

essential to the functioning of the entire system. These

would typically include low level interrupt service routines

that were triggered by hardware events and key scheduling

functions. The remaining top-level functions would be

candidates for a Hazops to determine their criticality.

The analysis showed that the conceptually independent

components at the design level could actually share code

(such as utility functions). In such cases, the shared code

must have the same classification as the most critical call

tree it supports.

The main limitation of such a control flow analysis is that

it assumes that there are no ‘covert channels’ (e.g. via data

shared between trees) that can affect an apparently

independent software call tree. A later validation (described

in Section 5) addressed this aspect and this resulted in some

changes to the safety classification of the software

components.

The approach used to assess and classify the safety

impact of the identified software components is described in

Section 3.

3. Assessing the safety impact of failure—Hazops

One of the key techniques for assessing the impact of

failure is the software equivalent of a hazard and

operability study (Hazops) [3]. The basic concept of a

Hazops is to take a full description of the process

(traditionally a piping and instrumentation—P&I—dia-

gram), and use a multidisciplinary team to question every

part of it systematically to discover how deviations from

the design intention can occur, and decide whether these

deviations can give rise to accidents. Guide words are used

to ensure that the team explores all credible deviations.

Often, the study team will recommend design changes

P. Bishop et al. / Reliability Engineering and System Safety 81 (2003) 291–301292

Page 3: Software criticality analysis of COTS/SOUP

during the study. The systematic multidisciplinary exam-

ination of the software described below should not identify

any new system-level hazards, but will identify potential

software failure modes and their consequence for the

overall system safety function(s).

3.1. Hazops

The analysis builds on the partition-based analysis

described in Section 2, where components of the system

were identified. A rigorous and systematic procedure is used

to investigate the possible deviations of attributes from the

design intent for each of the components identified.

The Hazops was carried out broadly in accordance with

the guidance in Interim Defence Standard 00-58 [4].

3.2. Procedure

The approach was to apply the guide words to the

informal specifications. In this study, the deviations NO,

MORE, LESS, AS WELL AS, PART OF, REVERSE,

OTHER THAN, EARLY and LATE ACTION were

considered for the operations as a whole; in addition,

specific errors in output values were considered using

MORE, LESS and WRONG VALUE.

The analysis was initially directed towards functions that

we expected to have a criticality of less than 4 (the assumed

criticality of the system as a whole) in order to derive and

justify that lower criticality.

The Hazops analysis was recorded using a template

as shown in the example in Table 1. The record summarizes

the function being assessed, identified credible failure

modes and provided a detailed assessment of the conse-

quence of failure.

The Hazops process also developed recommendations

such as:

† The need for further study of system behaviour: for

example when the results were not readily apparent to the

Hazops team.

† The need for further analysis to underwrite the assump-

tions in the Hazops. Often these concerned the incred-

ibility of certain failure modes.

† The need to consider additions to design.

After the assessment of failure impact the software

components were sentenced as discussed in Section 4.

4. Ranking software components

The basis of the classification scheme is that software and

hardware in the system have different types of requirements.

Broadly speaking these arise from the need to show

correctness for the safety system functions, the correctness

of the internal fault handling features and lack of impact

when the fault handling features fail.

These requirements will depend on the frequency of

the challenge, which in turn depends on the overall

system architecture and design. In general, because of

Table 1

Example of Hazops record

FcTst_Main File Characterisation Question/

recommendation

h/fctst.c Dummy task. Created as the last task to provide markers

at the top of memory used for supervision of overflow in the

common task stack pool. Contains test functions, but these

should not be enabled in release version—they are compiled out

Hazop

Item

Guide

word

Characterisation Consequence/implication Indication/protection Question/

recommendation

Risk calculation

(factors used)

Risk

index

h1 No action Task not created Stack overflow check does not

work for last task. If last task

overflows its stack, then it could

affect anything

No indication of

non-running

Consider checking

task running in test

version. Consider

applying static

arguments against

stack overflow

h2 More action Task never sleeps Could block processes if task

scheduler fails

If scheduler fails,

everything fails

h3 More action Task given high

priority and never

sleeps

Could block other processes

without needing scheduler failure

Detected by system scheduler,

which is run on an interrupt,

because regular cyclic tasks

do not run. Priorities are fixed

at system configuration

h4 Less action Task scheduled

less often

None: does nothing

P. Bishop et al. / Reliability Engineering and System Safety 81 (2003) 291–301 293

Page 4: Software criticality analysis of COTS/SOUP

the redundancy in the architecture and the periodic

manual testing, the fault handling features will have

lower integrity requirements. The most important events

are likely to be:

† Failure of common protection functions (where correct

operation is important).

† Interference failures of supporting functions (like fault

handling and system maintenance) that affect the

integrity of the protection functions.

In this section we describe the process of ranking the

components previously identified according to risk assessed

on likely failure rate and consequence. This ranking builds on

the Hazop analysis described above. The results of the

criticality analysis are summarised in a Safety Criticality

Index (SCI) that defines the importance to safety of a software

element. It does not indicate the resources required for the

qualification of that software. It may be that very strong and

easily defended arguments exist to show that a piece of

software meets very stringent requirements and other, more

difficult, arguments are needed for lower criticalities.

Each failure was attributed a value based on its (system-

level) consequence, the estimated frequency of demand, and

the degree of mitigation of the consequence. The SCI was

then derived from this and a sentence encapsulates the

results of this assessment. It should be noted that the same

software item might be in several classes at once and the

highest classification would take precedence, which is a

conservative assumption.

4.1. Defining the SCI

It is useful to distinguish different types of risk from the

software:

1. it defeats a safety function;

2. it spuriously initiates a safety function;

3. it defeats the handling of a safety function after its

initiation.

The SCI calculation evaluated these separately and for

each type of risk considered:

† the worst case consequence on the computer-based safety

function;

† the frequency with respect to demands placed on the

safety function;

† the mitigation.

The basis for the SCI calculation is a semi-quantitative

calculation of the risk from the software where

informally:

software criticality ¼ consequence £ frequency/

mitigation

The Software Criticality Index is only semi-quantitative

as the factors used in the SCI are derived in a sentencing

process based on engineering judgement. However, from a

quantitative perspective, the SCI can be viewed as the

logarithm of the relative risk posed by the software, i.e.

log(criticality) ¼ log(consequence) þ log(frequency)

2 log(mitigation)

Or put in terms of the index numbers representing these

logarithmic values:

SCI ¼ (consequence_index þ relative frequency_

index 2 mitigation_index) ¼ (C þ F 2 M)

If there is no consequence we set ‘C’ as 0 and the SCI as

0. The maximum SCI is limited to 4 and the minimum to 0.

Appendix A provides a more detailed rationale for the above

equation.

In order to systematically sentence the SCI of the

components, rules are needed to relate the consequence of

the failure mode to the loss of the system functions (as SCI

is relative to this), and to the length of time exposure of the

system to the risk.

For example, for a process protection application, rules

are needed for the severity of:

† Loss of a controlled process shutdown function.

† The consequences of spuriously initiating a shutdown.

There are a number of other factors that might change the

consequence:

† Failure only affects a single channel of a redundant

system (e.g. faulty diagnostic software).

† A delayed shutdown rather than a total defeat of the

shutdown function.

There are also mitigations that would need to be taken

into account. For example:

† The identification of precise failure modes that defeat a

shutdown means that there are often no specific mitiga-

tions. However, the failure mode that defeats the

shutdown is often one of many. The mitigation is a result

of taking into account the fact that another failure may

occur first that does not defeat the shutdown. This could be

verified by examining the impact of actual historic errors.

The frequency component is also influenced by the

number of demands on the component. The rules needed in

sentencing are for:

† Moving from a continuous operation, or that of many

demands, to figures for failures per demand.

† When a second software failure is needed to expose the

fault we need a rule for the reduction in frequency index

for defeating the safety function.

P. Bishop et al. / Reliability Engineering and System Safety 81 (2003) 291–301294

Page 5: Software criticality analysis of COTS/SOUP

† The impact of the hardware failure rate that can lead to a

demand on the fault detection and management software.

These failures often only lead to loss of a single

component in a redundant architecture and the SCI is

normally dominated by the consequence of other

common mode failures.

4.2. Assigning a criticality index (SCI)

The results of the Hazops were reviewed and consoli-

dated and then the definition of SCI was applied to each of

the functions considered. For each top-level function, we

considered features that impact the consequence, frequency

or mitigation. The process is illustrated by the example table

in Appendix B. Further analyses on the overall results were

performed in order to:

† Derive the worst case criticality index in each risk

category (e.g. spurious execution of safety function,

safety function defeated) for each procedure in the

Hazops, by taking the maximum over the subfunctions

considered.

† Derive the worst case criticality index over all categories

for each procedure in the Hazops.

† Map each procedure to the criticality associated with its

top-level caller. Where there were many callers, the

maximum value was taken.

† Summarise the criticality assignment to procedures by

showing the number of procedures of each criticality.

This analysis required only the correct assignment of

SCIs to the top-level procedures. Propagation down the tree

is a mechanical matter of taking for each procedure the

maximum criticality of all the top-level procedures from

which it can be reached, and so to which it can contribute.

The coverage obtained in this way can be checked

automatically and showed that the analysis had covered

all the procedures in the build of the software analysed.

These rules for propagation of the criticality index down

the call tree are conservative and another pass of the SCI

assignments was later undertaken to remove some of the

conservatism during the SCA validation.

5. Validation of the SCA

The criticality analysis required validation that:

1. The analysis was applied to the correct object, i.e.:

† It applied to the actual code being used in the

system, i.e. the documents used all related to the

same versions as the product. This was addressed

by the re-engineering of the program structure

from the actual code.

† The Hazops were adequately undertaken (with

respect to procedure, people and documentation).

This was addressed by the review of the results

and by the check on consistency of the SCI

assignment as described in Section 5.1.

2. The assumptions of behaviour of software and

surrounding system were correct, i.e. that:

† The judgements used about the code partitioning

were correct. Verification of this involved covert

channel analysis (Section 5.2) and was addressed

by subsequent static analysis of pointers [5] and

by the trust placed in the reverse engineering and

expert judgement.

† The mitigations assumed (e.g. benefits from

redundant channels, time skew) were correct.

The behaviour of the software was as assumed

by the experts. This could be checked by

confirmatory tests or analysis.

The process of assigning software criticality indices to

procedures depends on analysis by domain experts and on a

detailed understanding of the function of the code and the

system level consequences of failure. When dealing with a

large body of complex code, it is possible that this analysis

will miss some aspect of the function that should affect the

criticality assigned. One way to verify the self-consistency

of an assignment is to examine the interconnections between

procedures that should place a constraint on their relative

criticalities. This can be done mechanically and helps to

build confidence in the result (although clearly it is possible

for an assignment to be internally consistent but not meet

the external criticality requirements of the application).

There are three aspects to such verification. The first is to

decide what consistency conditions should be satisfied by an

assignment of criticalities. The second is to determine how

these conditions can be evaluated mechanically. Finally,

where this information cannot be extracted completely and

exactly, we need to decide whether we can make a useful

approximation, and whether that approximation is con-

servative. A restricted analysis limits the amount of

consistency checking that can be done, while a conservative

analysis may result in apparent inconsistencies that further

investigation shows not to be problems with the assignment.

Section 6 summarises the results from an application of

these methods and describes how the apparent inconsisten-

cies found were resolved.

5.1. Consistency of SCI assignments

The consistency checks we performed were based on the

data flows between procedures. We assigned to each

variable the maximum criticality of any procedure that

used its value, and checked that any procedure that affected

its value had a criticality at least as high (This makes the

conservative assumption that all variables contribute to the

critical function of the procedure.). The consistency

constraints below consider three ways in which one

procedure can affect a value in another: through values

P. Bishop et al. / Reliability Engineering and System Safety 81 (2003) 291–301 295

Page 6: Software criticality analysis of COTS/SOUP

returned from a call (by assignment to a pointer parameter

or as the function result); through changes made to static

variables directly by a procedure and then used by another;

and through changes made to static variables in the

procedures called by a given procedure.

In this consistency procedure, we did not consider the

criticality of shared constants. The validation could be

extended by allocating a criticality index to the constants

according to the maximum criticality of the procedures that

use it.

The diagrams show the consistency conditions graphi-

cally. Code is shown in rectangular boxes, and data in oval

boxes. The flow of data associated with each constraint is

shown in the arrows between the boxes. The relationship on

the SCI constraint arrow holds between the box at the tail of

the arrow and that at its head (So, for example, the caller

must have an SCI less than or equal to the called procedure

for the downwards consistency condition to hold.).

5.1.1. Downwards call consistency

This condition addresses the constraint imposed by the

communication of results from a called procedure to a

calling one through its results or through pointers to local

variables in its parameter list (Fig. 1).

In general, these returned values are used by the calling

procedure in deriving its own results. They are thus as

critical as the results that they affect. We made the

conservative assumption that all called procedure results

can affect any calling procedure result, and hence a called

procedure should be no less critical than its caller. This

condition only needs to hold if a procedure returns a result

(i.e. is not declared as void) or writes to the address

provided by a pointer parameter.

This condition is called downwards call consistency

because the constraint propagates downwards through the

call tree. It is clearly met by any assignment produced by

defining criticalities for source nodes of the call graph and

then propagating them by taking the maximum of the callers

at each procedure, as was done in the original SCI

assignment.

5.1.2. Static data consistency

This condition addresses the constraints placed on the

criticality of a procedure by the static variables it modifies.

If these have a large scope, the consequences are visible to

other procedures, but even a change to a function static

variable will influence subsequent calls of the procedure,

which may be from different calling procedures. The

modifications to be considered can be made by referencing

the static variable directly or via a pointer, but modification

in sub-procedures is not included: this will be addressed by

the next condition (Fig. 2).

We can associate a criticality with each static variable,

determined by the maximum criticality of any procedure

using its value. In accordance with the general principles

above, the criticality of a procedure changing the value must

be at least as high.

5.1.3. Upwards call consistency

One procedure may call another that changes static

variables, either directly or through a chain of intermediate

procedures. In this case, the data flow is not only from the

parameters passed through the calls which affect the value

of the static variable, but also from the variables in the

conditions under which the call is made, which determine

whether the variable is set at all. Under the conservative

assumption that all variables affect the value, the criticality

of every caller should thus be at least as high as the

criticality of the callee. For example, if we consider a low-

level routine to write to a hardware bus address, passing the

right address is equally critical for all callers because using

the wrong address could overwrite the most critical data

accessible on the bus (Fig. 3).

This constraint is called upwards call consistency

because obligations flow from callee to caller up the call

trees. It is not automatically satisfied by propagating values

down the call graph, because one procedure may have many

callers that have been assigned different criticalities.

Fig. 1. Downwards consistency.

Fig. 2. Static data consistency.

Fig. 3. Upwards call consistency.

P. Bishop et al. / Reliability Engineering and System Safety 81 (2003) 291–301296

Page 7: Software criticality analysis of COTS/SOUP

Going back to the example of the low-level routine that

writes to a hardware bus address, suppose this routine has

SCI 3 (maybe because the hardware bus has SCI 3). Assume

that the low-level routine is called by two procedures—one

with SCI 3 and another with SCI 1. Both procedures could

affect the way the low-level routine updates the bus address.

Therefore, the caller with lower SCI should be assigned a

higher SCI, at least as high has the called procedure.

Upwards consistency guarantees that the caller with SCI 1

would have its SCI changed to three. Indeed, we might

expect this kind of subtle interconnection to be a common

source of inconsistencies in SCI assignments.

5.1.4. Checking the consistency conditions

Checking the consistency conditions above required the

following information from the code:

† The call graph.

† The subset of procedures returning results directly to

their callers, either by an explicit return parameter or via

pointers to local variables.

† The relationship between procedures and the static

variables that they read, either directly or indirectly via

pointers.

† The relationship between procedures and the static

variables that they write, either directly or indirectly

via pointers.

This information was extracted mechanically using

CodeSurfer version 1.1p1. The code was too large to

analyse in one piece, so it was divided into smaller

components (projects). The limit on project size was set

not so much by the analysis time as by the size of the files

produced. A reasonable amount of effort was needed to

synthesise a view of the whole project in a conservative way.

5.2. Covert channels

In classical safety analysis the idea of a segregation

domain is used to define a region within which a failure can

propagate. This is done so that interaction of, say, hardware

components of a physical plant are not overlooked when a

functional or abstract view of the system is considered. A

segregation domain defines an area where common cause

failures are likely.

In assessing safety-critical software a great deal of effort

is usually placed on trying to show segregation or non-

interference of software components. This often involves

sophisticated and extensive analysis and a design for

assurance approach that builds in such segregation. Indeed

the level of granularity that we take for a software component

is strongly influenced by our ability to demonstrate a

segregation domain: there may be no point in producing a

very refined analysis within a segregation domain.

When we are dealing with software, the representation

used defines certain types of interaction (e.g. dataflow) as

intended by the designer. It is these representations that are

used as a basis for criticality analysis, but the possibility

remains that there are unintended interactions or inter-

actions not captured by the notation (e.g. dynamic pointer

allocation). Furthermore, some SOUP might not even have

this documentation, so that we might be relying on expert

judgement to assess the possible flows.

Therefore the possibility remains of what, by analogy

with the security area, we term covert channels or flows.

Covert channel analysis should be addressed by the use of

the guide words in the Hazops analysis and by being

cautious about what is rejected as incredible. Other

techniques that can possibly be used for covert channel

analysis include:

† manual review

† tool-supported review (e.g. program slicing using a tool

like CodeSurfer [2])

† symbolic execution (e.g. pointer analysis using Polys-

pace [6])

† static flow analysis

† formal verification

In the application discussed here this was addressed by

program analysis and review.

6. Discussion

This section shows the results of the more detailed

analysis and it summarises the issues raised in the

development and application of the SCA approach. Some

of these issues are common to the subsequent work on static

analysis [5].

6.1. Impact of more detailed analysis

The more detailed final analysis (involving the re-

engineered call trees and the conservative, mechanical

propagation of SCIs from the top-level functions down-

wards) identified many more unused procedures (about

30%). This resulted in a corresponding reduction in the

qualification work required. The number of SCI 4

procedures was also reduced, but the detailed assessment

has, in many cases, found at least one route by which the

top-level procedures can become SCI 3, that was not visible

in the initial analysis, so procedures tended to move from

SCI 2 to SCI 3.

The impact of the more detailed analysis on the

proportion of procedures in each SCI category is shown in

Table 2. The initial SCI was based on the software

architecture partitions.

As one might expect, partitions at level 4 are not

homogeneously critical and contain a range of functions. It

also shows that although the initial broad categorisations are

P. Bishop et al. / Reliability Engineering and System Safety 81 (2003) 291–301 297

Page 8: Software criticality analysis of COTS/SOUP

indicative, it would not be sufficient to assure to the average

of the partition: the detailed assignment is important.

6.2. Tools

There were problems of scale—CodeSurfer was unable

to analyse the program in one go. We needed to split the

software into separate projects then devise means for

checking for interconnections between projects (via the

database of information extracted from all the projects).

Clearly tools to cope with more complex systems are

desirable and this may mean that tools may need

significant redesign, for example, to incrementally build

a database of results. While our application had some

particular pointer complexities it would seem that about

100 kloc is the limit for this tool at present. Other

program slicers [7] claim to scale but these comparisons

are not properly benchmarked.

6.3. Levels of detail and diminishing returns

There are clearly diminishing returns in the level of detail

in the SCA. More detail might limit the amount of software

to be analysed, but takes more time to justify and

demonstrate segregation and there is a trade-off against

having more time for the actual analysis. Progressive

application may be the best way of proceeding and this is

what we recommend in Table 3 taken from Ref. [1].

The optimum level of SCA will depend on the costs

associated with the subsequent assurance activities. Assur-

ance could make use of testing, static analysis, inspection

and evaluation of field experience. Where these costs

increase significantly with SCI, expending more SCA effort

to limit downstream assurance costs can be justified. On the

other hand, where the downstream cost is low, or relatively

insensitive to SCI, the optimum choice may be to assure all

software to the maximum SCI.

There are also pragmatic issues that influence the level

of detail needed in an SCA. If the SCA is being used to

decide the amount of testing required, all that might be

required is an estimate for the highest SCI in a testable

collection of software components (some part of the pruned

call tree).

6.4. Hazops

A range of expertise is needed for the safety analysis:

expertise in the software, the computer equipment and

application needed to assess failure consequences. The

volume of work can be significant if there are many top-

level functions to assess. This has an impact on resources, of

course, and also on the credibility of the results as large

Hazops can become difficult to sustain when there are only a

few experts available.

The representation used for the Hazops is a key issue.

The differences between the actual code and the block

diagrams and the prevalence of unused code (not just

from configuration of the product but also left in from

development and testing) meant that validation of the SCI

required detailed analysis at the code level.

There are also some issues with the Hazops process as to

how much can be done by individuals and then reviewed (a

desktop approach) and how much needs to be done by

groups. Apart from formal outputs, the Hazops led to

increased understanding of the system and it would be

useful to somehow recognize the value of this in the

qualification process.

Hazops requires knowledge of the consequence offailures.

There is a need to validate some of these judgements about the

behaviour of complex systems: failure impact may be

amenable to some automated support (see for example [8]).

6.5. SCI scheme

There is an issue of how to have a consistent index when

some of the application is demand driven and some

continuous—this is also an issue with the use of

IEC61508 and SILs generally.

In the safety application, there was a need for continuous

control of the process during the shutdown that can take

many hours. This meant that fail-stop was not an option for

that part of the equipment and this can complicate the SCI

and safety arguments. It meant that the use of arguments that

faults would be detected is not sufficient.

The SCI scheme is necessary for COTS/SOUP but the

definitions and approach could be put on a more rigorous

Table 2

Initial SCI and impact of detailed analysis

SCI Percentage of procedures in each SCI category

Initial SCI Detailed SCI

0 10 40

1 20 5

2 40 0

3 0 35

4 30 20

Table 3

SCA stages for progressive application

SCA Based on Remarks

Initial SCA Expert judgement,

Top level architecture,

High level description

If it exists, architecture

may not reflect functionality

so need to go a level lower

Design SCA Software design

descriptions

If it exists, description may

not capture behaviour sufficiently

so need to appeal to expert

judgement and code review

Code SCA Source code/assembler May be too detailed to abstract

behaviour. Need tools to extract

control, data and information flow

P. Bishop et al. / Reliability Engineering and System Safety 81 (2003) 291–301298

Page 9: Software criticality analysis of COTS/SOUP

basis. The link to standards could also be elaborated and the

need for pre-defined categories related to SILs discussed

(see for example the proposals in [9]).

6.6. SCA validation

There was a problem of conservatism causing everything

to float up a criticality level (as is often the case with

multilevel security). We therefore need conservative but

usable propagation rules. The detailed SCA reduced the

amount of level 4 software at the expense of needing to

show certain verifications. The pragmatic approach to

reducing the propagation was to verify some properties so

that, given this verification, the SCI assignments were

correct.

6.7. Overall assurance activities

The criticality analysis provides a focus for the overall

assurance activities and the safety case design. In practice,

there is the issue of how many criticality indexes levels are

really needed. More importantly we lack an empirical

evidence (rather than expert judgement, compliance with

standards) of what techniques and what rigour is needed for

the different criticality indexes and the different types of

software. This is an unresolved problem in all current

standards.

7. Conclusions

The overall conclusions of this work are that:

1. SCA enabled the safety assurance activities to be focused

on specific software components. Not only does this

bring safety benefits but also reduced the effort required

for a whole range of subsequent validation activities—

software reviews, testing, field experience studies as well

as static analysis.

2. A useful by-product of SCA is the identification of the

software and its structure—this provides an independent

assessment of the system design documents and where

these are of doubtful provenance provides some re-

engineering of them.

3. The Hazops method can be adapted to assess the safety

impact of software component failures.

4. The SCI used to classify the criticality of software

components was useful, but in practice the number of

levels requiring different treatment in subsequent quali-

fication is fairly small. So there is scope for simplifica-

tion and closer coupling between the definition of the SCI

and its subsequent use.

5. The level of detail considered in the SCA should be a

trade-off between the effort required to validate software

at a given criticality and the effort required to justify the

partitioning of components into different criticality

levels. A more detailed SCA might reduce the amount

of software classified at the highest criticality, but the

effort required for classification might be more than the

validation effort saved. The optimum level depends on

the effort required in the subsequent validation activities.

6. There is a need for improved tool support for

analysing code structure and for assessing the impact

of failures. This is especially true if the approach here

is to be increased in scale to even larger systems. The

use of different programming languages or the

adherence to safe subsets and coding guidelines

would increase the scale of what the tools can cope

with. However, many systems of interest will be

written in C, developed over many years by different

groups of people.

7. In general, some expertise in the application, design

and code is needed to make the SCA tractable.

However, in the cases when expertise is not available,

this may be compensated by further testing, static

analysis or code review. Further conclusions on the

generality of the approach will result from our current

work on applying the SCA approach to a different

class of SOUP/COTS.

Acknowledgements

This work was partly funded under the HSE Generic

Nuclear Safety Research Programme under contract

40051634 and is published with the permission of the

Industry Management Committee (IMC). This work was

assisted by the EPSRC Dependability Interdisciplinary

Research Collaboration (DIRC), grant GR/N13999.

The views expressed in this report are those of the

authors and do not necessarily represent the views of the

members of the IMC or the Health and Safety Commissio-

n/Executive. The IMC does not accept liability for any

damage or loss incurred as a result of the information

contained in this paper.

Appendix A. Treating the SCI probabilistically

There is a probabilistic basis to the SCI assignment

formula given in the main body of the report.

Let us define:

riskP the annual risk of demand failures

conP the consequences of the protection system failing

on demand

pfdP the probability of failure of the protection system

per demand

demand_rate trip demands per year

P. Bishop et al. / Reliability Engineering and System Safety 81 (2003) 291–301 299

Page 10: Software criticality analysis of COTS/SOUP

For each software component i, let us define:

riski the risk from the component i failing on demand

coni the consequence of component i failing on

demand

pf_swi the probability of failure on demand given

erroneous behaviour of software component i

psw_erri probability of erroneous behaviour in com-

ponent i given a software demand

nsw_demi number of times software component iis

activated on a demand

For a given safety function, the risk is:

riskP ¼ conP·pfdP·demand_rate

For a software component of the system:

riski ¼ coni·pf_swi·psw_erri·nsw_demi·demand_rate

We can express this is as a relative risk:

risk_ratioi ¼ coni·pf_swi·psw_erri·nsw_demi=ðconPðpfdPÞ

We are interested in the impact of the erroneous

behaviour of software component i. If we set psw_erri ¼

1; we obtain:

criticalityi ¼ ðconi=conPÞ·pf_swi·nsw_demi=pfdP

Taking logs to get an index of importance and defining:

Ci logðconi=conPÞ

Mi logð1=pf_swiÞ (as pf_sw , 1; Mi is a positive

integer $0)

Fi logðnsw_demiÞ

scii logðcriticalityiÞ

SIL log(1/pfdP) (relates the lower bound on the pfd

target)

We simply get:

scii ¼ SIL þ Ci þ Fi 2 Mi

In practice the sentencing scheme described in the

main paper uses a consequence index that was a

combination of SIL þ Ci: In the equation above Ci is a

negative index as coni may be less severe than the conP

consequence that may arise in the worst case (e.g.

delayed trip rather than no trip). Hence the overall

consequence is related to the SIL (0.4) of the safety

function, but modified by consideration of the expected

result of the component failure.

If the software component can contribute to different

safety function failures, then a similar calculation is

required for each one. The maximum criticality of the

software component can then be determined.

There is an equivalent scheme for continuous

components.

Appendix B. Example of SCI sentencing

Supervision task—on service master

Description:

To support multiprocessing on a bus with only one

interrupt line, it is necessary to nominate one processor to

handle interrupts on behalf of the others. This is the service

data master, and communication with the others is done via

requests posted in global memory.

A state machine on each processor determines which is

service data master. A service data master supervises its

own live signal. The slave processor supervises the master.

If service data master cannot see its own live signal, it will

reboot. This is a potentially critical failure.

Most access to the bus is to communicate data with the

IO modules, and is handled by the hardware and needs no

software intervention.

This task handles the requests for interrupt handling in

the service data master and slaves. Its actions depend on

whether it runs on the master or on the slave, and so these

have been analysed separately.

Key hazards: h99

Frequency: System task frequency.

Consequences: Failure to generate live signal will

cause reboot. Other failures may result in not

being able to configure I/O modules. Configur-

ation only needed at startup, or for single

channels on reset so only lose a channel in

operation in these cases.

Mitigation: Prompt failures on start-up will occur

before control is taken.

SCA table

Spurious

shutdown

Failure to

shutdown

Shutdown

management

failure

Frequency Continuous þ1 Continuous

Consequence 2 3 4

Mitigation One of

several

failure

modes 1

One of

several

failure

modes 1

One of several

failure modes 1

SCI (C þ

F 2 M)

1 3 3

Recommendation:

Code analysis to confirm no transition from master to

slave in state machine.

References

[1] Bishop PG, Bloomfield RE, Froome PKD, Justifying the use of software

of uncertain pedigree (SOUP) in safety-related applications. Report

P. Bishop et al. / Reliability Engineering and System Safety 81 (2003) 291–301300

Page 11: Software criticality analysis of COTS/SOUP

No: CRR336 HSE Books 2001 ISBN 0 7176 2010 7, http://www.hse.

gov.uk/research/crr_pdf/2001/crr01336.pdf

[2] CodeSurfer user guide and technical reference, Version 1.0, Gramma-

tech, 1999

[3] Burns DJ, Pitblado RM. A modified Hazop methodology for

safety critical system assessment. In: Redmil F, Anderson T,

editors. Directions in safety-critical systems. Berlin: Springer;

1993.

[4] Interim Defence Standard 00-58, Hazop studies on systems containing

programmable electronics. Part 1: requirements. Part 2: general

application guidance. Issue 2 MoD 2000

[5] Bishop PG, Bloomfield RE, Clement T, Guerra S, Jones C, Static

Analysis of COTS Used in Safety Application, Adelard document

D198/4308/2, 2001

[6] PolySpace Technologies, http://www.polyspace.com

[7] Tip F. A survey of program slicing techniques. J Program Lang 1995;

3(3):121–89. http://citeseer.nj.nec.com/tip95survey.html.

[8] Cichocki T, Gorski J. In: Voges U, editor. Formal support for fault modelling

and analysis. SAFECOMP, LNCS 2187, Berlin: Springer; 2001. p. 202–11.

[9] Faller R. In: Voges U, editor. Project experience with IEC 61508 and its

consequences. SAFECOMP 2001, LNCS 2187, Berlin: Springer; 2001.

p. 212–26.

P. Bishop et al. / Reliability Engineering and System Safety 81 (2003) 291–301 301