Safety in Interactive Systems Christopher Powell.

53
Safety in Interactive Systems Christopher Powell

Transcript of Safety in Interactive Systems Christopher Powell.

Page 1: Safety in Interactive Systems Christopher Powell.

Safety in Interactive Systems

Christopher Powell

Page 2: Safety in Interactive Systems Christopher Powell.

Safety in Interactive Systems

Christopher PowellChristopher Power

Page 3: Safety in Interactive Systems Christopher Powell.

Safety?

• Usually in HCIT, we have talked about properties of the interactive system as being facets of usability

• However, there are other properties that we often need to consider – safety is one of them

• Informally, safety in interactive systems broadly means preventing incidents that can lead to catastrophic loss, either from the human, the machine or the organisation

Page 4: Safety in Interactive Systems Christopher Powell.

Human error

• We often talk about human error – if you have read the QUAN primer on error, you know that the words Human Error are kind of meaningless

• Why is it meaningless though?– Because life is messy … there are lots of routes to

any given error– Observed phenomena may be caused by many

underlying factors, some related to the human, some to the device and some to the environment/context

Page 5: Safety in Interactive Systems Christopher Powell.

Reason’s “Swiss Cheese” Model of Human Error

• John Reason proposed a model in 1990 for understanding where error occurs by categorizing them by the process where they occur.

• Proposed that errors could occur in one process be propagated forward in a system.

• Forms the basis for the Human Factors Analysis and Classification System (HFACS)

Page 6: Safety in Interactive Systems Christopher Powell.

Swiss Cheese Model

• Each process is a layer of cheese, with holes where errors can slip through:

Page 7: Safety in Interactive Systems Christopher Powell.

HFACS: Swiss Cheese Model

• In most cases safe-guards in other processes catch them and correct them.

Page 8: Safety in Interactive Systems Christopher Powell.

HFACS: Swiss Cheese Model• However, sometimes the holes in the system line up, and an error makes it

all the way to the end with the effects of the error being realised.

Page 9: Safety in Interactive Systems Christopher Powell.

Unsafe Acts

• Unsafe acts are those that are tied to the action cycle involving humans and system interaction.

Errors and violations

Page 10: Safety in Interactive Systems Christopher Powell.

Unsafe Acts

• This looks kind of familiar:– Errors deal with the perception, evaluation, integration and

executing actions.– Violations deal with goals, intentions and action specifications.

Page 11: Safety in Interactive Systems Christopher Powell.

HEA and HRA

• How is this classification useful to us?• In respect to error, there are two different

processes that we can undertake:– Human Error Analysis – trying to capture where

errors can happen in the system, either proactively (evaluation prior to incident) or retrospectively

– Human Reliability Analysis – trying to capture the probability that a human will have a fault in the system at some point

Page 12: Safety in Interactive Systems Christopher Powell.

Human Error Analysis Techniques

• There are around 40 different techniques that I could point to in the literature

• Many of these have little empirical basis and have never been validated

• Some have had some work done, but it is debatable how well they work

• We’re going to look at different types of error analysis methods over the next couple of hours

Page 13: Safety in Interactive Systems Christopher Powell.

Error Modes• Most modern techniques use

the idea of an “error mode”• Error modes are categories of

phenomena that we see when an incident occurs in the world

• These phenomena could have many causes – we can track back along the causal chain

• Alternatively we can use the phenomena we see (or suspect will happen) and compare it to interface components to see what might happen in the future

Page 14: Safety in Interactive Systems Christopher Powell.

SHERPA

Page 15: Safety in Interactive Systems Christopher Powell.

Background

• SHERPA stands for “The Systematic Human Error Reduction and Prediction Approach”

• Developed by Embrey in the mid-1980s for the nuclear reprocessing industry (but you cannot find the original reference!)

• Has more recently been applied with notable success to a number of other domains

• (Baber and Stanton 1996, Stanton 1998, Salmon et al. 2002, Harris et al. 2005…)

• Has its roots in Rasmussen’s “SRK” model (1982)…

Page 16: Safety in Interactive Systems Christopher Powell.

SRK Reminder …

• Skill-based actions– Those that require very little conscious control e.g. driving a car on a

known route

• Rule-based actions– Those which deviate from “normal” but can be dealt with using rules

stored in memory or rules which are otherwise e.g. setting the timer on an oven

• Knowledge-based actions– The highest level of behaviour, applicable when the user has either run

out of rules to apply or did not have any applicable rules in the first place. At that time the user is required to use in-depth problem solving skills and knowledge of the mechanics of the system to proceed e.g. pilot response during QF32 incident

Page 17: Safety in Interactive Systems Christopher Powell.

SHERPA Taxonomy

• SHERPA, like many HEI techniques has it’s own cut-down taxonomy, here drawn up taking cues from SRK

• Uses prompts are firmly based on operator behaviours as opposed to listing conceivable errors in taxonomic approaches

• The taxonomy was “domain specific” (nuclear), but SHERPA has still been shown to work well across other domains (see references)

• Rather than have the evaluator consider what psychological level the error has occurred at, the taxonomy simplifies this into the most likely manifestations (modes) for errors to occur

Page 18: Safety in Interactive Systems Christopher Powell.

SHERPA Taxonomy

• The headings for SHERPA’s modes are (expanded next):

– Action (doing something like pressing a button)– Retrieval (getting information from a screen or

instruction list)– Checking (verifying action)– Selection (choosing one of a number of options)– Information Communication (conversation/radio

call etc.)

Page 19: Safety in Interactive Systems Christopher Powell.

Taxonomy - Action

• Action modes:– A1: Operation too long/short– A2: Operation mistimed– A3: Operation in wrong direction– A4: Operation too little/much– A5: Misalign– A6: Right operation on wrong object– A7: Wrong operation on right object– A8: Operation omitted– A9: Operation incomplete– A10: Wrong operation on wrong object

Page 20: Safety in Interactive Systems Christopher Powell.

Taxonomy - Retrieval & Checking

• Retrieval modes are:– R1: Information not obtained– R2: Wrong information obtained– R3: Information retrieval incomplete

• Checking modes are:– C1: Check omitted– C2: Check incomplete– C3: Right check on wrong object– C4: Wrong check on right object– C5: Check mistimed– C6: Wrong check on wrong object

Page 21: Safety in Interactive Systems Christopher Powell.

Taxonomy - Selection & Comms.

• Selection modes are:– S1: Selection omitted– S2: Wrong selection made

• Information Communication modes are:– I1: Information not communicated– I2: Wrong information communicated– I3: Information communication incomplete

Page 22: Safety in Interactive Systems Christopher Powell.

SHERPA Methodology

• SHERPA begins like many HEI methods, with a Hierarchical Task Analysis (HTA)

• Then the ‘credible’ error modes are applied to each of the bottom-level tasks in the HTA

• The analyst categorises each task into a behaviour, and then determines if any of the error modes provided are credible

• Each credible error is then considered in terms of consequence, error recovery, probability and criticality

Page 23: Safety in Interactive Systems Christopher Powell.

Step 1 - HTA

• Using the example of a Sat-nav

Page 24: Safety in Interactive Systems Christopher Powell.

Step 2 - Task Classification

• Each task at the bottom level of the HTA is classified into a category from the taxonomy

Action

Action Selection Retrieval

Page 25: Safety in Interactive Systems Christopher Powell.

Step 3 – Error Identification

• For the category selected for a given task, the credible error modes are selected and a description of the error provided

Selection:“Wrong selection made” – The user makes the wrong selection, clicking “point of interest” or something similar

Retrieval“Wrong information obtained” – The user reads the wrong postcode and inputs it

Page 26: Safety in Interactive Systems Christopher Powell.

Step 4 – Consequence Analysis

• For each error, the analyst considers the consequences

The user makes the wrong selection, clicking “point of interest” or something similar…This would lead to the wrong menu being displayed which may confuse the user

The user reads the wrong postcode and inputs it…

Depending on the validity of the entry made the user may plot a course to the wrong destination

Page 27: Safety in Interactive Systems Christopher Powell.

Step 5 – Recovery Analysis

• For each error, the analyst considers the potential for recovery

The user makes the wrong selection…There is good recovery potential from this error as the desired option will not be available and back buttons are provided. This may take a few menus before the correct one is selected though

The user reads the wrong postcode and inputs it…The recovery potential from this is fair, from the perspective that the sat Nav shows the duration and overview of the route, so depending on how far wrong the postcode is, it may be noticed at that point

Page 28: Safety in Interactive Systems Christopher Powell.

Steps 6, 7 – Probability & Criticality• Step 6 is an ordinal probability analysis, where

L/M/H is assigned to the error based on previous occurrence– This requires experience and/or subject matter

expertise

• Step 7 is a criticality analysis, which is done in a binary fashion binary (it is either critical or it is not critical)

Page 29: Safety in Interactive Systems Christopher Powell.

Step 8 – Remedy• Step 8 is a remedy analysis, where error

reduction strategies are proposed under the headings; – Equipment, Training, Procedures, Organisational

EquipmentThe use of the term ‘address’ may confuse some people when intending to input a postcode…as postcode is a common entry, perhaps it should not be beneath ‘address’ in the menu system

ProcedureThe user should check the destination/postcode entered for validity. The device design could display the destination more clearly than it does to offer confirmation to the user

Page 30: Safety in Interactive Systems Christopher Powell.

Output

• Output of a full SHERPA analysis (Stanton et al 2005)

Page 31: Safety in Interactive Systems Christopher Powell.

Output

Page 32: Safety in Interactive Systems Christopher Powell.

Summary

• SHERPA is an alternative to HE HAZOP• Claims in the literature point to it being “more

easy to learn” and “more easy to apply by novices” – which is attractive

• Founded on some of the roots of HF work done in the 1970s but …

• … simplifies that work into something that can be applied

Page 33: Safety in Interactive Systems Christopher Powell.

References• Rasmussen, J. (1982) Human errors, a taxonomy for describing human

malfunction in industrial installations. The Journal of Occupational Accidents, 4, 22.

• Baber, C. & N. A. Stanton (1996) Human error identification techniques applied to public technology: Predictions compared with observed use. Applied Ergonomics, 27, 119-131.

• Stanton, N. 1998. Human Factors in Consumer Products. CRC Press.• Salmon, P., N. Stanton, M. Young, D. Harris, J. Demagalski, A. Marshall, T.

Waldman & S. Dekker. 2002. Using Existing HEI Techniques to Predict Pilot Error: A Comparison of SHERPA, HAZOP and HEIST. HCI-02 Proceedings.

• Harris, D., N. A. Stanton, A. Marshall, M. S. Young, J. Demagalski & P. Salmon (2005) Using SHERPA to predict design-induced error on the flight deck. Aerospace Science and Technology, 9, 525-532.

• Stanton, N., P. Salmon, G. Walker, C. Baber & D. Jenkins. 2005. Human Factors Methods. Ashgate.

Page 34: Safety in Interactive Systems Christopher Powell.

• Action modes:– A1: Operation too long/short– A2: Operation mistimed– A3: Operation in wrong direction– A4: Operation too little/much– A5: Misalign– A6: Right operation on wrong

object– A7: Wrong operation on right

object– A8: Operation omitted– A9: Operation incomplete– A10: Wrong operation on wrong

object• Selection modes are:

– S1: Selection omitted– S2: Wrong selection made

• Retrieval modes are:– R1: Information not obtained– R2: Wrong information obtained– R3: Information retrieval

incomplete

• Checking modes are:– C1: Check omitted– C2: Check incomplete– C3: Right check on wrong object– C4: Wrong check on right object– C5: Check mistimed– C6: Wrong check on wrong object

• Information Communication modes are:– I1: Information not communicated– I2: Wrong information

communicated– I3: Information communication

incomplete

Page 35: Safety in Interactive Systems Christopher Powell.

THEA

Page 36: Safety in Interactive Systems Christopher Powell.

Human Error Analysis

• The qualitative nature of the techniques in HEA allow the participants to explore the cause of the error as opposed to the effects of the error.

• This differs from quantification in that it is not about when or if an error will happen but instead about how and why it will happen.

• Most techniques involve asking detailed questions about where errors could occur in a design.

• We have just seen two examples, with SHERPA and HEHAZOP – but there are problems …

Page 37: Safety in Interactive Systems Christopher Powell.

Behavioural guide words

• ‘Traditional’ HRA guidewords for error analysis: (Swain & Guttman,1983)

Lecture 8/Slide 37

Errors of Omission Omit actions / sub-goals

Commission Substitute actions / sub-goals

Carry out action incorrectly

Insert extraneous action

Errors of Sequence Actions in wrong order

Repetition Actions repeated unnecessarily

Qualitative error Too much / too little

Time error Too early / too late / too long

Page 38: Safety in Interactive Systems Christopher Powell.

Examples of HEHAZOP Guidewords

• Omission: operator fails to close the valve. • Commission: operator turns the valve clockwise

thereby opening it wider rather than closing it.

• Commission (extraneous): instead of closing the isolation valve, operator switches off the pump because pump on-off switch is close to isolation valve(“doing the wrong thing”)

Lecture 8/Slide 38

Page 39: Safety in Interactive Systems Christopher Powell.

Some problems of definitionTime interval when action was required

Missing

Delayed

Premature

Replaced commission

Four variations of omission

(Hollnagel, 1998)

t

Page 40: Safety in Interactive Systems Christopher Powell.

Some problems of definition

• Task: Entering an altitude value into the altitude alert window in an aircraft cockpit:

• “Substitution error” could be– Doing something other than entering data– Entering data into a different device– Entering a distance value instead of the altitude

• “Commission error” is not very constraining as a guide due to the large number of substitutions possible

• What is needed is more cognitive analysis for attributing error causes

Lecture 8/Slide 40

Page 41: Safety in Interactive Systems Christopher Powell.

THEA: Technique for Human Error Analysis

• It is not always the case that a product needs to be quantified for safety at all times of development.

• Early designs and prototypes can be examined early in the iterative design cycle to determine if there are major failures.

• As a result, qualitative analyses can be completed by people with comparatively low training to quantification techniques.

• One example of this is THEA (Fields, Harrison, Wright, 2001).

Page 42: Safety in Interactive Systems Christopher Powell.

THEA: Scenario Template

• In each scenario, the evaluator completes the following headings:

• Agents: human agents involved in the interaction with the system.• Rationale: the reasons the scenario is being examined.• Situation and environment: a description of the setting, and the

environmental triggers and events that occur during the scenario.• Task context: what tasks are performed (high level), what

procedures are being used, are the procedures violated at any time?

Page 43: Safety in Interactive Systems Christopher Powell.

THEA: Scenario Template

• In each scenario, the evaluator completes the following headings:

• System Context: what devices are involved, what known usability problems are there and what effects can users have on the system that affects the flow of the scenario?

• Action: how are the tasks carried out? How do they relate to the overall goals?

• Exceptional Circumstances: how might things evolve differently if known exceptions occur?

• Assumptions: are there any implicit conditions or activities going on in the environment that should be detailed?

Page 44: Safety in Interactive Systems Christopher Powell.

THEA: Scenario Example Scenario 2 – In Flight Refuelling (IFR) Agents A pilot engaged in in-flight refuelling (IFR) activities; Tanker crew; Eurofighter MHDD Rationale The scenario involves the pilot in considerable fault finding analysis as well as some difficult decision making arising from two fuel system abnormalities. One of these failures may be regarded as latent, as neither pilot nor system can detect the failed shut refuel valve since this is its default position Situation and environment The scenario takes place at a designated refuelling altitude over ocean and in fine visual weather conditions. Task Context The pilot is required to draw on extensive task and system knowledge, as well as experience, in order to take appropriate actions at the appropriate time System Context The fuselage forward group (FRG) experiences a refuel valve (RSOV) failure shortly after take off. This cannot be detected by either aircraft or fuel management computers and thus exhibits no external fault manifestation. The first indication of a problem will be after refuelling has commenced when the MHDD will show that the FRG is not filling. Just as this becomes apparent, a left-hand hydraulic system failure occurs Action The pilot must diagnose how and why the FRG does not appear to be filling with fuel. At the same time, the hydraulic failure complicates fault finding and presents the pilot with difficult decisions and task prioritisation issues Exceptional Circumstances This scenario is constructed around the production Eurofighter aircraft since the problems encountered are anticipated as being harder to correct than in the development aircraft. In the latter aircraft, current procedure simply requires the pilot to terminate IFR and land asap. This scenario would also involve different decisions to be made if the aircraft was refuelling over densely populated areas such as those encountered in Europe Assumptions 1. There are no complications other than those presented in the scenario

Page 45: Safety in Interactive Systems Christopher Powell.

THEA: Creation of HTA• The task information in the scenario creates a basis for an HTA.• For each task in the hierarchy in which questions are asked about

the performance of humans in four different stages (that look very familiar!):– Goals– Plans– Perception/Interpretation/Evaluation– Action

• For each error detected the evaluators can record consequences of the error and possible error reduction measures such as changes to design.

Page 46: Safety in Interactive Systems Christopher Powell.

THEA: HTA Example

• However, in a more advanced task model, it may be necessary to treat each subtask as a goal itself. Consider the following:

Page 47: Safety in Interactive Systems Christopher Powell.

THEA: HTA Example

• The evaluation begins with the level above the lowest level tasks that have been modelled.

Page 48: Safety in Interactive Systems Christopher Powell.

THEA: HTA Example

• The evaluation begins with the level above the lowest level tasks that have been modelled.

Page 49: Safety in Interactive Systems Christopher Powell.

THEA: HTA Example

• When analysis of this subgoal is complete, it can be considered a task in the higher level plan.

Page 50: Safety in Interactive Systems Christopher Powell.

THEA: HTA Example

• When analysis of this subgoal is complete, it can be considered a task in the higher level plan.

Page 51: Safety in Interactive Systems Christopher Powell.

THEA: Questions divided by cognitive model stage

Page 52: Safety in Interactive Systems Christopher Powell.

THEA Summary

• For each goal and its related plans and tasks an evaluator must answer a set of questions determine possible errors. These can be used to inform further designs.

• The goal of this method is not to quantify error, but to help identify possible error conditions in a scenario early in prototype design.

• For a large task model, this method becomes very time consuming.

• This method does not catch collaborative errors.

Page 53: Safety in Interactive Systems Christopher Powell.

Conclusions

• Qualitative approaches are useful for examining prototypes and early designs to identify potential trouble spots where errors could happen.

• It has a different purpose than quantification. Quantification examines the probability that something bad will happen, whereas the above approaches discuss how and why errors could occur.