Preventing recurrence of industrial control system ... · Preventing recurrence of industrial...

21
Preventing recurrence of industrial control system accident using assurance case Mirko Napolano, Fumio Machida, Roberto Pietrantuono, and Domenico Cotroneo University of Naples Federico II, NEC Corporation

Transcript of Preventing recurrence of industrial control system ... · Preventing recurrence of industrial...

Page 1: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

Preventing recurrence of industrial control system accident using assurance case

Mirko Napolano, Fumio Machida,

Roberto Pietrantuono, and Domenico Cotroneo

University of Naples Federico II, NEC Corporation

Page 2: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

Outline

1. Motivation

2. Assurance of accident recurrence prevention

3. A case study

4. Conclusion

Page 3: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

3

Critical infrastructure systems

▌Critical infrastructure systems

Power grids, gas pipelines, water supplies, communication and transportation services, etc.

They are essential for human lives and a wide variety of social activities

▌Advances and threats

Infrastructure systems are getting smarter

They may confront new types of threats

Page 4: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

4

Accident can happen

▌Accident in critical infrastructure system Ex) PG&E Gas pipeline explosion killed 8 people and injured 58

September 9, 2010 - San Bruno, California

Avoiding similar accidents in the future, by lessons learned from the experience

NTSB accident report, PAR-11/01

Page 5: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

5

Understanding what happened

▌Independent public agencies investigate on the accident

Authoritative body with experience in the field

Many months to reconstruct the events and assess the causes

Participations of all the stakeholders

▌At the end of this process a final report is published with:

Accident narrative

Systems descriptions and analyses

List of safety recommendations

▌Recommendations are guidelines to solve identified problems

E.g. “The flight management computer needs to be improved in accordance with the design specifications” (issued for an aircraft crash)

Page 6: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

6

Challenge

▌A source of information is available: accident knowledge

Useful for third-party organizations that need to improve their existing systems in the same domain

▌Though, the list of recommendations is not enough:

Directed to the concerned system providers

Issued with generic solutions not straightforward to be applied

Goal

• Learning from experience clearly how to avoid effectively

reccurence of similar accidents

Our contribution

• A methodology to structure the accident knowledge through

graphical notations and arguments

Page 7: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

Outline

1. Motivation

2. Assurance of accident recurrence prevention

3. A case study

4. Conclusion

Page 8: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

8

Approach overview

▌Step 1: ECFMA (Event and Causal Factor Mitigation Analysis)

Graphical representation of events, problems and solutions

Information provided by the whole report (descriptions and recommendations)

▌Step 2: Assurance Case

Argumentation over the mitigation of the discovered problems

Instantiation of a new pattern, “Accident Recurrence Prevention Pattern”

Page 9: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

9

Example of ECFMA

▌ECFA: tool used by investigative agencies as an accident causation model to identify root, direct and contributory causes

ECFMA introduces “solution” element connected to “causal factor”

Page 10: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

10

Assurance case concepts

▌Safety case

A structured argument supported by a body of evidence used for assuring system safety

▌Assurance case

A general argumentation for assuring any kind of system property

▌Goal Structuring Notation (GSN)

A standard graphical notation widely used to describe assurance cases

▌Assurance case patterns

A means of documenting and reusing successful argument structures

Page 11: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

11

Example of assurance case

Page 12: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

12

Accident Recurrence Prevention Pattern

▌Define a new assurance case pattern

Goal is to ensure the recurrence of similar accidents in the future

Page 13: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

Outline

1. Motivation

2. Assurance of accident recurrence prevention

3. A case study

4. Conclusion

Page 14: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

14

Case study: PG&E accident

▌Date and location: September 9, 2010 - San Bruno, California

▌Industrial system: SCADA system managing and controlling a gas pipeline

▌The accident: an explosion in the pipeline caused by an overpressure not adequately managed by SCADA system

▌Consequences: 8 people killed, 58 injuries and 38 homes destroyed

NTSB accident report,PAR-11/01

Page 15: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

15

Accident analysis

▌Analysis performed using the final report issued by NTSB

▌Problems identified from ECFMA

1. Lack of information in the maintenance work procedures (root cause)

2. Failure of the two redundant power supplies that energize the electrical valves in the station under maintenance (direct cause)

3. Inadequate fail-safe mode (contributory cause)

4. Absence of Remote Control Valves (RCV) (contributory cause)

▌Proposed solutions

1. Maintenance work procedure including requirements for identifying the likelihood and consequences of planned work on SCADA system

2. Use of separate circuit breakers in the station

3. Use of close fail-safe mode

4. Installation of RCVs along all the lines

Page 16: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

16

PG&E ECFMA: an excerpt

Page 17: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

17

PG&E assurance case

Page 18: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

18

Evaluation

▌Comparison among two possible approaches to improve systems from accident knowledge: Use of list of recommendations

Assurance case

▌Consider the report as a structured document composed by links and nodes to be compared against the assurance case nodes: sections, subsections, paragraphs

▌Evaluation criteria: Understandability

Reusability

Effectiveness

Page 19: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

19

Results

#1: Direct links from hazard to mitigation Recommendations 0/4

Assurance case 4/4

#2: Average hops from hazard to mitigation

Recommendations 24.5

Assurance case 1

Understandability

Reusability

#1: Links from recommendations to hazard context

Recommendations 0/4

Assurance case 4/4

#2: Hops from mitigation to hazard context Recommendations 31.25

Assurance case 2

Effectiveness

Number of mitigated hazards Recommendations 2

Assurance case 4

Assurance case provides more structured and reusable knowledge

Page 20: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of

20

Conclusions

▌We presented an approach to create a post-failure assurance case from the accident analysis

▌A new assurance case pattern has been developed to directly use the analysis outcomes about identified problems and solutions

▌Our approach effectively increases understandability and reusability in the system improving process

Page 21: Preventing recurrence of industrial control system ... · Preventing recurrence of industrial control system accident using assurance case ... and Domenico Cotroneo University of