On Resilient Computing

15
On Resilient Computing ISSI 2011, Tokyo, Japan February 16, 2012 Sven Wohlgemuth Transdisciplinary Research Integration Center National Institute of Informatics, Japan Research Organization for Information and Systems, Japan

Transcript of On Resilient Computing

Page 1: On Resilient Computing

On Resilient ComputingISSI 2011, Tokyo, Japan

February 16, 2012

Sven WohlgemuthTransdisciplinary Research Integration CenterNational Institute of Informatics, JapanResearch Organization for Information and Systems, Japan

Page 2: On Resilient Computing

Agenda

I. Social Infrastructures and ICT

II. Adaptation and Interdependencies

III. Isolation Mechanisms

IV. Resilient Computing

2 Sven Wohlgemuth <[email protected]> On Resilient Computing

Page 3: On Resilient Computing

Sensor and controller

ICT services

I. Social Infrastructures and ICT

Sensor and controller

Workflows

Energy supply

Communication network

ICT services

S1 S2 S3

S4

Physical

Cyber

Function Event-driven

S5 S6

S8

...

...

• ICT control systems implement functions of social infrastructures• Real-time processing of context data and controlling location• Centralized control• Operated by public or private organizations

3 Sven Wohlgemuth <[email protected]> On Resilient Computing

Page 4: On Resilient Computing

Sensor and controller

ICT services

I. Social Infrastructures and ICT

Sensor and controller

Workflows

Energy supply

Communication network

ICT services

S1 S2 S3

S4

Physical

Cyber

Function Event-driven

S5 S6

S8

...

...

• ICT control systems implement functions of social infrastructures• Real-time processing of context data and controlling location• Centralized control• Operated by public or private organizations

Correctness threatened by crime, terrorism, and natural disasters

3 Sven Wohlgemuth <[email protected]> On Resilient Computing

Page 5: On Resilient Computing

Resilience and ICT

• An affected resilient ICT system delivers at least correct critical services in a hostile environment (brittle) (Hollnagel et al., 2006)

• Ability of an ICT system to provide and maintain an acceptable level of service in the face of various faults and challenges to normal operation (Sterbenz et al., 2010)

• Persistence of dependability when facing changes (Laprie, 2008)

Own illustration following (Sheffi, 2005; Günther et al., 2007; McNanus, 2009)

4 Sven Wohlgemuth <[email protected]> On Resilient Computing

Page 6: On Resilient Computing

II. Adaptation and Interdependencies

Function

Specification Service

Sensor and controller

ICT services

d1 d2 c1

S1 S2 S3

S4

d1 d1, d1*d1. d2, ...

c2

S4

Sn

d1. d2, ...

c2

d1OS

Sj Sk

Si

Data flows describe interdependenciesAdaptation of an ICT system

5 Sven Wohlgemuth <[email protected]> On Resilient Computing

Page 7: On Resilient Computing

Shared service C

Shared service C

Sensor

Service A

Actuator

d r

d

Case (a) - Passive attack

Sensor

Service A

Actuator

d r*

d*

Case (b) - Active attack

Sensor Actuator

Case (c) - Non-availability

Malicious interferences Non-malicious interference

d, d* : Input data for a data processing

: Shared used servicer, r* : Result of a data processing

d

Attacking service B

Attacking service B

Shared service C

Service A

Service B

Covert Channels

Automatic detection of all cover channels is impossible (Wang and Ju, 2006)Covert channels may be unknown and lead to a failure Fault isolation

6 Sven Wohlgemuth <[email protected]> On Resilient Computing

Page 8: On Resilient Computing

III. Isolation Mechanisms

Mechanisms & Methods

Policies• Bell-LaPadula, Chinese Wall• BiBa, Clark-Wilson• Role-based access control• Optimistic Security• APPLE• Obligation Specification Language (OSL)• Extended Privacy Definition Tools (ExPDT)

• Testing• Simulation• Model checking

• Security engineering• Non-linkable Delegation of Rights• Monitors • Virtualization• Privacy-enhancing technologies• Verifiable homomorphic encryption• Secure data aggregation• Certified security patterns

• Vulnerability analysis• Model checking• Penetration testing• Process Rewriting• Software patches

Fault acceptance Fault avoidance

7 Sven Wohlgemuth <[email protected]> On Resilient Computing

Fault tolerance Fault forecasting

Fault prevention Fault removal

• Forensics• Process mining• Data provenance• Redundancy• Consensus protocols• Recovery-oriented computing

Page 9: On Resilient Computing

Consensus and Adaptation Objective: Majority on correct data (sensor data, computation result)

S4

S5

S6

Sj

Sl

Monitor

d1

d2

d3

d1, d2, d3

d1, d2, d3

Sk

Consensus protocols and malicious faults:

• Synchronous communication:• Asynchronous communication: Consensus not possible if one process fails

• But: Bears risk of failure due to non-availability of data• Tolerates t < n/3 faulty processes, with authenticated messages: t < n

dcorrect = (d1=d2=d3), (d1=d2), (d1=d3) OR (d2=d3)?

Cachin et al. 2011

8 Sven Wohlgemuth <[email protected]> On Resilient Computing

Page 10: On Resilient Computing

Challenge: Correct data processing in spite of covert channels

Fulfilled safety (correct) propertiesFulfilled liveness (adaptation) propertiesExpected risk of failure

Error rate0% 100%

The Error rate represents the probability of faulty services of a system according to its

specification

Safety Liveness

5On Resilient Computing Sven Wohlgemuth <[email protected]>

IV. Resilient Computing

Page 11: On Resilient Computing

Challenge: Correct data processing in spite of covert channels

Error rate0% 100%CriticalBrittleBrittleCritical

Fulfilled safety (correct) propertiesFulfilled liveness (adaptation) propertiesExpected risk of failure

Safety Liveness

5On Resilient Computing

Failure due to safety

High capability of correct data processing

Few on demand data processing

Sven Wohlgemuth <[email protected]>

IV. Resilient Computing

Page 12: On Resilient Computing

Failure due to liveness

Low capability on correct data processing

High on demand data processing

Challenge: Correct data processing in spite of covert channels

Error rate0% 100%CriticalBrittleBrittleCritical

Fulfilled safety (correct) propertiesFulfilled liveness (adaptation) propertiesExpected risk of failure

Safety Liveness

5On Resilient Computing Sven Wohlgemuth <[email protected]>

IV. Resilient Computing

Page 13: On Resilient Computing

Acceptable states

Acceptable correctness of data processing

Acceptable on demand data processing

Challenge: Correct data processing in spite of covert channels

Error rate0% 100%CriticalBrittleBrittleCritical

Fulfilled safety (correct) propertiesFulfilled liveness (adaptation) propertiesExpected risk of failure

Safety Liveness

5On Resilient Computing Sven Wohlgemuth <[email protected]>

IV. Resilient Computing

Page 14: On Resilient Computing

Generate Evidences

S4

S5

S6

Sj Sk

Sl

Risk Assessment with Uncertainty

Usage Control Policy Select Services

S4

S5

S6

Sj Sk

Sl

De-Select Services

S4

S5

S6

Sj Sk

Sl

Security Architecture for Resilient Computing

10 Sven Wohlgemuth <[email protected]> On Resilient Computing

Page 15: On Resilient Computing

Generate Evidences

S4

S5

S6

Sj Sk

Sl

Risk Assessment with Uncertainty

Usage Control Policy Select Services

S4

S5

S6

Sj Sk

Sl

De-Select Services

S4

S5

S6

Sj Sk

Sl

Preliminary work: DREISAM (Delegation of Rights) & DETECTIVE (Data Provenance)

Security Architecture for Resilient Computing

10 Sven Wohlgemuth <[email protected]> On Resilient Computing