Survivability Analysis of Networked Systems

30
Survivability Analysis of Networked Systems Computer Science Department Carnegie Mellon University Pittsburgh, PA DARPA OASIS PI Meeting, Norfolk, VA 14 February 2001 Jeannette M. Wing with Somesh Jha (Wisconsin) and Oleg Sheyner DARPA co-PI: Tom Longstaff (SEI)

description

Survivability Analysis of Networked Systems. Jeannette M. Wing with Somesh Jha (Wisconsin) and Oleg Sheyner DARPA co-PI: Tom Longstaff (SEI). Computer Science Department Carnegie Mellon University Pittsburgh, PA DARPA OASIS PI Meeting, Norfolk, VA 14 February 2001. Survivability. What if - PowerPoint PPT Presentation

Transcript of Survivability Analysis of Networked Systems

Page 1: Survivability Analysis of Networked Systems

Survivability Analysis of Networked Systems

Computer Science DepartmentCarnegie Mellon University

Pittsburgh, PA

DARPA OASIS PI Meeting, Norfolk, VA14 February 2001

Jeannette M. Wingwith Somesh Jha (Wisconsin) and Oleg Sheyner

DARPA co-PI: Tom Longstaff (SEI)

Page 2: Survivability Analysis of Networked Systems

2Survivability Analysis Jeannette M. Wing

Survivability

• What if– a terrorist hacker brings down the nation’s power

grid?– an act of Mother Nature causes the US banking

network to fail?

• Critical infrastructures– Utilities: gas, electricity, nuclear, water, …– Communications: telephone, networks, …– Transportation: airlines, railways, highways, …– Medical: emergency services, hospitals, …– Financial: banking, trading, …

Page 3: Survivability Analysis of Networked Systems

3Survivability Analysis Jeannette M. Wing

Survivability

• A system is survivable if it can continue to provide end services despite the presence of faults.

• Faults– Accidental or malicious– Not necessarily independent Finer-grained reliability analysis is required.

• Service-oriented– Exploit semantics of application Not all network nodes and links are treated equally.

Page 4: Survivability Analysis of Networked Systems

4Survivability Analysis Jeannette M. Wing

Foundational Questions

• What is the difference between models for survivability and those for– Fault-tolerant distributed systems?– Secure systems?

• Our starting point:– Independence assumption goes out the window.– Cost must be included in the equation.

Page 5: Survivability Analysis of Networked Systems

5Survivability Analysis Jeannette M. Wing

Determining Survivability Strategies

System Requirements/Architecture

SurvivableNetwork Analysis

Essential ServicesIntrusion EffectsMitigation Strategies

SEI CERT/CCIntrusionKnowledge

ImprovedRequirements/Architecture

Page 6: Survivability Analysis of Networked Systems

6Survivability Analysis Jeannette M. Wing

Two Parts in Cooperation

• The Survivable Network Analysis Method (SEI)– Measures existing systems for survivability– Focuses on user and intruder models

• Applying Formal Methods to SNA– Applies model checking and other techniques to

survivability– Allows systems that are formally specified to be

submitted to survivability analysis

Page 7: Survivability Analysis of Networked Systems

8Survivability Analysis Jeannette M. Wing

Simple Example: A Banking System

FRB 1

FRB 3FRB 2

MC 1 MC 3MC 2

Bank A Bank CBank B

a1a2

b1 b2 c1

Page 8: Survivability Analysis of Networked Systems

9Survivability Analysis Jeannette M. Wing

Overview of Our Formal Method

Checker

Network Model Survivability Property

Phase 1

Scenario Graph

Analyzer

Reliability Query,Cost Query, etc.Phase

2

Scenario Set

Page 9: Survivability Analysis of Networked Systems

10Survivability Analysis Jeannette M. Wing

Phase 1

Network Model =

Survivability Property =

Scenario Graph =

Model Checker = (modified) NuSMV

A set of concurrently executing Finite State Machines.

A predicate in CTL.

A set of related examples.

Page 10: Survivability Analysis of Networked Systems

11Survivability Analysis Jeannette M. Wing

Network Model

• Processes– Nodes and links are processes (i.e., FSMs)

• banks, money centers, federal reserve banks, and links

– Communication via shared variables (i.e., finite queues)

• representing channels, and hence interconnections.

• Failures– Faults represented by special state variable

• fault:{normal, failed, intruded}

– Links and banks can fail at any time• Failed link blocks all traffic.• Failed bank routes all checks to an arbitrarily chosen

money center.

– Money centers and federal reserve banks do not fail.

Page 11: Survivability Analysis of Networked Systems

12Survivability Analysis Jeannette M. Wing

Survivability Properties

• Fault-related– Money never deposited into wrong account.

• AG(error)

• Service-related– A check issued eventually clears.

• AG(checkIssued AF(checkCleared))

Page 12: Survivability Analysis of Networked Systems

14Survivability Analysis Jeannette M. Wing

Output: Fault Scenario Graph

Intuition:

• Each “counterexample” spit out by the model checker is a scenario.

• Survivability property gives a slice of the model.

Each path is a scenario of how a fault can occur.

Page 13: Survivability Analysis of Networked Systems

15Survivability Analysis Jeannette M. Wing

Survivability Properties

• Fault-related– Money never deposited into wrong account.

• AG(error)

• Service-related– A check issued eventually clears.

• AG(checkIssued AF(checkCleared))

Page 14: Survivability Analysis of Networked Systems

16Survivability Analysis Jeannette M. Wing

A Service Success Scenario Graph

issueCheck(A, C)

send(A, MC-2)

send(MC-2, FRB-1)

send(FRB-1, FRB-3)

send(FRB-3, MC-3)

send(MC-3, C)

debitAccount

send(FRB-2, FRB-3)

send(MC-1, FRB-2)

send(A, MC-1)

up(a2)

up(c1)

down(a2) & up(a1)

Page 15: Survivability Analysis of Networked Systems

17Survivability Analysis Jeannette M. Wing

A Service Fail Scenario Graph

issueCheck(A, C)

FAIL

down(A)

up(a2)

up(A)

pick(MC-2)

down(c1)

down(a2)

pick(MC-1)

down(a1)

down(c1)

down(c1)

up(a1)

send(A, MC-2) send(A, MC-1)

FAIL

FAIL

FAIL

Page 16: Survivability Analysis of Networked Systems

18Survivability Analysis Jeannette M. Wing

Overview of Method

Network Model

Reliability Query,Cost Query, etc.Analyzer

Scenario Set

Survivability Property

Phase 2

Phase 1

Scenario Graph

Checker

Annotations(e.g., probabilities, cost)

Page 17: Survivability Analysis of Networked Systems

19Survivability Analysis Jeannette M. Wing

Phase 2: Reliability Analysis (in a Nutshell)

• Annotations = Probabilities– Use Bayesian Networks to model dependence of

events.

• Symbolic– Use symbolic probabilities

• high, medium, low

– Use NDFA theory to compute scenario set.

• Continuous– Use numeric probabilities

• [0.0, 1.0]

– Use Markov Decision Processes to model both nondeterministic and probabilistic transitions.

Page 18: Survivability Analysis of Networked Systems

20Survivability Analysis Jeannette M. Wing

Phase 2a: Symbolic Analysis

Annotated Scenario Graph =

Reliability Query =

Scenario Set =

Composer = ASG + DFA

Bayesian Network + Scenario Graph

Regular Expression (DFA)

High-risk scenarios

high

{medium, low}

Page 19: Survivability Analysis of Networked Systems

23Survivability Analysis Jeannette M. Wing

Annotated Scenario Graph

issueCheck(A, C)

down(A)

up(a1)

up(A)

pick(MC-2)

down(a1)

pick(MC-1)

down(a2)down(a2)

FAIL

M

M

M

M

M

M

L H

Page 20: Survivability Analysis of Networked Systems

24Survivability Analysis Jeannette M. Wing

Phase 2b: Continuous Analysis

• Use real values for probabilities.• May leave probabilities of some events

unspecified. Markov Decision Processes

• Mix of nondeterministic and probabilistic transitions

• Why? System is not closed.– Hard to assign probabilities to some faults (e.g.,

intrusions).– Environment makes choice (i.e., decision) and can be

demonic!

Page 21: Survivability Analysis of Networked Systems

25Survivability Analysis Jeannette M. Wing

Reliability Analysis

Goal of (malicious) environment: Devise an optimal policy to minimize reliability.

• Assign to each state, s, a value, V(s), computed using a standard policy iteration algorithm from MDP literature.

• Let V* be the value function after convergence. Then, for initial state of scenario graph, s0, V*(s0) computes worst-case probability of service eventually finishing.

Page 22: Survivability Analysis of Networked Systems

26Survivability Analysis Jeannette M. Wing

A Typical Example

V(Bad) = 0.0V(Good) = 1.0

0.6

0.7

0.6

BadGood

0.40.

60.3

0.7

0.65

Page 23: Survivability Analysis of Networked Systems

27Survivability Analysis Jeannette M. Wing

A Service Success Scenario Graph

issueCheck(A, C)

send(A, MC-2)

send(MC-2, FRB-1)

send(FRB-1, FRB-3)

send(FRB-3, MC-3)

send(MC-3, C)

debitAccount

send(FRB-2, FRB-3)

send(MC-1, FRB-2)

send(A, MC-1)

up(a2)

up(c1)

down(a2) & up(a1)3/8 1/4

1/2The worst case probability that a check issued by Bank A on Bank C is

(1/2 * 3/8) + (1/2 * 1/4) = 5/16

Page 24: Survivability Analysis of Networked Systems

28Survivability Analysis Jeannette M. Wing

Phase 2c: Latency and Cost Analysis

• Latency Analysis– Associate with each edge in scenario graph an

immediate cost (e.g., time it takes to execute event).– Q: What is the worst case latency scenario?

• Cost Analysis– Identify new actions that correspond to decisions an

architect needs to make.– Associate a cost with each action.– Define constraints on costs.– Q: Which set of links can I afford to upgrade to

achieve higher reliability, given my cost constraints?

Page 25: Survivability Analysis of Networked Systems

31Survivability Analysis Jeannette M. Wing

Constrained Markov Decision Processes

<S, A, P, c, d>

• S is a finite state space.• A is a finite set of actions.

• P are transition probabilities. Psas’ is the probability of moving from state s to s’ if action a is chosen.

• c: (S x A) is the immediate cost. c(s, a) is the cost of choosing action a at state s.

• d: (S x A) is a k-dimensional vector of immediate costs, captures additional cost constraints.

Page 26: Survivability Analysis of Networked Systems

32Survivability Analysis Jeannette M. Wing

Progress To Date: Tools

• Trishul tool• Uses NuSMV model checker, done by Somesh Jha.• History variable explodes state space, leading to…

• …New tool• Uses SPIN, ongoing by Oleg Sheyner.• No need for history variable.

Page 27: Survivability Analysis of Networked Systems

33Survivability Analysis Jeannette M. Wing

Progress To Date: Case Studies

• Trading floor model of major investment bank (being “sanitized”)– 10K lines of NuSMV– half-million nodes in scenario graph– 50 threat scenarios– 45 found by system– 5 new threat scenarios found– With independence assumption, too many misses.

• B2B e-commerce NYC start-up (Jha)– 50K lines of Statecharts– 2 million NuSMV beyond capability of tool

• Lincoln Labs example (Sheyner)– TBD

Page 28: Survivability Analysis of Networked Systems

34Survivability Analysis Jeannette M. Wing

Next Steps

• Show applicability of the CMDP model for other critical infrastructure examples.

• Via Lincoln Labs connection

• Combine with other tools to further automate the analysis.

• Linear programming package, theorem provers, …

• Integrate with informal SEI Survivability Network Analysis Method

• Via case studies

Page 29: Survivability Analysis of Networked Systems

35Survivability Analysis Jeannette M. Wing

References

• Applying Formal Methods to SNA– Jha and Wing, “Survivability Analysis of Networked

Systems,” to appear, Proceedings of the International Conference of Software Engineering, 2001.

• Survivable Networking Analysis– Ellison, Linger, Longstaff, and Mead, “Survivable Network

System Analysis: A Case Study,” IEEE Software, July/August 1999.

• The Vigilant Healthcare System– Ellison, Fisher, Linger, Lipson, Longstaff, and Mead,

“Survivability: Protection Your Critical Systems, IEEE Internet Computing, November/December 1999.

– Web site: IEEE article and other reports www.sei.cmu.edu/organization/programs/nss/surv

-net-tech.html

Page 30: Survivability Analysis of Networked Systems

36Survivability Analysis Jeannette M. Wing

Other OASIS Connection

• Recovery service for PASIS (Greg Ganger, Pradeep Khosla, Carnegie Mellon)

– Anticipate intrusion• Proactive secret-sharing

– Upon intrusion detection• Reactive secret-sharing